The Minnesota Search Sprint

Continuing the great and growing tradition of bringing people together in small groups to attack focused problems, a search related code sprint has been planned. From May 9 to 11, in the headquarters of the University of Minnesota Libraries, a small but dedicated group of Drupal coders will be melding minds to bring forth the next generation of Drupal search.

## Why Search?

Drupal has a great search module. The search index it builds powers search on Drupal.org and thousands of other sites. It is a critical piece of the Drupal project and fundamental to countless sites built on Drupal. Being able to effectively search for issues and solutions is a cornerstone of keeping the Drupal.org community happy and productive, so investing in making search even better is akin to investing in Drupal's overall success.

Search is a complicated subject of study. The text processing that goes into building a search index, including the one that has shipped with Drupal for the past few versions, is complex and outside of the typical range of tasks that most Drupal developers deal with on a day-to-day basis. Performance is also an issue, and expectations for the relevancy and speed of search have become very high in the last decade.

The current search module as a framework for building custom searches has many subtle but important limitations. For example, the core searches, user and node, can't be turned off. Thus a true pluggable architecture hasn't been attained simply due to the fact that one can't replace what is already there. <a href="/%3Ca%20href%3D"http://api.drupal.org/api/function/hook_search/6">hook">http://api.drupal.org/api/function/hook_search/6">hook\_search</a>, the workhorse of the search framework, is also in need of some extensions, especially in the area of discovering the capabilities of each search implementation. <a href="/%3Ca%20href%3D"http://api.drupal.org/api/function/do_search/6">do">http://api.drupal.org/api/function/do_search/6">do\_search</a>, the function which is supposed to be the keystone of the search API, is so confusing that it is rarely used outside of the core search module. All of these problems are fixable, but they need coordination, investigation, and elbow grease.

## Who's involved?

The Minnesota search sprint was born during the BoF session on ApacheSolr search that took place at the Boston Drupalcon in March. Many of us who have been working on search have been doing so as solo efforts. Whether it is faceted search, ApacheSolr, or Sphinx integration, very little collaboration between projects has been happening. This means that some of the limitations of the core search framework have been addressed several times by different people in different ways. While this is evidence of duplicated effort and hints at the need for better efficiency, the good news is that it will make it easier to decide which approaches are successful and which can be learned from for Drupal 7.

The Minnesota Search Sprint is being hosted by the University of Minnesota Libraries. The individuals attending are being sponsored by their respective companies, including Acquia, McDean, Inc. / OpenBand, Workhabit, CivicActions, The University of Michigan, Laboratoire NT2, and BoldSource. The attendees will be: Earnest Berry, Robert Douglass, Chad Fennell, Doug Green, Michael Hess, Djun Kim, David Lesieur, and Blake Luchessi.

You can participate in improving Drupal's search too! Join the search group, review some patches, and follow the Search Sprint's progress and updates on groups.drupal.org.

Comments

Posted on by tjholowaychuk (not verified).

Nice! It really annoys me that user and node were required, for that reason alone I have strayed away from the core search on many occasions. Having the ability to use the same search functionality on a specific content type, with a specific display would be great, aka not just a flat list, fully rendered teaser lists would be great!

Posted on by David Lesieur (not verified).

I'll be participating to the search sprint too, thanks to support from Laboratoire NT2 and Acquia. I'm impatient to work will all those great minds! This will be a fantastic opportunity to join forces.

Posted on by Anonymous (not verified).

Great!

One massive limitation of core search for me has been the lack of support for paging.

I have a site where comments can easily be 100 pages long. Now, a search result linking you to the first page of the node is less than useless.