Home / Taxonomy term

search sprint

Drupal Search: How indexing works

This article explores the process of taking HTML content from Drupal nodes and indexing it for the purpose of search and text retrieval at a later time. The code examples apply to Drupal 6.

Finding what to index

Minnesota Search Sprint: Your top-five feature requests

In the same way that the Internet itself would not have achieved greatness without the ability to search it easily and efficiently, Drupal's greatness will always be tied directly to the effectiveness of its core search solution. Improving core search for Drupal 7 will be no small task, however. The current implementation is both elegant but complex, robust yet inflexible. The seven coders participating in the Minnesota Search Sprint this weekend have a great challenge as well as a great opportunity. Here are some of the things we hope to achieve:

- Identify the most important weaknesses in Drupal search and create a project plan for fixing them.
- Identify the most important new features currently missing from Drupal search and clear the roadblocks for implementing them.
- Increase the test coverage for Drupal search.
- Increase general developer awareness and knowledge of search.

A large part of what we will be doing is evaluating and planning. Without a roadmap and common understanding of what search is to become, little progress will be made in the Drupal 7 development cycle. However, a coding sprint is all about code, and we'll be writing some of that, too. Specifically I'm hoping that we'll be able to fix one of the top-five bugs, increase search module's test coverage, and come up with a first attempt at one of the top-five new features.

That's a lot! No matter what we manage to code during the three days together, we'll walk away with a high level of agreement about our goals for the next months, and plenty of homework to do.

We'll post regular updates that you can follow on Planet Drupal, as well as in the search group, and we're all ears if you have suggestions or wishes. For anyone wanting to catch up on their search related reading, here are some links:

Drupal's search compared to Google and Yahoo!

When Drupal does a content search, it optionally weighs the results using up to four scoring factors. These scoring factors include keyword relevancy, recency of the content, number of comments, and (if statistics module is enabled), the number of page views. Site administrators can adjust the relative weighting of these scoring factors from the example.com/admin/settings/search administration page. Setting any scoring factor to zero disables it.

In this article, which applies primarily to Drupal 6 but is relevant for Drupal 5 as well, I explore how useful these scoring factors really are, and whether they help Drupal search live up to the high standards that are set by leaders like Google and Yahoo!. This article is part of a series of search related articles in preparation for the Minnesota Search Sprint.

Drupal's Search Framework: The execution of a search

Drupal's ambitious search module provides a framework for building searches of all kinds. By isolating the tasks involved in searching, and allowing the actual search implementations to be handled by other modules, the search framework sets the stage for all sorts of creative search applications. This article, which applies to Drupal 6, explores the structure of the search framework by following the steps needed to execute a search.

## Stucture of a search

The Minnesota Search Sprint

Continuing the great and growing tradition of bringing people together in small groups to attack focused problems, a search related code sprint has been planned. From May 9 to 11, in the headquarters of the University of Minnesota Libraries, a small but dedicated group of Drupal coders will be melding minds to bring forth the next generation of Drupal search.

## Why Search?

Drupal has a great search module. The search index it builds powers search on Drupal.org and thousands of other sites. It is a critical piece of the Drupal project and fundamental to countless sites built on Drupal. Being able to effectively search for issues and solutions is a cornerstone of keeping the Drupal.org community happy and productive, so investing in making search even better is akin to investing in Drupal's overall success.

Minneapolis: Hub of the Drupalverse?

As a native Bostonian, I'm pretty hesitant about calling something the "hub" of anything. I have to go reconsider my instinct as I look at the Drupal events scheduled for the week of May 5-11 in Minneapolis. That week, the Twin Cities will host two opportunities for training, the Drupal search sprint, and a the Minnesota Bar Camp (minnebar). For that week, the Hub of the Drupalverse has clearly shifted to the American Midwest.