Home / Taxonomy term

search results

Delivering the "Right" Search Results

The Apache Solr search server that powers Acquia Search has many powerful features. One of the less appreciated ones is the ability to specify at query time that documents matching certain criteria should get an extra "boost" in their relevancy score. This means that they appear higher in the search results.

Imagine that you are maintaining a site and you have recently added Acquia Search. Your boss, Bob, is not pleased, however. He says "I thought you told me this new search would do a better job of finding the most relevant results - but when I try it the ones I expect to see come up first are not there." After protesting that the result are good matches to the key words, further discussion reveals that Bob expect his blog posts to be the most relevant matches!

In the Apache Solr settings you can use the "Content bias settings" tab and "Search fields" tab to adjust the boost (see screen shot below). The boost can be set based on a range of properties including content types and node properties, as well as for cases where a keyword matches a certain node field or taxonomy vocabulary. By changing these configuration options, in most cases you can shift the results so they match the needs of your site. Given the problem with Bob's blog posts, you adjust the settings so that all Blog content gets an extra boost.

However, you may still find that the search results are not optimally relevant, especially if you have certain pieces of content that you think should be highlighted, or some pieces of content that you know are of particularly high quality. In this case, you can add a search boost at the node level to make these "important" nodes come to the top. You can write a very small amount of custom code in a site-specific module to get the desired result.

In our imagined scenario case, Bob is still upset because the developers also write blog posts, and those tend to include more of the keywords so are better matches, plus he's annoyed that when one of his blog posts does show up, it's one he wrote last month. If you have some way to automatically identify the "important" nodes, then you may be able to transform those rules into code if the rules can be formulated as a Lucene query. For example, like this hook implementation: