Home / Taxonomy term

ApacheSolr

Acquia Search Available for Drupal 7

On Monday we made a series of module and service releases to bring Acquia Search to Drupal 7:

  • The Apache Solr Search Integration project version 7.x-1.0-beta4 includes a number of new features, API improvements, and fixes, including changes to the Solr schema.
  • Acquia Network Connector project version 7.x-1.1 now includes the Acquia Search module which integrates the Drupal 7.x version of Apache Solr Search Integration with the Acquia Network
  • The Acquia Network was updated to auto-detect your module version and use the correct set of Solr configuration files when you first enable Acquia Search
  • The Acquia Netowrk also now offers a settings page where you can manually select from a list and your selection will be propagated to the Solr servers.

This release of Apache Solr Search Integration sets the stage for the virtual code sprint starting this Friday, April 8. The sprint will encompass topics including integration with Facet API, integration with Views, UI improvements, and expanded test coverage. Sign up to participate in the Sprints group. If you look at the CHANGELOG.txt you'll see that we had at least 58 different issues that we worked on for this release. If you are running Apache Solr yourself, you will need to update your configuration files and re-index your content.

This screen shot shows the administrative UI with Acquia Search enabled:

Virtual Code Sprint for Apache Solr Search Integration

Would you like to help get the Apache Solr Search Integration module to a stable and feature-full release for Drupal 7? We are planning a virtual code sprint for Friday April 8 through April 13. That spans a weekend with the expectation that some sprinters will be able to more actively participate during a weekend, and others during work days. The sprint will encompass topics including integration with Facet API, integration with Views, UI improvements, and expanded test coverage.

Apache Solr module Chicago Slides and Roadmap

At Drupalcon Chicago, Chris Pliakas and I presented Attain Apache Solr Coding Chops to an enthusiastic audience. While some of the content was similar to past presentations, we updated it to highlight more of the changes in the Drupal 7.x-1.x version, and Chris added new recipes for building customized search forms and results.

Delivering the "Right" Search Results

The Apache Solr search server that powers Acquia Search has many powerful features. One of the less appreciated ones is the ability to specify at query time that documents matching certain criteria should get an extra "boost" in their relevancy score. This means that they appear higher in the search results.

Imagine that you are maintaining a site and you have recently added Acquia Search. Your boss, Bob, is not pleased, however. He says "I thought you told me this new search would do a better job of finding the most relevant results - but when I try it the ones I expect to see come up first are not there." After protesting that the result are good matches to the key words, further discussion reveals that Bob expect his blog posts to be the most relevant matches!

In the Apache Solr settings you can use the "Content bias settings" tab and "Search fields" tab to adjust the boost (see screen shot below). The boost can be set based on a range of properties including content types and node properties, as well as for cases where a keyword matches a certain node field or taxonomy vocabulary. By changing these configuration options, in most cases you can shift the results so they match the needs of your site. Given the problem with Bob's blog posts, you adjust the settings so that all Blog content gets an extra boost.

However, you may still find that the search results are not optimally relevant, especially if you have certain pieces of content that you think should be highlighted, or some pieces of content that you know are of particularly high quality. In this case, you can add a search boost at the node level to make these "important" nodes come to the top. You can write a very small amount of custom code in a site-specific module to get the desired result.

In our imagined scenario case, Bob is still upset because the developers also write blog posts, and those tend to include more of the keywords so are better matches, plus he's annoyed that when one of his blog posts does show up, it's one he wrote last month. If you have some way to automatically identify the "important" nodes, then you may be able to transform those rules into code if the rules can be formulated as a Lucene query. For example, like this hook implementation:

Facet queries? Making custom Solr facets for fun and profit.

It sounded like a really simple request: "Is it easy to add a search filter for 'My posts'?". In other words, add a search result facet for posts by the current (logged in) user through the Apache Solr Search Integration module APIs?

But then the wheels start turning - we want not just one blind link, but a real facet link that tells us how many results we'll get. Also, if we are filtering by 'My posts' then we probably have an equal use case for the opposite filter 'Posts not by me'. So we really need a facet block with two links and facets counts.

Drupal 7 Apache Solr Search Mastery

It is day two at Drupalcon Copenhagan, and Robert Douglass and I presented this afternoon on Apache Solr Search Mastery. While the concepts in this talk apply to the Drupal 6 versions, all the code examples are taken from the Drupal 7 port of the Apache Solr module.

Acquia Search release features

We have marked the one year anniversary of the our hosted search service by rolling out a significant update with new features and some fixes. This was released Wednesday night (June 30).

Advanced Apache Solr Example: IP-based Access

In the run-up to our talk "Apache Solr Search Mastery" at Drupalcon San Francisco, we decided that we would not have time to really cover all the advanced topics in the session. So we're going to put up a couple blog posts before hand to invite some discussion and encourage people to dig into the code ahead of time and then we can take questions at the end of the session or during a BoF.

This first post describes the elements of a module that implements a customized IP-address-based scheme for access control on Solr searches. It's a simplified version of the sort of access controls that some universities or companies use to only show (for example) journal articles purchased under license via a website for the library where the license restricts access to students or employees who are on-site. The attached module demonstrates how such a scheme for controlling which nodes appear in search results can be implemented. The code there should be contrasted with the code in the apachesolr_nodeaccess module.

Use Apache Solr to search in files

Drupal's file handling capabilities keep getting better. Beyond the core upload module, the filefield module for CCK has enabled us to build sites with all sorts of files; documents, images, music, videos, and so forth. Searching within these docuements, however, has never been a common feature on Drupal sites. Some solutions have existed, particularly for extracting texts from PDFs and common wordprocessing documents. With Apache Solr, the attachments module, and an extension library called Tika, things can be much better. With Tika you can extract texts not only from Microsoft Office, Open Office, and PDF documents, you can also get text and metadata from images, songs, Flash movies and zipped archives. Searching for these texts is done as part of the normal Apache Solr driven site search.

Pages