Check your version

This video covers a topic in Drupal 7 which may or may not be the version you're using. We're keeping this tutorial online as a courtesy to users of Drupal 7, but we consider it archived.

Alternate resources: 

Better Search Results with Solr and Drupal 7

Video loading...

  • 0:00
    Better Search Results with Solr and Drupal 7 with Joe Shindelar
  • 0:03
    Hi. Welcome to the Drupalize.Me series
  • 0:06
    on using Apache Solr to build super-fast, super-accurate
  • 0:10
    and super-awesome search applications with Drupal 7.
  • 0:13
    Solr is a Java-based application that provides an API
  • 0:17
    for interacting with Apache Lucene via HTTP
  • 0:21
    in order to facilitate the creation of applications
  • 0:23
    for performing full text searches of content,
  • 0:27
    with a special focus on internet-based search applications.
  • 0:30
    The quick pitch for why you should use Solr—
  • 0:33
    it's insanely fast, especially when compared with Drupal's default search module.
  • 0:38
    It can be scaled to handle millions of search queries per second
  • 0:41
    and huge piles of data.
  • 0:43
    Twitter, for example, uses it to do just that.
  • 0:47
    It's better at finding the best results, because it's designed
  • 0:51
    specifically to be a search application,
  • 0:54
    and it works well with faceted searches,
  • 0:56
    and best of all, it's open source, so it fits in well with the Drupal ecosphere
  • 1:01
    and Drupal projects base.
  • 1:03
    In this series, we'll take a more in-depth look
  • 1:06
    at various capabilities of Solr and talk about how they can be best used
  • 1:09
    in conjunction with data stored in a Drupal-based site.
  • 1:13
    Because Solr is an application in and of itself,
  • 1:16
    we're going to need to set up a Solr server with a single Solr core
  • 1:20
    for storing our indexed data.
  • 1:23
    For our production site, we're going to look
  • 1:25
    at how you might host Solr on an Ubuntu-based server
  • 1:28
    as part of a larger web infrastructure and walk through installing things like Java
  • 1:33
    and Tomcat and Solr itself, and generally talk
  • 1:35
    about best practices for setting all those things up.
  • 1:39
    In a lot of cases, you might already be hosting
  • 1:41
    with a provider that has Solr integrated into their network,
  • 1:44
    and you can simply connect your production site
  • 1:46
    to their Solr service. No need to set everything up yourself.
  • 1:50
    And that's a great way to get started with Solr,
  • 1:53
    but you'll probably still want to be able to do some local development to test things out—
  • 1:57
    see what happens when you change configuration, etcetera—
  • 2:00
    and to do so in a safe and sandboxed area.
  • 2:03
    Luckily, running a single instance of Solr for a development environment is super easy.
  • 2:08
    Download a few files and run a couple commands in your terminal,
  • 2:11
    and bam—you've got a server.
  • 2:13
    We'll go over that setup as well, so that you're prepared
  • 2:15
    to do development on your local host.
  • 2:17
    Since Solr is a third-party application, we need a way to bridge the gap
  • 2:22
    between Solr and Drupal, and really, there's two parts to this puzzle.
  • 2:26
    Getting the data out of Drupal and into Solr
  • 2:29
    so that it can be processed and indexed,
  • 2:31
    and passing a search query from Drupal to Solr
  • 2:35
    in order to retrieve and display search results.
  • 2:38
    There's also two key parts to the solution— the Search API module
  • 2:42
    and the Search API Solr module.
  • 2:45
    The Search API module takes on the responsibility
  • 2:48
    of understanding Drupal and the specifics of our site's content,
  • 2:52
    what, of all that data, should be indexed for searching,
  • 2:54
    and any special considerations for handling different field types,
  • 2:57
    like big blobs of text versus integers, versus dates.
  • 3:02
    In this series, we'll look at configuring a new index for our site
  • 3:05
    and walk through each of the various options
  • 3:07
    that are available to us when doing so.
  • 3:10
    We'll also talk about using boost values to influence the relevancy
  • 3:13
    of search results based on the field in which the keywords are located
  • 3:17
    and how to promote or sponsor results for specific keywords,
  • 3:21
    some of which is done in the Search API module's configuration,
  • 3:24
    and some of which we'll do directly in the Solr server's configuration.
  • 3:29
    The Search API module is awesome, because it creates
  • 3:32
    this generic index configuration that describes what to index and when,
  • 3:36
    but it doesn't actually do any of the indexing on its own.
  • 3:38
    Instead, it exposes the ability for other modules
  • 3:42
    to provide one or more service classes that can be used
  • 3:45
    to translate between what Search API knows about your data
  • 3:48
    and your search appliance of choice.
  • 3:50
    The nice thing is, this means we're not necessarily tied to using Solr,
  • 3:54
    and could pretty easily switch to Elasticsearch or Xapian
  • 3:57
    or one of the many other indexers available without having to change much configuration.
  • 4:02
    We'll use the Search API Solr module to configure a connection
  • 4:06
    between our Solr server and Drupal,
  • 4:08
    so that the Search API module can send content to Solr for indexing
  • 4:12
    and retrieve search results from Solr once someone performs a query.
  • 4:17
    Throughout this series, we're going to be using some sample data
  • 4:20
    that I've assembled about fish.
  • 4:22
    The hypothetical scenario is that you're working for the local DNR.
  • 4:25
    You've got a Drupal site that contains a giant database of fish,
  • 4:28
    and you've just been asked to improve the speed and relevancy
  • 4:31
    of search results performed on the site.
  • 4:35
    Your data also contains extensive information
  • 4:38
    that can be used to filter search results, so you would also like
  • 4:41
    to be able to provide faceted search capabilities
  • 4:43
    and some other bonus features as well.
  • 4:46
    By the end of this series, you should be able to install
  • 4:49
    and configure Apache Solr to act as a search indexer for Drupal,
  • 4:52
    use the Search API and Search API Solr modules
  • 4:56
    to connect Drupal to a Solr server and create amazing search experiences,
  • 5:00
    all within Drupal.

Better Search Results with Solr and Drupal 7

Loading...

One of the best ways to improve both the speed, and relevancy, of search results for a Drupal site is to stop using the Drupal core search module and start using Apache Solr. Solr is a Java-based application that provides an API for interacting with Apache Lucene via HTTP to facilitate the creation of excellent applications for performing full-text content searches, with a special focus on internet-based search applications. The quick pitch for why you should use Solr is it's insanely fast, especially when compared with Drupal's default Search module, and it can be scaled to handle millions of search queries per second and huge piles of data.

Since Solr is a third party application we need a way to bridge the gap between Solr and Drupal. Really, there are two parts to this puzzle: getting the data out of Drupal and into Solr so it can be processed and indexed, and passing a search query from Drupal to Solr in order to retrieve, and display, search results. For that, we'll use the Search API module, and the Search API Solr module.

In order to demonstrate a real-world use case we'll pretend that we're the owner of a website that contains a database of fish species. As the database has grown over time we've begun to feel the limits of Drupal's MySQL full-text search and want to improve our search tools. Using Solr will allow for better matches in full-text search, faster searches, and a lot of additional functionality like partial word matches, spell checking, facets, and more.

In this series we'll cover:

  • What Apache Solr is and why you should consider using it
  • Installing Solr and configuring it to work well with Drupal content
  • The contributed Search API module
  • The contributed Search API Solr module
  • Configuring Drupal to send content to Solr for indexing
  • Retrieving search results from Solr and displaying them in Drupal on both a stand-alone page and with the Views module
  • Using Solr field boosting to influence result relevancy
  • Using the contributed Facet API module with Solr to allow for faceted search results
  • Configuring stop words, synonyms, and promoted search results in Solr

This series is for anyone that wants to improve the quality of the search functionality of their Drupal-powered site. There is some system administration required to install Solr, but it's pretty straightforward. Almost everything else is done via configuration in Drupal's, or Solr's, user interface and by editing simple XML configuration files. So, no PHP, or module development experience required. We do however assume that you're already familiar with basic Drupal administration.