This week we start our new series on improving Drupal's search with Apache Solr.
When it comes to integrating Apache Solr with Drupal there are currently two different modules that can be used, Search API, and the Apache Solr module. While both are valid options, for this series we've chosen to focus on the Search API module because amongst other things it's generally more flexible, and based on conversations with people in the community who are working on Solr integration it is currently seeing more focused development efforts and will likely superseded the Apache Solr module sometime in the future.
This tutorial provides some background information on the Search API module and why we've chosen to use it. We'll look at how the Search API module bridges the gap between Solr and Drupal, and explain some of the commonly used terms we'll encounter in the module's UI and codebase.
By the end of this lesson you should be able to explain the Search API module's terminology, requirements, and position in the Drupal ecosphere, as well as be able to make a good case for why someone should choose the Search API module as a starting point for creating better search tools in Drupal.
Additional resources
Apache Solr is a world class search application built on top of the Lucene indexer. Before we start trying to integrate Solr with Drupal lets talk about what Solr is, and what makes it so good, as well as how Solr differs from the Drupal core database-backed Search module. This tutorial is a short presentation explaining Solr, Lucene, and things to consider when choosing Solr as a search technology.
Lucene is an open-source search indexer written in Java and governed by the Apache foundation. It is the underlying library that handles storing indexed content, and does so in a way that makes it extremely flexible. By treating each record as a document made up of any number of different fields Lucene is capable of storing just about anything you throw at it, as long as the resource can be broken up into fields and the textual data can be extracted from those fields. This makes it a good choice for indexing web based content where you might be dealing with HTML, PDF, XML, Microsoft Word, and all kinds of other document formats.
Solr, is an HTTP API for interacting with the Lucene application that makes it easier to create custom search applications. Like Lucene it is also open source, written in Java, and governed by the Apache foundation. Solr's extensive use of XML configuration files allows you to modify almost everything about how Solr works without having to write any Java. This makes it a great choice for anyone that's familiar with PHP but doesn't have Java experience.
When compared with Drupal core's Search module, or any MySQL full-text search tool, Solr has some distinct advantages. Including:
- Best-in-class stemming and tokenization
- Scalability; it's designed to scale both vertically and horizontally as needed
- Built-in support for facets, geospatial searches, and other advanced query options
In addition to these advantages, using Solr for your search can dramatically improve your Drupal site's performance by eliminating costly full-text queries, which can quickly turn MySQL into a bottleneck for sites with even a modest amount of content.
By the end of this tutorial you should be able to explain the advantages that Solr provides over Drupal core's search module and why it's a good choice for building ultra-fast, and accurate, search applications.
Additional resources
One of the best ways to improve both the speed, and relevancy, of search results for a Drupal site is to stop using the Drupal core search module and start using Apache Solr. Solr is a Java-based application that provides an API for interacting with Apache Lucene via HTTP to facilitate the creation of excellent applications for performing full-text content searches, with a special focus on internet-based search applications. The quick pitch for why you should use Solr is it's insanely fast, especially when compared with Drupal's default Search module, and it can be scaled to handle millions of search queries per second and huge piles of data.
Since Solr is a third party application we need a way to bridge the gap between Solr and Drupal. Really, there are two parts to this puzzle: getting the data out of Drupal and into Solr so it can be processed and indexed, and passing a search query from Drupal to Solr in order to retrieve, and display, search results. For that, we'll use the Search API module, and the Search API Solr module.
In order to demonstrate a real-world use case we'll pretend that we're the owner of a website that contains a database of fish species. As the database has grown over time we've begun to feel the limits of Drupal's MySQL full-text search and want to improve our search tools. Using Solr will allow for better matches in full-text search, faster searches, and a lot of additional functionality like partial word matches, spell checking, facets, and more.
In this series we'll cover:
- What Apache Solr is and why you should consider using it
- Installing Solr and configuring it to work well with Drupal content
- The contributed Search API module
- The contributed Search API Solr module
- Configuring Drupal to send content to Solr for indexing
- Retrieving search results from Solr and displaying them in Drupal on both a stand-alone page and with the Views module
- Using Solr field boosting to influence result relevancy
- Using the contributed Facet API module with Solr to allow for faceted search results
- Configuring stop words, synonyms, and promoted search results in Solr
This series is for anyone that wants to improve the quality of the search functionality of their Drupal-powered site. There is some system administration required to install Solr, but it's pretty straightforward. Almost everything else is done via configuration in Drupal's, or Solr's, user interface and by editing simple XML configuration files. So, no PHP, or module development experience required. We do however assume that you're already familiar with basic Drupal administration.
Additional resources
Before we can start building a search application we need some sample data that we can index and use for testing, not to mention a site we can use to test this all out on. In this tutorial we'll walk through installing Drupal 7 and importing some sample data.
In order for this to work I built a Drupal 7 site with a content type named Fish, and then imported a whole bunch of descriptions of various fish from Wikipedia. You should be able to use the provided database dump in order to get up and running with a sample Drupal site pre-populated with some sample data.
If you're not planning on following along and building the fish finder application in the Search API and Solr series, or are planning on implementing Solr search on your own site instead you can probably skip this tutorial. Just note that the rest of the tutorials in the series assume you've got a working Drupal 7 site with some content.
By the end of this tutorial you should have a working Drupal 7 site with sample content running on your localhost for playing with while watching the rest of the series.
Drupalize.Me on Your TV
Blog postOne thing people like to do with online learning is work and watch at the same time. Members have let us know that they want to be able to watch our videos on their TVs while using their computers to work along with the trainer. Luckily Drupalize.Me has several options to make this happen.
Debugging is a discipline that requires patience, and a fervent attention to detail. In the often times fast paced world of software development, when we're faced with deadlines, and an ever growing list of new features to add, and bugs to resolve, it can be a difficult to slow down and proceed in a meticulous, measured fashion. When it comes to solving difficult problems though, this fastidious approach is exactly what's required to locate, and resolve, a problem's root cause.
This week we're wrapping up our Introduction to Domain Access for Drupal 7 series and adding a handy tutorial to our existing free Command Line Basics series.
In addition to different content, you may also want to differentiate your domains in how they look and change some of the basic site settings to make them appear more as separate sites. In this tutorial we'll use the Domain Config and Domain Theme modules (included in the Domain Access package) to let us do just this. We'll change our settings on one of the sites to set the homepage node to the About page we created earlier. Then we'll make the Alumni site look quite different by giving it a new theme. Through this process you will understand things you need to watch out for when configuring Domain Access sites, and how to be appropriately cautious with your settings.
To really make Domain Access work the way you need it, you need to make sure you set up your roles, users, and permissions correctly. We've been setting things up on our site as the administrator, but so far our site is not configured for other people to be involved. In this tutorial we're going to configure the permissions so that we have authenticated users who can create and edit content on particular domains. We'll also have several editors. Two of the editors can only manage content on their particular domains, while one editor will have access to all content across all three domains.
In the process of setting this up we'll review the Domain Access permissions documentation, then dive into configuring them. We will also look at how we can set a default domain for a role, even though we won't need that for this use case. To test things out, we'll create some content as different users and see how the editors can or can not interact with that content.
To get things moving in this lesson, we are starting off having already created a number of users, and adding an editor role to the site. We don't walk through this process in the lesson, so if you need a refresher for creating roles and users, you can watch Hands-On: Creating Roles and Users from the Using Drupal series.
Additional resources
Domain Access Permissions (drupal.org handbook)
With the basics of our three domains set up, you're ready to build out your sites. We've covered the main steps to get you started, but you'll find that there are a lot more options available to you as you build. Which additional modules you use will depend heavily on your particular needs. In this tutorial we'll talk about the other modules that are included in the Domain Access package, which we haven't used in this series. We'll also look at a list of other contributed modules that work with Domain Access to extend its feature set even further.
Additional resources
Domain Access modules (drupal.org handbook)
Domain Access related contributed modules (drupal.org handbook)
This is an introduction to the Tail command, available on Unix/Linux systems. Tail has many applications, but this video concentrates on its basic usage and useful options, as they pertain to Drupal developers.
You'll learn how to take a quick peek at recent log messages from a single log file, how to do the same thing with multiple logs, as well as watching log files in real time! We'll finish up with a practical application, to see why this is useful.
Commands used in this video:
To view the documentation (or manual) for the tail command:
man tail
To show the last 20 lines of the webserver's access log file:
tail /var/log/apache2/access.log
To show the last 20 lines of the webserver's error log file:
tail /var/log/apache2/error.log
To show the last 20 lines of the webserver's error log file and continue to print new lines added to the file:
tail -f /var/log/apache2/access.log
Installing and Configuring Dreditor
Blog postWith DrupalCon Los Angeles underway we thought it might be a good time to introduce (or reintroduce) folks to Dreditor (short for "Drupal editor"). Dreditor is a collection of user scripts, which alter browser behavior on specific pages on the drupal.org domain. The features of dreditor are mostly helpful in the issue queue and during the patch review process.
We have a video Installing and using Dreditor if you'd like to follow along, but since recording installation of Dreditor is even easier. Let's take a look at the changes, and how we can use this powerful tool to make interacting with the issue queue easier.
The redesigned Syfy.com website is a beautiful example of the latest in front-end technology. Lullabot developers Mike Herchel and Chris Albrecht join the Drupalize.Me podcast to explain it all.
Monthly Update
Blog postApril was a busy month for our team! We published tons of new Drupal content and updated lots of site features. Here's a quick recap:
In this tutorial we will get hands-on with Domain Access by getting the module installed. This is a more involved process than a regular module installation, but we just need to make sure we have a few things in place first. We're going to need to make sure we have our domains functioning correctly through Apache, and then add the Domain Access include file to our settings.php. With the configuration and module in place, we'll also verify that it is working properly and take a look at our domain list.
After watching this tutorial you will be able to properly install the Domain Access module, with its additional steps, and then verify that the installation was correct.
Additional resources
Domain Access project (drupal.org)
Domain Access Configuring settings.php (drupal.org handbook)
Installing the Domain Access module (Drush instructions) (drupal.org handbook)
With the main Domain Access site installed, we now need to get our other domain names added to the site and working. In this tutorial we'll review the settings for domains, add the Alumni and News domain names, and then test that all three domains are working properly.
Additional resources
Basic Domain Access module configuration (drupal.org handbook)
One of the biggest reasons to use Domain Access is to control the content for multiple domain names. In this tutorial we'll dive into content on our three sites. We'll start by sharing content across all the domains, and then create domain-specific content. To make managing the content across our domains easier, we'll then enable the included Domain Content module. This will provide us with some nice administrative tools to keep track of things.
Release Day: Hands-on with Domain Access
Blog postThis week we get hands-on with Domain Access as we continue the Introduction to Domain Access for Drupal 7 series. We'll walk through configuring our Apache virtual host (vhost) so all three of our domain names are pointing to the same Drupal site. With that configured properly we can get the Drupal site installed and then install Domain Access. There is a little extra installation step required for Domain Access to do its magic. After we have it up and running we spend some time looking at and understanding the main Domain Access settings, get our other domain names added to the site, and dive into working with content. We're going to learn how to share content across all three domains, as well as be able to restrict some content to only certain domains.