In this tutorial we'll walk through downloading and installing the Search API module, the Search API Solr module, and their dependencies. Then we'll look at using the Search API Solr configuration files with our Solr server. These configuration files are specially crafted to help with indexing data contained in a Drupal site and allow Solr to have a better understanding of Drupal's entities, fields, and the data that they contain. For example, mapping a Drupal Field API body field on a page node type to the appropriate field type in Solr.
The gist of this tutorial is, locating the Solr configuration files in the search_api_solr/solr-conf/
directory, talking a little bit about what each one does, and then demonstrating how to copy those files into the configuration for your Solr server so that Solr will start using them.
After looking at the various configuration files, and then placing them into our Solr instance, we'll connect Drupal to our Solr server by creating a new Search API server configuration within Drupal's UI. This will allow us to confirm that our Apache Solr server, and Drupal, will be able to talk to one another.
By the end of this tutorial you should be able to configure Solr to work with Drupal, connect the two, and verify that the connection is working.
Additional resources
In order for Solr to provide search results we need to first send our Drupal content to the Solr server so that it can be indexed. In this tutorial we'll look at connecting the Search API module with the Solr server and creating an index that maps content in Drupal to data types in Solr. We'll also look at how the various configuration options effect the Solr index.
After creating a Search API index configuration we'll look at running the indexer, essentially queuing all the content on our site for indexing, and then telling Drupal to send the documents on our site to the Solr server for indexing. This can be done either via the UI or with Drush. You can choose to index content as it's created, or for sites with higher rates of new content you can send it to the indexer in periodic batches. Whichever you choose, making sure that you've got a system in place for periodically sending items queued for indexing to Solr is a critical step.
By the end of this tutorial you should be able to send content from Drupal to Solr for indexing, and verify that the content is showing up in the Solr server's index.
The Search API module by itself doesn't provide a UI for submitting a search query, or a page for displaying results. Instead, it exposes an API that other modules can use to provide those features. This makes it super flexible, but it also means we've got some extra work to do in order to allow someone to actually perform a search and see the results.
In this tutorial we'll look at using the Search API Pages module to create a simple search page with a form at the top and a list of results ordered by relevancy. Search API Pages is the quickest and easiest way to replace the Drupal core search module's functionality with a form that uses Solr for a search backend instead of MySQL.
When creating a new page with the Search API Pages module we can choose the view mode that we would like to use for displaying results. It works very nicely with Drupal's built-in view modes, as well as contributed modules like Display Suite, in order to allow for a high level of customization of view modes, and thus of the displayed results.
You can also configure the query type to use, choosing from one of: multiple terms, single term, or direct query. For integration with Solr you'll likely want to choose direct query, and allow Solr to handle the query parsing since it has a lot of advanced options that go far beyond what Search API handles on its own. However, we'll look at the different query type configurations, and demonstrate things we can do with direct query searches and the powerful Solr query syntax that we can't do with the other modes.
Finally, we'll look at the block that Search API Pages provides, and use it to replace the search form on the home page of our site with a form that points to our new Search API Pages search results page.
By the end of this tutorial you should be able to expose a page on your site that will allow your visitors to perform a search using the Solr index and have the results displayed in Drupal.
Additional resources
There are a couple of configuration options available when configuring a Search API index that we haven't looked at yet: adding additional fields, and using boost values to increase the relevance of a keyword when found in a specific field.
Solr allows you to index any number of additional fields, so we'll add a species and genus field to our index. This is one of the reasons using Search API to interface with Solr is so great. Through it's use of the Entity API, the Search API module has a deep understanding of all the content types on your site and the fields that are attached to them, without you having to write any code, or do anything other than configure things in the UI.
One of the benefits of creating your own search index is that you know your data better than anyone, and you know what people are hoping to find in your content. Solr allows you to configure a boosting value that can be used to increase the relevancy of keywords found depending on where in the data it's located. For example, when someone searches for a keyword we can probably assume that if the keyword is in the page title that the keyword is worth more relevancy points than if the keyword is found in the page body. With boosting we can affect the relevancy ranking of results and help our users more quickly find what they are looking for.
By the end of this tutorial you should be able to add additional fields to your Solr index so their content is available for searching, as well as assign a relevancy boosting value when keywords are found in specific fields.
The Search API module supports a handful of data alterations and processors; additional operations that can be performed on a document before it's indexed or during the display of search results. While Solr actually handles the majority of these for us already, this tutorial will look at the available options, talk about what each one does, and explain which ones are still relevant when using Solr as a backend.
Looking at data alterations in the Search API module also raises an important point about security. By default, Search API doesn't care about your content's access control settings. In order to prevent people from seeing results for their searches that contain data they shouldn't have access to we need to make sure we account for that in our configuration.
Here's a good list of the currently available data alterations and processors, though it's worth noting that not all of them are available for all search backends. Also, as we'll see, not all of them are recommended when using Solr even if they are available. Solr's tokenizer for example is much more full featured than the Search API tokenizer, so when using Solr as a backend it's best to keep the Search API tokenizer turned off and let Solr do its thing.
By the end of this lesson you should be able to use data alterations and processors to filter out specific content types from your Solr index and to highlight keywords found when displaying search results. You'll also be able to explain why some alterations and processors are better left off so that Solr can handle those tasks directly.
Additional resources
Being able to display search results using the Views module provides a huge amount of flexibility with respect to what is listed, what it looks like, and more. In this tutorial we'll look at using the Search API Views module, included in the Search API project, to create a view that allows users to search our Solr index and display the results as a table, or really, in any other way that Views can display content. We'll also cover some special considerations regarding access control and entity relationships that we need to keep in mind when using Views to display search results.
The biggest difference between creating a view that lists a bunch of nodes, and one that displays search results is that you need to use your Search API index as the base table from which you're building your view. Then, by default, the view only has access to the fields that are in the Solr index. This allows you to build the entire view without having to query the database. Or you can use the Views module's ability to define relationships to other buckets of content to query the database and pull in additional information. There's a huge amount of flexibility.
When building views from the Solr index you can optionally expose one or more filters. Essentially creating a form that allows someone to construct a search query. This can be as simple as exposing a keyword text field, or as complex as you would like to get. We'll look at using exposed filters to create a form that users can perform a search with and create a more complete search experience. We'll also look at how you can move those exposed filters into a block that can be displayed on the home page of our site, allowing us to replace the functionality provided to the Drupal core Search module with Views and Solr.
By the end of this tutorial you should be able to create a view that displays search results using the Search API Views module.
Additional resources
Using facets allows users of your search application to further narrow the results returned from a keyword search by selecting one or more attributes of the returned content and saying either show me only these, or show me everything but these. In this tutorial we'll take a look at some examples of faceted searching in practice, and then we'll use the Facet API module to expose facets for our genus and species fields.
One of the most common uses of facets is on e-commerce sites like Zappos.com that have huge collections of products that users can browse through, and narrow down, to focus in on exactly the pair of shoes they are after. In this example facets allow you do to things like narrow the results returned from your initial keyword search to just shoes for men, which are brown, size 10.5, and on sale. You can can also see faceting in action any time you perform a search on our site.
We'll use facets to allow users of the fish finder application to limit the results returned to just those of a specific species or genus. In doing so we'll also look at the options available for determining how facets should be displayed, whether or not we should show a facet that has zero documents in our result set, and how to combine multiple facets together into a single query using either AND or OR logic.
By the end of this tutorial you should be able to use the Facet API module in conjunction with Search API in order to provide facets that your users can use to further narrow and refine their search results.
Additional resources
Depending on the data that is being searched, some shorter general words, like "a", "the", or "is" can adversely effect search result relevancy. Consider the word "the", which in a standard description of a fish in our database could easily appear hundreds of times or more. When a search is performed, part of the algorithm that calculates the relevancy of any document in the index is to count the number of times a word appears in the text being searched. The more often it appears, the more relevant the document. Words like "the" however often have little to no real bearing on a document's actual relevancy. These words should instead be excluded from the ranking algorithm.
Stop words can also serve another purpose. You can filter out words that are so common in a particular set of data that the system can't handle them in a useful way. For example, consider the word "fish" in our dataset. It's probably very common. With only 500 fish being indexed it's not really going to make much difference, but what if we were indexing five million fish, and each one had the word "fish" in the description even just five times? That's 25 million occurrences of the word "fish". Eventually we might start to hit the upper limit of what Solr can handle. The word "fish" in this case is probably also not very useful in a search query. You're browsing a fish database. Are you really likely to search for the query fish and expect any meaningful results? Likely it would instead return every result. It would be like going to Drupal.org and searching for the word "drupal" and expecting to get something useful. Not going to happen.
Solr has the ability to read in a list of stop words, or words that should be ignored during indexing, so that those words do not clutter your index and are removed from influencing result relevancy. In this tutorial we'll take a look at configuring stop words for Solr.
First, we'll use the Solr web UI to see the most common terms in our index for the body field. Then, based on that list, and the list of common stop words provided by the Solr team, we'll configure our stopwords.txt file. Finally, we'll re-index all the content of our site so that it makes use of the new stop words configuration and re-examine the most common terms noting that our stop words no longer appear in the list.
By the end of this tutorial you should be able to use the Solr web UI to get a list of the most common terms in your index, and know how to add terms to Solr's stopwords.txt file to prevent them from showing up in your index.
Additional resources
Solr provides the option to configure synonyms for use during both indexing and querying of textual data. A synonym is a word or phrase that means exactly or nearly the same thing as another word or phrase in the same language. For example, shut is a synonym of close. Synonyms, if not accounted for, can cause a dilution of search result relevancy when searching for a keywords that have lots of variations in your index.
Consider for example the words, "ipod", "i-pod", and "i pod". It's pretty easy to imagine a scenario in which the content of our site could contain all three variations of the word. When someone searches though they are likely just going to search for one, but expect results for all three. In order to not break those expectations we need to make sure we account for this scenario. Another example from the the Drupal world would be the terms "CMI" and "configuration management". Chances are if you search for one you would be happy to see results for the other.
In this tutorial we'll look at using the synonyms.txt file that is part of our Solr configuration in order to account for synonyms in our data. Of course the exact words you use will depend on the content of your site, but we can at least cover how they work and how to configure them.
By the end of this tutorial you should be able to configure Solr to be aware of synonyms in your data in order to improve the quality of your search results.
Additional resources
One of the benefits of building our own search application is that we have ultimate control over the ranking of items. Combined with our superior knowledge of our own content we can use this to ensure that when someone searches for a specific keyword we bubble our best content for that term to the top of the list, regardless of whatever Solr might rank it based on its internal algorithms. This is commonly referred to as promoted, or sponsored, results; the artificial boosting of a particular document to the top of the result list for a specific query.
A similar, but not exactly the same, example would be sponsored results on Google searches, where you can pay to have your page listed at the top of the results for a specific keyword or set of keywords. We are going to be doing all of this except for the part where we let people pay to promote results, though you could certainly build that part on your own if you need that.
Solr uses a configuration file named elevate.xml, in conjunction with a processor, to elevate results at the time a query is performed. We can promote specific documents in our Solr index by figuring out the unique Solr ID for a document and then adding it to the elevate.xml file along with some information about a query, or queries, this document should be promoted for.
In this tutorial we'll learn how to find a Solr document's unique ID, and then configure Solr to use an elevate.xml file that will promote the "How to Use the Fish Finder" page to the top of the results when someone searches for the term "fish". This configuration is all within the Solr application itself and doesn't really rely on Drupal in anyway. As such, the material in this tutorial should be applicable to your Solr search applications even if you're not building them with Drupal.
By the end of this lesson you should be able to configure promoted documents in your own Solr-based search application.
Additional resources
By far, the best way to keep up-to-date on which modules are the most useful, and to ensure that those modules do what you need, is to actually get directly involved and help. The Drupal community offers a myriad of ways for everyone, from the person who just installed Drupal for the first time yesterday to the person who has been coding since she was in diapers, to give something back. In this tutorial we'll look at all of these options and explain how you can dive in.
Additional resources
In this tutorial, you will learn how to get a Drupalize.Me tutorial demo site up and running using Pantheon. You'll learn about the various components that make up the Drupalize.Me demo site downloads and how each part should be imported. By the end of this lesson, you'll know how to create a Drupalize.Me demo site on a free Pantheon Dev instance so that you can follow along with the trainer in the Drupalize.Me video tutorial.
Additional resources
In this tutorial, you will learn how to use Acquia Dev Desktop 2 to get a Drupalize.Me tutorial demo site up and running. You will learn how to import a Drupal codebase and database containing a finished site for an individual tutorial on Drupalize.Me. This will enable you to walk through the lesson and see what was accomplished on the site during the lesson.
Additional resources
Drupal's content moderation and workflow tools allow you to configure and support a flexible multistep publication process.
An overview of some of our favorite Drupal documentation resources.
Do you want to know how to contribute translations to Drupal core or other contributed modules and themes? Have you ever wondered how translations are managed in Drupal? It all happens in the community at localize.drupal.org. This tutorial gives a tour of localize.drupal.org and then teaches you how to join translation groups and contribute translated strings back to the Drupal community.
Additional resources
Building one Drupal site is a fair amount of work in and of itself. But what about working with multiple Drupal sites? Sometimes you have a few sites that make sense together, either from a maintenance perspective, or due to an overlap in content or users. There are a number of different ways to approach this in Drupal, and which path you follow varies considerably depending on the exact use case you need to fulfill. In this lesson we'll get a good look at the problem multiple sites can pose, and list out some common use cases. Then we'll take a look at three different broad categories of solutions, with some specific architectural approaches. The rest of this series will walk through managing multiple sites using Drupal core's built-in multisite system.
If you are interested in working with the Domain Access project instead of core multisite, you should look at the Introduction to Domain Access series.
When working with domain names and getting a website to show up in your browser, it can be a little confusing to sort out which bits of the puzzle are where. You need to be able to properly configure the domain name server (DNS) so your browser can match up a domain name with a web server, and then make sure the Apache web server knows which files to direct that incoming domain name to. In this lesson we're going to walk through the process from the browser request to the website files. We'll take a look at the Apache documentation on virtual hosts (or vhosts) and discuss where to find this configuration. Then we'll take a look at some example vhost files to see what's going on in there.