Depending on the data that is being searched, some shorter general words, like "a", "the", or "is" can adversely effect search result relevancy. Consider the word "the", which in a standard description of a fish in our database could easily appear hundreds of times or more. When a search is performed, part of the algorithm that calculates the relevancy of any document in the index is to count the number of times a word appears in the text being searched. The more often it appears, the more relevant the document. Words like "the" however often have little to no real bearing on a document's actual relevancy. These words should instead be excluded from the ranking algorithm.
Stop words can also serve another purpose. You can filter out words that are so common in a particular set of data that the system can't handle them in a useful way. For example, consider the word "fish" in our dataset. It's probably very common. With only 500 fish being indexed it's not really going to make much difference, but what if we were indexing five million fish, and each one had the word "fish" in the description even just five times? That's 25 million occurrences of the word "fish". Eventually we might start to hit the upper limit of what Solr can handle. The word "fish" in this case is probably also not very useful in a search query. You're browsing a fish database. Are you really likely to search for the query fish and expect any meaningful results? Likely it would instead return every result. It would be like going to Drupal.org and searching for the word "drupal" and expecting to get something useful. Not going to happen.
Solr has the ability to read in a list of stop words, or words that should be ignored during indexing, so that those words do not clutter your index and are removed from influencing result relevancy. In this tutorial we'll take a look at configuring stop words for Solr.
First, we'll use the Solr web UI to see the most common terms in our index for the body field. Then, based on that list, and the list of common stop words provided by the Solr team, we'll configure our stopwords.txt file. Finally, we'll re-index all the content of our site so that it makes use of the new stop words configuration and re-examine the most common terms noting that our stop words no longer appear in the list.
By the end of this tutorial you should be able to use the Solr web UI to get a list of the most common terms in your index, and know how to add terms to Solr's stopwords.txt file to prevent them from showing up in your index.
Additional resources
Solr provides the option to configure synonyms for use during both indexing and querying of textual data. A synonym is a word or phrase that means exactly or nearly the same thing as another word or phrase in the same language. For example, shut is a synonym of close. Synonyms, if not accounted for, can cause a dilution of search result relevancy when searching for a keywords that have lots of variations in your index.
Consider for example the words, "ipod", "i-pod", and "i pod". It's pretty easy to imagine a scenario in which the content of our site could contain all three variations of the word. When someone searches though they are likely just going to search for one, but expect results for all three. In order to not break those expectations we need to make sure we account for this scenario. Another example from the the Drupal world would be the terms "CMI" and "configuration management". Chances are if you search for one you would be happy to see results for the other.
In this tutorial we'll look at using the synonyms.txt file that is part of our Solr configuration in order to account for synonyms in our data. Of course the exact words you use will depend on the content of your site, but we can at least cover how they work and how to configure them.
By the end of this tutorial you should be able to configure Solr to be aware of synonyms in your data in order to improve the quality of your search results.
Additional resources
One of the benefits of building our own search application is that we have ultimate control over the ranking of items. Combined with our superior knowledge of our own content we can use this to ensure that when someone searches for a specific keyword we bubble our best content for that term to the top of the list, regardless of whatever Solr might rank it based on its internal algorithms. This is commonly referred to as promoted, or sponsored, results; the artificial boosting of a particular document to the top of the result list for a specific query.
A similar, but not exactly the same, example would be sponsored results on Google searches, where you can pay to have your page listed at the top of the results for a specific keyword or set of keywords. We are going to be doing all of this except for the part where we let people pay to promote results, though you could certainly build that part on your own if you need that.
Solr uses a configuration file named elevate.xml, in conjunction with a processor, to elevate results at the time a query is performed. We can promote specific documents in our Solr index by figuring out the unique Solr ID for a document and then adding it to the elevate.xml file along with some information about a query, or queries, this document should be promoted for.
In this tutorial we'll learn how to find a Solr document's unique ID, and then configure Solr to use an elevate.xml file that will promote the "How to Use the Fish Finder" page to the top of the results when someone searches for the term "fish". This configuration is all within the Solr application itself and doesn't really rely on Drupal in anyway. As such, the material in this tutorial should be applicable to your Solr search applications even if you're not building them with Drupal.
By the end of this lesson you should be able to configure promoted documents in your own Solr-based search application.
Additional resources
By far, the best way to keep up-to-date on which modules are the most useful, and to ensure that those modules do what you need, is to actually get directly involved and help. The Drupal community offers a myriad of ways for everyone, from the person who just installed Drupal for the first time yesterday to the person who has been coding since she was in diapers, to give something back. In this tutorial we'll look at all of these options and explain how you can dive in.
Additional resources
In this tutorial you will learn how to create a site archive for an existing Drupal site so that you can import it into Pantheon. You'll learn how to create one single archive file that contains all the components we need: database, code base, and files. First, we'll demonstrate how to create an archive using the Backup and Migrate module, using a Drupal site's UI, and then you'll learn how to accomplish the same task using the Drush archive-dump
command. In this tutorial we assume that you either already have Backup and Migrate module installed on your site, or that you have Drush set up and are familiar with the basics of using it.
Additional resources
Lullabot Module Monday: Backup and Migrate video tutorial
Introduction to Drush video series
In this tutorial, you will learn how to get a Drupalize.Me tutorial demo site up and running using Pantheon. You'll learn about the various components that make up the Drupalize.Me demo site downloads and how each part should be imported. By the end of this lesson, you'll know how to create a Drupalize.Me demo site on a free Pantheon Dev instance so that you can follow along with the trainer in the Drupalize.Me video tutorial.
Additional resources
In this tutorial, you will learn how to use Acquia Dev Desktop 2 to get a Drupalize.Me tutorial demo site up and running. You will learn how to import a Drupal codebase and database containing a finished site for an individual tutorial on Drupalize.Me. This will enable you to walk through the lesson and see what was accomplished on the site during the lesson.
Additional resources
What Is Node.js?
FreeThis series is about integrating Node.js with Drupal 7 using the Node.js Integration contributed module on Drupal.org. The Node.js Integration project contains a number of submodules, and a separate Node.js application written in Javascript, that uses the Express, Socket.io, and Request packages.
Node.js is really fantastic for real time communications, something that Drupal is not particularly good at, out-of-the-box. The Drupal Node.js Integration module brings a host of real time capabilities and a client for your site to enable notifications when a variety of events occur, so your users can receive overlay notifications directly in their browser without page reloads!
In this series we'll cover:
- What Is Node.js?
- Install Node.js and the Node.js Drupal Module
- Configure Your Node.js Application
- Node.js Notifications Module
- Node.js Actions in Action!
- Content Update Messages with Node.js Subscribe
- Real-Time Log Viewing with Node.js Watchdog
- Node.js Checker, a Status Tool
- Get Ready for Production with Node.js
This series is a walk-through of how to get Node.js installed in your system, and how to install and configure the Node.js Drupal module, as well as a look at the related submodules. We'll install a different module that's dependent on the Node.js Drupal module. With this, you'll see how dependencies work in the context of the Node.js Drupal module's Node.js application's configuration.
Now, this series is not for the faint of heart. You'll need some Drupal administration knowledge, installing and configuring modules. We shall also be wildly typing things at the command line, so if you have some experience with command line use, that will definitely be helpful!
If you want to take your learning further, look for the self-check questions in the description for each tutorial in this series. These questions are presented to help you make sure you’re understanding the material, and to encourage you to explore how what you've just learned could apply to your own use case.
If you want to integrate Node.js and Drupal 7 so you can provide real time update notifications on your site, as well as a number of other real time communication options, this series is for you!
Additional resources
To use the Drupal Node.js Integration module, you need Node.js installed on your system, so in this tutorial, I will show you how to get this and the other dependencies installed on your system.
In this lesson, you will learn how to:
- Install Node.js on your system
- Test the version of Node.js installed
- Test the version of NPM installed
- Install the Node.js Drupal module
- Install the Node.js project dependencies with NPM
At the time of recording, I installed version v0.12.5 of Node.js. As the note on Node.js Previous Releases states: "Releases 1.x through 3.x were called "io.js" as they were part of the io.js fork. As of Node.js 4.0.0 the former release lines of io.js converged with Node.js 0.12.x into unified Node.js releases." - which explains the rapid version jump to the current 4.x releases.
You should make sure you have the latest version Node.js installed, particularly if there are security releases.
Note: Starting with version 7.x-1.11 of the Drupal module, the Node.js application is no longer bundled with the download. It can be downloaded seperately from GitHub or NPM.
Self-check question: Which version of Node.js do you have installed?
Additional resources
Note: This video is out of date and will be archived. Check this page for instructions: Node.js integration documentation.
To be able to use the Node.js Integration module, you need to configure both a Node.js application, as well as the Drupal Node.js Integration module itself.
In this lesson, you will learn how to:
- Enable the Node.js Config module
- Understand of all of the configuration options
- Generate a config file for our Node.js application
- Edit the generated configuration for a basic working setup
- Start the Node.js application
Self-check question: Did you remember to change the protocols array in your nodejs.config.js?
Additional resources
The Node.js Notifications module is the User Interface (UI) component of the set of Node.js Integration modules. In this video, you'll see the Node.js Integration module's basic functionality.
In this lesson, you will learn how to:
- Enable the Node.js Notifications module
- Explore the configuration options
- Broadcast a notification message
The Node.js Actions module makes use of the Drupal Core module, Trigger, and adds a new Action type that allows for real time notifications to be sent for various Trigger events. Check out this lesson to see it all in action!
In this lesson, I will show you how to:
- Enable the Node.js Actions module
- Explore the various notification types
- Enable notifications for users logging in and out of the site
The Node.js Subscribe module lets you configure your site to allow users to subscribe to real time notifications through your site's UI for particular content types. The Node.js Subscribe module uses the Node.js Notifications module to provide the UI for the notifications, which we covered in Node.js Notifications Module.
In this lesson, I will show you how to:
- Enable Node.js Subscribe
- Set permissions to allow users to subscribe to content
- Subscribe to some content and see Node.js Subscribe in action
Have you ever wanted to watch the "Recent Log Messages" page of Drupal load automatically? This module can be really helpful if you'd like to do just that.
In this lesson, I will show you how to:
- Enable Node.js Watchdog module
- Generate and watch log messages in real time
The Node.js Integration module has a number of other contributed modules that depend on it. Node.js Checker is one such module. We'll take a look at how this is set up with the Node.js Integration module by adding a dependency to our Node.js server configuration, and then see it all in action!
In this lesson, I will show you how to:
- Install the Node.js Checker module
- Enable the extension in the Node.js server configuration
- Configure the Node.js Checker block
Additional resources
Once you're happy with your Node.js setup in a local development environment, there are some further steps to take before running this set up in a production environment.
While not a requirement, I would highly recommend you use an SSL certificate, and configure your Node.js application and the Drupal Node.js module to run in HTTPS. You can read more about SSL certificates at The Linux Documentation Project.
Up until this point, we have been starting our Node.js application manually, from the command line. This is not a good method for a production environment, so we'll use PM2 to manage the running of our application.
In this lesson, I will show you how to:
- Install PM2, a Node.js module to manage your Node applications
- Configure PM2
- Add SSL configuration for the Node.js server
- Start up our Node.js application with PM2