Importing Data With Migrate and Drupal 7

This page is archived

We're keeping this page up as a courtesy to folks who may need to refer to old instructions. We don't plan to update this page.

Alternate resources

This series is focused on using the Migrate module to take data that exists in various different sources and import into a Drupal 7 website. The Migrate module provides an extremely flexible and robust framework for accessing data from various sources, and importing or migrating that data to Drupal. With built in support for creating core Drupal data types likes nodes, users, and taxonomy terms, the Migrate module is one of the best solutions available for importing content into Drupal.

This series kicks off with Joe Shindelar explaining the basic components that make up a data migration, and the terminology and code that is specific to the Migrate module. Then continues with a series of lessons that take you from installing the Migrate module to writing and running your own custom data migration.

Throughout the series Joe teaches us how to run a data migration using both the Migrate module's UI and drush, and some of the plusses and minuses of both methods. Joe also talks about the various different sources, or types of data, from which the Migrate module can read data and how to map the unique fields in a row of source data to their corresponding Drupal content types and fields.

By migrating from a single source into two different Drupal content types we'll also have the opportunity to learn about creating relationships during a migration and mapping the resulting information to an entity reference field. During the course of writing a custom migration Joe will show us how and where we can add code to perform additional runtime data munging during our import process. We'll learn about importing data into multi-value fields, and even providing defaults for fields that don't have information. Then look at some of the tools the migrate module provides for collaborating with team members in order to create a successful migration path.

Finally we'll wrap up the series looking at a couple of different techniques for debugging our migrations and dealing with pesky source material that just doesn't want to be imported.

Because this series is focused primarily on writing custom data migrations and since the Migrate module itself requires at least some amount of code to be written to perform a migration it is suggested that students be familiar with PHP and basic Object Oriented Programming techniques. And although not required to run a migration Joe uses the drush command line tool extensively in this series. If you need a refresher on using drush take a moment to watch our drush series.

Tutorials in this course
More information

This series is focused on using the Migrate module to import data that exists in various different sources into a Drupal 7 website. The Migrate module provides an extremely flexible and robust framework for accessing data from various sources and importing or migrating that data to Drupal. With built in support for creating core Drupal data types likes nodes, users, and taxonomy terms, the Migrate module is one of the best solutions available for importing content into Drupal.

This series kicks off with Joe Shindelar explaining the basic components that make up a data migration, and the terminology and code that is specific to the Migrate module. Then continues with a series of lessons that take you from installing the Migrate module to writing and running your own custom data migration.

Throughout the series Joe teaches us how to run a data migration using both the Migrate module's UI and drush, and some of the plusses and minuses of both methods. Joe also talks about the various different sources, or types of data, from which the Migrate module can read data and how to map the unique fields in a row of source data to their corresponding Drupal content types and fields.

By migrating from a single source into two different Drupal content types we'll also have the opportunity to learn about creating relationships during a migration and mapping the resulting information to an entity reference field. During the course of writing a custom migration Joe will show us how and where we can add code to perform additional runtime data munging during our import process. We'll learn about importing data into multi-value fields, and even providing defaults for fields that don't have information. Then we'll look at some of the tools the migrate module provides for collaborating with team members in order to create a successful migration path.

Finally we'll wrap up the series by looking at a couple of different techniques for debugging our migrations and dealing with pesky source material that just doesn't want to be imported.

Because this series is focused primarily on writing custom data migrations, and since the Migrate module itself requires at least some amount of code to be written to perform a migration, it is suggested that students be familiar with PHP and basic Object Oriented Programming techniques. Although not required to run a migration, Joe uses the drush command line tool extensively in this series. If you need a refresher on using drush take a moment to watch our drush series.

Additional resources

Migrate module — Drupal.org

More information

This lesson includes a short presentation that explains the basics terminology and architecture of the migrate module and the components that make up a custom data migration. We'll talk about the Extract / Transform / Load process and how it relates to data migrations, the types of data sources that the migration module can read from, and a little bit about how the code in both the migrate module and our own custom migrations will be organized.

Additional resources

Migrate module architecture documentation

More information

In this lesson we'll cover downloading and installing the Migrate module (version 7.x-2.6) and ensuring that our local environment is ready to be able to run migrations via both the UI and drush. Once that's setup we'll take a high level look at the migrate module's UI and drush commands to familiarize ourselves with the tools that we'll be using throughout the rest of the series. This will also help formalize some of concepts introduced in the previous lesson.

Additional resources

Migrate module project page
migrate-7.x-2.6-rc1 download
Migrate module documentation
Introduction to Drush series
Drupalize.Me Migrate module series code on GitHub

More information

In this lesson we'll take a more in-depth look at the migrate module's UI with a focus on being able to identify and execute custom migrations. For now we'll work with the provided example migrations just so that we have something to work with. Throughout the lesson we'll learn how to run a migration to import it's data into Drupal, rollback a migration that was previously run in order to set a clean slate, and other ways we can interact with a migration via the UI. Then we'll discuss some of the challenges inherit in running migrations via the UI and Drupal's Batch API and how to identify them.

Additional resources

More information

In this lesson we'll take a look at running migrations via drush rather than via the Migrate module's UI. We'll take a look at the commands provided by the Migrate module and talk about what they do. Then we'll practice running, rolling back, and otherwise interacting with migrations via drush commands. Throughout the lesson we'll learn about some of the functionality you get from running migrations via drush that are not provided by the UI, like the ability to specify a single record to migrate with the --idlist flag. Finally we'll learn about why in most cases drush is a better tool for running large data migrations because of the limitations imposed on the UI. Pay close attention to this lesson since throughout the remainder of this series we'll be running all of our migrations via drush.

Additional resources

Drush project
Introduction to Drush series

More information

The Migrate module itself contains some excellent examples of data migrations implementing the various APIs provided by the module and serves as the canonical documentation for how to write a migration. In this lesson we'll take a look at the beer and wine import examples provided with the migrate module as a way to familiarize ourselves with the concepts discussed earlier and to be able to see the code that makes up a basic migration before attempting to write our own. In practice these examples serve as a great starting point and can often times be copy/pasted and adjusted for your own needs.

Additional resources

Migrate module

More information

Before we can write our own custom migration we need to construct the site that we're going to import data into and of course we need some source data to import. In this lesson we'll obtain some source data to work with and configure our Drupal site by installing a couple contributed modules, and creating the content types and fields necessary for our information architecture. During this process we'll also be taking a look at the baseball player and team data that we'll be importing and familiarizing ourselves a bit more with the tables and columns in our source database.

Note: The Lahman database structure has changed since this video was recorded, and the latest files provided by Lahman don't match what is used in the video. When following along with these examples you'll either need to use the 2012 source data, or make a few adjustments to field/table names throughout the source code as you follow along so that they match the current structure of the Lahman data.

Additional resources

More information

Running your own migration requires a bit of setup and boilerplate code, Drupal needs to know where to find the source data, and the Migrate module needs to be provided with some basic information about our custom code. In this lesson we'll look at setting up Drupal to be able to connect to multiple databases, and then create the skeleton of a simple module which will house our custom migration code and an implementation of hook_migrate_api(). Finally we'll create a base class from which we can begin writing our own custom migrations, and talk about why this is a good way to start organizing our migration.

More information

In this lesson we're going to write our first custom data migration and start importing some of the player data in to Drupal. The primary concepts covered in this lesson are the creation of a migration class following the pattern necessary for Migrate module to be able to discover our code and then dealing with defining source and destination objects so that the Migrate module will know where to read data from and where to write data to. Finally we'll add a simple single field mapping where we map the player's name to the node.title field allowing us to run the migration for the first time and import some real data.

Additional resources

More information

In this lesson we'll continue to add field mappings to the basic migration class created in the previous lesson. We'll see how to add more information about the available source fields. We'll map more of the player source data fields to their equivalent destination fields and learn about some of the many ways that fields can be mapped. Finally we'll also cover mapping sub-fields which allows us to import data for things like the text format of a node's body field or the alt column in an image field. Information that's contained within a meta field.

Additional resources

Migration class documentation

More information

In this lesson we're going to finish mapping the fields for the player migration and learn about how to deal with source data that requires some additional massaging before being saved to the destination. We'll learn about the use of field mapping callback functions and the migration's prepareRow method as possible spots to perform additional logic during a migration. Then we'll use these techniques to combine our player first and last name fields together for the node title field, deal with our birth and death date fields by concatenating the three source columns together into a single date string, and finally add some additional information to the notes field during import that will allow us to track imported records in the future.

 

Note: This lesson was recorded using the 7.x-2.6 version of the date module, however the 7.x-2.7 version is now out which contains some changes to the module's integration with the migrate module. The biggest change being that the date_migrate module is no longer required and has been deprecated. You can read more about the changes here: https://drupal.org/node/2034231

Additional resources

More information

Although not required when writing your own migration, the Migrate module provides ways for us to decorate our migrations with additional information making it easier to keep track of who is working on what, what needs to get done, and related issues. We've already seen some of this in previous lessons with with the Do Not Migrate option for fields and ability to provide field mapping descriptions. In this lesson we'll take a look at how we can make use of the additional tools to do things like; Show the name, contact info, and roles of individuals working on the migration, pose questions to other team members via the migration UI, and link individual field mappings to related tickets in our bug tracking software. Making it easier for team members who don't want to be involved with the code to help move our migration along.

Additional resources

Field Mappings documentation

More information

In this lesson we're going to take a look at creating relationships between two Drupal nodes during a migration. In our case we've got player and team nodes, and each player node has an entity reference to a team node which we need to populate during our migration. In order for this to work we need to ensure that the team node has already been created so that we know the unique node.nid to use in the entity reference field for the player.

To accomplish this we're going to write a migration for team data and ensure that it is run prior to our player migration being run. Then we're going to make use of the mapping between source and destination rows that the Migrate module is tracking for teams so that during a player migration we can lookup the corresponding team node's nid and make use of it.

More information

This lesson is a short one but it covers an important topic, multi-value fields. Almost any field in Drupal can be configured to support more than one value being entered for a single field. Our teams entityreference field is a good example of this, a player could have played for one or more team over the course of their career. This lesson will look at two different ways to map multiple values for a single field.

First we'll look at doing it in a callback method where we perform an additional query and then use the values returned by that query. And second we'll look at using the field mapping's separator method to take a column in a source row that has multiple values separated by a comma and import them as individual field values.

More information

The infamous causality dilemma of the chicken and the egg examines which of the two came first constantly battling with the fact that you need one to produce the other. It's a vicious circle. In this lesson we're going to explore this dilemma in the context of data migrations. Imagine a scenario where you've got an article node type that has a reference field for similar articles which you need to populate with the node ID of the similar articles. During a migration when the article is being imported the article that is being referenced may or may not exist already. If it doesn't exist already how do we know what ID we need to put into the reference field?

One option would be to solve this problem using multiple passes. A first pass that goes through and creates all the articles, and a second that comes back through and updates the similar articles field. Though what happens if the similar articles field is required? You wouldn't be able to save the article without a value in that field the first time around? So you see how this quickly becomes another example of the chicken or the egg problem?

Lucky for us the migrate module has a solution to this called stub migrations. A process that allows creating a stub or a shell for the referenced but not yet created article so that we can use it's unique ID, then when that article is encountered in the migration it will update the stub rather than create a new article.

Additional resources

Migrate stub documentation

More information

So far all the data we've been migrating has come from a MySQL database but Migrate supports a number of other data sources which we can access by using a different source migration class. In this lesson we'll look at the source migration classes that are available with the migrate module and talk about what each one could be used for. Then we'll implement a migration that imports data from a CSV file since that's another common way of receiving source data.

For your convenience, we've included a copy of the sample data in the companion files. This is a duplicate of the data from lesson Set Up Migrate Demo Site and Source Data. If you've been following the lessons in order, you should not need to download the sample data set again.

Additional resources

Migrate Source Class documentation

MigrateSourceCSV Class documentation

Migrate Extras module

More information

Up to this point we've focused focused on creating nodes as the result of our migrations. The Migrate module however supports a number of different destinations that we can use when importing data. In this lesson we'll take a look at the destination classes that the Migrate module provides for us and talk about what each one is used for and where to find more information and examples of using them. Then we'll implement a migration that imports data as vocabulary terms using the MigrateDestinationTerm class.

Additional resources

Migrate Destination Classes documentation

More information

It's almost unheard of to write a data import that just works on the first try. Our examples have all been written using data that is known to be in good shape, and to be honest we've avoided even trying to import some of the data because it ended up being problematic and we wanted to focus on working with the Migrate module and not debugging the problems in your source data. In the real world though, you're going to end up with problematic rows in your source data and you'll need to get them resolved.

In this lesson we'll run the complete player migration and end up with a couple of rows that fail to import because of an oversight in our code. We can use the migrate UI to get a sense of what is failing and why. Then we'll use a combination of options available to the drush migrate command and some strategically placed print_r's to debug and resolve the problem rows. Finally, we'll use a trick to get the Migrate module to re-import all the problem rows but not the already imported rows.