Merge the Values of Two Entity Reference Fields During a Drupal 7 to Drupal 10 Migration

Earlier this month I hosted a Drupal-to-Drupal Migration Workshop, and one of the attendees asked about merging two entity reference fields into a single field during the migration. The scenario is you’ve got a Drupal 7 event content type with an entity reference field that relates session, and another that relates sponsors. On the Drupal 10 site you want to consolidate this into a single entity reference field that relates both sessions and sponsors via the same field.

I wasn't quite sure how to approach this but we brainstormed some pseudo code, and then came up with the following two solutions.

The problem

The value of the two reference fields from the D7 source is an array of arrays. The other array has one row for each value in the D7 field, and the inner array is an associative array with the key target_id that indicates the ID of the node being referenced.

Example:

$field_related_sessions = [
	0 => ['target_id' => 42],
  1 => ['target_id' => 1337],
];

$field_related_sponsors = [
	0 => ['target_id' => 86],
];

Use the merge plugin from Migrate Plus

You can use the merge plugin provided by the Migrate Plus module to merge these two arrays into a single field.

field_merged:
    -
      plugin: merge
      source:
        - field_related_sessions
        - field_related_sponsors

You’ll end up with something like this:

$field_merged = [
	0 => ['target_id' => 42],
  1 => ['target_id' => 1337],
	2 => ['target_id' => 86],
];

If the Node IDs are the same between Drupal 7, and Drupal 10 this should work fine. However, if you need to use the migration_lookup plugin to ensure the correct Drupal 10 Node IDs are used, you’ll need to add a sub_process step like this:

field_merged:
    -
      plugin: merge
      source:
        - field_related_sessions
        - field_related_sponsors
    -
      plugin: sub_process
      process:
        target_id:
          -
            plugin: migration_lookup
            source: target_id
            migration:
              - upgrade_d7_node_complete_session
              - upgrade_d7_node_complete_sponsor
          -
	# Leave this off if you're not migrating translations.
            plugin: extract
            index:
              - 0

The result will look the same as the before adding the sub_process just that the IDs will be updated to convert the Drupal 7 source ID to the Drupal 10 destination ID.

Note the use of the extract plugin at the end there. This may or may not be necessary depending on your specific migration. If you’re migrating translations, the output from migration_lookup will be a composite key. If you’re not, it’ll be a single value. Learn more about this in Debugging inconsistent return values from the Drupal migration_lookup plugin.

What about duplicate values?

If you know that the fields on the Drupal 7 site are not going to contain duplicate values this should be all you need to do. However, if it’s possible that both fields could reference the same node you might want to de-duplicate the data. For example, what if your data looked like this?

$field_related_sessions = [
	0 => ['target_id' => 42],
	2 => ['target_id' => 42],
  1 => ['target_id' => 1337],
];

$field_related_sponsors = [
  // Duplicate!
	0 => ['target_id' => 42],	
  1 => ['target_id' => 86],
];

Initially I thought Drupal 10 might just handle that automatically. And perform some de-duplication logic before it saved the imported field values. But it doesn’t. It just saves whatever you give it. So we need to update the migration to handle this scenario.

Here’s an example that will de-duplicate the array. It works by first merging the two arrays like before. And then using array_build to create a new array out of merged values that’ll look like this:

$_field_related_content_deduped = [
    42 => "42"
    1337 => "1337"
    86 => "86"
  ];

Since an array can’t have duplicate keys, you’ve done the de-duplication work. But now you need to get that back into the format required for the entity reference field. So we use the callback plugin to call PHP’s array_chunk() function. And then iterate over that with sub_process to create a new array in the required format.

source:
	constants:
		ONE: 1

process:
	_field_related_content_deduped:
	    -
	      plugin: merge
	      source:
	        - field_related_sessions
	        - field_related_sponsors
	    -
	      plugin: array_build
	      key: target_id
	      value: target_id
	
	  field_merged:
	    -
	      plugin: callback
	      callable: array_chunk
	      unpack_source: true
	      source:
	        - '@_field_related_content_deduped'
	        - constants/ONE
	    -
	      plugin: sub_process
	      process:
	        target_id:
	          -
	            plugin: array_pop
	            source:
	              - '0'

Or, write a custom process plugin

Alternatively you could write a custom process plugin. Something like the following should do the trick:

<?php

namespace Drupal\my_custom_module\Plugin\migrate\process;

use Drupal\migrate\ProcessPluginBase;

/**
 * A process plugin to de-dupe an array.
 *
 * @MigrateProcessPlugin(
 *   id = "my_custom_deduplication_filter"
 * )
 */
class MyCustomDeduplicationFilter extends ProcessPluginBase {

  /**
   * {@inheritdoc}
   */
  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    $seen = [];
    $dedupedArray = [];

    foreach ($value as $item) {
        $key = $item[$keyField];

        if (!isset($seen[$key])) {
            $seen[$key] = true;
            $dedupedArray[] = $item;
        }
    }

    return $dedupedArray;
  }

}

Conclusion

It might not be the most elegant solution, but we got something that works and learned a bunch along the way. Getting here required a lot of use of the --migrate-debug flag from the Migrate Devel module. As well as exploring the available process plugins that we might be able to use.

This might not be the best solution, but it works. How would you go about solving this? Got any suggestions for how to improve it?

Keep going!

You can learn more in our Process Plugins tutorial. And our Learn to Migrate to Drupal guide.

Add new comment

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <code class> <ul type> <ol start type> <li> <dl> <dt> <dd><h3 id> <p>
  • Lines and paragraphs break automatically.

About us

Drupalize.Me is the best resource for learning Drupal online. We have an extensive library covering multiple versions of Drupal and we are the most accurate and up-to-date Drupal resource. Learn more