Skip to main content

Migrating Drupal 7 Files into Drupal 8 / 9 Media entities

An article from ComputerMinds - Building with Drupal in the UK since 2005
27th Jul 2021

Ross Bale

Developer
Hey, you seem to look at this article a lot! Why not Bookmark this article so you can find it easily in the future?

This article assumes a basic knowledge of the building of custom modules, the Drupal 8 / 9 Migration system, and the processes behind creating customised migrations from a previous version of Drupal.

One of the more common components of any migration from a previous version of Drupal is the need to migrate files. In Drupal 7 there was a core ‘File’ entity type and on pretty much all of our clients' sites we would also have the contributed module File Entity enabled. This extended the core file functionality and gave the ability to add fields to the file entity, have separate file types, integrate with views and more.

As of Drupal 8.4.x and above, Drupal Core has the concept of a ‘Media’ entity type that allows you to upload, manage and reuse files and multimedia assets. This acts as a replacement for the standard core File entity upload field and once you have configured a Media entity reference field on your entity type, it allows you to reference previously uploaded files from a Media library, instead of having to upload a new file every time.

Internally, a Media entity references a file entity and this process happens automatically for uploads made through the site frontend. When you upload a file to a Media entity reference field that has been set to use the 'Media Library' field widget, Drupal will handle importing the file, creating the entry in the file_managed table, creating the Media entity and then referencing the newly created file in the file reference field on the media entity. You can then reference this newly created media entity in any other media entity reference fields that allow referencing the same media entity bundles as this field.

The Core Drupal 8 / 9 migration of files doesn’t automatically allow you to migrate from Files to Media entities. The core file migration plugin for Drupal 7 to 8 / 9 just provides a migration route for Files to Files, and any auto-generated entity to entity migrations (provided by the migrate_drupal core module) will only support a File reference field to a File reference field. However, we want the ability to migrate into Media entities, so read on to see how to accomplish this with a bit of custom code.

The Process

The process of getting Files migrated as Media entities is a two-step process. First, we need a migration to migrate the Drupal 7 Files into our Drupal 8 / 9 site as Files, and then we need one or more secondary migrations (one per file/media type) to create Media entities from those now-migrated Files.

Creating the custom module

First of all, we’ll need to create a new custom module (if you haven’t already got a custom migration module in place already!). You can name this anything you want - usually, we’d use a project name prefix followed by the name of the module but for this tutorial, we’ll just call the module ‘example_custom_migration’ for simplicity.

example_custom_migration.info.yml

name: Example Custom Migration
description: Includes a custom migration for files.
package: example

type: module
core_version_requirement: ^8.8.0 || ^9.0
dependencies:
  - migrate:migrate

This is a minimal module info.yml file in which we include a dependency on the core migrate module.

Writing the migration source plugin

Next up we’ll need to create our custom migration source plugin. This is the plugin that we will reference in our migration(s) later on. We are going to be extending the core File migration source plugin to give the ability to filter the source files we want to migrate by file type, which we’ll need later on in order to determine which files will be referenced by which Media entity type.

We’ll name the plugin class file FileByType.php and it will live inside of our custom module in the directory src/Plugin/migrate/source. Be sure that the directory structure matches that exactly as Drupal 8 uses the PSR-4 standard for PHP autoloading. For more information about the PSR-4 standard with regards to Drupal 8 / 9 development, read this article.

We’ll start writing the class by extending the core File migration process plugin.

src/Plugin/migrate/source/FileByType.php

<?php

namespace Drupal\example_custom_migration\Plugin\migrate\source;

use Drupal\file\Plugin\migrate\source\d7\File;

/**
 * Drupal 7 file source (optionally filtered by type) from database.
 *
 * @MigrateSource(
 *   id = "d7_file_by_type",
 *   source_module = "file"
 * )
 */
class FileByType extends File {

}

We only need to override two methods provided by the File class, query() and fields().

Our query() method override will handle filtering the files by a configured file type that we can pass into the plugin. The fields() method override will include the names of any extra fields that we will be returning in addition to the fields from the base File class.

The query() function

First of all, let's write the query function. The start of the function will call the parent query() method which will build up the query to query the Drupal 7 file_managed table for all files that are not stored under the temporary:// scheme, also providing the ability to only return files for a specified scheme based on the scheme configuration parameter. We don’t need to modify anything about the innards of the parent method so we can just call it here to save having the same code duplicated.

/**
 * {@inheritdoc}
 */
public function query() {
  $query = parent::query();
  return $query;
}

Next, we’ll add our customisations to support filtering the files by a specific file type. Later on, this will be passed in as a configuration option from our migration.

// Filter by file type, if configured.
if (isset($this->configuration['type'])) {
  $query->condition('f.type', $this->configuration['type']);
}

As we are migrating into Media entities, we may want additional data such as image alt and title text values from the Drupal 7 database. So we’ll support (optionally) returning this data as well.

// Get the alt text, if configured.
if (isset($this->configuration['get_alt'])) {
  $alt_alias = $query->addJoin('left', 'field_data_field_file_image_alt_text', 'alt', 'f.fid = %alias.entity_id');
  $query->addField($alt_alias, 'field_file_image_alt_text_value', 'alt');
}

// Get the title text, if configured.
if (isset($this->configuration['get_title'])) {
  $title_alias = $query->addJoin('left', 'field_data_field_file_image_title_text', 'title', 'f.fid = %alias.entity_id');
  $query->addField($title_alias, 'field_file_image_title_text_value', 'title');
}

Later on, when we are writing the migration for ‘Image’ media items we will be including two additional configuration lines in the call to our source plugin, get_alt and get_title. When we write the migration for ‘Document’ media items, we don’t want alt text and title text as they aren’t applicable, so we will omit them from the configuration there.

The finished query() function will look like this:

/**
 * {@inheritdoc}
 */
public function query() {
  $query = parent::query();

  // Filter by file type, if configured.
  if (isset($this->configuration['type'])) {
    $query->condition('f.type', $this->configuration['type']);
  }

  // Get the alt text, if configured.
  if (isset($this->configuration['get_alt'])) {
    $alt_alias = $query->addJoin('left', 'field_data_field_file_image_alt_text', 'alt', 'f.fid = %alias.entity_id');
    $query->addField($alt_alias, 'field_file_image_alt_text_value', 'alt');
  }

  // Get the title text, if configured.
  if (isset($this->configuration['get_title'])) {
    $title_alias = $query->addJoin('left', 'field_data_field_file_image_title_text', 'title', 'f.fid = %alias.entity_id');
    $query->addField($title_alias, 'field_file_image_title_text_value', 'title');
  }

  return $query;
}

If you wanted to add any further logic in this function to return any other field data (your fieldable files may have extra field data that you need!) then you could do so here.

The fields() function

In the fields function, we want to return the additional fields that this source plugin now provides (file type, alt text, title text), in addition to the standard fields the File plugin already provides.

/**
 * {@inheritdoc}
 */
public function fields() {
  $fields = parent::fields();
  $fields['type'] = $this->t('The type of file.');
  $fields['alt'] = $this->t('Alt text of the file (if present)');
  $fields['title'] = $this->t('Title text of the file (if present)');
  return $fields;
}

We first call parent::fields() to get the list of fields from the parent method and then add our additional fields onto the $fields array afterwards. If you were to add further logic to the query() method which returned further additional fields, you would also want to add these here as well.

The finished source plugin

The finished source plugin should now look like this:

<?php

namespace Drupal\example_custom_migration\Plugin\migrate\source;

use Drupal\file\Plugin\migrate\source\d7\File;

/**
 * Drupal 7 file source (optionally filtered by type) from database.
 *
 * @MigrateSource(
 *   id = "d7_file_by_type",
 *   source_module = "file"
 * )
 */
class FileByType extends File {

  /**
   * {@inheritdoc}
   */
  public function query() {
    $query = parent::query();

    // Filter by file type, if configured.
    if (isset($this->configuration['type'])) {
      $query->condition('f.type', $this->configuration['type']);
    }

    // Get the alt text, if configured.
    if (isset($this->configuration['get_alt'])) {
      $alt_alias = $query->addJoin('left', 'field_data_field_file_image_alt_text', 'alt', 'f.fid = %alias.entity_id');
      $query->addField($alt_alias, 'field_file_image_alt_text_value', 'alt');
    }

    // Get the title text, if configured.
    if (isset($this->configuration['get_title'])) {
      $title_alias = $query->addJoin('left', 'field_data_field_file_image_title_text', 'title', 'f.fid = %alias.entity_id');
      $query->addField($title_alias, 'field_file_image_title_text_value', 'title');
    }

    return $query;
  }

  /**
   * {@inheritdoc}
   */
  public function fields() {
    $fields = parent::fields();
    $fields['type'] = $this->t('The type of file.');
    $fields['alt'] = $this->t('Alt text of the file (if present)');
    $fields['title'] = $this->t('Title text of the file (if present)');
    return $fields;
  }

}

Writing the Migrations

As you will hopefully know already, since Drupal 8.1.x, migrations are now Plugins instead of Configuration, which means we can just add them into a migrations folder inside of our custom module. You no longer have to include the migrations as config in a config/install folder inside of your module or worry about deleting and re-importing said config to pick up any changes that were made. You simply need to do a cache rebuild and the changes will be picked up automatically.

The First migration - Drupal 7 Files to Drupal 8 / 9 Files

Create the migrations directory inside of your custom module which will contain this migration and the other ones later on. Then create the file that will contain our migration plugin and call it example_custom_migration.upgrade_d7_file.yml

At this point, the innards of our migration plugin definition will be pretty much the same as the core Drupal 7 migration file plugin. At this point, we are not using our custom migration source plugin that we wrote earlier. The standard d7_file plugin provided by core will suffice to initially migrate all of the files, regardless of their type.

id: upgrade_d7_file
class: Drupal\migrate\Plugin\Migration
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: migrate_drupal_7
label: 'Public files'
source:
  plugin: d7_file
  scheme: public
  constants:
    source_base_path: /path/to/your/drupal7/webroot
process:
  fid: fid
  filename: filename
  source_full_path:
    -
      plugin: concat
      delimiter: /
      source:
        - constants/source_base_path
        - filepath
    -
      plugin: urlencode
  uri:
    plugin: file_copy
    source:
      - '@source_full_path'
      - uri
  filemime: filemime
  status: status
  created: timestamp
  changed: timestamp
  uid: uid
destination:
  plugin: 'entity:file'

As you can see, this is pretty standard stuff for a migration plugin and we aren’t doing anything overly complex yet. This plugin definition will ensure we grab all the files from the Drupal 7 site and migrate them as files onto our Drupal 8 / 9 site.

You may notice the ‘migration_group’ key which isn’t something you would find with core migration plugins. This is an optional key that the migrate_plus module allows you to define in your migration plugin that allows you to group your migrations together and share configuration between migrations. This enables you to then run the migration(s) as a group through both the Migrate UI and through the command-line (Drush). This is a feature we use quite a lot when building Drupal to Drupal migrations! You don’t have to have migrate_plus in order to run these migrations but the grouping stuff is very useful.

Next up, we need to write the migration(s) to turn these Files into Media entities.

The Secondary migration(s) - Drupal 8 / 9 Files to Media

For this bit I’m assuming that you already have created the various media entity types in the site UI for ‘image’ and ‘document’ media types. If you haven’t, be sure to go ahead and create them and if you decide to name them something different you’ll have to update the relevant places in the scripts.

Images

Now we have the file migration in place, we’ll begin by writing the migration for the Image media entities. Create a new file named example_custom_migration.upgrade_d7_file_to_media_image.yml inside of the migrations folder with the following contents.

id: upgrade_d7_file_to_media_image
class: Drupal\migrate\Plugin\Migration
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: migrate_drupal_7
label: 'Migrate Media image entities'
source:
  plugin: d7_file_by_type
  scheme: public
  type: image
  get_alt: true
  get_title: true
  constants:
    source_base_path: /path/to/your/drupal7/webroot
process:
  field_media_image/target_id:
    -
      plugin: migration_lookup
      migration: upgrade_d7_file
      source: fid
    -
      plugin: skip_on_empty
      method: row
  thumbnail/target_id:
    plugin: migration_lookup
    migration: upgrade_d7_file
    source: fid
  field_media_image/alt: alt
  field_media_image/title: title
  status: status
  created: timestamp
  changed: timestamp
  uid: uid
destination:
  plugin: 'entity:media'
  default_bundle: image
migration_dependencies:
  required:
    - upgrade_d7_file

Let’s break this down section by section.

In the source section, you can see that this time we are now using our custom source plugin d7_file_by_type that we wrote earlier. We are also passing in a couple of additional options to the plugin; get_alt and get_title that will enable the plugin to return the additional alt text and title text data that we want for our image media entities.

In the process section we write to the appropriate fields on the media entity; field_media_image/target_id and thumbnail/target_id to save the references to our file, field_media_image/alt and field_media_image/title to save the alt and title values. We also save the standard entity data like our file migration before; status, created, changed, and uid.

The destination plugin we use this time is ‘entity:media’ because we are now creating a media entity, and the default_bundle is set to ‘image’ as that’s the media bundle type we are using for this migration.

Finally, we add a required migration_dependency on our upgrade_d7_file migration. This will ensure that the migration will only run once the file one has fully ran and imported all of the available items.

Documents

The document migration will be pretty similar to the image migration that we wrote above.

id: upgrade_d7_file_to_media_document
class: Drupal\migrate\Plugin\Migration
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: migrate_drupal_7
label: 'Migrate Media file entities'
source:
  plugin: d7_file_by_type
  scheme: public
  type: document
  constants:
    source_base_path: /path/to/your/drupal7/webroot
process:
  field_media_file/target_id:
    -
      plugin: migration_lookup
      migration: upgrade_d7_file
      source: fid
    -
      plugin: skip_on_empty
      method: row
  field_media_file/display:
    plugin: default_value
    default_value: 1
  status: status
  created: timestamp
  changed: timestamp
  uid: uid
destination:
  plugin: 'entity:media'
  default_bundle: document
migration_dependencies:
  required:
    - upgrade_d7_file

The source section is the same except this time we pass in a type of document to our source plugin and as we don’t need the alt and title text data that we needed for the image media, we’ll omit the get_alt and get_title configuration lines.

The process section again is similar, except this time we are setting the referenced file inside of field_media_file/target_id and we also set the field_media_file/display to 1 which will enable the file to be displayed when viewing content. Again, we omit the get_alt and get_title bits and the rest of the configuration here are the same entity properties as before.

We keep the destination plugin the same (entity:media) but the default_bundle is changed to ‘document’.

Before running the migrations.

Before we are able to run our migrations, we need to make sure that your Drupal 8 / 9 site knows how to connect to your Drupal 7 site. If you are already mid-way through writing migrations from Drupal 7 to Drupal 8 / 9 then you will most likely already have this in place, but if not, you need to ensure your settings.php (or settings.local.php - wherever your current site database connection details are present!) has an entry with the connection details to the Drupal 7 site.

e.g. inside of settings.local.php

$databases['drupal_7']['default'] = [
  'database' => ‘your_database_name’,
  'username' => 'your_database_user_name',
  'password' => 'your_database_user_pass',
  'prefix' => '',
  'host' => 'localhost',
  'port' => '3306',
  'namespace' => 'Drupal\\Core\\Database\\Driver\\mysql',
  'driver' => 'mysql',
];

Note the connection key drupal_7 there, in the $databases definition. We also need to ensure that our migration group knows that we should be using this key if it doesn’t already.

If you have already got a migration_group configuration setup as part of a Drupal -> Drupal site migration (you most likely will if you have migrate_plus enabled and have begun work on a migration) then ensure in the shared_configuration section at the bottom that you have the following:

shared_configuration:
  source:
    key: drupal_7

If you don’t have a migration group config already, then create a new one named migrate_plus.migration_group.migrate_drupal_7.yml

langcode: en
status: true
dependencies: {  }
id: migrate_drupal_7
label: 'Import from Drupal 7'
description: ''
source_type: 'Drupal 7'
module: null
shared_configuration:
  source:
    key: drupal_7

Put it into your site’s configuration directory (e.g. config/sync) and import the config.

If for some reason you aren’t using migrate_plus with migration groups then you’ll need to ensure in the source section of each migration plugin we have written earlier that you have

key: drupal_7

in the plugin definition. e.g. (for the document plugin this would be as follows)

source:
  plugin: d7_file_by_type
  scheme: public
  type: document
  key: drupal_7
  constants:
    source_base_path: /path/to/your/drupal7/webroot

Running the migrations

Nearly there (I promise!). Now you have your migration plugins all correct and in place, it’s time to try them out! First, ensure your module is enabled if it isn’t already (either via the extend menu in the UI or via Drush - drush en example_custom_migration).

Next, you can either try running the migrations through the UI (admin/structure/migrate) or via Drush (if you are using the migrate_tools module version 4.x / 5.x and Drush 9.x / 10.3.x OR Drush 10.4.x+ which includes most of the Drush commands the migrate_tools module included..) I’m a Drush man myself, so I’d either run the migrations individually with these commands

  • drush migrate:import upgrade_d7_file
  • drush migrate:import upgrade_d7_file_to_media_image
  • drush migrate:import upgrade_d7_file_to_media_document

or as a group (if you are using migrate_plus with the group support) with

  • drush migrate:import --group=migrate_drupal_7

although if you are writing these migrations as part of a larger migration with other migrations in the group, you’ll probably just want to run them individually to be sure they are working correctly.

All being well, after running the migrations you should find all the relevant files have been copied into your new site’s public files directory and you should be able to see all your lovely files as media entities by visiting /admin/content/media. (Great Job!)

Referencing the media in migrations that previously referenced files

Any migrations that you have previously written that referenced files and fid’s instead of media entities should now be updated. This is a relatively trivial task as the following example details. Please note that the fields you are migrating into will need to be recreated as Entity Reference fields, referencing the Media entity type. If you have let Drupal handle migrating your fields from a previous version then it will have created these fields as File entity reference fields.

This is an example of what part of your process plugin may look like before and after, on an entity migration for an image field named ‘field_images’.

Before (File)

field_images:
  plugin: sub_process
  source: field_images
  process:
    target_id: fid
    alt: alt
    title: title
    width: width
    height: height

After (Media)

field_images:
  plugin: sub_process
  source: field_images
  process:
    target_id:
      plugin: migration_lookup
      migration: upgrade_d7_file_to_media_image
      source: fid
      no_stub: true

And here is an example of what part of your process plugin would look like before and after for a document field named ‘field_documents’.

Before (File)

field_my_documents:
  plugin: sub_process
  source: field_my_documents
  process:
    target_id: fid
    display: display

After (Media)

field_my_documents:
  plugin: sub_process
  source: field_my_documents
  process:
    target_id:
      plugin: migration_lookup
      migration: upgrade_d7_file_to_media_document
      source: fid
      no_stub: true

If your file field in Drupal 7 allows both images and documents, then your field in Drupal 8 / 9 should allow the same. You can reference both media migrations in the process as follows:

field_my_documents:
  plugin: sub_process
  source: field_my_documents
  process:
    target_id:
      plugin: migration_lookup
      migration:
        - upgrade_d7_file_to_media_document
        - upgrade_d7_file_to_media_image
      source: fid
      no_stub: true

Extending further

We have covered the basics on how to get images and document files migrated, but you can (and we have on lots of client’s sites) get this running for Video files as well in a similar manner with minimal changes.

You just need to create another file to media migration plugin for video files, in addition to the image and document ones already made. Ensure that the type of file passed into the source plugin section of the migration is ‘video’ and the default_bundle of your entity:media destination in the destination section is ‘video’. Then you should be hot to trot!

Bonus - How to have a dynamic file source_base_path

You may be wondering, “What if I don’t want to hard code my source_base_path in the source plugin definition of my migrations?”

This is another common requirement we have when writing our own migrations, and luckily the solution isn’t too tricky with a bit of custom code. You may find this useful if you are going to run this migration across different environments (e.g. your dev machine, testing server, production environment when the time comes) and you don’t want to keep having to change your migration plugin files each time.

In your migration source plugin for both the file and file -> media migrations, change

source_base_path: /path/to/your/drupal7/webroot

to

source_base_path: /

You don’t strictly have to do this as we are going to override it anyway but I think having a blank path gives a clearer indication that we haven’t explicitly set one, as we are going to override the value. Of course, you could also have a comment above the line explaining that this will be overridden if you so wish.

Migration Event Subscriber

In order to override the souce_base_path, we’ll create an Event Subscriber in our custom module. This will enable us to subscribe to the appropriate Migrate events and will allow us to modify the data at the correct point in time. For more detailed information about event subscribers, see this article we wrote way back in 2016 ‘Drupal 8 Event Subscribers - the successor to alter hooks’.

You’ll first want to create a services.yml file directly inside of the custom module as well named example_custom_migration.services.yml. This will let us define our migration subscriber so that Drupal knows about it.

services:
  example_custom_migration.d7_file_migration_subscriber:
    class: Drupal\example_custom_migration\D7FileMigrationSubscriber
    arguments: ['@config.factory']
    tags:
      - { name: event_subscriber }

Next, create this file inside of the src folder in our custom module. You can name it what you want really but for this example, we’ll call it D7FileMigrationSubscriber.php.

<?php

/**
 * @file
 * Contains \Drupal\example_custom_migration\D7FileMigrationSubscriber
 */

namespace Drupal\example_custom_migration;

use Drupal\Core\Config\ConfigFactoryInterface;
use Drupal\migrate\Event\MigrateEvents;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;

/**
 * Act on events during a migration.
 */
class D7FileMigrationSubscriber implements EventSubscriberInterface {

  /**
   * @var \Drupal\Core\Config\ConfigFactoryInterface
   */
  protected $configFactory;

  public function __construct(ConfigFactoryInterface $configFactory) {
    $this->configFactory = $configFactory;
  }

  /**
   * {@inheritdoc}
   */
  public static function getSubscribedEvents() {
    $events[MigrateEvents::PRE_IMPORT] = 'onMigrationPreImport';
    return $events;
  }

  /**
   * Act on a migration before it imports.
   */
  public function onMigrationPreImport(MigrateImportEvent $event) {
    $migrate_id = $event->getMigration()->id();

    // Inject the correct source_base_path to the file migrations.
    $file_migrate_ids = [
      'upgrade_d7_file',
      'upgrade_d7_file_to_media_image',
      'upgrade_d7_file_to_media_document'
    ];

    if (in_array($migrate_id, $file_migrate_ids)) {
      $source_config = $event->getMigration()->getSourceConfiguration();
      $source_base_path = $this->configFactory->get('example_custom_migration.settings')->get('files_source_base_path');
      $source_config['constants']['source_base_path'] = $source_base_path;
      $event->getMigration()->set('source', $source_config);
    }
  }
}

There is a lot going on here which I'm not going to get into in this article. (The article I linked to above should help explain it all if you aren’t familiar with concepts such as event subscribers and dependency injection!) But the end result of this code will be to swap out the source_base_path in our source plugin configuration to one defined by a config entry named ‘example_custom_migration.settings’ with a key of ‘files_source_base_path’

Finally, all you need to do next is just pop this line into the bottom of your settings.local.php (which should be different per environment) replacing ‘/path/to/your/drupal7/webroot’ with the path to the Drupal 7 site webroot for the current environment.

$config['example_custom_migration.settings']['files_source_base_path'] = '/path/to/your/drupal7/webroot';

This will then allow you to have a different path set on a local dev build, another team member’s local dev build, testing and production servers as well! Neat huh?

Photo by Kaboompics.com on Pexels

Hi, thanks for reading

ComputerMinds are the UK’s Drupal specialists with offices in Bristol and Coventry. We offer a range of Drupal services including Consultancy, Development, Training and Support. Whatever your Drupal problem, we can help.