Drupal Migrate API: Importing Content from CSV and Legacy Databases

The Migrate API is the canonical way to move data into Drupal 11 — whether that means a one-off CSV import, a recurring feed from a legacy database, or a full Drupal 7 upgrade. It implements an ETL (Extract–Transform–Load) pipeline via plugins, and every step is swappable. This article walks through two real scenarios: importing nodes from a CSV file and pulling records from a legacy MySQL database.

Core Concepts

Every migration is a YAML configuration entity composed of three sections:

  • source — a Source plugin that reads rows from somewhere
  • process — one or more Process plugins that map and transform fields
  • destination — a Destination plugin that writes Drupal entities

Migrations are tracked in a map table so re-running the same migration only processes new or changed rows (incremental migration via highwater marks).

Required Modules and Composer Packages

# Core modules (enable with Drush)
drush en migrate migrate_tools migrate_plus --yes

# CSV source support
composer require drupal/migrate_source_csv

# Useful extras
drush en migrate_source_csv --yes

For database migrations the source is built into core (migrate.source.sqlbase). The migrate_tools module adds Drush commands (drush migrate:import, drush migrate:status, etc.).

Scenario 1: Importing Articles from a CSV File

CSV Structure

id,title,body,tags,image_url,published_date
1,"Getting started with Drupal","Drupal is a CMS...","drupal,cms","https://example.com/img1.jpg","2024-01-15"
2,"PHP 8.4 Features","Property hooks are...","php,programming","https://example.com/img2.jpg","2024-02-20"

Module Scaffold

Create a custom module web/modules/custom/my_migrations/:

my_migrations/
├── my_migrations.info.yml
├── my_migrations.module       (empty, required)
└── config/
    └── install/
        └── migrate_plus.migration.import_articles.yml
# my_migrations.info.yml
name: My Migrations
type: module
core_version_requirement: ^11
package: Custom
dependencies:
  - drupal:migrate
  - drupal:migrate_plus
  - migrate_source_csv:migrate_source_csv

Migration YAML

# config/install/migrate_plus.migration.import_articles.yml
id: import_articles
label: 'Import articles from CSV'
status: true

migration_tags:
  - content
  - csv

source:
  plugin: csv
  path: 'public://import/articles.csv'   # Or an absolute path
  ids:
    - id
  delimiter: ','
  enclosure: '"'
  header_row_count: 1
  column_names:
    - id: 'Unique ID'
    - title: 'Article title'
    - body: 'Body text'
    - tags: 'Comma-separated tags'
    - image_url: 'Image URL'
    - published_date: 'Published date (YYYY-MM-DD)'

process:
  # Map title directly
  title: title

  # Map body with a format
  'body/value': body
  'body/format':
    plugin: default_value
    default_value: basic_html

  # Convert date string to timestamp
  created:
    plugin: format_date
    from_format: 'Y-m-d'
    to_format: U
    source: published_date

  # Set node type
  type:
    plugin: default_value
    default_value: article

  # Publish status
  status:
    plugin: default_value
    default_value: 1

  # Handle tags: split CSV string into array, then look up or create terms
  field_tags:
    - plugin: explode
      source: tags
      delimiter: ','
    - plugin: entity_generate
      entity_type: taxonomy_term
      bundle: tags
      value_key: name
      bundle_key: vid

  # Download remote image and create a file entity
  'field_image/target_id':
    plugin: file_import
    source: image_url
    destination: 'public://article-images'
    uid: 1
  'field_image/alt': title

destination:
  plugin: 'entity:node'
  default_bundle: article

migration_dependencies: {}

Key Process Plugins Used

PluginPurpose
default_valueSet a hardcoded value regardless of source
format_dateConvert date strings between formats or to Unix timestamps
explodeSplit a delimited string into an array
entity_generateLookup or create referenced entities (terms, users)
file_importDownload a remote file and create a Drupal file entity
callbackRun any PHP callable for custom transformations

Placing the CSV File

# Copy CSV to Drupal's public files
mkdir -p web/sites/default/files/import
cp articles.csv web/sites/default/files/import/

Running the Migration

# Enable your module (imports config entities automatically)
drush en my_migrations --yes

# Check status
drush migrate:status import_articles

# Run the migration
drush migrate:import import_articles

# Re-run to pick up only new/changed rows (requires highwater_property)
drush migrate:import import_articles --update

# Rollback all imported content
drush migrate:rollback import_articles

Scenario 2: Migrating from a Legacy MySQL Database

Suppose you have a legacy CMS with a posts table:

CREATE TABLE posts (
  post_id   INT PRIMARY KEY AUTO_INCREMENT,
  title     VARCHAR(255),
  content   LONGTEXT,
  author_id INT,
  status    TINYINT DEFAULT 1,
  created   INT  -- Unix timestamp
);

Database Credentials in settings.php

// web/sites/default/settings.php — add a secondary database connection
$databases['legacy']['default'] = [
  'driver'   => 'mysql',
  'host'     => '127.0.0.1',
  'port'     => '3306',
  'database' => 'legacy_cms',
  'username' => 'migrate_ro',
  'password' => 'secret',
  'prefix'   => '',
  // Do not set 'namespace' — Drupal 10+ auto-detects it from 'driver'.
  // The old 'Drupal\\Core\\Database\\Driver\\mysql' namespace is deprecated.
];

Migration YAML for Database Source

# config/install/migrate_plus.migration.import_legacy_posts.yml
id: import_legacy_posts
label: 'Import posts from legacy MySQL database'
status: true

source:
  plugin: sql
  # Use the secondary DB key from settings.php
  key: legacy
  target: default
  query: >
    SELECT p.post_id,
           p.title,
           p.content,
           p.status,
           p.created,
           u.email AS author_email
    FROM   posts p
    LEFT JOIN users u ON u.user_id = p.author_id
    WHERE  p.status IN (0, 1)

  # Unique ID column(s) from the source
  ids:
    post_id:
      type: integer

  # Incremental migration: only process rows newer than the last import
  highwater_property:
    name: created
    alias: p

process:
  title: title

  'body/value': content
  'body/format':
    plugin: default_value
    default_value: full_html

  status: status

  created: created
  changed: created

  type:
    plugin: default_value
    default_value: article

  # Map legacy author email to Drupal user ID
  uid:
    plugin: entity_lookup
    source: author_email
    entity_type: user
    value_key: mail
    # Fall back to user 1 if no match
    ignore_missing: true

destination:
  plugin: 'entity:node'
  default_bundle: article

migration_dependencies: {}

Writing a Custom Source Plugin

When built-in source plugins do not fit (e.g., a REST API), create a custom one:

<?php
// web/modules/custom/my_migrations/src/Plugin/migrate/source/JsonApiSource.php

namespace Drupal\my_migrations\Plugin\migrate\source;

use Drupal\migrate\Plugin\migrate\source\SourcePluginBase;
use Drupal\migrate\Plugin\MigrationInterface;
use GuzzleHttp\ClientInterface;
use Drupal\Core\Plugin\ContainerFactoryPluginInterface;
use Symfony\Component\DependencyInjection\ContainerInterface;

/**
 * Source plugin to pull records from a JSON API endpoint.
 *
 * @MigrateSource(
 *   id = "json_api_source",
 *   source_module = "my_migrations"
 * )
 */
class JsonApiSource extends SourcePluginBase implements ContainerFactoryPluginInterface {

  protected ClientInterface $httpClient;

  public function __construct(
    array $configuration,
    string $plugin_id,
    mixed $plugin_definition,
    MigrationInterface $migration,
    ClientInterface $http_client,
  ) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this->httpClient = $http_client;
  }

  public static function create(ContainerInterface $container, array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration = NULL): static {
    return new static(
      $configuration,
      $plugin_id,
      $plugin_definition,
      $migration,
      $container->get('http_client'),
    );
  }

  public function getIds(): array {
    return ['id' => ['type' => 'integer']];
  }

  public function fields(): array {
    return [
      'id'    => 'Record ID',
      'title' => 'Title',
      'body'  => 'Body',
    ];
  }

  public function __toString(): string {
    return $this->configuration['url'] ?? 'json_api_source';
  }

  protected function initializeIterator(): \Iterator {
    $url      = $this->configuration['url'];
    $response = $this->httpClient->get($url, ['headers' => ['Accept' => 'application/json']]);
    $data     = json_decode((string) $response->getBody(), true);

    foreach ($data['data'] as $record) {
      yield $record;
    }
  }
}

Custom Process Plugin

Transformations that built-in plugins cannot handle belong in a custom process plugin:

<?php
// src/Plugin/migrate/process/CleanHtml.php

namespace Drupal\my_migrations\Plugin\migrate\process;

use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\Row;

/**
 * Strip dangerous HTML from legacy content.
 *
 * @MigrateProcessPlugin(
 *   id = "clean_html"
 * )
 */
class CleanHtml extends ProcessPluginBase {

  public function transform(mixed $value, MigrateExecutableInterface $migrate_executable, Row $row, string $destination_property): mixed {
    if (!is_string($value)) {
      return $value;
    }
    // Strip scripts and on* attributes
    $clean = preg_replace('/
Tags

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
Please share this article on your favorite website or platform.