Accueil / File import improvements for Migrate 2.4

File import improvements for Migrate 2.4

The Migrate module is the leading tool for migrating data from an external application into Drupal. Migrate has been used to bring many world class sites onto Drupal, including The Economist, Martha Stewart and thousands more. The main theme of the upcoming Migrate 2.4 release is improved file handling on Drupal 7.

Motivation

Media files and attachments are often the trickiest part of a site migration, and the previous approach taken by Migrate did not help as much as it could.

  1. File migration mappings supported a large and obscure array of arguments, without a clear distinction between options and data, or much clarity on which ones were relevant in any given context.
  2. The available options and behavior of the file field handler and the file destination differed significantly, with no code shared between the two (being in two distinct class hierarchies).
  3. The hard-coded “file functions” provided no mechanism for overriding or extending their behavior.
  4. The file field handler implementation in particular was very convoluted, making it difficult to add enhancements or figure out exactly how it would behave in various circumstances.

For Migrate 2.4, now in the release candidate stage, we were determined to significantly improve the developer experience around file migration.

File classes

The key to addressing these issues was to introduce file classes, more-or-less equivalent to the former file functions (which, truth be told, were never real functions). By implementing the primary tasks of file migration - moving (or creating, or linking to) the actual file, and creating the corresponding file entity - in separate classes, we have achieved better isolation of the different implementations, as well as the ability to share them between the field and destination handlers. Each file class is responsible for interpreting the incoming file representation (such as a URI) and, based on its parameters, returning a corresponding file entity which can serve as the result of a file destination import, or a file field within another entity.

File classes are implementations of MigrateFileInterface, and are required to implement two methods - fields() (documenting any options or subfields they support) and processFile() (which takes some value representing a file or file data, and returns the corresponding file entity). The file field handler and the file destination instantiate their configured file class and have it do its work, so each can focus on its specific work - constructing a field array or managing the file entity. There are three file classes implemented directly in Migrate:

MigrateFileUri is the default file class, and does not need to be specified. It accepts a local file path or a remote URI, and copies (if necessary) the file at that location into the Drupal file system.

MigrateFileBlob accepts a variable containing the actual file data (presumably coming from a database blob field) and saves it to a file in the Drupal file system.

MigrateFileFid is something of a degenerate case, and only applicable to file fields - when you create a separate file migration, and need to link a file field in a later migration to one of the previously-migrated files, you simply pass its fid in the mapping to the file field with this class specified to make the link.

Field documentation

Different options and subfields are applicable to the different file classes. In the original implementation, all of these were mixed together in the static arguments() method, with little help on which ones would work with which “file function”. In the new world, each file class implements its own fields method() to document what options and subfields it supports, and the file field handler and destination class incorporate them into the Migrate UI’s detail pages, alongside the regular fields available for mapping. Thus, once you have selected a particular file class, you only see the options and subfields that are relevant to that file class. In addition, we have documented each option and subfield on drupal.org, and linked directly to the documentation from the field descriptions:

Migrate web UI

Online documentation

Examples

File destinations

To migrate to a file destination (i.e., to create file entities directly from source data), the key points are to map the representation of the file (usually a URI/file path) to the ‘value’ field on the file destination, and to pass the file class as the second argument to the MigrateDestinationFile constructor (optional for MigrateFileUri). The options and subfields supported by the chosen file class can be mapped directly.

<?php
class FileBlobMigration extends Migration {
public function
__construct() {
  
$query = db_select('legacy_file_data', 'f')
            ->
fields('f', array('attachmentid', 'attachmentblob', 'filename', 'file_ownerid'));
  
$this->source = new MigrateSourceSQL($query);
  
$this->destination = new MigrateDestinationFile('file', 'MigrateFileBlob');
  
$this->addFieldMapping('value', 'attachmentblob');
  
$this->addFieldMapping('destination_file', 'filename');
  
$this->addFieldMapping('uid', 'file_ownerid')
        ->
sourceMigration('User');
?>

File fields

Now, if the blobs we migrated above are referenced in a node’s file field, we can easily reference them by using the MigrateFileFid file class:

<?php
$this
->addFieldMapping('field_blob_attachment', 'attachmentid')
    ->
sourceMigration('FileBlob');
$this->addFieldMapping('field_blob_attachment:file_class')
     ->
defaultValue('MigrateFileFid');
?>

As you can see, the relevant options and subfields are expressed in the mapping using the new subfield syntax in Migrate 2.4, following the parent field and a colon.

Let’s consider migrating image files directly through the field mappings, with the file entities being automatically created. In this example, the source files are mounted at /mnt/files and the image_filename source field is a file path relative to that directory. The source database also contains alt and title information for each file. Because the default file_class is MigrateFileUri we don’t need to specify it:

<?php
$this
->addFieldMapping('field_image', 'image_filename');
$this->addFieldMapping('field_image:source_dir')
    ->
defaultValue('/mnt/files');
$this->addFieldMapping('field_image:alt', 'image_alt');
$this->addFieldMapping('field_image:title', 'image_title');
?>

Try it!

We believe these changes will substantially improve the developer experience in implementing file migrations, as well as making this support more maintainable in the long run. They are, however, a complete rewrite, and given the myriad of real-world scenarios and complexity of the options and subfields supported, we can’t test it all ourselves - please install the Migrate 2.4 release candidate and try it in your migration project. Any bug reports, or improvement suggestions, are welcome in the Migrate issue queue.

It is important to note that these changes are incompatible with the previous file support - if you have existing file migrations (particularly if you use MigrateFileFieldHandler::arguments, which has been removed), you must change your mappings to use the new techniques.

For more information, please see the documentation at http://drupal.org/node/1540106.

Commentaires

Posted on by BTMash (non vérifié).

These are some very cool changes :) I'm trying to wrap my head around the the whole thing since I got so used to the json way (will just have to give it a try to fully understand it) which is why I have a few questions. Just so I understand the changes, the major implication seems to be that the various pieces (source_dir, alt, title, destination, etc) are now split up. So if we're now dealing with a multiple value image field, would we then just be passing through the mappings and then in our prepareRow be generating each of these pieces separately? So we'd have something like (under this assumption, lets say we're passing through a hardcoded value):

<?php

$row->field_image = array('file_1.jpg', 'path/to/file_2.jpg', 'file_3.png');
$row->image_alt = array('Some text', '', 'Text for file 3');
$row->image_title = array('File 1', '', 'File 3');
?>

Based off this, would it still know to look inside /mnt/files from above? Also, so with the example above, it we wanted to preserve the destination path, I imagine then that just having
<?php $this->addFieldMapping('field_image:destination_file', 'image_filename'); ?> would be fine?

Posted on by mryan.

Yes, that's exactly it, you can make arrays of the associated subfields (alt, title, etc.), you just need to make sure they line up - put in an empty value as you did for file_2, rather than omitting the array value entirely. Note that the field handlers have some smarts - if you pass a scalar rather than array to the subfields, that value will be used for all files in the main array.

The source_dir applies whether the source field is populated directly by your query or in prepareRow(), everything gets pulled together at the prepare() stage.

Yes, set destination_file to preserve the full path.

Posted on by John Brandenburg.

This must have been a change in 2.5, but

$this->addFieldMapping('field_blob_attachment:file_class� 39;, 'MigrateFileFid');

Becomes

$this->addFieldMapping('field_blob_attachment:file_class� 39;)
->defaultValue('MigrateFileFid');

Posted on by mryan.

Finally fixed that in the post, thanks!

Ajouter un commentaire

Plain text

  • Aucune balise HTML autorisée.
  • Les adresses de pages web et de courriels sont transformées en liens automatiquement.
  • Les lignes et les paragraphes vont à la ligne automatiquement.

Filtered HTML

  • Use [acphone_sales], [acphone_sales_text], [acphone_support], [acphone_international], [acphone_devcloud], [acphone_extra1] and [acphone_extra2] as placeholders for Acquia phone numbers. Add class "acquia-phones-link" to wrapper element to make number a link.
  • Pour publier des morceaux de code, entourez-les avec les balises <code>...</code>. Pour du PHP, utilisez. <?php ... ?>, ce qui va colorier le code en fonction de sa syntaxe.
  • Les adresses de pages web et de courriels sont transformées en liens automatiquement.
  • Tags HTML autorisés : <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <h4> <h5> <h2> <img>
  • Les lignes et les paragraphes vont à la ligne automatiquement.
By submitting this form, you accept the Mollom privacy policy.