For some time now I've been working on a Drupal site that consists mainly of scraped content from a proprietary, ASP based CMS from the late nineties. The Simple HTML Dom Parser, used from within a drush script, has been invaluable. It made scraping the old site content and importing it as Drupal nodes a relative breeze. (No access to the database used by the CMS, boo!)
Part of setting up the new site is importing users and their content profile nodes from a different Drupal site, that was setup a year or two ago to manage an event.
I had hoped there would be a way for me to export these users and their profile nodes from one Drupal to the other, but though I found modules to export one or the other, I might still end up with profile nodes that were no longer related to their users. Of course, that's pretty useless.
When I remembered I was also supposed to add all these users to a simplenews newsletter, the proverbial light bulb came on.
A spot of theory
I have previously done some bulk simplenews subscriptions by selecting Drupal data from one database table into another. In theory, that works fine across different databases as well. By inserting selected data from one Drupal's users table into another (avoiding uid clashes) I should be able to copy all users, keeping a predictable uid, so I could then do the same for their profile nodes.
First off, export the profile content type to the new Drupal site using the CCK content export module.
The grab the schema of all the tables I would need to transpose data to. These should be the same as the tables the data will come from. In my case the tables were: users, content_type_profile, node and node_revisions. The easiest way to grab the schema is via the mysql command line tool, using this query:
MySQL [source]> SHOW CREATE TABLE users\G
The \G means the data is shown in a single column, not in a column per field, which wouldn't fit across the terminal. Just repeat this for the other tables that need to be transposed and keep the output in a text editor window or something. Basically, this is just so you can easily copy and paste the field names.
At the bottom of each schema, you'll see a line like this one:
) ENGINE=InnoDB AUTO_INCREMENT=6221 DEFAULT CHARSET=utf8
The one from the node_revisions table is the important one. The AUTO_INCREMENT value tells us the next id that will be assigned on the primary auto_increment field in that table. In this case that is the vid field. It's important, because in all likelihood there is already a vid in this table for each value below the AUTO_INCREMENT value.
If you simply copy the node data across, chances some node.nid or node_revisions.vid will clash, causing the copy to break, leaving you with a cleanup to do. They do need to be included in the copy, though, because they are the way the user, her profile node and its CCK fields are linked together. If I simply copied all the other table columns and had the database assign new uids, nids and vids, I'd have a mess of information without its relational links intact.
My table has at most 6220 node revisions, so if I start with 10000 as the nid and vid for the copied nodes, there will never be a clash. Because I already had some users on the new Drupal, I decided to restart all copied user uid from 10000 as well. The really easy way to do this is by adding 10000 to the uid, nid or vid field.
At the end of the day the user.uid, node.nid and node_revisions.vid are just automatically assigned numbers, so if a chunk of them between 6220 and 10000 are not used, that hardly matters.
Run some queries
Time to migrate some data. In the following examples I execute all queries on the target database. All source data is coming from the Drupal tables in the oldevent database. These queries all need to be run by a MySQL user with INSERT privileges on the target Drupal database and at least SELECT privileges on the source Drupal database.
Note: for each of the queries to be run, it is important that you copy and paste the field names in the SELECT sub-query in the correct order.
First, copy the users and give them a predictable non-clashing new user.uid. I exclude user id 1, as it already exists on the new Drupal site. I knew the other user's names would not clash, so I did not need to limit the query any further.
MySQL [source]> INSERT INTO users (SELECT uid+10000, name, pass, mail, mode, sort, threshold, theme, signature, signature_format, created, access, login, status, timezone, language, picture, init, data, timezone_name FROM oldevent.users WHERE oldevent.users.uid > 1);
Now copy all profile node CCK fields, making sure the nid and vid will not clash.
MySQL [source]> INSERT INTO content_type_profile (SELECT vid+10000, nid+10000, field_profile_firstname_value, field_profile_surname_value, field_profile_position_value, field_profile_organisation_value, field_profile_phone_value FROM oldevent.content_type_profile);
Note that iIf you have multi-value fields in your user's profiles, you will also need to perform this for all content_field_profile_* tables.
Next, copy across all profile nodes:
MySQL [source]> INSERT INTO node (SELECT nid+10000, vid+10000, type, language, title, uid+10000, status, created, changed, comment, promote, moderate, sticky, tnid, translate FROM oldevent.node WHERE oldevent.node.type='profile');
And finally, copy all profile node revisions across as well:
MySQL [source]> INSERT INTO node_revisions (SELECT nid+10000, vid+10000, uid+10000, title, body, teaser, log, timestamp, format FROM oldevent.node_revisions WHERE oldevent.node_revisions.nid IN (SELECT nid FROM oldevent.node WHERE oldevent.node.type='profile'));
And that's it. I did not need to copy any of the user's roles across, so at this point I was done. I migrated just under 1500 users and all revisions of their profiles from one Drupal to another in approximately two minutes.
If your users are set up with profile pictures, you'll need to make sure the users.picture field points at the right directory under files on the new Drupal site. You can do a quick search & replace query to update the path using REPLACE().
MySQL [source]> UPDATE users SET picture=REPLACE(picture, 'old-sub-string', 'new-sub-string');
Note that the picture filename will not match the user's uid anymore. If you use a tool like mmv or a shell script to rename the picture files, be sure to also update the field in the database. (Hint: generate a new file path via CONCAT() , using the user.uid field)