Author image
Senior Developer

Migrate sites from one Aegir to another

We recently needed to migrate all our sites on one physical server to another server, there were more than 200 sites, and they were all hosted with Aegir. The old server was to be decommissioned, so we had to move all of Aegir's data about the site to the new server import into a new Aegir master on the new server. We also needed to do this with as small amount of downtime as possible.

In the end we migrated all the sites with about 30 seconds of downtime each, here's how:

The setup before

For clarity, here's a simplified diagram of the infrastructure before the migration:

Server diagram before migration

You can see we have two servers behind a firewall, the firewall basically is there to translate the public IP address into an internal IP address of the actual web server.

We set up the two webservers to be as identical as possible, in terms of Aegir installed and the names of the platforms hosting the sites on them etc. but this process is actually quite flexible, so for example, we actually had a dedicated DB server on our old infrastructure, but not in the new.

We installed the code I'm about to describe on both machines, into ~aegir/.drush/ and additionally we ensured that the aegir user on the new machine could ssh into the old machine as the aegir user with its ssh key. We also created a Drush alias for the 'old_hostmaster' by creating the file: ~aegir/.drush/aliases.drushrc.php and adding the entry:

$aliases['old_hostmaster'] = array(
  'remote-host' => '192.168.1.2',
  'remote-user' => 'aegir',
  'uri' => 'old-hostmaster.computerminds.co.uk',
  'root' => '/var/aegir/hostmaster-6.x-1.5',
);

(You will need to set the details to match your environment.)

Reducing downtime

To reduce the amount of downtime during the server migration, we employed the apache module: mod_proxy. This allows the apache server on the new webserver to function as a reverse proxy server. Traffic arriving at the new server gets forwarded to the old one and visitors don't notice any difference.

Setting up 200 or more virtual host files with all the correct details is a pain though, and so naturally we wrote a Drush command to do it for us.

First we need to get a list of sites on an Aegir platform, which we can do like this:

/**
* Lists all sites from a particular platform.
*
* Prints a serialized array of URLs.
*/
function drush_computerminds_migrate_platform_list_sites($platform) {
  $sites = array();

  $platform_node = hosting_context_load($platform);

  // Load the platform:
  $all_sites = hosting_get_sites_by_status($platform_node->nid, HOSTING_SITE_ENABLED);

  foreach ($all_sites as $site) {
    $sites[] = $site->title;
  }
 
  // This is a bit of a fragile way to return the data, but I couldn't seem to
  // get Drush to pass the structured data back properly, so we'll do this,
  // which works.
  drush_print_r(serialize($sites));
}

Because this is a Drush command itself, we can call this on a 'remote' server. So we can run a command from the new server, and go get all the sites we need to migrate from the old server.

/**
* Sets up the current hostmaster ready for migration.
*/
function drush_computerminds_migrate_pre_migrate_setup() {
  $platform = COMPUTERMINDS_PLATFORM_NAME;
  $old_hostmaster = '@' . COMPUTERMINDS_OLD_HOSTMASTER_NAME;
  $platform_context = d([email protected]_' . $platform);
  $web_server = d($platform_context->web_server);

  $old_sites = computerminds_migrate_get_all_sites($old_hostmaster, $platform);

  foreach ($old_sites as $site) {
    // Need to ensure that we have a mod_proxy vhost for this site.
    drush_log('Creating a mod proxy vhost for: ' . $site);
    $vhost = new provisionConfig_computerminds_proxy($web_server, array('uri' => $site));
    $vhost->write();
  }

  // Now restart the web server.
  $web_server->service('http')->restart();
}

/**
* Get all sites of a given platform on a given server.
*/
function computerminds_migrate_get_all_sites($target, $platform) {
  $platform = 'platform_' . $platform;
  // Get a list of all the sites on the remote hostmaster.
  $result = drush_backend_invoke_args('@' . ltrim($target, '@') . ' ' . 'platform-list-sites', array($platform), array('root' => NULL, 'uri' => NULL), 'GET', FALSE);
  $sites = unserialize($result['output']);
  if (is_array($sites)) {
    return $sites;
  }
  return array();
}

This is a fairly straightforward command in which we go an get a list of sites on a particular platform on the old server, and then create a provisionConfig_computerminds_proxy for site and write it. The provisionConfig_computerminds_proxy class looks like this:

/**
* Base class for proxied virtual host configuration files.
*/
class provisionConfig_computerminds_proxy extends provisionConfig {
  public $template = 'computerminds_proxy_vhost.tpl.php';
  public $description = 'mod proxy virtual host configuration file';


  function filename() {
    return $this->http_vhostd_path . '/' . $this->data['uri'];
  }

  function process() {
    parent::process();

    $this->data['http_port'] = $this->http_port;
    $this->data['http_proxy_forward'] = COMPUTERMINDS_OLD_SERVER_IP;

    if ($this->aliases && !is_array($this->aliases)) {
      $this->aliases = explode(",", $this->aliases);
    }
    else {
      $this->aliases = array();
    }

    $this->aliases = array_filter($this->aliases, 'trim');

    $uri = $this->data['uri'];
    if (strpos($uri, 'www.') === 0) {
      $this->aliases[] = substr($uri, 4);
    }
    else {
      $this->aliases[] = 'www.' . $uri;
    }
  }
}

This class just sets up some basic Aegir config stuff, and requires a simple template, computerminds_proxy_vhost.tpl.php:

<VirtualHost *:<?php print $http_port; ?>>
  ServerName <?php print $this->data['uri']; ?>

  <?php
  if (sizeof($this->aliases)) {
    print "\n ServerAlias " . implode("\n ServerAlias ", $this->aliases) . "\n";
  }

  print " RewriteEngine on\n";
  foreach ($this->aliases as $alias) {
    print " RewriteCond %{HTTP_HOST} ^{$alias}$ [NC]\n";
    print " RewriteRule ^/*(.*)$ http://{$this->data['uri']}/$1 [L,R=301]\n";
  }

  ?>

  ProxyRequests Off
  <Proxy *>
    Order deny,allow
    Allow from all
  </Proxy>

  ProxyPass / http://<?php print $http_proxy_forward; ?>/
  ProxyPassReverse / http://<?php print $http_proxy_forward; ?>/
  ProxyPreserveHost On

</VirtualHost>

We ran this Drush command before starting the migration, and actually because our new server had a public IP address of its own, we could test beforehand that access one of the sites domains at that IP, we actually accessed the site on the old server.

Switching IPs

Once we had our servers set up we were then good to start the migration process, the first stage was to re-assign the public IP address of the sites to the new server, instead of being attached to the old, this was the traffic flow:

Server diagram during the migration phase

Note, that if you don't have the ability to re-assign public addresses between servers, then you could switch the DNS entries over to the new IP, and just wait until all the traffic is hitting the new server and not the old one.

Doing the actual migration

The actual migration process is pretty simple, we take advantage of Aegir's built-in Drush commands to do all of the heavy lifting.

Here is the complete Drush command to migrate one site from the old server to the new one.

/**
* Drush command to migrate a single site from the old server to the new one.
*/
function drush_computerminds_migrate_migrate_computerminds_one($site) {
  // Do a backup on the old server.
  drush_log(dt('Backing up old site: @uri', array([email protected]' => $site)), 'ok');
  $suggested = d()->platform->server->backup_path . '/' . $site . '-migrate-' . date("Ymd.His", mktime()) . '.tar.gz';
  drush_backend_invoke_args('@' . ltrim($site, '@') . ' ' . 'provision-backup', array($suggested), array('uri' => $site, 'root' => d(COMPUTERMINDS_PLATFORM_NAME)->root), 'GET', TRUE, NULL, COMPUTERMINDS_OLD_SERVER_IP, 'aegir');

  // Rsync to this machine.
  drush_log('Copying backup from remote server...', 'ok');
  if (drush_core_call_rsync(escapeshellarg('aegir@' . COMPUTERMINDS_OLD_SERVER_IP . ':' . $suggested), escapeshellarg($suggested), array(), TRUE, FALSE)) {
  }
  else {
    return drush_set_error('RSYNC_FAILED', 'Failed to copy the backup from the remote server.');
  }

  // Copy the Aegir context file over.
  $alias_file = '/var/aegir/.drush/' . $site . '.alias.drushrc.php';
  if (drush_core_call_rsync(escapeshellarg('aegir@' . COMPUTERMINDS_OLD_SERVER_IP . ':' . $alias_file), escapeshellarg($alias_file), array(), TRUE, FALSE)) {
    // Now set the new DB server
    $args = array(
      'uri' => "$site",
      "@$site",
      'db_server' => '@' . COMPUTERMINDS_NEW_DB_SERVER,
      'root' => d('platform_ ' . COMPUTERMINDS_PLATFORM_NAME)->root,
      'platform' => [email protected]_' . d(COMPUTERMINDS_PLATFORM_NAME)->name,
    );
    drush_backend_invoke('provision-save', $args);
  }
  else {
    return drush_set_error('RSYNC_FAILED', 'Failed to copy the alias from the remote server.');
  }

  drush_log('Copied all files from remote server.', 'ok');

  // Deploy the site.
  provision_backend_invoke($site, 'provision-deploy', array($suggested), array('old_uri' => $site));
  drush_log('Deployed the files and database locally.', 'ok');

  // Import into the frontend, if there are no errors.
  if (!drush_get_error()) {
    drush_log('Importing the site into the frontend...', 'ok');
    provision_backend_invoke([email protected]', 'hosting-import', array("@" . $site));
    provision_backend_invoke([email protected]', 'hosting-task', array("@" . $site, 'verify'));
    provision_backend_invoke([email protected]', 'hosting-task', array("@" . $site, 'enable'));
    drush_bootstrap(DRUSH_BOOTSTRAP_DRUPAL_LOGIN);

    // Hosting will create a dummy install task, but it'll fail. So we remove it here.
    $ref = hosting_context_load("@" . $site);
    if ($ref->nid) {
      if ($task = hosting_get_most_recent_task($ref->nid, 'install')) {
        drush_log(dt('Removed the dummy install task: @nid.', array([email protected]' => $task->nid)), 'ok');
        _computerminds_migrate_node_delete($task->nid);
      }
    }

    drush_log(dt('The site: @uri has been imported.', array([email protected]' => $site)), 'ok');
  }
}

Note that we copy the context from the old server, so that settings stored about the site are also copied over to the new server. We reset a few of those, such as the database server.

We wanted our site's Aegir generated vhost to be used in place of the proxied one we created earlier, so we implemented a couple of Drush hooks to hook into the above command and remove the proxy vhost when it is run, and put it back if the command to migrate the site fails:

/**
* Implements drush_hook_pre_migrate_computerminds_one().
*
* We use the pre command hook to remove our temporary vhost.
*/
function drush_computerminds_migrate_pre_migrate_computerminds_one($site) {
  // Remove the proxy vhost
  $platform = COMPUTERMINDS_PLATFORM_NAME;
  $platform_context = d([email protected]_' . $platform);
  $web_server = d($platform_context->web_server);
  drush_log('REMOVING mod proxy vhost for: ' . $site);
  $vhost = new provisionConfig_computerminds_proxy($web_server, array('uri' => $site));
  $vhost->unlink();
  $web_server->service('http')->restart();
}

/**
* Implements drush_hook_pre_migrate_computerminds_one_rollback().
*
* We use the pre rollback command hook to replace our temporary vhost if
* something when wrong with this migrate.
*/
function drush_computerminds_migrate_pre_migrate_computerminds_one_rollback($site) {
  // Add the proxy vhost.
  $platform = COMPUTERMINDS_PLATFORM_NAME;
  $platform_context = d([email protected]_' . $platform);
  $web_server = d($platform_context->web_server);
  drush_log('Creating a mod proxy vhost for: ' . $site);
  $vhost = new provisionConfig_computerminds_proxy($web_server, array('uri' => $site));
  $vhost->write();
  $web_server->service('http')->restart();
}

Finally, to be able to migrate all the sites we have a simple Drush command that gets a list of sites that could be migrated and compares that to the sites on the current server, and offers to migrate them:

/**
* Drush command to migrate a lot of sites from an old Hostmaster to this one.
*/
function drush_computerminds_migrate_migrate_computerminds_all() {
  // Get the remote sites.
  $remote_sites = computerminds_migrate_get_all_sites('@' . COMPUTERMINDS_OLD_HOSTMASTER_NAME, COMPUTERMINDS_PLATFORM_NAME);

  // Get the local sites.
  $local_sites = computerminds_migrate_get_all_sites([email protected]', COMPUTERMINDS_PLATFORM_NAME);

  // Migrate the diff.
  $sites_to_migrate = array_diff($remote_sites, $local_sites);
 
  $limit = drush_get_option('limit', 0);
  // Truncate the list of sites to migrate if there is a limit.
  if (!empty($limit)) {
    $sites_to_migrate = array_slice($sites_to_migrate, 0, $limit);
  }

  drush_log(dt('The following sites will be migrated:'), 'ok');

  foreach ($sites_to_migrate as $site) {
    drush_log('  ' . $site, 'ok');
  }

  if (!drush_confirm(dt('Do you want to proceed.'))) {
    return;
  }

  drush_log(dt('Migrating...'), 'ok');

  $success = array();
  $failed = array();
 
  $count = 0;

  foreach ($sites_to_migrate as $site) {
    $result = provision_backend_invoke([email protected]', 'migrate-computerminds-one', array($site));
    $count++;
    drush_log(dt([email protected] of @total sites migrated...', array([email protected]' => $count, [email protected]' => count($sites_to_migrate))), 'ok');
    if (!empty($result['error_status'])) {
      $failed[] = $site;
    }
    else {
      $success[] = $site;
    }
  }

  if (!empty($success)) {
    drush_log(dt('The following sites migrated successfully'), 'ok');
    foreach ($success as $site) {
      drush_log('  ' . $site, 'ok');
    }
  }

  if (!empty($failed)) {
    drush_log(dt('The following sites migrated unsuccessfully'), 'error');
    foreach ($failed as $site) {
      drush_log('  ' . $site, 'error');
    }
  }
}

We had to add a --limit option because Drush creates so many log messages that the parent Drush process asking all the other processes to do the work ran out of memory.

The actual command we ran to do the migration was:

drush @hostmaster migrate-computerminds-all --limit=50

And we just kept running it until all the sites had been migrated. Each site will in turn be taken off-line, by removing the proxy vhost, and then backed up, migrated, and imported onto the new server. This took about 30 seconds per site for us. After a few hours or so our setup looked like this:

Server diagram showing post migration

We were then free to remove the old server after confirming that no traffic was being routed to it. It should be noted that apart from installing this code onto the old server, we've not actually changed anything on it, so if something goes wrong we could just switch the IP back in the firewall and go back to serving sites from the old server until we fixed the problem.

The code

I've put the code described in this article into a github repository:

https://github.com/computerminds/aegir_sites_migrate

I've cleaned it up and removed names for the purpose of this article, which may mean that I've broken it somewhere along the way. Feel free to fork and send a pull request with any fixes.

You will want to edit the defines at the top of the code to set things up.