Changing a Drupal pathauto alias pattern with working redirects

So we recently found out we’ve been a bit boneheaded with the pathauto alias to our Freebase taxonomy pages. The pattern was http://svenska.yle.fi/term/freebase/politics-1234, where 1234 was the term ID (tid) in Drupal.

This is a stupid alias pattern, since it makes our urls obfuscated – impossible to determine from the outside without access to our Drupal database. Also it feels like bad semantical form, because the url contains something that’s meaningless to most people. So we wanted to change the pathauto alias to use the unique Freebase ID instead.

Here it is important to remember that the best looking url naturally would be just http://svenska.yle.fi/term/freebase/politics, but because of disambiguation (dealing with the case where a word can have many different meanings, such as ”mercury”, which can be an element, a Roman god, a planet and many other things) we want to guarantee that a url is unique.

If we look up the word politics on Freebase, we find that its unique Freebase ID is /m/05qt0 and so we would like our url to have the form http://svenska.yle.fi/term/freebase/politics-m05qt0.

Using our own Freebase module (which may be released to Drupal.org at a later time) has put a field called ”field_freebase_freebaseid” under a taxonomy vocabulary called ”freebase”. This means we have access to the token [term:field-freebase-freebaseid] and that makes the whole pattern for Freebase taxonomy term listings the following:

term/freebase/[term:name]-[term:field-freebase-freebaseid]

The problem

The problem is that when we change the url alias pattern we want to leave the old alias intact and redirect from it to the new one. This functionality is built into the pathauto module: you can open up a taxonomy term for editing, save it and the new alias will be generated and the old one made into a redirect.

However, we have 6 000 Freebase terms and it would take a day to open up them all and save them to get the new alias with a redirect. It seems fortunate then, that we have a bulk update feature in the pathauto module. Bulk update generates aliases for all entities in a specific content type. Unfortunately, bulk update only works on entities, in our case taxonomy terms, that don’t yet have a pathauto alias. What you have to do is delete all current aliases and then start the bulkupdate, which will generate new aliases using the new pattern. If we start by deleting all current aliases, no redirects can be created! Here are some articles and threads discussing this very issue. Apparently it’s been a problem for around four years:

https://drupal.org/node/1661546

http://drupal.stackexchange.com/questions/98350/how-create-redirect-pattern-from-old-to-new-urls

Basically, if you’ve created thousands of pathauto aliases that have been indexed by Google and need to exist as redirects to the new alias, you’re out of luck! This seems like an incomprehensible oversight and part of me thinks I must’ve missed something, because this isn’t acceptable.

The solution

Searching the web has given us several ideas about how to deal with this issue, but most require some kind of manual hacking of the database, which doesn’t really sound like something we want to do.

Instead, we ended up writing a simple drush script that just loads all terms in a taxonomy vocabulary (”freebase” in our example, but the script could easily be modified to take a command line parameter). Writing the script took about a third of the time it took to write this blog text, so hopefully at least two other Drupal-users will find this beneficial.

I am assuming you are familiar with Drush scripts, but to briefly explain, assuming your module is named ”freebase”, you can just create a file called ”freebase.drush.inc” in the same folder and when you activate your module, your drush.inc file will be autoloaded as an available drush script.

The code

The definition of the Drush-command:

function ydd_api_drush_command() {
  $items = array();
  $items['yle-resave-freebase-terms'] = array(
    'description' => "Loads all Freebase terms and saves them to force Drupal to create new pathauto aliases.",
    'callback' => 'drush_fix_freebase',
    'aliases' => array('yrft'),
  );
  return $items;
}

The accompanying function that does the actual work:

function drush_fix_freebase() {
  // Find freebase vocabulary id.
  $fbvoc = taxonomy_vocabulary_machine_name_load('freebase');
  if (!empty($fbvoc)) {
    $vid = $fbvoc->vid;
    // Get a list of all terms in that vocabulary.
    $tree = taxonomy_get_tree($vid);
    $counter = 0;
    // Loop through all term ids, load and save.
    foreach ($tree as $t) {
      drush_print('Fixing '.$t->tid.' ['.++$counter.' / '.count($tree).']');
      // Load and save term to refresh its pathauto alias.
      $term = taxonomy_term_load($t->tid);
      taxonomy_term_save($term);
    }
  }
}

Finally, you need to run the script which will tell you how many terms have been processed out of how many. The command is:

drush yrft

After running this, it’s easy to verify that all terms have indeed received new aliases and a redirect from the old alias.