Home / Comment permalink

Module authors: In SQL, don't assume INCREMENT(1) == 2

Trivia question: Suppose you execute the MySQL queries

CREATE TABLE t1 (  id INT UNSIGNED NOT NULL AUTO_INCREMENT,  n INT NOT NULL,  PRIMARY KEY (id))INSERT INTO t1 (n) VALUES (100)INSERT INTO t1 (n) VALUES (200)

What are the values for the id column in your two rows? If you said "1 and 2", you are in good company but you are mistaken. If you want your module to be good enough to run on Drupal.org or Acquia Hosting some day, keep reading.

The correct answer is "they could be anything." On Acquia Hosting, the results are:

mysql> SELECT * FROM t1;+----+-----+| id | n   |+----+-----+|  1 | 100 | |  6 | 200 | +----+-----+

On Drupal.org, the results would be 1 and 3. Surprised?

Getting this wrong puts you in good company because, until recently, Drupal 6 core got this question wrong too. During installation, it ran the queries

db_query("INSERT INTO {role} (name) VALUES ('%s')", 'anonymous user');db_query("INSERT INTO {role} (name) VALUES ('%s')", 'authenticated user');

and assumed those two roles got ids 1 and 2. This means Drupal 6 core could not be freshly installed on drupal.org! Ooops. Thanks to a recent patch, Drupal 6.17 now handles this correctly:

<?phpdb_query("INSERT INTO {role} (name) VALUES ('%s')", 'anonymous user');$rid_anonymous = db_last_insert_id('role', 'rid');if ($rid_anonymous != DRUPAL_ANONYMOUS_RID) {  db_query("UPDATE {role} SET rid = %d WHERE rid = %d", DRUPAL_ANONYMOUS_RID, $rid_anonymous);}?>

If you are a Drupal contrib module author, it is important that you do not make the same mistake. Avoiding it is easy: Never assume an INSERT query into an auto-increment column will produce a known value. If you need a value in an auto-increment column to be a fixed value, use the db_last_insert_id() function (in Drupal 6) or the return value of $query->execute() (in Drupal 7) to access the actual auto-increment value inserted, and update it explicitly.

So now you know what can happen and what to do about it. But why?

MySQL has two server variables, auto_increment_offset and auto_increment_increment, that control both the first auto-increment value inserted into a table and all subsequent auto-increment values inserted into a table. The variables are used in multi-master or master-slave replication setups when a passive master or slave server might at some point be promoted into an active master server. They ensure that auto-increment values created on the different servers will not conflict with each other when the resulting rows are merged together by replication.

On Acquia Hosting, auto_increment_increment is 5. auto_increment_offset is 1 on each primary master and 2 on each secondary master. The primary master is usually active, so auto-increment columns get values 1, 6, 11, 16, etc. If a primary master fails and the secondary master becomes active, new auto-increment columns get values like 17, 22, 27, etc. When the primary master comes back, the offset auto-increment values will not cause a conflict.

Example: Suppose on the primary master, INSERTs occur creating id values 1, 6, and 11, but the column containing 11 does not get replicated before the primary master crashes. The secondary server will have id values 1 and 6 via replication. The application fails over to the secondary master and runs two more INSERTs, creating the id values 7 and 12. Eventually the primary master comes back up, and the column containing id 11 is replicated to the secondary master. Also, because these are dual masters, the columns containing 7 and 12 are replicated back to the primary master. Now, both servers contain id vlaues 1, 6, 7, 11, and 12, and all is well with the world.

One additional note: Many people think these auto-increment features of MySQL mean they can run Drupal on with dual active-active master MySQL replication. You can't. Not all of Drupal's tables use auto-increment primary keys and if your site makes changes against both masters simultaneously you will quickly have a mess on yours hands. Don't do it. If you think you want dual active-master replication, read about Pipe Dream.

Comments

Posted on by TBarregren (not verified).

Corollary: In SQL, don't assume that auto-incremented values are monotonically increasing.

Posted on by Barry Jaspan.

Excellent point!

Scenario: id 1 is inserted on the primary master and replicated. id 6 is inserted on the primary master and a failure occurs before replication. The application fails over, and id 2 is inserted on the secondary master. Later the primary is restored is ids 6 and 2 are both replicated to both. However, the object behind id 2 is newer than the object behind id 6.

Thank you, Thomas!

Posted on by chris.cohen (not verified).

If you can't assume that the first insert into a new table will have id == 1, does that mean that Drupal's master user might have a uid other than 1? If so, would this not cause problems all over the code where it is assumed that this user has an id of 1?

Posted on by Barry Jaspan.

It would mean that, yes, but (since you pointed it out) I just checked and both Drupal 6 and 7 handle this correctly. In Drupal 7:

<?php  // We need some placeholders here as name and mail are uniques and data is  // presumed to be a serialized array. This will be changed by the settings  // form in the installer.  db_insert('users')    ->fields(array(      'uid' => 1,      'name' => 'placeholder-for-uid-1',      'mail' => 'placeholder-for-uid-1',      'created' => REQUEST_TIME,      'status' => 1,      'data' => NULL,    ))    ->execute();?>

In Drupal 6:

<?php  // We need some placeholders here as name and mail are uniques and data is  // presumed to be a serialized array. Install will change uid 1 immediately  // anyways. So we insert the superuser here, the uid is 2 here for now, but  // very soon it will be changed to 1.  db_query("INSERT INTO {users} (name, mail, created, data) VALUES('%s', '%s', %d, '%s')", 'placeholder-for-uid-1', 'placeholder-for-uid-1', time(), serialize(array()));  // This sets uid 1 (superuser). We skip uid 2 but that's not a big problem.  db_query("UPDATE {users} SET uid = 1 WHERE name = '%s'", 'placeholder-for-uid-1');?>

The first comment is technically wrong; the installer does not know the second uid inserted will be 2. Luckily, the update query below it uses the name column instead of the uid column in the WHERE clause, so everything works out.