Ariejan de Vroom – 19 May 2011
243 words in about 1 minute

Rails migrations are awesome, even for updating data after a migration to keep everything consistent. Arguably, you should not update data in migrations, but it is useful in some scenarios. If done incorrectly, however, data migrations can take a long time, causing unnecessary downtime of your application.

The Problem

Updating data can take a very long time, especially if done incorrectly. Consider the following example:

1
2
3
Photo.all.each do |photo|
  photo.update_attribute :order, 999 if photo.order.nil?
end

This might look nice and all and work okay on your development machine. However, if you run this on your production database with 78k photos… You guessed it, it takes for ever. What happens is that Rails will fetch all photos from the database and instantiate 78k Photo objects. Then for each object it will issue an update-query if necessary. Running this took at least ten minutes or more. Bad!

The Alternative

There is, luckily, an alternative that is quite a bit faster.

1
Photo.update_all("`order` = 999", "`order` IS NULL")

This alternative issues exactly one update-query to the database and achieves the same end result as the previous code example. Also, this query took about 20 seconds to run! You may want to look into http://apidock.com/rails/ActiveRecord/Base/update_all/class for more information about the update_all method.

Conclusion

Thinking about the code you put in your migrations, especially when manipulating data, is very important and will pay you back in less database downtime during deployments.

Ariejan de Vroom

Software Engineer • CodeRetreat Facilitator • Ruby, Go and C Programmer • Electronics Apprentice