YamlDb for Database-Independent Data Dumps
November 23, 2007 by Adam Wiggins
One of the many benefits of Rails is database independence. Migrations are particularly nice in this regard; and the easy-to-read / Rubyified display of your schema (via rake db:schema:dump) in schema.rb is icing on the cake.
But what about data? For import and export of the actual data, we’re stuck with mysqldump (or pg_dump, if you’re so inclined). Further, these dump formats are not terribly readable, contain lots of information you may or may not want to copy (like permissions, schema settings, views, triggers…you know, database features that Rails users are supposed to avoid).
Worst of all, ddata dumps are vendor specific, so you can’t move data between databases (e.g., SQLite, MySQL, PostgreSQL). Rails is database-independent, but you’re pretty much stuck with whatever you picked at the start of the project. That’s hardly agile.
Realizing this problem was going to affect us in a pretty direct way with the import/export features of Heroku, Orion and I threw together a plugin yesterday which we’ve given the (rather uninspired) name YamlDb. Once you’ve installed it:
script/plugin install git://github.com/adamwiggins/yaml_db.git
(use http://github.com/adamwiggins/yaml_db.git for Rails < 2.1)
…you’ll have the additional rake tasks db:data:dump and db:data:load, which access a file data.yml. This is a simple dump of your database’s tables, which along with schema.rb, describe everything you need to make a complete copy of your database. (db:dump and db:load are shortcuts to do both schema and data together.)
What’s awesome about this is that you can switch databases in the blink of an eye. Got a SQLite3 database that’s gotten too big for its britches? rake db:dump, create a MySQL database, update your config/database.yml, and rake db:load. Presto, your app is now running in MySQL. Want to give PostgreSQL a try? Same process. Run it for a week, and decided you miss mysqladmin too much? Reverse the process, bringing the data which was stored in Postgres during that week back into MySQL.
Caveats: it’s much slower than dumping and restoring with your database’s native tool, and probably will stop working altogether with a database that is larger than RAM. But for most apps it should work fine. In the future we will most likely write database-specific extensions to make dump/restore process happen in chunks, and thus be more scalable.
As for Heroku, it now uses yaml_db as its native import/export format. So when you export an app from Heroku, you’ll always see a db/data.yml with your current data. It also makes it easy to import data into your Heroku app, at the same time you’re importing the code, or later on. Just install yaml_db in your app locally and run db:dump. Include data.yml with the import (or run and upload it separately), and then run db:load at the rake console.