Developing applications with a relational database requires specialized knowledge. Traditionally, database knowledge was divided between database application developers and database administrators (DBA). More than a decade ago, I took training at The Learning Tree and got certified as an Oracle Applications Developer and as an Oracle Database Administrator. My goal was to get knowledge in SQL and Oracle that I could apply as a Java application developer. My training as a DBA saw little practical use, because each project that I worked on had access to a DBA, someone who spent all of their time just working with the database. For quick view of a DBA’s duties, see the Wikipedia article.
My job was to design the database schema for the application, map its tables and views into Java, and write queries with Java technologies. I also learned how to automate building a database schema and populating it with data automatically, using Java build tools, such as Ant and Maven. Work such as installing a database, configuring security, managing backups, etc., especially in production environments, was left to a dedicated DBA. And the DBA did one job that seemed mysterious and difficult: creating the database migration script. This was the script, created manually by the DBA for each application release, that modified the existing schema in production, altering table structures, inserting new reference data, and so on, to bring the old version of the database up to the version needed by the application being released with it.
What if your team has no access to a professional DBA?
Let’s assume that environments to support your application have been set up. The environments have a database server and an application server. Continuous integration environments also have an integration server, like Jenkins. Your team is responsible to automate building the database, populating it with reference data. Test environments need to be loaded with some test data, and production environments need to be upgraded, automatically. How do you do it?
The first step is to automate rebuilding of a clean database schema and populate it with data for the purposes of running automated integration and functional tests of the application. This assumes that every environment, including development environments on local workstations, has a database server installed. Then you can write ‘nuke and pave’ scripting that tears down and rebuilds the schema clean on every automated build cycle. What technologies are available? I have used:
Any of these can be used in combination with a build manager such as Maven or Gradle to automate database work. Each uses a different expression for data. Ant SQL can run SQL files directly. Dbunit lets you express your data in XML. Groovy lets you express everything in the Groovy language. All of these can work for local builds and CI. But the production environment is not likely to support these build tools. All you can count on is having access to the database. And it talks SQL.
This makes a case for writing all of your database management scripts as SQL. Then, in addition to being called during automated builds, the same scripts can be delivered to the production environment and run as pure SQL. Now that you are on the hook to write the mysterious database migration script, how will you do that?
Ruby On Rails incorporates a concept of database migrations. For Java and Grails, the Liquibase project offers similar functionality. I have worked with teams that tried these and found them lacking. A goal of such frameworks is to let the developer write database scripts in a more familiar language than SQL. My view is that this approach often fails. Migration files proliferate, spreading changes across many files. Generated SQL is unreadable. SQL is just another language. Learn it! Support for non-SQL migrations might not be available in production environments. SQL always is. Write your scripts in SQL using good coding standards and the scripts will be readable, as well as functional.
So put on your DBA hat and start writing the migration script. When? As soon as you have your first release in production. The next release will have changes to the database. How will you test the migration script? It will be tested on every automated build in your development and CI environments! The key is for each release you have a baseline script that builds the database up to that release version. Every time the database is nuked and paved, it is paved in stages, first applying the baseline of the prior release and then running the migration script that you are continually developing. If you add a table, your migration script creates it. If you alter a table, likewise. Your migration will also insert or delete reference data.
The final piece is to create master scripts to orchestrate the creation of a fresh test database (starting with an empty schema), or the upgrade of an existing production database. The scripts should be simple enough for deployment staff to invoke without having much specialized knowledge. A production migration can be done by executing a single master script. Feedback should be provided on the success or failure of the script by logging, or even by writing the status of the upgrade to a database table.
- Write all DDL and DML scripts in SQL in the flavor of your database.
- Invoke the scripts during build automation.
- Maintain baseline scripts to build the previous release version.
- Maintain migration scripts to upgrade to the next version.
- Run as many of the scripts as possible during every automated build.
- Provide master scripts to invoke the scripts independently of build automation.
- Developers will have to learn SQL. But this is good.
- Re-baselining needs to be done every so often, so you don’t have too many migration scripts, which can slow build times.
- Works best if the release schedule is less frequent.