For the past year I've been working on bringing the various databases at my current job under version control. To do this we are using a product called liquibase in combination with Jenkins to auto deploy changes to our database. This removes any manual user interaction with our databases and allows us to keep code and database changes locked together.
When deploying changes to our databases it is not enough to just modify the data structures and functions. You also need to be able to version control certain table data. While we do not want liquibase modifying user data, we do however need to be able to update reference/lookup tables data ( e.g. a list of countries ) so that we can change that data inline with changing the code that may depend on said data during deployment.
This is easily achieved with liquibase as it allows us to automatically run sql scripts on the database if the contents of that script changes. All we need to do is write our sql scripts for these specific tables so that new records are inserted when they don't exist, otherwise update the existing records based on the records Primary Key.
When working with MySQL/MariaDB databases the seemingly obvious solution to making re-runnable data updates is the REPLACE statement to insert new data or replace existing data. For example:
REPLACE INTO countries (
id
, name
, iso
) VALUES
( 1 , 'Afghanistan', 'AF' )
,( 2 , 'Albania', 'AL' )
,( 3 , 'Algeria', 'DZ' )
,( 4 , 'Andorra', 'AD' )
,( 5 , 'Angola', 'AO' )
There is a problem with this approach however. Lets look at how this statement actually operates.
The problem with this approach happens on step 2a. You will note that REPLACE does not UPDATE the record in question, it delete it. When this statement attempts the DELETE operation it will fail if there are any Foreign Key dependencies on the record being deleted.
The solution fortunately is relatively simple. The INSERT statement has a modifier IGNORE which allows us to add new records to our table ignoring any existing ones. When we pair this with the ON DUPLICATE KEY UPDATE modifier we achieve our goal of updating any existing records.
INSERT IGNORE INTO countries (
id
, name
, iso
) VALUES
( 1 , 'Afghanistan', 'AF' )
,( 2 , 'Albania', 'AL' )
,( 3 , 'Algeria', 'DZ' )
,( 4 , 'Andorra', 'AD' )
,( 5 , 'Angola', 'AO' )
ON DUPLICATE KEY UPDATE
name = values(name)
, iso = values(iso);
Using this pattern we can easily manage our reference tables data via version controlled sql scripts.