Inserting Multiple Rows With A Single SQL Statement

Published: {ts '2012-01-17 00:00:00'}
Author: Steven Neiland
Site Url: http://www.neiland.net/article/inserting-multiple-rows-with-a-single-sql-statement/

A common pattern I find in web applications that store large amounts of data is the practice of using a loop to wrap a sql "INSERT" statement to insert multiple rows. For example consider the following code to insert 5 near identical rows.

Note: While the below code is ColdFusion I have seen this pattern used in many other languages.

INSERT INTO sometable (col1,col2,col3) VALUES (#i#,'somename','sometitle')

While this code does work and produces the desired results it does have a drawback. The drawback being that each insert is a separate call to the database. This can create some unexpected problems when you start to scale up.

Partial Success

If you run the above code and a single insert fails all the previously inserted rows will still be there. This makes re-running the insert problematic as you need to work out what did and what did not get inserted successfully.

What About Transactions

You can of course use transactions to do a rollback so that if any insert fails they all fail. While this does solve the partial success problem it still does not address the issue of performance.

Scaling And Locking

While it may not be a big problem with 5 rows, if you scale the loop value in the above code up to 10,000 rows (or 100,0000) you suddenly have a different situation. You are now hitting the database with 10,000 individual calls in a very short amount of time.

While databases can handle this number of calls you also have to remember that you are locking the table 10,000 times which can slow performance for other users (unless you allow dirty reads). In addition the database must also continue serving all the other process's.

Network Performance

Another thing which gets overlooked very often is network speed. Many databases are housed on separate machines to the application server (sometimes in different physical locations) and communicate over the network. If your application is sending 10,000 (or more) individual insert statements that is potentially a lot of network traffic in a short amount of time.

Multi Row Inserts

The solution to this problem is a multi-row insert. To do this you write a single insert statement which contains all the values to be inserted in one pass. The implementation of this is slightly different for each database but they all show a significant performance improvement.

Note: I have not determined what the maximum number of inserts you can do at once is but I have run up to a thousand myself with no issues. I suspect that the maximum number is significantly higher.

Multi Row Insert In MySQL

To do a multi row insert in mysql is a simple matter of creating a comma delimited list of value groups.

INSERT INTO sometable (col1,col2,col3) VALUES (1,'somename','sometitle'), (2,'somename','sometitle'), (3,'somename','sometitle'), (4,'somename','sometitle'), (5,'somename','sometitle')

Appling this principle to the first code block INSERT INTO sometable (col1,col2,col3) VALUES , (#i#,'somename','sometitle')

Note the "if" condition check to add a separator after the first value group.

Multi Row Insert In SQL Server 2005

To accomplish this in SQL Server 2005 we use "SELECT value,value,value UNION ALL". Otherwise execution is exactly the same.

INSERT INTO sometable (col1,col2,col3) SELECT 1,'somename','sometitle' UNION ALL SELECT 2,'somename','sometitle' UNION ALL SELECT 3,'somename','sometitle' UNION ALL SELECT 4,'somename','sometitle' UNION ALL SELECT 5,'somename','sometitle'

Multi Row Insert In SQL Server 2008

Finally SQL Server 2008 allows you to use either the method used by SQL Server 2005 OR the method used by MySQL. My preference would be to use the MySQL method only because each insert line in the sql would be smaller in size due to the shorter text string.

Comparing

In order to demonstrate the speed difference between the two approaches I put together this simple video. As you can see the single call method is significantly faster.

Note: Best viewed full screen.

Speed Test Summary

 MySQLSQL Server 2005
Multiple Call Method52.573 seconds3.569 seconds
Single Call Method0.191 seconds0.051 seconds

Conclusion

Now when you look at it you could make the point that very few web applications would need to do this many inserts from a single entry point. Normally you would only be inserting one or two records at a time for say a single user. However if you have some type of data sync process being called then this technique would be applicable.

In addition if you were seeing a very high volume of inserts to a particular table and you could allow a small time lag then you could potentially put in place some type of application level queuing. This would do a single multi-line insert whenever a certain time frame had elapsed or the queue had exceeded a specified number of new records.