13 December 2011

Bulk insert efficiently

Using an ORM like Entity Framework or performing single inserts is not efficient to perform bulk operations.
To  bulk insert efficiently, the SqlBulkCopy class ADO.Net must be used.

The SqlBulkCopy class is used to efficiently bulk load a Sql Server table with data from another sources.
SQL Server has a command-prompt utility named bcp to also perform a bulk insert. The SqlBulkCopy is the equivalent for writing managed code.

The data source, like for example a flat file, can be converted to a generic list, that is the source of the SqlBulkCopy or a DataTable can be used directly.

To bulk insert a generic list, it must be converted to a DataTable so that it can be used by the SqlBulkCopy:
public static DataTable ConvertToDataTable<T>(IList<T> list)
    PropertyDescriptorCollection propertyDescriptorCollection = TypeDescriptor.GetProperties(typeof(T));
    DataTable table = new DataTable();
    for (int i = 0; i < propertyDescriptorCollection.Count; i++)
        PropertyDescriptor propertyDescriptor = propertyDescriptorCollection[i];
        Type propType = propertyDescriptor.PropertyType;
        if (propType.IsGenericType && propType.GetGenericTypeDefinition() == typeof(Nullable<>))
            table.Columns.Add(propertyDescriptor.Name, Nullable.GetUnderlyingType(propType));
            table.Columns.Add(propertyDescriptor.Name, propType);
    object[] values = new object[propertyDescriptorCollection.Count];
    foreach (T listItem in list)
        for (int i = 0; i < values.Length; i++)
            values[i] = propertyDescriptorCollection[i].GetValue(listItem);
    return table;

Then the SqlBulkCopy can be used.
The steps to use the SqlBulkCopy are simple:
1) A SqlConnection open must be passed to it's constructor.
2) A DataTable must have the data to bulk insert
3) The mappings between the database columns and the DataTable must be defined, using the ColumnMappings propery.  The Column mappings define the relationships between columns in the data source and columns in the destination.
4) The DestinationTableName property must be set with the name of the destination table on the server.
5) The WriteToServer method copies all the rows of the DataTable to the specified destination table.

In the following example the user table is bulk inserted, where dt is the DataTable with the data to bulk insert.

DataTable dt = new DataTable();
using (SqlConnection connection = new SqlConnection(connectionString))
    using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(connection))
        sqlBulkCopy.ColumnMappings.Add("UserID", "UserID");
        sqlBulkCopy.ColumnMappings.Add("UserName", "UserName");
        sqlBulkCopy.ColumnMappings.Add("Password", "Password");
        sqlBulkCopy.DestinationTableName = "User";
The WriteToServer method also supports a DataRow array or a IDataReader.

No comments: