Saturday, August 27, 2011

Dynamic data load in script task without a data flow task.( under copyrights)

This post / development has been pending and pondering in my mind and I am quite relieved now after what I achieved the dynamic data load in script task. Mind you , there is no data flow task involved here :-). This is after some heavy time investment during weekends …

What does this solution talk about ?

1. All about the data load from excel source into destination tables in SSIS using script task. Only script task.
2. While I insist on excel source , this can well be extended to table to table, table to excel , excel to excel.
3. A great characteristic I induced into this : Data validations. It can detect the data type mismatches between source and destination. While I tested this on excel to table. It could well be possible from table to table. But for table to excel and excel to excel , need more insight.
4. Error handling/ data type validation is designed for Decimal, INT, NVARCHAR (length) which can be customized further and further validations for varchar and other types can be added.

What are the inputs provided , prerequisites and limitations for this solution and enhancements which can be made further :

1. The source excel sheet has sheets named Sheet_test1$ and Sheet_test2$. (for test).
2. The table naming conventions are followed according to the source sheet names.(Wrk_Sheet_test1 and Wrk_Sheet_test2). The table names should be in hand before the dynamic data load as obvious it is , the tables are project specific. While as an alternative , all the table list can be stored in a table and they can be looped through to load the data into corresponding tables from the source sheets.

What are the advantages of this solution:

1. This package can load the data from these sheets irrespective of number of columns being added in future or irrespective of number of sheets added in future and it is very flexible.
2. The good feature is data type validation and error column detection. The error is logged into a separate error table. On your mark on error column detection with a very good specific error message with respect to data type validations.

Notes :

1. As far as testing is considered , I had this idea tested quite thoroughly in some form or the other(reliable ), but I would like to still burn down my arms on testing and customizing this.
2. I can really insist on the fact that this solution has some great deal of .NET coding in VB in script task. Now apart from that , you could learn how to loop through the source columns whether it can be excel or excel table.
3. There are many other interesting facts when you go through the package.

Here is some preview of code to loop through the columns :

For excel : ColumnsInTable = cnnXls.GetSchema("COLUMNS", columnRestrictions)
For database connections here is a special case of connection casting(otherwise the connection does not work inside the script):

'CONNECT TO THE DESTINATION DATABASE
DatabaseConnectionManager = DtsConvert.GetExtendedInterface(Dts.Connections("DatabaseConnection"))
DatabaseConnectionString = DatabaseConnectionManager.ConnectionString
'MAKE AN ARRAY OF THE CONNECTION STRING OF THE OLEDB CONNECTION.
arrDatabaseConnectionString = DatabaseConnectionManager.ConnectionString.Split(";")
'CREATE A NEW SQL CONNECTION , WE CANNOT CAST THE EXISTING OLEDB CONNECTION TO SQL CONNECTION AS UNMANAGED
'CONNECTIONS CANNOT BE CASTED UNLESS IT IT ADO.NET CONNECTION
oSQLCnn.ConnectionString = arrDatabaseConnectionString(0) + ";" + arrDatabaseConnectionString(1) + ";" + arrDatabaseConnectionString(3) + ";"
oSQLCnn.Open()


Where can you find the solution(the package source code) , scripts and source sheets to test this and also you can customize on your own way :

 https://skydrive.live.com/?cid=c43b6c5c822dd98a&sc=documents&id=C43B6C5C822DD98A%21107#!/?cid=c43b6c5c822dd98a&permissionsChanged=1&id=C43B6C5C822DD98A%21104






3 comments:

  1. Hi,
    it was really helpful but if I add the script task then in the select statement it is going to take more time in the target and if my select statement has more than 50 crore records than it's gonna take more time.
    you solution was perfect but can you suggest anything else without script task

    ReplyDelete
  2. Hi.I have downloaded and opened the package and in script task I am not able to see the actual code.

    ReplyDelete
  3. Hi Deepak,

    I am not able to find download package. will you please let me forword download link?

    ReplyDelete