"SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database software that can be used to perform a broad range of data migration tasks.

SSIS is a platform for data integration and workflow applications. It features a fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL).  The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data."
[Wikipedia]

One of the most important components of the SSIS is "the Data Flow task that encapsulates the data flow engine that moves data between sources and destinations, and lets the user transform, clean, and modify data as it is moved. Addition of a Data Flow task to a package control flow makes it possible for the package to extract, transform, and load data." [TechNet Library]

To utilize the data flow behavior, we need to learn 3 parts of it.

(a) Data Flow Sources
(b) Data Flow Transformations
(c) Data Flow Destinations


Data Flow source picks the data from source defined in the properties of this components
e.g. If you want to extract the data from a flat file and want to get it inserted into Database. Flat File data source can be useful.

Data Flow Transformation is required when you want to do any kind of change in the source of the data before inserting into destination database (or any data flow destination e.g. Flat file, Excel, SQL Server etc.)

The various example of Transformation can be adding a BatchId or Date Time when the data is extracted can be added using Data Flow Transformations task, in this case derived column for example.

One of most used data flow destination is OLEDB Destination for all kind of imports from various sources to OLEDB data source destination, e.g. ACCESS, SQL Server, ORACLE, TERADATA etc.

In OLEDB Data Flow destination we set the properties where you want to insert the data from source

1. Connection Manager :  Set the Connection Manager name
2. Data Access Mode: How will data be accessed in loading into destination?
3. Name of Table/View: Sets the Name of the Table or Views

Here we will discuss the various data access modes and why they are provided in OLEDB Destination

1. Table or View: 
Design time specify the Table or View name where you want to insert the data into. The view you are specifying here must be updatable to work.

2. Table or View Fast Load:  

Specify the Table or View name where you want to insert the data into. The view you are specifying here must be updatable to work.
You can't enable error configuration for row direct which is a major disadvantage. But that makes it’s faster than the other one

 In Fast load you do have options to enable or disable

(a) Keep Identity
(b) Keep Nulls
(c) Table Lock
(d) Check Constraints

We also have to set parameters:

(a) Rows per Batch: 
How many rows will be inserted per batch in loading data from Source to Destination?

(b) Maximum Insert Commit Size: 
This option specifies the maximum possible count of records buffered up and inserted in the destination per batch. It will depend on the hardware/resources available. The default value is 2147483647. There is no best value, it depends greatly on the design of the database, the number of users, the kind of hardware you are operating one etc. You need to test for yourself with your system.

3. Table or View Name from Variable: 
Specify the Table or View name from a SSIS variable where you want to insert the data into. This is useful when you want to put the table/view name in run time rather than design time. Again, the view you are specifying through variable in runtime here must be updatable to work.

4. Table or view name from Variable – Fast Load: 
Same as above with fast Load option

5. SQL Command : 

There are various use of SQL Command compare to Table or View (all the above 4 options discussed)

Few of them are listed below.

 (a) There are several cases where no of data object (tables views etc ) are 1000+ in that case selecting table of view is difficult from drop down list, so there is choice of putting the name manually using SQL command, where you can write the table/view name.

(b) It is better to select only few columns if you don’t want to insert into sparse columns of SQL server tables, sometime not all columns are inserted in dataflow, so why bother about those columns by selecting just table name.

(c) In SQL command you can specify all the SQL HINTS if you want to use at the time of inserting in Destination database. In Fast load only few hints are available like Table LOCK

Any comments or feedback will be appreciated


See Also