none
optimizing dataload in index tables

    Question

  • Hi all,

    We use SSIS DFT to perform dataload using oledbsource and oledb destination (fastload option). The target table is SCD2 type and we are in process of creating Index in target table. As the process is in development stage, no of records (per day) from source is unknown. But decided to built the index as per the rough estimate of rows per day received from client team. Loading data in to indexed table will take additional time.

    Is it good practice to disable and rebuild index before loading data. I read from the link(http://msdn.microsoft.com/en-us/library/ms177445.aspx) that no of incoming rows should be relative to the target row before disable and rebuild index. If the no of input rows is less on a daily basis can we give "order" (clustered index columns) in fasloadoption under oledb destination without disabling index.

    Checkconstraint & tableloack are already enabled as part of fastloadoption.

    Please find the index details in one of the tables created.

    Clustered index (3 columns) on Int
    non clustered no unique (2 columns) on Int

    we could not make standard steps now as we don't know the exact no of rows to come in production. Any suggestion would be highly appreciated.

    Saturday, February 25, 2012 7:47 PM

All replies

  • Are you asking if you should go with the clustered index versa non-clustered?

    Arthur My Blog

    Monday, February 27, 2012 3:11 PM
    Moderator
  • Hi msbi,

    If you want exactly to take the options that whether indexes should be dropped beforehand. You could use 2 "Execute SQL Task" to get the amount of new data imported and the amount of existing data in the table, then did division using the 2 numbers to calculate their proportion to see if the indexes should be dropped. The logic could be illustrated like this -

    [<<Execute SQL Task - to get the row count of the new data imported and assign the count to a Variable>>]
                                                                            |
                                                                            |
    [<<Another Execute SQL Task - to get the row count of existing data in the target table and assign the count to a variable>>]
                                       |                                                                                          |
                      (Count1/Count2<=30%)                                                    (Count1/Count2 > 30%)
                                       |                                                                                          |
                  [<<DFT->without dropping index>>]                                    [<<DFT - > drop index>>]

    In additional, to add the ORDER hint, open the advanced properties of the OLEDB destination componenet and switch to componenet Properties tab where you can find "FastLoadOption", in the box, type something like this: TABLOCK, ORDER(ClusteredColumn, ASC).

    Hope this helpfully,

    thanks,
    Jerry

    Thursday, March 01, 2012 9:37 AM
    Moderator
  • Thanks for your valuable input...will check this option

    Could you please give more details on Order hint..will this improve the performance of loading.

    Wednesday, March 14, 2012 2:39 PM