Traitée Look up Trsformation with 10 million records

  • mardi 1 mai 2012 19:14
     
     

    Hi I am using a look up to find a key based on a key column my input data set -1 has about 200,000 records with about 30 couumns and I am looking in to anothe  source table (data set -2 ) which has two columns I am looking up on to one column and bring the other columns this table has about more than 10 million records so my look up fails what is the best practice to do this.

    Note: I need all the 200,000 records (from data set -1)  to be flow to next stage even if I dont have any corresponding match look with data set-2 so I am using the ignore option in the look up tansformation.

    Thanks in advance.

Toutes les réponses

  • mardi 1 mai 2012 19:54
    Modérateur
     
     

    Hi,

    Do the both lookups on the database side, another approach is caching all to disk (using the Cache Transformation), but you may have a better luck with the databse side lookups using SQL (just using the Execute SQL Task).


    Arthur My Blog

  • mardi 1 mai 2012 20:09
     
     

    If you have both data sets as SQL Table , then use Join and get the required columns. You can use your query in OLE DB Source.

    Thanks

    Aamir


    http://sqlage.blogspot.com/

  • mardi 1 mai 2012 22:49
     
     
    My souure is a raw file and looking on to a oracle table
  • mardi 1 mai 2012 23:36
    Modérateur
     
     Traitée
    Then you can dump the contents of the raw file (I guess it is the smallest denominator here) to a temp table in Oracle and do the join on the oracle table. This way all the looksp occur on the database side.

    Arthur My Blog

  • mercredi 2 mai 2012 00:04
    Modérateur
     
     Traitée

    Here are a couple techniques for working with Lookups that should be useful.


    Todd McDermid's Blog Talk to me now on

  • mercredi 2 mai 2012 05:54
     
     

    What are the datatypes of the columns of dataset 2?
    If they are integers, the table should fit in memory.

    8 bytes * 10 million rows = 80,000,000 bytes = 80 megabyte. Not very spectacular.
    If you have two string columns with a length of 50, you have

    100 bytes * 10M rows = 1 gigabyte. Again, something a decent server should be able to handle.


    MCTS, MCITP - Please mark posts as answered where appropriate.