none
Fuzzy Lookup Transformation question

    Question

  • Experts,

    I'm currently using "Fuzzy Lookup Transformation" where I'm comparing 23 columns from available input columns to available lookup columns. Everything is working fine as per our expectation.

    Now if I add one more column into this Fuzzy Logic meaning total of 24 columns than I'm receiving following error message:-

    "[SSIS.Pipeline] Error: component "LocationFuzzy Lookup" failed the pre-execute phase and returned error code 0x8007007A."

    Is their any limitation with the number of columns comparision?

    Please do let me know what is the work around to find suitable solution with my type of req.

    Thank You in advance.

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Wednesday, February 22, 2012 8:41 PM

Answers

  • Hi SSISJoost,

    Sorry for the deplay in response.

    But I'm lost. :(

    Can you please give me step by step procedure?

    Thank You

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    easy...

    1) for your source component (I guess it is a database table) don't use a table but use a query as source:

    SELECT  Address_Line_1 + ' ' + Address_Line_2 + ' ' + Address_Line_3 AS AddressLineCompleet
    ,       Address_Line_1
    ,       Address_Line_2
    ,       Address_Line_3
    ,       etc.
    FROM    MyTable 

    2) Do the same for the fuzzy lookup. Create for example a view with only the AddressLineCompleet and not the separate addressline columns.
    map the AddressLineCompleet from your source with the AddressLineCompleet from your Fuzzy Lookup.

    3) Then continue as if AddressLineCompleet didn't exists and map the Address_Line_1, Address_Line_2 and Address_Line_3 to your destination columns


    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com

    • Marked as answer by Eileen Zhao Friday, March 02, 2012 7:34 AM
    Monday, February 27, 2012 6:58 PM

All replies

  • There is a limit of some kind, certainly, but I think it blew the memory buffer allocation, thus I think this item is applicable:

    http://connect.microsoft.com/SQLServer/feedback/details/266185/ssis-fuzzy-lookup-fails-when-it-has-10-columns-or-greater-to-match-and-a-high-number-of-pass-through-variables.

    Remarkably, the workaround it too keep the number of columns to minimum.

    I agree, it is just not practical.

    Consider revising the design toward breaking the processing into multi-stage or multi-piece operation.


    Arthur My Blog

    Wednesday, February 22, 2012 9:46 PM
  • Can't find anything about that in msdn documentation. But isn't it just requesting a little bit to much resources/memory? Maybe check out the MaxMemoryUsage property?

    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com

    Wednesday, February 22, 2012 9:49 PM
  • Hi ArthurZ,

    I review the link which you have provided in your last reply. It says work around as below:-

    "In the fuzzy lookup your first aim is to keep the number of pass through variables to an absolute minimum. Use other methods to obtain and populate these columns after the fuzzy lookup has processed.

    Next reducing the number of columns that the fuzzy lookup transformation has to match on also helps (but also partly defeats the object of why you need it in the first place =]) "

    Can you please show me an example about the work around which is described? I wasn't able to understand properly.

    Thanks in advance.

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Thursday, February 23, 2012 2:54 PM
  •  Hi SSISJoost,

    For the "MaxMemoryUsage" properly by default value is 0, what value should I keep to usage more memory in terms of MB. Can you give please show me example?

    I'm unable to find same example on SSIS forum nor on google.

    Thanks in advance

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Thursday, February 23, 2012 3:04 PM
  • The value of 0 instructs the SSIS engine to allocate the memory dynamically, IMHO this should stay as is.

    I suggest this: try running this package on a different machine (preferably with a better hardware set).


    Arthur My Blog

    Thursday, February 23, 2012 3:42 PM
  • Hi ArthurZ,

    I tried running on 3 different Servers but didn't got any luck.

    Any other ideas? please?

    Thanks

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Thursday, February 23, 2012 4:47 PM
  • Hi Apex,

    Then you need to invest some time in breaking the Fuzzy Lookup into parts. You simply have to reduce the amount of fields used in it.


    Arthur My Blog

    • Proposed as answer by Eileen Zhao Friday, February 24, 2012 8:36 AM
    Thursday, February 23, 2012 6:27 PM
  •  Hi SSISJoost,

    For the "MaxMemoryUsage" properly by default value is 0, what value should I keep to usage more memory in terms of MB. Can you give please show me example?

    I'm unable to find same example on SSIS forum nor on google.

    Thanks in advance

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS


    I have never have to use the MaxMemoryUsage property because I always use 2 or 3 columns max... so no examples. Best thing is to decrease the number of columns and records in your lookup. Maybe skip a couple of columns that don't make sense for a fuzzy lookup (like codes and numbers).

    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com

    Thursday, February 23, 2012 8:13 PM
  • Hi ArthurZ/SSISJoost,

    All the columns are mandatory so I can reduce it, I double check with my users and also with our DW team, WE NEED ALL THE COLUMNS.

    Can you throw me some light how to break the fuzzy lookup such that I can use all the 24 columns successfully?

    That will be great help to me.

    Thanks in advance.

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 2:08 PM
  • Please post screenshots how you set it up.

    Arthur My Blog

    Friday, February 24, 2012 3:25 PM
  • Hi Again,

    Here you go,

    1. Data Flow Task Screenshot,

    2. Fuzzy Lookup Screenshot,

    Please let me know if you need further more information with this.

    Thank You

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 3:58 PM
  • Hi Apex,

    Looked at the screenshot and asked myself why would you put the multicast in front of the lookup if only one output's used?


    Arthur My Blog

    Friday, February 24, 2012 4:01 PM
  • Hi ArthurZ,

    I can remove it, does that make any difference?

    Thanks

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 4:03 PM
  • Hi Again,

    I remove "MultiCast" component but still it is failling up, here is my DFT screenshot:-

    Here is the error message from the "Progress" tab:-

    Please help me out.

    Thanks

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 4:16 PM
  • Let's do that and see if that changes anything.

    Arthur My Blog

    Friday, February 24, 2012 4:16 PM
  • Didn't help me out. :(

    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 4:17 PM
  • You are really hitting the technical limits of the Fuzzy Lookup component (still the MS Connect item https://connect.microsoft.com/SQLServer/feedback/details/266185/ssis-fuzzy-lookup-fails-when-it-has-10-columns-or-greater-to-match-and-a-high-number-of-pass-through-variables applies).

    Sigh, you need to probably move this process outside SSIS if you cannot reduce the number of columns or simplify it.

    One approach I would go with is to create a view that combines any relevant columns together into one string e.g.

    Create View Simple AS

    SELECT Address_Line_1 + Address_Line_2 + Address_Line_3 AS AddressLine, etc. FROM MyTable ...

    And then the FuzzyLookup will get fewer columns!


    Arthur My Blog

    Friday, February 24, 2012 4:23 PM
  • Hi,

    Your view idea looks good, but will inserting into OLEDB destination, how would I split up back to Line1, Line2 & Line3?

    Any idea?

    Thanks

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Friday, February 24, 2012 5:59 PM
  • and i see you also fuzzy match zipcode... doesn't seems right

    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com

    Friday, February 24, 2012 6:02 PM
  • Hi,

    Your view idea looks good, but will inserting into OLEDB destination, how would I split up back to Line1, Line2 & Line3?

    Any idea?

    Thanks

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS


    create an extra column for the combined fields that you only use for the lookup.

    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com


    Friday, February 24, 2012 6:03 PM
  • Agree with SSISJoost,

    you may want to sort of "couple" then "de-couple" the fields. Alas, you have gotten a very hard nut to crack.


    Arthur My Blog

    Friday, February 24, 2012 6:46 PM
  • Hi SSISJoost,

    Sorry for the deplay in response.

    But I'm lost. :(

    Can you please give me step by step procedure?

    Thank You

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    Monday, February 27, 2012 4:07 PM
  • Hi SSISJoost,

    Sorry for the deplay in response.

    But I'm lost. :(

    Can you please give me step by step procedure?

    Thank You

    Regards,

    Apex


    Please do let us know your feedback. Thank You - KG, MCTS

    easy...

    1) for your source component (I guess it is a database table) don't use a table but use a query as source:

    SELECT  Address_Line_1 + ' ' + Address_Line_2 + ' ' + Address_Line_3 AS AddressLineCompleet
    ,       Address_Line_1
    ,       Address_Line_2
    ,       Address_Line_3
    ,       etc.
    FROM    MyTable 

    2) Do the same for the fuzzy lookup. Create for example a view with only the AddressLineCompleet and not the separate addressline columns.
    map the AddressLineCompleet from your source with the AddressLineCompleet from your Fuzzy Lookup.

    3) Then continue as if AddressLineCompleet didn't exists and map the Address_Line_1, Address_Line_2 and Address_Line_3 to your destination columns


    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com

    • Marked as answer by Eileen Zhao Friday, March 02, 2012 7:34 AM
    Monday, February 27, 2012 6:58 PM