Table.FuzzyNestedJoin Really Slow When Working with Excel Data RRS feed

  • Question

  • Hello Microsoft Technet,

    I'm hoping someone can assist me with an issue that I have been having. I have been ENTHUSIASTIC about the implementation of Table.FuzzyNestedJoin in Power Query for Excel as it has allowed us to overcome several efficiency issues with merging tables.

    One issue that I seem to be encountering, however, is using fuzzy with Excel-based datasets. For example, I can provide two workbooks that simulate a process that I would much prefer to use fuzzy rather than an exact join. One uses Table.NestedJoin and loads an exception list against data from two Excel tables near-instantly. However, the other workbook replaces the join code with Table.FuzzyNestedJoin and moves excruciatingly slow on a comparatively tiny dataset (117k - less than 2000 rows x 8 columns for both tables).

    Additionally, it appears to be bugging with some internal functionalities of Excel as, even after the refresh is complete, Excel will continue to eat 50+% of my CPU for up to two minutes after the process - in some cases, the application crashes if I try to do things like copy-paste during this time. I am using the latest version of Office 365 (64 bit - ver 1907), both on a Lenovo Thinkpad i5 Windows 7 machine (work) and on my Windows 10 i7 Surface Pro 4 - both experiencing similar behaviors.

    I have tried removing formulas from the workbook, using named ranges instead of tables, tried using the same basic code in an external workbook (replacing source with Excel.Workbook rather than Excel.CurrentWorkbook), and using Table.Buffer across several points in my code, with no avail. 

    As a test, I ran some similar code against over 40,000 rows x 9 columns from an Access Database and left outer joined (using fuzzy) a SharePoint list - no performance issues. 

    I'm really hoping someone can help me out here. I'm hoping its just a best practices issue that I'm not following. Please let me know if I can clarify any details.

    • Edited by Auburn_Eagle05780 Saturday, August 17, 2019 7:58 AM redundant info in body
    Saturday, August 17, 2019 7:53 AM


  • Hi,

    I tried to reproduce this issue, but the application seems to work fine on my machine. To identify the issue, can you help us by running fuzzy nested join on tables of other sizes?

    1) A table with 2000 rows but fewer columns (maybe once with 2 columns and another time with 5 columns)

    2) A table with 8 columns but fewer rows (maybe with the number of rows = 100, 500, 1000 and 1500)

    Does fuzzy nested join work smoothly on smaller tables? I don't seem to be running into an issue on a system with 32GB RAM, 3.60GHz processor running Excel on Windows 10.

    Thursday, October 17, 2019 6:18 PM