none
filter large table stored in SQL Server with a list of product from the Excel document? RRS feed

  • Question

  • Hi,

    My users are doing this a lot of time:

    creating a list of product in their Excel and want to gather statistics for these products.

    So how I can filter my large sales table using this Excel list without having to do a bunch of complicated stuff (ie: power query formulas, vba etc...)

    my only option for now is to rely on MDS to push the data into the database and then join the list to my fact table on the SQL Server side...

    But is powerquery has something in place to do this? I dont want to start managing a bunch of MDS entities...

    Friday, September 4, 2015 12:59 PM

Answers

  • Yes, there is a simple way to do this in Power Query: You filter your SQL data by your Excel-table by doing a merge in "only include matching rows"-mode (JoinKind.Inner).

    This will enable Query folding, which is a prerequisite to work on large SQL tables in Power Query (http://blog.crossjoin.co.uk/2014/06/04/join-conditions-in-power-query-part-2-events-in-progress-performance-and-query-folding/): This will make sure that the selections is made on the Server and only matching rows are returned. Otherwise performance would probably be inacceptable.

    This is clickable in the UI. However: There's a bug in there currently, that limits this technique to non-SQL-filter-tables that are not longer than 200 rows.

    To solve this, I've created a technique that automatically detects into how many packets your table needs to be split and then executes a function on it to make this technique also work on larger datasets. This however, is not clickable, but needs (just a couple) of code lines in M. If this is of interest for you, I could post the code.


    Imke Feldmann TheBIccountant.com

    Friday, September 4, 2015 2:07 PM
    Moderator

All replies

  • Yes, there is a simple way to do this in Power Query: You filter your SQL data by your Excel-table by doing a merge in "only include matching rows"-mode (JoinKind.Inner).

    This will enable Query folding, which is a prerequisite to work on large SQL tables in Power Query (http://blog.crossjoin.co.uk/2014/06/04/join-conditions-in-power-query-part-2-events-in-progress-performance-and-query-folding/): This will make sure that the selections is made on the Server and only matching rows are returned. Otherwise performance would probably be inacceptable.

    This is clickable in the UI. However: There's a bug in there currently, that limits this technique to non-SQL-filter-tables that are not longer than 200 rows.

    To solve this, I've created a technique that automatically detects into how many packets your table needs to be split and then executes a function on it to make this technique also work on larger datasets. This however, is not clickable, but needs (just a couple) of code lines in M. If this is of interest for you, I could post the code.


    Imke Feldmann TheBIccountant.com

    Friday, September 4, 2015 2:07 PM
    Moderator
  • well... in my case its just a little more complicated...

    the user dont know the surrogate keys of the products (and dont have to know them)

    he setup a items like: ID, Color, Size, etc...

    could be 1 ID in 1 color

    or 1 id all colors, all size

    resulting in a bunch of combination and in few products or large list of products at the end.

    so its not a simple join.

    but maybe I can use an M function receiving all these product filter criterions, producing a list of product surrogate keys from my DW and them filter my sales with a join...

    Does this filter the sales on the server side?

    Friday, September 4, 2015 2:27 PM
  • Yes, server-side filterting is what I tried to explain with "Query-folding" :-)

    It will stop at a certain Point (have a look at Chris' article), so make sure that this filtering is done first. You might create a separate query for the preparations:

    Good thing about this solution is that you can use as many columns as your keys surrogates as you need. So no need to create one key as in Power Pivot.  

    I'm pretty sure that you will be able to solve the multiple-selections-for-one-column challenge as well. Split columns and unpivot might help you. Otherwise just ask or post a sample file.


    Imke Feldmann TheBIccountant.com

    Friday, September 4, 2015 2:35 PM
    Moderator