none
Merging Queries by Column Key RRS feed

Answers

  • Cannot download file anymore, but tested on my sample.

    A litle bit shorter code:

    let Source = Excel.CurrentWorkbook(), Source2 = Source[Content], Custom1 = List.Accumulate(List.Skip(Source2, 1),Source2{0},(s, c)=>Table.Join(s, "Key", c, "Key")) in Custom1


    Assuming that in this file you have only source tables with keys. But if you have another tables in the workbook, you can apply a filter (I recommend to do it) before 2nd step. The idea is to get the list of tables-to-join before last step.

    But there no limit (almost) for amount of joined tables, any number.


    Maxim Zelensky Excel Inside


    • Edited by Maxim ZelenskyMVP Friday, May 26, 2017 4:53 PM additions
    • Marked as answer by NicoPer Friday, May 26, 2017 8:38 PM
    Friday, May 26, 2017 4:50 PM
  • Or put another way - in a blank query:

    let Tables = {Nr1, Nr2, Nr3, Nr4}, Joined = List.Accumulate(List.Skip(Tables), Tables{0}, (accum, curr) => Table.Join(accum, "Key", curr, "Key")) in Joined

    Which is a good demonstration of how hand coding can be much more efficient than using the UI (in this specific case). Also demonstrates the versatility of List.Accumulate, which can be used in many cases instead of recursion or List.Generate. Good of you to mention this technique! However, I suspect that even this far more efficient code won't help a lot with a very large data set. Nico should try it anyway.

    • Marked as answer by NicoPer Friday, May 26, 2017 7:12 PM
    Friday, May 26, 2017 6:11 PM

All replies

  • Excel 2010/2013/2016 Power Query (aka Get & Transform)
    With Append, and Pivot in either PQ or PP.
    http://www.mediafire.com/file/8cdcnosv4oc5xdc/05_25_17.xlsx

    Thursday, May 25, 2017 9:05 PM
  • Thanks Herbert

    I´ve checked the file, but the output is completely different.

    Here´s a pic of the original sample with all the tables having the same rows.
    Note that the Key is shared along all tables and there are no empty spaces




    and this is a pic of the file you sent:



    What could it be?


    Friday, May 26, 2017 2:30 AM
  • Just successively join the tables by the key column. You can join through the Query Editor (Home-->Merge Queries) or paste the following code in a blank query:

    let
        Source = Table.NestedJoin(Nr1,{"Key"},Nr2,{"Key"},"NewColumn",JoinKind.Inner),
        MergedQueries = Table.NestedJoin(Source,{"Key"},Nr3,{"Key"},"NewColumn.1",JoinKind.Inner),
        MergedQueries1 = Table.NestedJoin(MergedQueries,{"Key"},Nr4,{"Key"},"NewColumn.2",JoinKind.Inner),
        ExpandedNewColumn = Table.ExpandTableColumn(MergedQueries1, "NewColumn", {"Nr2"}),
        ExpandedNewColumn.1 = Table.ExpandTableColumn(ExpandedNewColumn, "NewColumn.1", {"Nr3"}),
        ExpandedNewColumn.2 = Table.ExpandTableColumn(ExpandedNewColumn.1, "NewColumn.2", {"Nr4"})
    in
        ExpandedNewColumn.2

    • Marked as answer by NicoPer Friday, May 26, 2017 4:39 AM
    • Unmarked as answer by NicoPer Friday, May 26, 2017 7:12 PM
    Friday, May 26, 2017 3:51 AM
  • That did the trick.

    Thanks a million Colin, I appreciate it.

    Friday, May 26, 2017 4:40 AM
  • Quick note: It did work perfectly with the sample, but when I tried it with the big tables (+2.5 million rows each), excel completely freezes.

    I guess I´ll turn again to PowerPivot for this task..


    As a sidenote to developers, this could be a good idea for a "built in" option in the Merging part of PQ . The current merging options cannot do that.




    • Edited by NicoPer Friday, May 26, 2017 2:20 PM
    Friday, May 26, 2017 2:19 PM
  • Cannot download file anymore, but tested on my sample.

    A litle bit shorter code:

    let Source = Excel.CurrentWorkbook(), Source2 = Source[Content], Custom1 = List.Accumulate(List.Skip(Source2, 1),Source2{0},(s, c)=>Table.Join(s, "Key", c, "Key")) in Custom1


    Assuming that in this file you have only source tables with keys. But if you have another tables in the workbook, you can apply a filter (I recommend to do it) before 2nd step. The idea is to get the list of tables-to-join before last step.

    But there no limit (almost) for amount of joined tables, any number.


    Maxim Zelensky Excel Inside


    • Edited by Maxim ZelenskyMVP Friday, May 26, 2017 4:53 PM additions
    • Marked as answer by NicoPer Friday, May 26, 2017 8:38 PM
    Friday, May 26, 2017 4:50 PM
  • Or put another way - in a blank query:

    let Tables = {Nr1, Nr2, Nr3, Nr4}, Joined = List.Accumulate(List.Skip(Tables), Tables{0}, (accum, curr) => Table.Join(accum, "Key", curr, "Key")) in Joined

    Which is a good demonstration of how hand coding can be much more efficient than using the UI (in this specific case). Also demonstrates the versatility of List.Accumulate, which can be used in many cases instead of recursion or List.Generate. Good of you to mention this technique! However, I suspect that even this far more efficient code won't help a lot with a very large data set. Nico should try it anyway.

    • Marked as answer by NicoPer Friday, May 26, 2017 7:12 PM
    Friday, May 26, 2017 6:11 PM
  • Great!! Just when I lost all hope..

    Thank you both very much for the replies and code. 
    I fixed the link of the sample to make it permanent (picked the wrong type)

    So I´ve tested everything, first in the sample.

    Maxim, I got this error with your code:


    But I´m sure is just a quick fix. What could it be?
    Really interested in picking all tables automatically.


    And Colin, your code worked like a charm, almost instantly with 3 tables of +2.5 million rows.

    Again thanks a million to both.

    Friday, May 26, 2017 7:11 PM
  • Wow! I didn't expect that result. Maxim's solution works as follows:

    If you have your four tables on the worksheet, then Source = Excel.CurrentWorkbook()[Content] gives you a list of the tables, similar to {Nr1, Nr2, Nr3, Nr4}. You don't have to create a query for each individual table in this case. You can then add the join step.

    If the tables are coming from another source, you obviously can't use this technique. If you have more  tables in the worksheet than you need to work with, you have to remove the extra tables either from the worksheet or in the query editor.


    Friday, May 26, 2017 8:05 PM
  • Ah, I see. No, the sources are external TXT files. I placed them in the sample to make it simpler.
    Ok then, I´ll stick with your code. Again thank you very much.   

    And Maxim, a special thanks to you as for what I see your code was the seed to this solution.
    Friday, May 26, 2017 8:23 PM
  • Definitely, the credit goes to Maxim. Consider marking his solution also as an answer.
    Friday, May 26, 2017 8:28 PM