none
Conditional Join RRS feed

  • Question

  • Hello,

    I have two tables that need joining, but it needs to be conditional. Table1 has variables ID and REC-Date (date received). Table2 has ID and REV-Date (date reviewed). There are duplicate IDs because a case may have been received and reviewed more than once.

    If there is a duplicate ID, then I need to ensure that period 1 (REC-Date to REV-Date) is prior to period 2 (REC-Date). Said another way, the first period REV-Date needs to be earlier than the second period REC-Date.

    Thank you and let me know if you need clarifications.

    Tim

    Tuesday, May 10, 2016 6:16 PM

Answers

  • Hi All,

    I learned that I needed to create a clustered index in order to get what I was after.

    The single column solution is:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"ID", type number}}),
        AddedSortIndex = Table.AddIndexColumn(ChangedType, "SortIndex", 0, 1),
        GroupedRows = Table.Group(AddedSortIndex, {"ID"}, {{"FilteredTable", each _, type table}}),
        AddedClusteredIndex = Table.TransformColumns(GroupedRows, {"FilteredTable", each Table.AddIndexColumn(_,"ClusteredIndex",1,1)}),
        ExpandedTable = Table.ExpandTableColumn(AddedClusteredIndex, "FilteredTable", {"SortIndex", "ClusteredIndex"}),
        SortedRows = Table.Sort(ExpandedTable,{{"SortIndex", Order.Ascending}}),
        RemovedColumns = Table.RemoveColumns(SortedRows,{"SortIndex"})
    in
        RemovedColumns

    This is the output that I was after:

    • Marked as answer by Tim Dyeson Tuesday, May 24, 2016 2:59 PM
    Tuesday, May 24, 2016 2:59 PM

All replies

  • You could create partitioned indexes in each file (for the ID sorted by date) as here: https://social.technet.microsoft.com/Forums/en-US/c36810ae-840c-4cce-95b8-a841ab8f5fa7/implementing-rank-partitioned-index?forum=powerquery Then join the two files using ID and index as the key. Dale
    Wednesday, May 11, 2016 11:12 AM
  • Hi Tim,

    Can you provide an small example of the input tables and what your desired result would be?

    Thanks,

    Ehren

    Thursday, May 12, 2016 6:51 PM
    Owner
  • Hi All,

    I learned that I needed to create a clustered index in order to get what I was after.

    The single column solution is:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"ID", type number}}),
        AddedSortIndex = Table.AddIndexColumn(ChangedType, "SortIndex", 0, 1),
        GroupedRows = Table.Group(AddedSortIndex, {"ID"}, {{"FilteredTable", each _, type table}}),
        AddedClusteredIndex = Table.TransformColumns(GroupedRows, {"FilteredTable", each Table.AddIndexColumn(_,"ClusteredIndex",1,1)}),
        ExpandedTable = Table.ExpandTableColumn(AddedClusteredIndex, "FilteredTable", {"SortIndex", "ClusteredIndex"}),
        SortedRows = Table.Sort(ExpandedTable,{{"SortIndex", Order.Ascending}}),
        RemovedColumns = Table.RemoveColumns(SortedRows,{"SortIndex"})
    in
        RemovedColumns

    This is the output that I was after:

    • Marked as answer by Tim Dyeson Tuesday, May 24, 2016 2:59 PM
    Tuesday, May 24, 2016 2:59 PM