none
Power Query - If statement while merging 2 tables RRS feed

  • Question

  • Hello I am trying to merge 2 tables in Power Query and compare prices etc.
    In order to create a unique identifier I need to concatenate PO columns with one of the following columns:

    SKU, EAN or Customer Material

    PO & SKU
    PO & EAN
    PO & CUSTOMER MATERIAL

    I cannot use just one of them as non of the combinations returns 100% match between tables as some SKUs, EANs, Customer Materials are missing in the customer table, so the identifier cannot be created and the tables cannot be fully merged

    My question is : how I can use "if" statement (or any other function) so that if 1st combination fails then it should go for the next one and if the second combination fails then it should go for the third one :

    if PO & SKU does not find a match go for PO & EAN ,
    if PO & EAN  does not find a match go for  PO & CUSTOMER MATERIAL

    Below exemplary tables


    SALES TABLE
    PO Material EAN Customer Material PRICE $
    ABC123 A2093657 3026980935447    S07W7RQKLR 10,25
    ABC123 E2093651 3026980935395    S07W7RQSHJ 12,61
    ABC123 R2093457 3026980933453    S07W8W8N52 50,1
    ABC123 W2093661 3026980935494    S07W8W8N5H 10,03
    ABC123 Q2093652 3026980935402    S07WSY52R7 24,65
    XYZ000 S1868705 71700999552    S0140YQCR2 42,85
    XYZ000 1868736 71701000030    S0140YQPDS 14,14
    XYZ000 1868739 71701000061    S0140Q24VS 23,12
    XYZ000 1868737 71701000037    S0140Q2K8K 41,53
    XYZ000 1868755 71701000221    S0140Q2SSO 74,5

    CUSTOMER TABLE 
    PO SKU EAN Customer Material PRICE $
    ABC123 A2093657 3026980935447    S07W7RQKLR 10,25
    ABC123 E2093651   S07W7RQSHJ 11,64
    ABC123 R2093457 3026980933453    S07W8W8N52 50,12
    ABC123   3026980935494      10,03
    ABC123 Q2093652     24,65
    XYZ000 S1868705 71700999552    S0140YQCR2 42,14
    XYZ000     S0140YQPDS 14,14
    XYZ000 1868739 71701000061    S0140Q24VS 23,12
    XYZ000 1868737 71701000037    S0140Q2K8K 41,56
    XYZ000 1868755 71701000221    S0140Q2SSO 74,5

    Thank you in advance

    Lukasz

    Thursday, September 5, 2019 12:15 PM

Answers

  • Hi Lukasz,

    For the table named 'Customer', the code will be:

    let
        Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Customer"]}[Content]),
        changedType = Table.TransformColumnTypes(
            Source,
            {
                {"PO", type text}, 
                {"SKU", type text}, 
                {"EAN", type text}, 
                {"Customer Material", type text}, 
                {"PRICE $", type number}
            }
        ),
        addedColumn = Table.AddColumn(
            changedType, 
            "Customer Column Values", 
            each Text.Combine(List.RemoveNulls({[PO],[SKU],[EAN],[Customer Material]}), ",")
        )
    in
        addedColumn

    For the table named 'Sales', the code would be:

    let
        Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Sales"]}[Content]),
        changedType = Table.TransformColumnTypes(
            Source,
            {
                {"PO", type text}, 
                {"Material", type text}, 
                {"EAN", type text}, 
                {"Customer Material", type text}
            }
        ),
        addedCustom = Table.AddColumn(changedType, "Sales Column Values", each {[PO],[Material],[EAN],[Customer Material]}),
        customerColumnValuesList = List.Buffer(Customer[Customer Column Values]),
        splittedListText = List.Transform(customerColumnValuesList, (current)=> Text.Split(current, ",")),
        addedCustom2 = Table.AddColumn(
                          addedCustom, 
                          "Customer Column Values", 
                          each List.Select(
                                  splittedListText, 
                                  (current)=> List.ContainsAll(
                                                  [Sales Column Values],
                                                  current,
                                                  Comparer.OrdinalIgnoreCase
                                               )
                                ){0}?
                       ),
        convertedListToText = Table.ReplaceValue(
            addedCustom2, 
            each [Customer Column Values], 
            each Text.Combine([Customer Column Values], ","),
            Replacer.ReplaceValue, 
            {"Customer Column Values"}
        ),
        mergedQueries = Table.NestedJoin(
            convertedListToText, 
            {"Customer Column Values"}, 
            Customer, 
            {"Customer Column Values"}, 
            "Customer", 
            JoinKind.LeftOuter
        ),
        expandedCustomer = Table.ExpandTableColumn(mergedQueries, "Customer", {"PRICE $"}),
        removedColumns = Table.RemoveColumns(expandedCustomer,{"Sales Column Values", "Customer Column Values"})
    in
        removedColumns

    • Marked as answer by LouCash Monday, September 9, 2019 6:51 AM
    Thursday, September 5, 2019 9:57 PM

All replies

  • Hi Lukasz,

    For the table named 'Customer', the code will be:

    let
        Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Customer"]}[Content]),
        changedType = Table.TransformColumnTypes(
            Source,
            {
                {"PO", type text}, 
                {"SKU", type text}, 
                {"EAN", type text}, 
                {"Customer Material", type text}, 
                {"PRICE $", type number}
            }
        ),
        addedColumn = Table.AddColumn(
            changedType, 
            "Customer Column Values", 
            each Text.Combine(List.RemoveNulls({[PO],[SKU],[EAN],[Customer Material]}), ",")
        )
    in
        addedColumn

    For the table named 'Sales', the code would be:

    let
        Source = Table.Buffer(Excel.CurrentWorkbook(){[Name="Sales"]}[Content]),
        changedType = Table.TransformColumnTypes(
            Source,
            {
                {"PO", type text}, 
                {"Material", type text}, 
                {"EAN", type text}, 
                {"Customer Material", type text}
            }
        ),
        addedCustom = Table.AddColumn(changedType, "Sales Column Values", each {[PO],[Material],[EAN],[Customer Material]}),
        customerColumnValuesList = List.Buffer(Customer[Customer Column Values]),
        splittedListText = List.Transform(customerColumnValuesList, (current)=> Text.Split(current, ",")),
        addedCustom2 = Table.AddColumn(
                          addedCustom, 
                          "Customer Column Values", 
                          each List.Select(
                                  splittedListText, 
                                  (current)=> List.ContainsAll(
                                                  [Sales Column Values],
                                                  current,
                                                  Comparer.OrdinalIgnoreCase
                                               )
                                ){0}?
                       ),
        convertedListToText = Table.ReplaceValue(
            addedCustom2, 
            each [Customer Column Values], 
            each Text.Combine([Customer Column Values], ","),
            Replacer.ReplaceValue, 
            {"Customer Column Values"}
        ),
        mergedQueries = Table.NestedJoin(
            convertedListToText, 
            {"Customer Column Values"}, 
            Customer, 
            {"Customer Column Values"}, 
            "Customer", 
            JoinKind.LeftOuter
        ),
        expandedCustomer = Table.ExpandTableColumn(mergedQueries, "Customer", {"PRICE $"}),
        removedColumns = Table.RemoveColumns(expandedCustomer,{"Sales Column Values", "Customer Column Values"})
    in
        removedColumns

    • Marked as answer by LouCash Monday, September 9, 2019 6:51 AM
    Thursday, September 5, 2019 9:57 PM
  • Hi Colin

    why the need for Table.Buffer in your two Source ?

    Thursday, September 5, 2019 11:07 PM
  • Hi Anthony,

    The buffering may not be needed in this case (or in the other two recent posts where I provided similar solutions), and I haven't tested the performance. The habit comes from lookup scenarios where for each row in a table, a value is looked up in another table. In these cases, without buffering, the tables are evaluated multiple times (once for each row). But that sort of lookup is not happening here (I'm looking up lists), so I should really determine whether buffering actually helps (or hurts) performance. 

    Friday, September 6, 2019 12:06 AM
  • Hi Colin,

    it's a great solution!

    Thank you!

    Monday, September 9, 2019 6:51 AM
  • Hi Lukasz,

    I'm glad to know that the solution was helpful.

    Monday, September 9, 2019 2:51 PM