none
Binary Value from Duplicate RRS feed

  • Question

  • Hi, I'm hoping you can help.

    I have been looking at solutions but none of them seem to match exactly what I need.  In short, I need to create a new custom column in an Excel Power Query table to give a binary value of 1 if it detects the same value in any other row in the same column (basically checking for duplicates).

    As an example

    ID  Name Duplicate
    1    A        1
    2    B        0
    3    C        0
    1    D        1

    I guess I need to reference the original column to check the row value but I'm stuck :(

    Thank you in advance for your help

    Andy

    Thursday, December 20, 2018 10:49 AM

Answers

  • a few comments:

    • the 2 codes I suggested are alternatives, not additional
    • you can create a step that simply reference the previous step with the GUI simply by cliking on the fx button



    • I did not change your query at all, assuming it works well up to the last steps
    • Can you replace your code with this amended code in the advanced Editor and tell if it works ?
    • Can you also please confirm there are duplicates in your column "Person ID" ?
    let
        Source = Web.Page(Web.Contents("https://**************************")),
        Data0 = Source{0}[Data],
        #"Changed Type" = Table.TransformColumnTypes(Data0,{{"C_Date", type date}, {"MH_Date", type date}, {"S_Date", type date}, {"Difficulty", type text}, {"Check", type date}, {"J_Ref", type text}, {"Person ID", Int64.Type}, {"MH_Ref", Int64.Type}, {"CMH_Ref", type text}, {"First name", type text}, {"Surname", type text}, {"Hyper_Ref", type text}, {"L_Step", type text}, {"V_Title", type text}, {"M_Name", type text}, {"SLine", type text}, {"SBU", type text}, {"BU Code", type text}, {"Capability", type text}, {"G_O", type text}, {"Median", type text}, {"Percentage Against Median", type text}, {"SDATE", type date}, {"Source", type text}, {"A_Source", type text}, {"SO", Int64.Type}, {"AG_Fe", Percentage.Type}, {"PO_Fe", Percentage.Type}, {"RBP", type text}, {"RtR", type text}}),
        #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Check", "Check - Copy"),
        #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Check - Copy", "M_Accept"}}),
        #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name" = Table.TransformColumns(#"Reordered Columns", {{"M_Accept", each Date.MonthName(_), type text}}),
        #"Duplicated Column1" = Table.DuplicateColumn(#"Extracted Month Name", "SDATE", "SDATE - Copy"),
        #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column1",{{"SDATE - Copy", "Start Month"}}),
        #"Reordered Columns1" = Table.ReorderColumns(#"Renamed Columns1",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name1" = Table.TransformColumns(#"Reordered Columns1", {{"Start Month", each Date.MonthName(_), type text}}),
        #"Added A_Inv" = Table.AddColumn(#"Extracted Month Name1", "Custom", each Value.Multiply([SO] as number, [AG_Fe] as number)),
        #"Renamed Columns2" = Table.RenameColumns(#"Added A_Inv",{{"Custom", "A_Inv"}}),
        #"Reordered Columns2" = Table.ReorderColumns(#"Renamed Columns2",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "RBP", "RtR"}),
        #"Added P_Inv" = Table.AddColumn(#"Reordered Columns2", "P_Inv", each if [Source] = "TTP" then 650 else Value.Multiply([SO] as number, [PO_Fe] as number)),
        #"Reordered P_Inv" = Table.ReorderColumns(#"Added P_Inv",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "P_Inv", "RBP", "RtR"}),
        #"Added v_Src" = Table.AddColumn(#"Reordered P_Inv", "v_Source", each if [Source] is null then 1 else null),
        #"Added v_ASrv" = Table.AddColumn(#"Added v_Src", "v_ASrc", each if [Source] = "Ag" and [A_Source] = null then 1 else null),
        #"Added v_AFee" = Table.AddColumn(#"Added v_ASrc", "v_AFee", each if [Source] = "Ag" and [AG_Fe] = null then 1 else null),
    
    Summary = #"Added v_AFee",
    Add_Duplicate = Table.AddColumn(#"Summary", "v_Duplicate", each if List.PositionOf(#"Summary"[Person ID], [Person ID], Occurrence.Last)=List.PositionOf(#"Summary"[Person ID], [Person ID], Occurrence.First) then null else 1, Int64.Type)
    
    in #"Add_Duplicate"



    • Edited by anthony34 Friday, December 21, 2018 7:19 AM
    • Marked as answer by Abrunton Friday, December 21, 2018 10:38 AM
    Friday, December 21, 2018 7:04 AM

All replies

  • Assuming Source is your table, this step will make it:

    AddDuplicate = Table.AddColumn(#"Source", "Duplicate", each 
        if List.PositionOf(#"Source"[ID], [ID], Occurrence.Last)=List.PositionOf(#"Source"[ID], [ID], Occurrence.First)
        then 0
        else 1,
        Int64.Type
    )


    Basically it simply checks whether the first occurence of the ID entry is at the same position as the last occurence, by using List.PositionOf


    • Edited by anthony34 Thursday, December 20, 2018 11:34 AM
    Thursday, December 20, 2018 11:28 AM
  • Another approach: count the number of occurences in ID Column by using List.Count

    AddDuplicate = Table.AddColumn(#"Source", "Duplicate v2", (_)=> if List.Count(List.Select(#"Source"[ID], (x)=> x=_[ID] ))=1 then 0 else 1, Int64.Type)
    You can add List.Buffer if your table is large
    • Edited by anthony34 Thursday, December 20, 2018 12:20 PM
    Thursday, December 20, 2018 11:50 AM
  • Dear Anthony.

    Thank you for taking the time to work on this, I have modified as follows but still just get null I'm afraid.  Are there any glaring errors here you can possibly see?

    Custom Column Formula

    =if List.PositionOf(Data0[Person ID], [Person ID], Occurrence.Last)=List.PositionOf(Data0[Person ID], [Person ID], Occurrence.First) then null else 1

    Using the Custom Column builder in Excel PowerQuery

    Thanks

    Andy


    • Edited by Abrunton Thursday, December 20, 2018 1:55 PM
    Thursday, December 20, 2018 1:55 PM
  • 2 suggestions:

    • You got null because you put null in your formula

    =if List.PositionOf(Data0[Person ID], [Person ID], Occurrence.Last)=List.PositionOf(Data0[Person ID], [Person ID], Occurrence.First)
    then null else 1

    replace null by 0 and then you will get what you want.

    • are you sure there are duplicates in your column Person ID ?


    Thursday, December 20, 2018 2:19 PM
  • Could you paste your full code ?

    In Power Query Editor -> Home -> Advanced Editor -> select your code and copy it with your mouse

    Thursday, December 20, 2018 2:39 PM
  • Thanks Anthony, of course:

    let
        Source = Web.Page(Web.Contents("https://**************************")),
        Data0 = Source{0}[Data],
        #"Changed Type" = Table.TransformColumnTypes(Data0,{{"C_Date", type date}, {"MH_Date", type date}, {"S_Date", type date}, {"Difficulty", type text}, {"Check", type date}, {"J_Ref", type text}, {"Person ID", Int64.Type}, {"MH_Ref", Int64.Type}, {"CMH_Ref", type text}, {"First name", type text}, {"Surname", type text}, {"Hyper_Ref", type text}, {"L_Step", type text}, {"V_Title", type text}, {"M_Name", type text}, {"SLine", type text}, {"SBU", type text}, {"BU Code", type text}, {"Capability", type text}, {"G_O", type text}, {"Median", type text}, {"Percentage Against Median", type text}, {"SDATE", type date}, {"Source", type text}, {"A_Source", type text}, {"SO", Int64.Type}, {"AG_Fe", Percentage.Type}, {"PO_Fe", Percentage.Type}, {"RBP", type text}, {"RtR", type text}}),
        #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Check", "Check - Copy"),
        #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Check - Copy", "M_Accept"}}),
        #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name" = Table.TransformColumns(#"Reordered Columns", {{"M_Accept", each Date.MonthName(_), type text}}),
        #"Duplicated Column1" = Table.DuplicateColumn(#"Extracted Month Name", "SDATE", "SDATE - Copy"),
        #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column1",{{"SDATE - Copy", "Start Month"}}),
        #"Reordered Columns1" = Table.ReorderColumns(#"Renamed Columns1",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name1" = Table.TransformColumns(#"Reordered Columns1", {{"Start Month", each Date.MonthName(_), type text}}),
        #"Added A_Inv" = Table.AddColumn(#"Extracted Month Name1", "Custom", each Value.Multiply([SO] as number, [AG_Fe] as number)),
        #"Renamed Columns2" = Table.RenameColumns(#"Added A_Inv",{{"Custom", "A_Inv"}}),
        #"Reordered Columns2" = Table.ReorderColumns(#"Renamed Columns2",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "RBP", "RtR"}),
        #"Added P_Inv" = Table.AddColumn(#"Reordered Columns2", "P_Inv", each if [Source] = "TTP" then 650 else Value.Multiply([SO] as number, [PO_Fe] as number)),
        #"Reordered P_Inv" = Table.ReorderColumns(#"Added P_Inv",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "P_Inv", "RBP", "RtR"}),
        #"Added v_Src" = Table.AddColumn(#"Reordered P_Inv", "v_Source", each if [Source] is null then 1 else null),
        #"Added v_ASrv" = Table.AddColumn(#"Added v_Src", "v_ASrc", each if [Source] = "Ag" and [A_Source] = null then 1 else null),
        #"Added v_AFee" = Table.AddColumn(#"Added v_ASrc", "v_AFee", each if [Source] = "Ag" and [AG_Fe] = null then 1 else null),
        #"Added v_Duplicate" = Table.AddColumn(#"Added v_AFee", "v_Duplicate", each if List.PositionOf(Data0[Person ID], [Person ID], Occurrence.Last)=List.PositionOf(Data0[Person ID], [Person ID], Occurrence.First) then null else 1),
        #"Added v_Duplicate2" = Table.AddColumn(#"Added v_Duplicate", "vDuplicate2", each if List.Count(List.Select(Data0[Person ID], (x)=> x=_[Person ID] ))=1 then 1 else null)
    in
        #"Added v_Duplicate2"

    Added the two duplicates down at the end.

    Thanks


    Andy



    • Edited by Abrunton Thursday, December 20, 2018 7:04 PM
    Thursday, December 20, 2018 7:04 PM
  • a few comments:

    • the 2 codes I suggested are alternatives, not additional
    • you can create a step that simply reference the previous step with the GUI simply by cliking on the fx button



    • I did not change your query at all, assuming it works well up to the last steps
    • Can you replace your code with this amended code in the advanced Editor and tell if it works ?
    • Can you also please confirm there are duplicates in your column "Person ID" ?
    let
        Source = Web.Page(Web.Contents("https://**************************")),
        Data0 = Source{0}[Data],
        #"Changed Type" = Table.TransformColumnTypes(Data0,{{"C_Date", type date}, {"MH_Date", type date}, {"S_Date", type date}, {"Difficulty", type text}, {"Check", type date}, {"J_Ref", type text}, {"Person ID", Int64.Type}, {"MH_Ref", Int64.Type}, {"CMH_Ref", type text}, {"First name", type text}, {"Surname", type text}, {"Hyper_Ref", type text}, {"L_Step", type text}, {"V_Title", type text}, {"M_Name", type text}, {"SLine", type text}, {"SBU", type text}, {"BU Code", type text}, {"Capability", type text}, {"G_O", type text}, {"Median", type text}, {"Percentage Against Median", type text}, {"SDATE", type date}, {"Source", type text}, {"A_Source", type text}, {"SO", Int64.Type}, {"AG_Fe", Percentage.Type}, {"PO_Fe", Percentage.Type}, {"RBP", type text}, {"RtR", type text}}),
        #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Check", "Check - Copy"),
        #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Check - Copy", "M_Accept"}}),
        #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name" = Table.TransformColumns(#"Reordered Columns", {{"M_Accept", each Date.MonthName(_), type text}}),
        #"Duplicated Column1" = Table.DuplicateColumn(#"Extracted Month Name", "SDATE", "SDATE - Copy"),
        #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column1",{{"SDATE - Copy", "Start Month"}}),
        #"Reordered Columns1" = Table.ReorderColumns(#"Renamed Columns1",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "PO_Fe", "RBP", "RtR"}),
        #"Extracted Month Name1" = Table.TransformColumns(#"Reordered Columns1", {{"Start Month", each Date.MonthName(_), type text}}),
        #"Added A_Inv" = Table.AddColumn(#"Extracted Month Name1", "Custom", each Value.Multiply([SO] as number, [AG_Fe] as number)),
        #"Renamed Columns2" = Table.RenameColumns(#"Added A_Inv",{{"Custom", "A_Inv"}}),
        #"Reordered Columns2" = Table.ReorderColumns(#"Renamed Columns2",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "RBP", "RtR"}),
        #"Added P_Inv" = Table.AddColumn(#"Reordered Columns2", "P_Inv", each if [Source] = "TTP" then 650 else Value.Multiply([SO] as number, [PO_Fe] as number)),
        #"Reordered P_Inv" = Table.ReorderColumns(#"Added P_Inv",{"C_Date", "MH_Date", "S_Date", "Difficulty", "Check", "M_Accept", "J_Ref", "Person ID", "MH_Ref", "CMH_Ref", "First name", "Surname", "Hyper_Ref", "L_Step", "V_Title", "M_Name", "SLine", "SBU", "BU Code", "Capability", "G_O", "Median", "Percentage Against Median", "SDATE", "Start Month", "Source", "A_Source", "SO", "AG_Fe", "A_Inv", "PO_Fe", "P_Inv", "RBP", "RtR"}),
        #"Added v_Src" = Table.AddColumn(#"Reordered P_Inv", "v_Source", each if [Source] is null then 1 else null),
        #"Added v_ASrv" = Table.AddColumn(#"Added v_Src", "v_ASrc", each if [Source] = "Ag" and [A_Source] = null then 1 else null),
        #"Added v_AFee" = Table.AddColumn(#"Added v_ASrc", "v_AFee", each if [Source] = "Ag" and [AG_Fe] = null then 1 else null),
    
    Summary = #"Added v_AFee",
    Add_Duplicate = Table.AddColumn(#"Summary", "v_Duplicate", each if List.PositionOf(#"Summary"[Person ID], [Person ID], Occurrence.Last)=List.PositionOf(#"Summary"[Person ID], [Person ID], Occurrence.First) then null else 1, Int64.Type)
    
    in #"Add_Duplicate"



    • Edited by anthony34 Friday, December 21, 2018 7:19 AM
    • Marked as answer by Abrunton Friday, December 21, 2018 10:38 AM
    Friday, December 21, 2018 7:04 AM
  • Dear Anthony.

    This worked! Thank you very much for helping with this.

    If you have a link where I can send you a coffee/beer for your time - please let me know.

    Regards

    Andy

    Friday, December 21, 2018 10:39 AM
  • thanks for the "e-coffee"
    Friday, December 21, 2018 1:46 PM