none
Flagging Duplicates without deleting RRS feed

  • Question

  • I need to flag duplicates in a column.  If there is 3 records that are the same then the first one gets flagged as unique and the others as duplicates.  I need to keep all the records and use the flag as a filter when needed. 

    CountIF in Excel does this well but I have too many records to use that function. 

    Thanks

    Pete

    Wednesday, July 12, 2017 11:17 PM

Answers

  • I would do a group by the column, and count the rows. I would then edit the count portion of the group by code and replace it with an Add Index column code and change the type to "table". Then when you expand the table, you should get a count of each row 1, 2, 3 etc. All the number 1s are either unique or the first of many. Sorry, this may be a bit hard to follow - I'm in my iPad. I covered something similar in my online training http://xbi.com.au/pqt but that is a paid course. If you post some sample data I could have a go for you when I get a moment. Or see if you can work omit out from my description.

    Self Service BI Expert using Power Pivot http://exceleratorbi.com.au

    Thursday, July 13, 2017 10:07 AM
  • Here's a similar solution to what Matt suggested, checking for uniqueness in a column called 'Name':

    let
        Source = <<YOUR DATA>>,
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
        #"Add Unique Flag" = Table.AddColumn(#"Added Index", "Unique?", 
            each 
            if [Index] = ((N)=>List.Min(Table.SelectRows(#"Added Index", each [Name] = N)[Index]))([Name]) 
              then "UNIQUE" 
            else "NOT UNIQUE")
    in
        #"Add Unique Flag"

    First, we added an index column. Then, we're using a function to create row context, and finding the first index that has this row's 'Name' value. If the indexes match, then obviously it's the first instance and we'll call it unique. Otherwise, we know it's not unique.




    Thursday, July 13, 2017 6:22 PM

All replies

  • I would do a group by the column, and count the rows. I would then edit the count portion of the group by code and replace it with an Add Index column code and change the type to "table". Then when you expand the table, you should get a count of each row 1, 2, 3 etc. All the number 1s are either unique or the first of many. Sorry, this may be a bit hard to follow - I'm in my iPad. I covered something similar in my online training http://xbi.com.au/pqt but that is a paid course. If you post some sample data I could have a go for you when I get a moment. Or see if you can work omit out from my description.

    Self Service BI Expert using Power Pivot http://exceleratorbi.com.au

    Thursday, July 13, 2017 10:07 AM
  • Here's a similar solution to what Matt suggested, checking for uniqueness in a column called 'Name':

    let
        Source = <<YOUR DATA>>,
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
        #"Add Unique Flag" = Table.AddColumn(#"Added Index", "Unique?", 
            each 
            if [Index] = ((N)=>List.Min(Table.SelectRows(#"Added Index", each [Name] = N)[Index]))([Name]) 
              then "UNIQUE" 
            else "NOT UNIQUE")
    in
        #"Add Unique Flag"

    First, we added an index column. Then, we're using a function to create row context, and finding the first index that has this row's 'Name' value. If the indexes match, then obviously it's the first instance and we'll call it unique. Otherwise, we know it's not unique.




    Thursday, July 13, 2017 6:22 PM