locked
Possible bug in Duplicates Removed RRS feed

  • Question

  • The duplicates remove method (list.distinct) appears to be case-sensitive. Is this the expected behaviour? If so, is there a way to make the method case-insensitive?

    Steps to follow:

    1) Create the following list in Excel and create a query based around it -

    Thank You
    Thank You
    Hello There
    hello there
    Excuse ME
    Excuse me

    2) In Filter & Shape, click remove duplicates

    3) Result - The following list remains

    Thank You
    Hello There
    hello there
    Excuse ME
    Excuse me

    4) Expected Result (This is also what I get when I use the remove duplicates feature from the data tab) - 

    Hello There
    Excuse Me
    Thank You

    I am running the 32-bit Excel 2013. 

    Thanks!!



    • Edited by AKhwaja Monday, September 30, 2013 10:24 PM
    Monday, September 30, 2013 10:20 PM

Answers

  • Found a workaround -- Using transform and selecting "Capitalize Each Word" before removing duplicates is functionally equivalent to the expected result from above. 
    Wednesday, October 2, 2013 9:04 PM
  • Another solution is to use the version of List.Distinct which takes in an equationCriteria. If you know the culture, you can use Comparer.FromCulture(cultureName, true) to ignore the case.  For example:

    List.Distinct(Source, Comparer.FromCulture("en-us", true))

    will give you the expected result if Source contains your original list.

    -Alejandro Lopez-Lago (MSFT)

    • Marked as answer by AKhwaja Friday, October 4, 2013 9:28 PM
    Thursday, October 3, 2013 7:40 PM

All replies

  • Found a workaround -- Using transform and selecting "Capitalize Each Word" before removing duplicates is functionally equivalent to the expected result from above. 
    Wednesday, October 2, 2013 9:04 PM
  • Another solution is to use the version of List.Distinct which takes in an equationCriteria. If you know the culture, you can use Comparer.FromCulture(cultureName, true) to ignore the case.  For example:

    List.Distinct(Source, Comparer.FromCulture("en-us", true))

    will give you the expected result if Source contains your original list.

    -Alejandro Lopez-Lago (MSFT)

    • Marked as answer by AKhwaja Friday, October 4, 2013 9:28 PM
    Thursday, October 3, 2013 7:40 PM
  • Thanks Alejandro.

    I tried your method, it works. Because it uses one fewer step, it is also probably faster and uses less memory.

    Friday, October 4, 2013 9:28 PM