Answered by:
Table.Distinct removes values completely
Question

I created a table from queries, then deleted all columns except for the one I need. Then I used the GUI to remove duplicate entries in this single column. With the Table.Distinct function, 88 distinct values remain.
However, if I use 'remove duplicates' in Excel or convert the column to a list and then use the List.Distinct function, 137 distinct values remain! Using Group By (and then removing the extra added column) also gives the desired result.
Why does Table.Distinct remove ALL entries of certain values?
If this is not a bug, can anyone tell me what the intention of the default behavior of this function is and whether I can change this default behavior in order to obtain the desired result?
Answers

That's not the complete text because it depends on the definition of "Rijen gesorteerd" which isn't included.
 Proposed as answer by Ehren  MSFTMicrosoft employee, Owner Wednesday, October 2, 2019 11:18 PM
 Marked as answer by Imke FeldmannMVP, Moderator Saturday, October 26, 2019 6:19 AM
All replies


Hello Aleksei!
Sorry, I can't post my file due to sensitivity of the data. And when I try to isolate the issue by coping this single column to a separate file and load it from there in Power Query, the Table.Distinct function works as expected.
I do have a workaround, but at this point I doubt wether the Table.Distinct function can be trusted (or at least there seems to be an unidentified factor that influences the result).
Is there any known difference in the working of Table.Distinct and List.Distinct that might cause this behavior?

The list of values returned by following formulas should be the same:
= Table.Distinct(SingleColumnTable) = Table.Distinct(Table, {"Column"}) = List.Distinct(Table[Column])
Check, that you use the same source for both your formulas (i.e. the same table, without any additional transformations).
 Edited by Aleksei Zhigulin Thursday, August 29, 2019 2:21 PM

The list of values returned by following formulas should be the same:
= Table.Distinct(SingleColumnTable) = Table.Distinct(Table, {"Column"}) = List.Distinct(Table[Column])
Check, that you use the same source for both your formulas (i.e. the same table, without any additional transformations).
Hello Aleksei, thank you for your support.
I used all three variants after creating my single column table of 794 rows as:
=Table.Distinct(#"Table Name")
=Table.Distinct(#"Table Name", {"ColumnName"})
=List.Distinct(#"Table Name"[ColumnName])
They do not give the same results: Both the Table.Distinct formulas result in 88 rows, the List.Distinct formula gives 138 rows.


The source is a CSV dump with SAP data.
All letters are capital letters, so casing is identical. To simplify, I have a list like this:
APPLE
PEAR
BANANA
PEAR
BANANA
APPLE
APPLE
APPLE
BANANA
APPLEAnd I end up with:
PEAR
BANANABut where did APPLE go?
It appears to me completely random (note that some 50 of 138 distinct values are completely removed in this way!). Moreover, why does List.Distinct give the correct result whereas Table.Distinct does not?!



That's not the complete text because it depends on the definition of "Rijen gesorteerd" which isn't included.
 Proposed as answer by Ehren  MSFTMicrosoft employee, Owner Wednesday, October 2, 2019 11:18 PM
 Marked as answer by Imke FeldmannMVP, Moderator Saturday, October 26, 2019 6:19 AM
