locked
Table.Sort custom comparison criteria on several columns RRS feed

  • Question

  • Hi All.

    There is the Table.Sort comparison criteria description on MSDN (the same as in the old PDFs):

    Comparison criteria

    Comparison criterion can be provided as either of the following values:

    • A number value to specify a sort order. See sort order in the parameter values section above.

    • To compute a key to be used for sorting, a function of 1 argument can be used.

    • To both select a key and control order, comparison criterion can be a list containing the key and order.

    • To completely control the comparison, a function of 2 arguments can be used that returns -1, 0, or 1 given the relationship between the left and right inputs. Value.Compare is a method that can be used to delegate this logic.

    But of course this:

    = Table.Sort(Source,{{Column1, Order.Ascending}, {"Column2", (a, b)=>if a = null or b=null then -1 else Value.Compare(a, b)}})
    // error

    returns an error "this sort direction is not allowed" or something like that.

    I suppose that the same question was there on the Technet, but cannot find a topic about it. 

    Is there a workaround? I only can imagine a very complex monstruoso function which will calculate some sort order over several columns.

    Thanks in advance!


    Maxim Zelensky Excel Inside

    Wednesday, April 25, 2018 5:08 PM

Answers

  • I think that custom sorting will always add some complexity to a sort expression. Your requirement is like sorting 0, 3, 1, 2 in ascending order, except that the zero should be at the end instead of at the beginning. It's not a trivial task in my opinion.

    An alternative would be to use a more "traditional" approach, i.e. add a custom sort column that can be used to sort the columns through the Query Editor. For example:

    AddedSortColumn = Table.AddColumn(Source, "Sort Order", each if [Column2] = null then List.Max(Source[Column2]) + 1 else [Column2]),
    SortedRows = Table.Sort(AddedSortColumn,{{"Column1", Order.Ascending}, {"Sort Order", Order.Ascending}})
    
    So here you reduce complexity at the expense of adding an additional step, which doesn't matter unless the additional step impacts the performance of the query negatively.
    Friday, April 27, 2018 3:45 PM
  • I think I found the right combination:

    = Table.Sort(Source, (a, b) => 
        if Value.Compare(a[Column1], b[Column1])=0 
            then if a[Column2] = null 
                then 1 
                else if b[Column2] = null 
                    then -1 
                    else Value.Compare(a[Column2], b[Column2])
            else Value.Compare(a[Column1], b[Column1])
        )

    at least for this case

    but this is weird... why it should be so complex?..


    Maxim Zelensky Excel Inside

    Friday, April 27, 2018 8:36 AM

All replies

  • Hi Maxim,

    As the third bullet indicates, only key and control order can be in a list. The control order in this case is one of Order.Descending, Order.Ascending, 0 or 1. The 2-argument function in the last bullet cannot be in a list. 

    Thus, the syntax for Value.Compare would be:

    Table.Sort(Source, (a, b) => if a[Column2] = null or b[Column2] = null then -1 else Value.Compare(a[Column2], b[Column2]))
    I do not know if or how you can combine multiple columns when using the a function for comparison. 

    Wednesday, April 25, 2018 6:50 PM
  • I believe you would have to go a route like this:

    = Table.Sort(#"Replaced Value", (a, b) => if a[Column2] = null or b[Column2] = null 
    											then -1 
    											else if Value.Compare(a[Column1], b[Column1])=0 
    													then Value.Compare(a[Column2], b[Column2]) 
    													else Value.Compare(a[Column1], b[Column1]))


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Proposed as answer by Colin Banfield Thursday, April 26, 2018 4:15 PM
    • Unproposed as answer by Colin Banfield Friday, April 27, 2018 3:48 PM
    Thursday, April 26, 2018 2:37 PM
  • What this code does is:

    It first sorts by Column 1 and then within the clusters of Column1 it sorts by Column2, UNLESS on of the clusters value is null, then no sorting of the Column2-part takes place.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Thursday, April 26, 2018 4:15 PM
  • Hi Imke,

    That works, and I would remove the expression: 'if a[Column2] = null or b[Column2] = null then - 1' because I now realize that it can affect the sort order in undesirable ways (e.g. cause duplicate values in Column1 to no longer be sorted together). 

    Thursday, April 26, 2018 4:36 PM
  • Imke, I think that our posts are out of sync. :)

    Anyway, Maxim's original requirement was to sort Column1 in ascending order, and the 'offending' expression mentioned in my previous post overrides the sorting of this column when there are nulls in Column2 and duplicates in Column1. Frankly, I'm not sure what it was intending to achieve in the first place.


    Your code, minus the leading 'if' expression seems right to me.
    Thursday, April 26, 2018 5:01 PM
  • Hi Imke, Colin

    Thank you very much for your efforts to help in this question.

    Unfortunately, both your suggestions didn't work. I think I need to add more details. Let this table is the source:

    Source table

    So, you can see that it contains some duplicates it the 1st column and some nulls in the 2nd.
    My intention is to sort this table with the next order: Column1 ascending, Column2 ascending, but the nulls should be in the end as if they are the biggest values:

    Desired result

    Of course I can replace nulls with the #infinity if Column2 has the number type, or with some string like "ZZZZZZZZ" if this is a text column, but value replacement could be very consuming.

    Imke's code gives me the next result:

    You can see that the 1st column is in the wrong order, also as the 2nd column.

    When I removed the 1st "if" construct in the Imke's code, as Colin suggested, I got the usual sort order: nulls ahead:

    I think it is because of sort algorithm used in the PQ - specially what is the order of left and right values in the Value.Compare function.


    Maxim Zelensky Excel Inside

    Friday, April 27, 2018 8:14 AM
  • I think I found the right combination:

    = Table.Sort(Source, (a, b) => 
        if Value.Compare(a[Column1], b[Column1])=0 
            then if a[Column2] = null 
                then 1 
                else if b[Column2] = null 
                    then -1 
                    else Value.Compare(a[Column2], b[Column2])
            else Value.Compare(a[Column1], b[Column1])
        )

    at least for this case

    but this is weird... why it should be so complex?..


    Maxim Zelensky Excel Inside

    Friday, April 27, 2018 8:36 AM
  • I think that custom sorting will always add some complexity to a sort expression. Your requirement is like sorting 0, 3, 1, 2 in ascending order, except that the zero should be at the end instead of at the beginning. It's not a trivial task in my opinion.

    An alternative would be to use a more "traditional" approach, i.e. add a custom sort column that can be used to sort the columns through the Query Editor. For example:

    AddedSortColumn = Table.AddColumn(Source, "Sort Order", each if [Column2] = null then List.Max(Source[Column2]) + 1 else [Column2]),
    SortedRows = Table.Sort(AddedSortColumn,{{"Column1", Order.Ascending}, {"Sort Order", Order.Ascending}})
    
    So here you reduce complexity at the expense of adding an additional step, which doesn't matter unless the additional step impacts the performance of the query negatively.
    Friday, April 27, 2018 3:45 PM
  • Thanks, Colin

    When I wrote "so complex", I meaned not the sorting task itself, but a Table.Sort function resulting syntax.

    I mean, that if Table.Sort function could be written in such way as in the first message of this topic, it will be great simplification of such cases: not only list of {column, order}, but list of {column, rule}, where rule could be an order or 1-arg function or 2-arg function - like in the List.Sort

    Or, may be I do not understand table sorting principles at all and all is not so easy-peasy?


    Maxim Zelensky Excel Inside

    Saturday, April 28, 2018 3:15 PM
  • Hi Maxim,

    I mean, that if Table.Sort function could be written in such way as in the first message of this topic, it will be great simplification of such cases: not only list of {column, order}, but list of {column, rule}, where rule could be an order or 1-arg function or 2-arg function - like in the List.Sort

    Yeah, I understand where you're coming from. Expanding your table to 10,000 rows, I saw no difference in performance between Value.Compare and the sort column approach. If you really want to have some fun, add a third sort column to the 2-arg function with yet a different custom sort order. :)

    Saturday, April 28, 2018 4:58 PM