none
Implementing Rank / Partitioned Index RRS feed

  • Question

  • Has anyone implemented a partitioned RANK in Power Query? I understand I can Sort and then create an Index, but how can I define partitions for the index? Is there some way I can make it restart with each new partition value?

    For instance, I want to rank company performance in each sport:

    let
        Source = Table.FromRecords(
    	{[Company = "Company A", Event = "Swimming", Points = 8],
    	[Company = "Company A", Event = "Running", Points = 5],
    	[Company = "Company B", Event = "Swimming", Points = 10],
    	[Company = "Company B", Event = "Running", Points = 2],
    	[Company = "Company C", Event = "Swimming", Points = 2],
    	[Company = "Company C", Event = "Running", Points = 4]	})
    in
        #"Source"

    Is there a way to do this, besides manually filtering to a single sport, doing the Sort + Index, repeating this step for each sport, and then appending these steps?

    Tuesday, September 30, 2014 6:29 PM

Answers

  • The only way to do this in Power Query today is as you suggest. You would essentially do a "group by" operation on whatever your categories are, then sort and index inside each group. I don't think this can be done through the UI; you'd probably have to write some M code directly to do it. Here's what I came up with:

    let
        Source = Table.FromRecords({
            [Company = "Company A", Event = "Swimming", Points = 8],
            [Company = "Company A", Event = "Running", Points = 5],
            [Company = "Company B", Event = "Swimming", Points = 10],
            [Company = "Company B", Event = "Running", Points = 2],
            [Company = "Company C", Event = "Swimming", Points = 2],
            [Company = "Company C", Event = "Running", Points = 4]
        }),
        AddRanking = (table, column, newColumn) =>
            Table.AddIndexColumn(Table.Sort(table, {{column, Order.Descending}}), newColumn, 1, 1),
        #"Grouped Rows" = Table.Group(Source, {"Event"}, {{"Data", each _, type table}}),
        Transformed = Table.TransformColumns(#"Grouped Rows", {{"Data", each AddRanking(_, "Points", "Rank")}}),
        #"Expand Data" = Table.ExpandTableColumn(Transformed, "Data", {"Company", "Points", "Rank"}, {"Company", "Points", "Rank"})
    in
        #"Expand Data"
    

    As long as it matches your other needs, I think Colin's suggestion of loading into the data model and then using Power Pivot is a good one.

    There is an request on our backlog of work to add a ranking function, but I don't know when (or if!) it might happen.

    Wednesday, October 1, 2014 1:29 PM

All replies

  • Michael, is there a reason why you prefer to do the ranking in Power Query rather than in Power Pivot?
    Tuesday, September 30, 2014 11:49 PM
  • The only way to do this in Power Query today is as you suggest. You would essentially do a "group by" operation on whatever your categories are, then sort and index inside each group. I don't think this can be done through the UI; you'd probably have to write some M code directly to do it. Here's what I came up with:

    let
        Source = Table.FromRecords({
            [Company = "Company A", Event = "Swimming", Points = 8],
            [Company = "Company A", Event = "Running", Points = 5],
            [Company = "Company B", Event = "Swimming", Points = 10],
            [Company = "Company B", Event = "Running", Points = 2],
            [Company = "Company C", Event = "Swimming", Points = 2],
            [Company = "Company C", Event = "Running", Points = 4]
        }),
        AddRanking = (table, column, newColumn) =>
            Table.AddIndexColumn(Table.Sort(table, {{column, Order.Descending}}), newColumn, 1, 1),
        #"Grouped Rows" = Table.Group(Source, {"Event"}, {{"Data", each _, type table}}),
        Transformed = Table.TransformColumns(#"Grouped Rows", {{"Data", each AddRanking(_, "Points", "Rank")}}),
        #"Expand Data" = Table.ExpandTableColumn(Transformed, "Data", {"Company", "Points", "Rank"}, {"Company", "Points", "Rank"})
    in
        #"Expand Data"
    

    As long as it matches your other needs, I think Colin's suggestion of loading into the data model and then using Power Pivot is a good one.

    There is an request on our backlog of work to add a ranking function, but I don't know when (or if!) it might happen.

    Wednesday, October 1, 2014 1:29 PM
  • Since there is a built-in rank function in Power Pivot and you have more control over filtering etc., if one has the choice, then Power Pivot is the better option. On the other hand, Curt provides a fine solution if you need to do the equivalent in Power Query.

    It appears though, that you'd rather not have an index column and the lot. An alternative solution involves using a couple of portable custom functions.

    ListRankEqual:

    (inputValue as any, inputSeries as list, optional orderDescending as nullable logical) as number =>
    let
        SortedSeries = if not orderDescending or orderDescending = null then
                        List.Sort(inputSeries,Order.Descending)
                     else
                        List.Sort(inputSeries),
        RankEqual = List.PositionOf(SortedSeries,inputValue)+1
    in
        RankEqual

    The function has two mandatory parameters and one optional parameter. In your example, inputValue would be the current value in the Points column, specified as [Points] when passed to the function. inputSeries would be the entire Points column, or in this case, a subset of the Points column. In the body of the function, we sort inputSeries in descending order to get the rank in ascending order and vice versa.

    The second function I call GroupList. I tend to use this function when I want to group a column, but retain the original columns in a table. In your example, we want to rank Swimming and Running independently. Therefore, we need to create a Swimming points list and a Running points list for the ListRankEqual function.

    The code for GroupList:

    (keyColumnList as list, keyColumnValue as any, associatedColumnList as list)=>
    let
        FilteredKeyColumnPositions = List.PositionOf(keyColumnList, keyColumnValue, Occurrence.All),
        FilteredGroupList = List.Transform(FilteredKeyColumnPositions, each associatedColumnList{_})
    in
       FilteredGroupList

    keyColumnList is the list of values in the grouping column. In this case, it would be specified as

    Source[Event]. keyColumnValue is the current value in the Event column, specified as [Event]. associatedColumnList is the column with the values you want to rank (Source[Points]).

    GroupList first finds all the positions of the current value in the Event column. It then uses List.Transform to produce a corresponding Point list. The Swimming event will produce a list containing all of the points for Swimming, and the Running event will produce a list containing all of the points for Running.

    The complete custom formula would be:

    ListRankEqual([Points],GroupList(Source[Event],[Event],Source[Points]))

    You just toss the formula into a new custom column.

    Wednesday, October 1, 2014 4:35 PM
  • The reason I was looking to skip PowerPivot is because I'm currently using the the worksheet as loaded by Power Query for my report data source, and I would prefer to keep all my transformations in the same place. Adding in PowerPivot would introduce extra steps in the loading process (ie Power Query --> Power Pivot --> worksheet) This would also complicate the referencing of this query by other queries.. Also, last time I used Power Query and Power Pivot in conjunction with each other, I would get an error when I tried to refresh the query--something about the query being edited by Power Pivot, which was preventing the refresh. Anyway, thank you Curt, your answer worked for me.


    BI practitioner at a Microsoft Gold Certified managed services provider.


    Thursday, October 2, 2014 4:17 PM