none
SUMIFS in Power Query (Rolling 12 Months)

    Question

  • Basically, I'm trying to calculate a "Rolling 12 Months" column in Power Query.  Using the table below as an example, I would want the column to add the number of Sales for each Associate for the month represented on each row as well as the 11 months prior.  I've included a screenshot of a table with the calculation being done with SUMIFS.  Is there a way to replicate these results in Power Query?

    Associate

    Date

    Sales

    Rolling 12 Months

    Associate 1

    1/1/2015

    10

    10

    Associate 2

    1/1/2015

    5

    5

    Associate 1

    2/1/2015

    15

    25

    Associate 2

    2/1/2015

    7

    12

    Associate 1

    3/1/2015

    4

    29

    Associate 2

    3/1/2015

    1

    13

    Associate 1

    4/1/2015

    17

    46

    Associate 2

    4/1/2015

    8

    21

    Associate 1

    5/1/2015

    8

    54

    Associate 2

    5/1/2015

    20

    41

    Associate 1

    6/1/2015

    2

    56

    Associate 2

    6/1/2015

    3

    44

    Associate 1

    7/1/2015

    6

    62

    Associate 2

    7/1/2015

    15

    59

    Associate 1

    8/1/2015

    9

    71

    Associate 2

    8/1/2015

    2

    61

    Associate 1

    9/1/2015

    12

    83

    Associate 2

    9/1/2015

    20

    81

    Associate 1

    10/1/2015

    18

    101

    Associate 2

    10/1/2015

    14

    95

    Associate 1

    11/1/2015

    3

    104

    Associate 2

    11/1/2015

    6

    101

    Associate 1

    12/1/2015

    11

    115

    Associate 2

    12/1/2015

    1

    102

    Associate 1

    1/1/2016

    20

    125

    Associate 2

    1/1/2016

    10

    107

    Associate 1

    2/1/2016

    0

    110

    Associate 2

    2/1/2016

    12

    112

    Associate 1

    3/1/2016

    4

    110

    Associate 2

    3/1/2016

    10

    121

    Here is the Excel formula (with table references) that produces the results in the "Rolling 12 Months" column:

    =SUMIFS([Sales],[Associate],[@Associate],[Date],"<="&[@Date],[Date],">"&EDATE([@Date],-12))



    Monday, April 25, 2016 11:28 PM

Answers

  • Hi,

    Try this code below. This is a one of possible ways.

    let
        fxRolling12 = (tbl as table) as table =>
          let
            AddIdx = Table.AddIndexColumn(tbl, "Idx",1,1),
            SalesList = AddIdx[Sales],
            Roll12Col = Table.AddColumn(AddIdx, "Rolling 12", each if [Idx] <= 12 then List.Sum(List.FirstN(SalesList,[Idx])) else List.Sum(List.LastN(List.FirstN(SalesList,[Idx]), 12)))
          in
            Roll12Col,
    
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}}),
        AddIndex = Table.AddIndexColumn(ChType, "Indeks", 1, 1),
        GroupByAssoc = Table.Group(AddIndex, {"Associate"}, {{"tbl", each fxRolling12(_), type table}}),
        RemColumn = Table.RemoveColumns(GroupByAssoc,{"Associate"}),
        Expand = Table.ExpandTableColumn(RemColumn, "tbl", {"Associate", "Date", "Idx", "Indeks", "Sales", "Rolling 12"}, {"Associate", "Date", "Idx", "Indeks", "Sales", "Rolling 12"}),
        ChType2 = Table.TransformColumnTypes(Expand,{{"Date", type date}}),
        SortByIndeks = Table.Sort(ChType2,{{"Indeks", Order.Ascending}}),
        RemCols = Table.RemoveColumns(SortByIndeks,{"Idx", "Indeks"})
    in
        RemCols

    Regards

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 9:07 AM
  • Thanks Colin!

    Actually, I wanted to use a more elegant way using (_) => like described here, but couldn't get it to work (wouldn't have to write the "let..columns..in" then). Do you have an idea if/how this would work here?


    Imke Feldmann TheBIccountant.com


    Yes, good point. The script then becomes:

    let
      Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
      ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type datetime}, {"Sales", Int64.Type}}),
        Rol12= Table.AddColumn(ChType, "Custom", (ChRow) =>
                List.Sum(
                Table.SelectRows(
                  ChType, 
                  each [Date]>=Date.AddMonths(ChRow[Date],-11) and [Date]<=ChRow[Date] and [Associate]=ChRow[Associate]
                )[Sales]
              )
          )
    in
        Rol12

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:56 PM
    • Unmarked as answer by RollingHippy Tuesday, April 26, 2016 5:57 PM
    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 2:20 PM
  • Thanks Colin!

    Learned some "basics" today (Table.AddColumns taking in a function as the 3rd parameter... - syntax sugar is really dangerous :-) )

    So we just give the parameter a name of our choice ("ChRow" or "x") - really didn't get this!

    Re the Date.AddMonths, I ended up with:  [Date]>Date.AddMonths(ChRow[Date],-12)

    Also Bill just gave me a friendly reminder that I need to tune my solution in order to keep up with the speed of his solution. Table.Buffer does that for me. So here's the new code then :-)  :

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    ChType = Table.Buffer(Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}})),
    Rol12= Table.AddColumn(ChType, "Custom", (x) => List.Sum(Table.SelectRows(Table.Buffer(ChType), each [Date]>Date.AddMonths(x[Date],-12) and [Date]<=x[Date] and [Associate]=x[Associate])[Sales]))
    in
    Rol12


    Imke Feldmann TheBIccountant.com

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:55 PM
    • Unmarked as answer by RollingHippy Tuesday, April 26, 2016 5:55 PM
    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 5:06 PM
    Moderator

All replies

  • Hi,

    Try this code below. This is a one of possible ways.

    let
        fxRolling12 = (tbl as table) as table =>
          let
            AddIdx = Table.AddIndexColumn(tbl, "Idx",1,1),
            SalesList = AddIdx[Sales],
            Roll12Col = Table.AddColumn(AddIdx, "Rolling 12", each if [Idx] <= 12 then List.Sum(List.FirstN(SalesList,[Idx])) else List.Sum(List.LastN(List.FirstN(SalesList,[Idx]), 12)))
          in
            Roll12Col,
    
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}}),
        AddIndex = Table.AddIndexColumn(ChType, "Indeks", 1, 1),
        GroupByAssoc = Table.Group(AddIndex, {"Associate"}, {{"tbl", each fxRolling12(_), type table}}),
        RemColumn = Table.RemoveColumns(GroupByAssoc,{"Associate"}),
        Expand = Table.ExpandTableColumn(RemColumn, "tbl", {"Associate", "Date", "Idx", "Indeks", "Sales", "Rolling 12"}, {"Associate", "Date", "Idx", "Indeks", "Sales", "Rolling 12"}),
        ChType2 = Table.TransformColumnTypes(Expand,{{"Date", type date}}),
        SortByIndeks = Table.Sort(ChType2,{{"Indeks", Order.Ascending}}),
        RemCols = Table.RemoveColumns(SortByIndeks,{"Idx", "Indeks"})
    in
        RemCols

    Regards

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 9:07 AM
  • There's also a syntax that more resembles the Excel-way:

    
    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}}),
    Rol12= Table.AddColumn(ChType, "Custom", each let Asso=[Associate], Date=[Date] in List.Sum(Table.SelectRows(ChType, each [Date]>=Date.AddYears(Date,-1) and [Date]<=Date and [Associate]=Asso)[Sales]))
    in
    Rol12
    


    Imke Feldmann TheBIccountant.com

    Tuesday, April 26, 2016 12:00 PM
    Moderator
  • There's also a syntax that more resembles the Excel-way:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}}),
    Rol12= Table.AddColumn(ChType, "Custom", each let Asso=[Associate], Date=[Date] in List.Sum(Table.SelectRows(ChType, each [Date]>=Date.AddYears(Date,-1) and [Date]<=Date and [Associate]=Asso)[Sales]))
    in
    Rol12


    Imke Feldmann TheBIccountant.com

    Imke, there's a slight error in your very clever solution. Instead of Date.AddYears(Date, -1), you need to use Date.AddMonths(Date, -11) to get the correct sum after the year changes.
    Tuesday, April 26, 2016 12:50 PM
  • Thanks Colin!

    Actually, I wanted to use a more elegant way using (_) => like described here, but couldn't get it to work (wouldn't have to write the "let..columns..in" then). Do you have an idea if/how this would work here?


    Imke Feldmann TheBIccountant.com


    Tuesday, April 26, 2016 12:59 PM
    Moderator
  • Thanks Colin!

    Actually, I wanted to use a more elegant way using (_) => like described here, but couldn't get it to work (wouldn't have to write the "let..columns..in" then). Do you have an idea if/how this would work here?


    Imke Feldmann TheBIccountant.com


    Yes, good point. The script then becomes:

    let
      Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
      ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type datetime}, {"Sales", Int64.Type}}),
        Rol12= Table.AddColumn(ChType, "Custom", (ChRow) =>
                List.Sum(
                Table.SelectRows(
                  ChType, 
                  each [Date]>=Date.AddMonths(ChRow[Date],-11) and [Date]<=ChRow[Date] and [Associate]=ChRow[Associate]
                )[Sales]
              )
          )
    in
        Rol12

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:56 PM
    • Unmarked as answer by RollingHippy Tuesday, April 26, 2016 5:57 PM
    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 2:20 PM
  • Thanks Colin!

    Learned some "basics" today (Table.AddColumns taking in a function as the 3rd parameter... - syntax sugar is really dangerous :-) )

    So we just give the parameter a name of our choice ("ChRow" or "x") - really didn't get this!

    Re the Date.AddMonths, I ended up with:  [Date]>Date.AddMonths(ChRow[Date],-12)

    Also Bill just gave me a friendly reminder that I need to tune my solution in order to keep up with the speed of his solution. Table.Buffer does that for me. So here's the new code then :-)  :

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    ChType = Table.Buffer(Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}})),
    Rol12= Table.AddColumn(ChType, "Custom", (x) => List.Sum(Table.SelectRows(Table.Buffer(ChType), each [Date]>Date.AddMonths(x[Date],-12) and [Date]<=x[Date] and [Associate]=x[Associate])[Sales]))
    in
    Rol12


    Imke Feldmann TheBIccountant.com

    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:55 PM
    • Unmarked as answer by RollingHippy Tuesday, April 26, 2016 5:55 PM
    • Marked as answer by RollingHippy Tuesday, April 26, 2016 5:58 PM
    Tuesday, April 26, 2016 5:06 PM
    Moderator
  • Thank you all so much! This is exactly what I was looking for.
    Tuesday, April 26, 2016 6:00 PM
  • Hi Imke,

    Note that in Rol12, ChType is already buffered, so buffering it a second time will make no performance difference.

    I agree that the mixed syntax sugar makes the Rol12 operation a bit harder to understand. Removing "each," for example, Rol12 can be written:

    Rol12= Table.AddColumn(
                    ChType,
                   "Custom",
                    (x) => List.Sum(
                                   Table.SelectRows(
                                       ChType,
                                       (y)=> y[Date]>Date.AddMonths(x[Date],-12) 
                                       and y[Date]<=x[Date]
                                       and y[Associate]=x[Associate])[Sales]
                                   )
                                )

    Recognizing that the x and y prefixes represents a table with one row (the current row being evaluated).

    Tuesday, April 26, 2016 7:21 PM
  • Hi Imke,

    Note that in Rol12, ChType is already buffered, so buffering it a second time will make no performance difference.p>

    That's what I thought as well, but it is actually speeding up again. Definitely strange - but confirming my doubts everytime I had to decide where to put the buffer.

    But one shouldn't forget, that buffering would prevent query-folding. So if you work on SQL-servers or other sources that would pass the commands to the server, Bill's solution would probably be faster!

    Very diplomatic introduction of (y) :-)

    "Each" definitely makes formulas look less scary, but if you have a nested iteration like in our case here, they will not serve. So for didactic purposes, this might be a better way to put it:

    Rol12= Table.AddColumn(
                    ChType,
                   "Custom",
                    (FilterTable) => List.Sum(
                                   Table.SelectRows(
                                       Table.Buffer(ChType),
                                       (ValuesTable)=> ValuesTable[Date]>Date.AddMonths(FilterTable[Date],-12) 
                                       and ValuesTable[Date]<=FilterTable[Date]
                                       and ValuesTable[Associate]=FilterTable[Associate])[Sales]
                                   )
                                )

    We only have one source-table here, which needs to play 2 roles: It delivers the values to be added (ValuesTable: [Sales]) and also the filter-arguments to be checked against (FilterTable: [Date] & [Associate])

    Thanks for your persistence here :-)


    Imke Feldmann TheBIccountant.com


    Tuesday, April 26, 2016 8:13 PM
    Moderator
  • "We only have one source-table here, which needs to play 2 roles: It delivers the values to be added (ValuesTable: [Sales]) and also the filter-arguments to be checked against (FilterTable: [Date] & [Associate])"

    I couldn't have stated that any better! :) This is how EARLIER works in Power Pivot, while hiding the details. It's also a pattern common in SQL (correlated subqueries). One can appreciate how much work SUMIFS does.


    Tuesday, April 26, 2016 9:39 PM
  • Ok...So for fun only :-)

    Here is the code without any "=>" or "fxSomething".... counting only.

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChType = Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}}),
        AddIndex1 = Table.AddIndexColumn(ChType, "Indeks", 0, 1),
        Sort1 = Table.Sort(AddIndex1,{{"Associate", Order.Ascending}, {"Date", Order.Ascending}}),
        HMMonths = List.Count(List.Distinct(Sort1[Date])),
        ListOfSales = List.Buffer(Sort1[Sales]),
        AddIndex2 = Table.AddIndexColumn(Sort1, "Idx", 0,1),
        ModForIdxCol = Table.TransformColumns(AddIndex2, {{"Idx", each Number.Mod(_, HMMonths), type number}}),
        AddIndex3 = Table.AddIndexColumn(ModForIdxCol, "IdxFromListRange", 0, 1),
        AddCustomColumn = Table.AddColumn(AddIndex3, "Custom", each if [Idx] = 0 then [IdxFromListRange] else null),
        FillDown = Table.FillDown(AddCustomColumn,{"Custom"}),
        Rolling12 = Table.AddColumn(FillDown, "Rolling12", each if [Idx] < 11 then List.Sum(List.Range(ListOfSales,[Custom],[Idx]+1)) else List.Sum(List.Range(ListOfSales,[Idx]-11+[Custom],12))),
        Sort2 = Table.Sort(Rolling12,{{"Indeks", Order.Ascending}}),
        RemoveColumns = Table.RemoveColumns(Sort2,{"Indeks", "Idx", "IdxFromListRange", "Custom"})
    in
        RemoveColumns
    Regards :-)

    Wednesday, April 27, 2016 9:29 AM
  • It turns out that there's more to this story.

    I used data from the AdventureWorks database, which amounted to about 24K columns. After returning the table to Excel, and adding a new column, it took about 9 seconds for Excel to fill down the 24K rows with the original SUMIFS formula.

    The Imke/Colin solution was a complete disaster. It took about 10 minutes (!!) for the query to complete, rendering the solution useless for any reasonably sized dataset. All attempts at table buffering slowed down the query even more.

    Bill's last solution completed in an impressive 21 seconds. However, the row order was all over the place - not in the expected running 12-Month order. I'm sure that the problem could be fixed, but I didn't try because I couldn't follow the logical flow of the script. It seemed to be quite inelegant, to be honest.

    I wondered if the Imke/Colin solution could be improved significantly be grouping the table by Product and Date(I was using Product sales over four years in this case). This grouping produces a column of tables filtered by product, eliminating an extra criterion in the rolling 12-month summation. The Imke/Colin solution would then add the rolling total to a new column in each table of the column of tables.

    To my complete surprise, the revised solution completed in 5 seconds, almost twice as fast as the native Excel formula! This is not the first time that I've had to revise a query due to performance issues. The new script is a as follows:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChangedType = Table.Buffer(Table.TransformColumnTypes(Source,{{"Associate", type text}, {"Date", type date}, {"Sales", Int64.Type}})),
        SortedRows = Table.Sort(ChangedType,{{"Date", Order.Ascending}, {"Associate", Order.Ascending}}),
        AddedSortIndex = Table.AddIndexColumn(SortedRows, "Index", 0, 1),
        GroupedRows = Table.Group(AddedSortIndex, {"Associate"}, {{"Table", each _, type table}}),
        Rolling12Month =  Table.TransformColumns(
                     GroupedRows, 
                     {"Table", 
                         (TableRow) => 
                         Table.AddColumn(
                             TableRow, 
                             "12-Month Running Total", 
                             (FilterTable) =>  
                             List.Sum(
                                 Table.SelectRows(
                                     TableRow, 
                                     (ValuesTable)=> ValuesTable[Date]>Date.AddMonths(FilterTable[Date],-12)
                                                     and ValuesTable[Date]<=FilterTable[Date]
                                 )[Sales]
                             )
                         )
                       }
                 ),
        RemovedColumns = Table.RemoveColumns(Rolling12Month,{"Associate"}),
        ExpandedTable = Table.ExpandTableColumn(RemovedColumns, "Table", {"Associate", "Date", "Sales", "Index", "12-Month Running Total"}, {"Associate", "Date", "Sales", "Index", "12-Month Running Total"}),
        SortedIndex = Table.Sort(ExpandedTable,{{"Index", Order.Ascending}}),
        RemovedIndexColumn = Table.RemoveColumns(SortedIndex,{"Index"}),
        ChangedType2 = Table.TransformColumnTypes(RemovedIndexColumn,{{"Date", type date}})
    in
        ChangedType2

    Indenting always gets screwed up after pasting code in this forum.

    Saturday, April 30, 2016 9:18 PM
  • Hi Colin,

    I am sure, that the method we use to the same problem, depends on the structure of the data. I mean that we will use a different method in case we have only a few types of "associate" and different when we have a hundreds.
    By the way, I made the table with 25 000 rows and i can't reproduce your timing :-(
    Please, tell me more about data you transform.

    Have a nice weekend :-)

    Sunday, May 01, 2016 12:47 AM
  • Hi Bill,

    The script for the data transformed is:

    let
        Source = Sql.Databases("Colin-PC\SQLSERVERTABULAR"),
        AdventureWorksDW2012 = Source{[Name="AdventureWorksDW2012"]}[Data],
        dbo_FactInternetSales = AdventureWorksDW2012{[Schema="dbo",Item="FactInternetSales"]}[Data],
        ExpandedDimProduct = Table.ExpandRecordColumn(dbo_FactInternetSales, "DimProduct", {"ProductAlternateKey"}, {"ProductAlternateKey"}),
        RemovedOtherColumns = Table.SelectColumns(ExpandedDimProduct,{"ExtendedAmount", "OrderDate", "ProductAlternateKey"}),
        ReorderedColumns = Table.ReorderColumns(RemovedOtherColumns,{"OrderDate", "ProductAlternateKey", "ExtendedAmount"}),
        ChangedType = Table.TransformColumnTypes(ReorderedColumns,{{"OrderDate", type date}}),
        GroupedRows = Table.Group(ChangedType, {"OrderDate", "ProductAlternateKey"}, {{"Sales", each List.Sum([ExtendedAmount]), type number}}),
        SortedRows = Table.Sort(GroupedRows,{{"OrderDate", Order.Ascending}}),
        AddedIndex = Table.AddIndexColumn(SortedRows, "Index", 0, 1),
        GroupedRows2 = Table.Group(AddedIndex, {"ProductAlternateKey"}, {{"Product Table", each _, type table}}),
        AddedColumn =  Table.TransformColumns(
                    GroupedRows2, 
                    {"Product Table", 
                     (ProdTable) => Table.AddColumn(
                              ProdTable, 
                              "12-Month Running Total", 
                              (OuterProdRow) =>  List.Sum(
                                        Table.SelectRows(
                                          ProdTable, 
                                          (InnerProdRow)=> InnerProdRow[OrderDate]>Date.AddMonths(OuterProdRow[OrderDate],-12)
                                                 and InnerProdRow[OrderDate]<=OuterProdRow[OrderDate]
                                        )[Sales]
                                      )
                            )
                    }
                  ),
        RemovedColumns = Table.RemoveColumns(AddedColumn,{"ProductAlternateKey"}),
        ExpandedProductTable = Table.ExpandTableColumn(RemovedColumns, "Product Table", {"OrderDate", "ProductAlternateKey", "Sales", "Index", "12-Month Running Total"}, {"OrderDate", "ProductAlternateKey", "Sales", "Index", "12-Month Running Total"}),
        SortedRows2 = Table.Sort(ExpandedProductTable,{{"Index", Order.Ascending}}),
        RemovedColumns1 = Table.RemoveColumns(SortedRows2,{"Index"}),
        ChangedType2 = Table.TransformColumnTypes(RemovedColumns1,{{"OrderDate", type date}, {"Sales", Currency.Type}, {"12-Month Running Total", Currency.Type}})
    in
        ChangedType2

    There are exactly 23,797 rows returned. The 5 second timing was the same whether I used Excel 2016 or Excel 2010 with the Power Query add-in.

    Sunday, May 01, 2016 10:10 PM
  • That's really interesting - so the query definitely folded. Would it be possible to share the SQL-"translation" from the profiler?

    Re buffering & performance: Buffering is (normally) expected to stop folding. Speculation: Maybe cache-management & buffering don't work together when addressing SQL-sources. Will give it a go later. 


    Imke Feldmann TheBIccountant.com


    Monday, May 02, 2016 5:20 AM
    Moderator
  • Gave it a try and the winner is (3 secs):

    let Quelle = Sql.Database("srvsql1\dev", "AdventureWorksDW2012",

    [Query="select [$Outer].[ExtendedAmount], [$Outer].[OrderDate], [$Inner].[ProductAlternateKey], SUM( [$Outer].[ExtendedAmount]) OVER (PARTITION BY [$Inner].[ProductAlternateKey] ORDER BY [$Outer].[OrderDate]) AS 'total' from ( select [_].[ProductKey] as [ProductKey2], [_].[ExtendedAmount] as [ExtendedAmount], [_].[OrderDate] as [OrderDate] from [dbo].[FactInternetSales] as [_] ) as [$Outer] left outer join [dbo].[DimProduct] as [$Inner] on ([$Outer].[ProductKey2] = [$Inner].[ProductKey]) ORDER BY [$Outer].[OrderDate]"]) in Quelle

    This is my personal final nail to the "Query-folding-coffin" so far.

    None of our above queries folded and there were all sorts of issues with it which I'm too sad to describe.

    Anyone know any official source from Microsoft describing how and when folding should work?

    Are we safe to assume that all native SQL-queries will fold?


    Imke Feldmann TheBIccountant.com

    Monday, May 02, 2016 10:05 AM
    Moderator
  • Hi Imke, it's not clear what you're saying. Query folding in Power Query is limited, and I never expected the query to fold beyond the SortedRows step.  I'm I missing something?

    Your native query took between 1 and 2 seconds to run on my system. There is one difference compared to the Power Query script, which sums all of the sales per product per day, prior to calculating the running total. If this transformation step is not performed, you end up with running totals rows with the same total multiple times. Perhaps that result is acceptable though.



    Monday, May 02, 2016 4:34 PM
  • Hi Colin,

    yes, sorry - was a under shock still (to obviously be the last one on the planet who recognized that native queries aren't breaking query-folding any more, although in such a desparate need for it for a client solution). So still looking out for up-to-date documentation on it.

    Re your solution: Great!

    Just one suggestion for improvement: Call Step "GroupedRows2" "Table.PartitionByDimensions" instead. (I've used this technique often as well, but it didn't cross my mind to use it for this task) (& isn't this why the SQL-statement also runs so quickly? - reducing the lines to be iterated through into a different dimension)

    Still haven't had time to get my head around Bill's solution, but this is still on the radar cause I'm expecting some serious M-whispering-secrets in it, that could be helpful in the future.


    Imke Feldmann TheBIccountant.com



    Tuesday, May 03, 2016 5:27 AM
    Moderator
  • Hi Imke,

    Just my two cents...

    You actually bring up a point that's been an irritation for me in M. A variable name is a label, and as such should always be a noun, and not an action verb. This basic principle, generally followed in every other programming language, is routinely violated in M for reasons I cannot fathom. Notice how every default variable name in M is a noun? So, for example, you might see SortedList = List.Sort(...) , but never ListSort = List.Sort(...).

    I do fully agree that we should strive to use better naming for steps, because it helps make the code self-documenting. Instead of "GroupedRows2," I prefer "GroupedProducts" - it is a noun, self-describing and to the point. But then, what do we call "GroupedRows?" "Grouped_Date_Product" perhaps? If there were many columns being grouped, I'd probably stick to "GroupedRows" :).

    Although I use the name "SortIndex," a more appropriate name would be "SortingIndex," because it better describes why this step is included. And so on...

    Notice that M motivates us to change the default names by its use of ugly, error-prone syntax like #"Changed Type." Great for readability in the UI I guess, but a PITA when modifying code.

    Finally, in the interest of clarity, I do not support using short, cryptic names for variables.

    Tuesday, May 03, 2016 2:13 PM
  • hehe... Colin, there are no variables in Power Query ;-) Steps are immutable.

    And about naming... I agree but I disagree with you :-)

    And...You forgot add an information about source of your data. It is obvious that the query based on sql sever data, should be build in a different way (because of query folding) than query based on other sources ( for example txt, csv, xlsx files or excel tables or web page etc). because these latter never folds.
    So try to use your code to 

    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]

    as you described in some of your previous post. The timming is completly different than if you use it to sql database ;-)

    Cheers :-)

    Wednesday, May 04, 2016 3:12 PM
  • "hehe... Colin, there are no variables in Power Query ;-) Steps are immutable."

    Bill, have you ever read the Formula Language document? If you had, you would have noticed that the word variable is used throughout that document. Anyway, a variable means that on each execution of a query, the value of the "variable" can change, as opposed to a assignment that is constant (like xStep =2). In Haskell - one of the purest functional languages on the planet - the word "variable" is liberally used to refer to immutable values. 

    "And about naming... I agree but I disagree with you :-)"

    That statement makes sense.

    So try to use your code to 

    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]

    as you described in some of your previous post. The timming is completly different than if you use it to sql database ;-)

    I actually did this very thing at the time of posting my last script, and you're right. Loading the SQL data (60K rows and 3 columns) to a table, then using the table as a new source (after unlinking from the SQL source), the query took 6 sec to complete instead of 5 sec. 

    Wednesday, May 04, 2016 4:11 PM