none
When and how the function is evaluated? RRS feed

  • Question

  • Suppose I have a code:

    let
        MyTable = #table(type table [Number = number], List.Zip({List.Repeat({1..10},10)})),
        MyFunction = let minimum = List.Min(MyTable[Number]) in (row) => row[Number] = minimum,
        FilteredTable = Table.SelectRows(MyTable, MyFunction)
    in
        FilteredTable

    Questions:

    1. How much times List.Min(MyTable[Number]) will be calculated there? Am I right that List.Min will be calculated there only once?

    2. Is there a difference (in the terms of performance and inner logic) with 

    = Table.SelectRows(MyTable, each [Number]=List.Min(MyTable[Number]))
    and why? Am I right that there List.Min will be calculated for each row?


    Maxim Zelensky Excel Inside

    Wednesday, February 20, 2019 4:55 PM

Answers

  • In the first version, the list minimum is calculated only once and stored in the let member. It is referenced from the let member thereafter.

    In the second version, the list minimum is probably calculated once per row. We do attempt to do constant-folding on the predicate and so this isn't guaranteed, but the constant-folding happens within certain complexity limits which I think your example probably exceeds.

    Thursday, February 21, 2019 9:39 PM

All replies

  • Hi Maxim,

    that sounds about right. Please check this article: https://blog.crossjoin.co.uk/2018/11/16/function-scalarvector-optimise-power-query-m-functions/



    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Thursday, February 21, 2019 5:25 AM
    Moderator
  • Hi Imke!

    I think that for the 2nd question it is clear and expected, but for the 1st question there is no answer in the Chris post. We do not pass a column/list to the function, and there is no Function.ScalarVector calls.

    Based on my tests (1M of dates in a single column, 10th of runs), the first code sample works about 25-40% faster than 2nd. I also wrote a code where List.Min is precalculated on the other step and then [Date] compared to this precalculated value. Results is almost the same as the 1st code.

    So, if I am right, then when this function evaluated to get the function value, List.Min is evaluated also and remembered for the future uses (invokations for the each row). May be it is because its evaluation result is assigned to the immutable 'minimum'? If so, I can rewrite a lot of my functions to get a big performance gain.

    I can imagine that any part of the function body which do nothing with function parameters is evaluated during the function value evaluation. But I am definitely not sure about it. And if so, it opens a very interesting perspectives for performance tuning in Power Query.

    Or it will be evaluated (and remembered) before the invokation in any case? If so, why this doesn't work in the 2nd case?


    Maxim Zelensky Excel Inside

    Thursday, February 21, 2019 11:22 AM
  • Hi Maxim,

    yes, I was too fast with my answer. As I don't have insights into how the M-statements are evaluated internally, I'm also very interested to hear some feedback from the Dev-team here.

    Thanks for raising this issue!


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Thursday, February 21, 2019 6:35 PM
    Moderator
  • In the first version, the list minimum is calculated only once and stored in the let member. It is referenced from the let member thereafter.

    In the second version, the list minimum is probably calculated once per row. We do attempt to do constant-folding on the predicate and so this isn't guaranteed, but the constant-folding happens within certain complexity limits which I think your example probably exceeds.

    Thursday, February 21, 2019 9:39 PM
  • Thank you, Curt!

    In the first version - am I right that "any part of the function body which do nothing with function parameters is evaluated during the function value evaluation"?

    Or there is other logic - we have a let expression which evaluates to the function value, so when Table.AddColumn evaluates given expression it also evaluates let members in it, and then function value already have pre-calculated List.Min.

    After your explanations I think that second is correct, but I'd like to hear the confirmation from you.

    Cheers,


    Maxim Zelensky Excel Inside

    Friday, February 22, 2019 6:47 AM
  • Hi Maxim

    Since you are about to rewrite your functions for speed gain, the following (from the excellent bengribaudo.com) is worth reading: it can save 50% of the execution time of some functions. You may already know it, so just if not and for the other readers:

    Source: Power Query M Primer (Part 5): Paradigm

    I am quoting from the "Partially Lazy" paragraph, all the following belongs to bengribaudo:


    Partially Lazy
    (...)
    This lazy evaluation behavior also holds true for lists, records and tables.

    List.Count({ ComputeValue(), ComputeAnotherValue() })

    To count the number of items in a list, the M interpreter doesn’t need to evaluate the contents of those items, so it doesn’t.
    Above, ComputeValue() and ComputeAnotherValue() are NEVER called because of lazy evaluation.


    BUT

    However, in M, ONLY

    • let expression
    • lists
    • records 
    • tables

    are lazy evaluated.
    Eager evaluation
    is used for everything else. This is why we say that M is only partly lazy.

    For example, arguments to a function call are eagerly evaluated.


    let
      Numbers = { 1, 2, 3 },
      Letters = { "A", "B", "C" },
      Choose = (chooseFirst as logical, first, second) => 
        if chooseFirst then first else second
    in
      Choose(true, Numbers, Letters)

    Since both Numbers and Letters are used as parameters when Choose is called and parameters aren’t lazy evaluated, the expressions for BOTH variables are evaluated even though only one of the two will used by the function.



    SOLUTION TO AVOID UNNECESSARY EVALUATATION

    Higher-Ordered

    Above, both lists are evaluated because they were passed as arguments to a function. What if we really wanted to avoid having both lists materialized only for one to be thrown away? Hmm…

    Wait! Remember—we can pass functions around

    Instead of passing Choose two lists of values, what if we adapt it so that we pass it functions that produce values when they are invoked?

    (I don’t meanuse functions in the parameter list to populate the function’s arguments,” I mean “pass the functions as arguments into the function itself.”) Choose can then choose to only execute one of the functions—the  one that produces the output it needs. The other function is not invoked so the list it generates will not be generated.

    let
      Numbers = () => { 1, 2, 3 },
      Letters = () => { "A", "B", "C" },
      Choose = (chooseFirst as logical, first, second) => 
        if chooseFirst then first() else second()
    in
      Choose(true, Numbers, Letters)


    Technically, the M interpreter still evaluates both Numbers and Letters. However, since these names now identify functions, evaluating them simply takes their function expressions and turns them into function values.

    The two function values are passed into function Choose which then invokes only ONE of them.
    For short, we say that we passed two functions into Choose which chose which of the two to invoke (how’s that for a tongue twister!).

    (...)

    Best, Anthony

    • Edited by anthony34 Sunday, February 24, 2019 10:59 AM
    Sunday, February 24, 2019 10:47 AM
  • Hi Anthony!

    Yes, I already saw this excellent post by Ben and reread it before coming here.

    What was not clear that the expression, evaluated to get the function in my 1st example, is not the function body, it is just a let expression that returns a function value. 

    I think that this is because Table.SelectRows evaluates this expression only once before invoking it multiple times.

    This transition of immutable let member to the function value is a kind of magic which can help in a lot of ways.

    Best wishes,

    Maxim




    Sunday, February 24, 2019 11:51 AM