none
PQ Tunnelling optimisation candidates RRS feed

  • Question

  • I wonder whether PQ Tunneling optimization takes place, and what can we expect from it.

    I. E.g. let's consider this code:

    Section E;
    
    shared Q1 = let
        r = [f1 = 1, f2 = 2],
        t = some table,
        t1 = Table.AddColumn(t, "r", each r),
        t2 = Table.ExpandRecordColumn(t1, "r", {"f1", "f2"}),
    
        a number of other complex steps, but without influence on columns f1 and f2 of t2
    in tN;
    
    shared Q2 = let
        a number of complex steps, but without influence on columns f1 and f2 of Q1
        t1 = Table.RemoveColumns(Q1, "f2")
    in t1

    No any type of Buffer anywhere.

    So the question is:

    From internal point of view, for calculation of Q2 (e.g. f only Q2 is uploaded to model and we update data in the model (not work in PQ UI)) - is this variant different in comparison with variant when f2 won't be used at all in all the queries?

    Please could you give the answer for both cases:

    1. f2 is hardcoded like in the example above.

    2. f2 is in dynamically evaluated, i.e. dynamically computed based on real data got during evaluation, like this:

    Section E;
    
    shared r = get record from some source, dynamically computing [f1 = 1, f2 = 2];
    
    shared fl = Record.FieldNames(r);
    
    shared Q1 = let
        t = some table,
        t1 = Table.AddColumn(t, "r", each r),
        t2 = Table.ExpandRecordColumn(t1, "r", fl),
    
        a number of other complex steps, but without influence on columns f1 and f2 of t2
    in tN;
    
    shared Q2 = let
        a number of complex steps, but without influence on columns f1 and f2 of Q1
        t1 = Table.RemoveColumns(Q1, fl{1})
    in t1

    II. Does using variable really optimize that expression:

    r = some record,
    fl = Record.FieldNames(r),
    t1 = Table.AddColumns(t, "blabla", each fl)
    
    versus
    
    t1 = Table.AddColumns(t, "blabla", each Record.FieldNames(r))

    Obvious answer would be yes, because in the second example Record.FieldNames(r) is called for every row of the table.

    But does it really take place, or PQ tunnel optimization algorithm substitutes internally Record.FieldNames(r) by some sort of internal variable, so both the variant are actually equal, from performance point of view? Just because in both the variants Record.FieldNames(r) will be called just once?

    Monday, July 29, 2019 8:14 PM

All replies

  • Hi Andrey. I'm not 100% sure what you're asking. (I'm also unfamiliar with what you mean by "tunnelling".) Can you clarify?

    Ehren

    Tuesday, August 6, 2019 10:11 PM
    Owner
  • Like Ehren, I'm also not sure what you mean by "tunneling". The use of abstract examples or a mix of code and pseudocode also makes it a little harder to understand the question(s).

    One thing I can tell you in this area is that in the following invocation

    Table.AddColumn(table, "Column Name", each expression())

    we do go through a phase we call "constant folding" which tries to simplify "each expression()" by doing things like inlining external functions and evaluating functions for which all the input arguments are known. You can see this most notoriously by adding a column produced by Number.Random() and observing that when you fill to a worksheet you get the same value in all cells. This is because we don't internally have a notion of a non-functional function, so the Number.Random() call gets evaluated once and inlined into a new function.

    Wednesday, August 7, 2019 7:02 PM
  • Hi Andrey,

    did Curts answer solve your question? Please mark it as answer then.

    If not please specify or close this thread.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Friday, August 16, 2019 5:47 PM
    Moderator