none
What is the best way to add all the field of record to table as columns? RRS feed

  • Question

  • We obviously can do it like this (UPDATED in order to keep variants close to each other from performance point of view):

    r = some record,

    fl = Record.FieldNames(r),

    Table.AddColumn(t, "r", each r), Table.ExpandRecordColumn(t, "r", fl)

    But my overall feeling is that .ExpandXXX slow down "compiling" part of evaluation (when there is no any sign of doing something in low part corner of PQ UI). So what about this one:

    r = some record,
    t = some table,
    s1 = List.Accumulate(Record.FieldList(r), t, (v, i) => Table.AddColumn(v, i, Record.Field(r, i))

    Is it the same from performance point of view, by design?



    Monday, July 29, 2019 3:16 PM

Answers

  • Hey!

    I take it this is a purely theoretical question. There are ways to "combine" records, but note that a record and a table are quite similar, but at the same time quite different. I usually tell people that a record is like a table but with only 1 row. 

    The situation here is that in a Table you can have as many rows as you'd like so, how do you assign that single row to multiple rows?

    If you look at how it's done with the Merge operation, you'll see that it creates this "NewColumn" that you can later expand, but you'll need to figure out the names of the columns that you'll be expanding at that point and that's where you either hardcode those column names or you use something like an M function that grabs the column names like "Table.ColumnNames". At that point you're bound to first fetch the names of the columns and only then pass those values for one of the Expand operations.

    This is a code that I've created just for this thread. I've never had the need to use it before, but it exists.

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGYlOlWJ1oJSMgywSIzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}}),
        MyRecord = [D=7,E=8],
        Custom2 = #"Changed Type",
        #"Added Custom" = Table.AddColumn(Custom2, "Custom", each _ &MyRecord),
        Custom = Table.FromRecords( #"Added Custom"[Custom])
    in
        Custom

    Tuesday, July 30, 2019 4:28 AM
  • nicely done Miguel, with Andrey's definitions it summarizes as:

    Table.FromRecords(Table.AddColumn(t, "new col", each _ & r )[new col])

    Do you think this list could be created directly without going through the Table.AddColumn ?

    Table.AddColumn(t, "new col", each _ & r )[new col]

    Another question: is it really faster than Table.ExpandRecordColumn ?



    Based on Miguel's Query - of course, we can avoid Table.AddColumn.

    I mean something like this below

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGYlOlWJ1oJSMgywSIzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}}),
        MyRecord = [D=7,E=8],
        Custom1 = Table.FromRecords(List.Transform(Table.ToRecords(#"Changed Type"), each _ & MyRecord))
    in
        Custom1

    Tuesday, July 30, 2019 10:47 AM

All replies

  • Hey!

    I take it this is a purely theoretical question. There are ways to "combine" records, but note that a record and a table are quite similar, but at the same time quite different. I usually tell people that a record is like a table but with only 1 row. 

    The situation here is that in a Table you can have as many rows as you'd like so, how do you assign that single row to multiple rows?

    If you look at how it's done with the Merge operation, you'll see that it creates this "NewColumn" that you can later expand, but you'll need to figure out the names of the columns that you'll be expanding at that point and that's where you either hardcode those column names or you use something like an M function that grabs the column names like "Table.ColumnNames". At that point you're bound to first fetch the names of the columns and only then pass those values for one of the Expand operations.

    This is a code that I've created just for this thread. I've never had the need to use it before, but it exists.

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGYlOlWJ1oJSMgywSIzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}}),
        MyRecord = [D=7,E=8],
        Custom2 = #"Changed Type",
        #"Added Custom" = Table.AddColumn(Custom2, "Custom", each _ &MyRecord),
        Custom = Table.FromRecords( #"Added Custom"[Custom])
    in
        Custom

    Tuesday, July 30, 2019 4:28 AM
  • nicely done Miguel, with Andrey's definitions it summarizes as:

    Table.FromRecords(Table.AddColumn(t, "new col", each _ & r )[new col])

    Do you think this list could be created directly without going through the Table.AddColumn ?

    Table.AddColumn(t, "new col", each _ & r )[new col]

    Another question: is it really faster than Table.ExpandRecordColumn ?



    • Edited by anthony34 Tuesday, July 30, 2019 7:35 AM
    Tuesday, July 30, 2019 7:29 AM
  • nicely done Miguel, with Andrey's definitions it summarizes as:

    Table.FromRecords(Table.AddColumn(t, "new col", each _ & r )[new col])

    Do you think this list could be created directly without going through the Table.AddColumn ?

    Table.AddColumn(t, "new col", each _ & r )[new col]

    Another question: is it really faster than Table.ExpandRecordColumn ?



    wouldn't be able to tell you if it's faster or not. One function that you can't really run away from is Table.AddColumn

    As the function states, it adds a new column. If there was a Table.AddColumns available and we could pass our own function to it, then that would be exactly what Andrey is looking for.

    Btw, I remember now when I first saw the usage of that "&" and it was in one of the sample Power BI Custom Connectors.

    Something also quite interesting is that it works with Lists but at that point is like you're implicitly doing a List.Combine.

    Tuesday, July 30, 2019 10:29 AM
  • nicely done Miguel, with Andrey's definitions it summarizes as:

    Table.FromRecords(Table.AddColumn(t, "new col", each _ & r )[new col])

    Do you think this list could be created directly without going through the Table.AddColumn ?

    Table.AddColumn(t, "new col", each _ & r )[new col]

    Another question: is it really faster than Table.ExpandRecordColumn ?



    Based on Miguel's Query - of course, we can avoid Table.AddColumn.

    I mean something like this below

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGYlOlWJ1oJSMgywSIzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}}),
        MyRecord = [D=7,E=8],
        Custom1 = Table.FromRecords(List.Transform(Table.ToRecords(#"Changed Type"), each _ & MyRecord))
    in
        Custom1

    Tuesday, July 30, 2019 10:47 AM
  • thanks Bill, exactly what I had in mind.

    with Andrey's definitions :

    let 
    r = #"some record",
    t = #"some table",
    in
    Table.FromRecords(List.Transform(Table.ToRecords(t), each _ & r))
    
    Tuesday, July 30, 2019 11:19 AM
  • nicely done Miguel, with Andrey's definitions it summarizes as:

    Table.FromRecords(Table.AddColumn(t, "new col", each _ & r )[new col])

    Do you think this list could be created directly without going through the Table.AddColumn ?

    Table.AddColumn(t, "new col", each _ & r )[new col]

    Another question: is it really faster than Table.ExpandRecordColumn ?



    Based on Miguel's Query - of course, we can avoid Table.AddColumn.

    I mean something like this below

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGYlOlWJ1oJSMgywSIzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}}),
        MyRecord = [D=7,E=8],
        Custom1 = Table.FromRecords(List.Transform(Table.ToRecords(#"Changed Type"), each _ & MyRecord))
    in
        Custom1


    You’re right. Completely forgot about turning that table into a list of records and using list.transform against it. Record.combine would be explicit record field combination.
    Tuesday, July 30, 2019 2:16 PM
  • Guys, hi both of you!

    First of all, thanks for your participation.

    Just want to remind that the question is what is faster - do you have any experience in approaches comparison? But thanks to your discussion I understood that it should be added "at the same level of type safety". 

    Have to mention that if you use Table.FromRecords(List.Transform(Table.ToRecords approach, you lose all types of columns, so all of them will be of "any" type. The same is for "_ & r" approach.

    Of course I didn't mention typing in my example, but both of the ways let add typing with noticeably less overhead (just for new fields). In case of Table.ToRecords, you have to do it for the whole table, and in case it has dynamic table structure you'll have to do a double call to the source in order to get the typing.

    And performance is not a theoretical question if you have a little bit of complex ETL. 
    Tuesday, July 30, 2019 10:40 PM
  • Guys, hi both of you!

    First of all, thanks for your participation.

    Just want to remind that the question is what is faster - do you have any experience in approaches comparison? But thanks to your discussion I understood that it should be added "at the same level of type safety". 

    Have to mention that if you use Table.FromRecords(List.Transform(Table.ToRecords approach, you lose all types of columns, so all of them will be of "any" type. The same is for "_ & r" approach.

    Of course I didn't mention typing in my example, but both of the ways let add typing with noticeably less overhead (just for new fields). In case of Table.ToRecords, you have to do it for the whole table, and in case it has dynamic table structure you'll have to do a double call to the source in order to get the typing.

    And performance is not a theoretical question if you have a little bit of complex ETL. 

    have you tried doing some performance tests? I'd love to know your findings. Especially if you run those tests on a Mac.

    Don't forget that in a concrete scenario, the performance relies on how query folding will behave, your data source and other factors that may be directly related to the data source. Not to mention that how you calculate things can impact how things get lazy evaluated / loaded - and this is quite critical when building a Custom Connector when you want to take advantage of lazy evaluation to the fullest.

    Wednesday, July 31, 2019 2:47 AM
  • Here some results.

    as usual, all depends on the data and the context, but it shows some trend.
    I "mesured" with a table of 1M rows and 3M rows, using Process Monitor.
    Note that I did not care about the keeping the types


    Table.Expand
    1M rows = 1.5 ms
    3M rows = 8 ms

    let
        t = #table(
                    type table [col1=number, col2=number, col3=number],
                    List.Zip({{1..3E6},{1..1E6},{1..1E6}})
        ),
        r = [rec1="a", rec2=70, rec3={"a",70}],
        Source1 = Table.AddColumn(t, "r", each r),
        Result1 = Table.ExpandRecordColumn(Source1, "r", Record.FieldNames(r))
    in Result1


    List.Accumulate
    1M rows =  ms
    3M rows =  ms
    to be updated


    Table.FromRecords
    1M rows = 6 ms
    3M rows = 17.5 ms

    let
        t = #table(
                    type table [col1=number, col2=number, col3=number],
                    List.Zip({{1..3E6},{1..1E6},{1..1E6}})
        ),
        r = [rec1="a", rec2=70, rec3={"a",70}],
        Result3 = Table.FromRecords(List.Transform(Table.ToRecords(t), each _ & r))
    in Result3



    • Edited by anthony34 Wednesday, July 31, 2019 8:30 AM
    Wednesday, July 31, 2019 8:26 AM
  • Hi Andrey, what is
    Record.FieldList
    I cannot replicate your query

    • Edited by anthony34 Wednesday, July 31, 2019 9:16 AM
    Wednesday, July 31, 2019 8:28 AM
  • Record.FieldNames, sorry for the mistake!
    Wednesday, July 31, 2019 2:42 PM
  • still not working

    s1 = List.Accumulate(Record.FieldNames(r), t, (v, i) => Table.AddColumn(v, i, Record.Field(r, i)))
    Wednesday, July 31, 2019 4:11 PM
  • Anthony, sorry for so long answer.

    I meant this code (we should add "each" before "Record.Field":

    s1 = List.Accumulate(Record.FieldNames(r), t, (v, i) => Table.AddColumn(v, i, each Record.Field(r, i)))
    Sunday, August 18, 2019 2:12 PM
  • I believe that via Table.FromRecords we somehow replicate the inner behavior of Table.ExpandRecordColumn, but in not efficient way. I mean that Table.ExpandRecordColumn COULD be written internally not via the other M functions, but via the lower-levle code, so it could be optimised


    Maxim Zelensky Excel Inside

    Sunday, August 25, 2019 11:19 AM