none
How to preserve the Columns Types after a Table.Group / Table.Expand RRS feed

  • Question

  • Hi
    When using Table.Group (all columns) and then Table.Expand, all the columns types are lost.
    How to preserve the columns types ?


    • Edited by anthony34 Wednesday, January 30, 2019 1:36 PM
    Wednesday, January 30, 2019 1:15 PM

Answers

  • It just occurred to me that Aleksei provided a solution for the case where the number of columns are reduced. :) 

    A function, such as the following, is useful if your issue is a frequent occurrence. I call the function Table_PreserveColumnTypes:

    (baseTable as table, processedTable as table) =>
    let
        baseTableType = Value.Type(baseTable),
        CommonColumnNames = List.Intersect({Table.ColumnNames(baseTable), Table.ColumnNames(processedTable)}),
        processedTableNewColumnTypes =
            List.Transform(
                CommonColumnNames,
                each {_} & {Type.TableColumn(baseTableType, _)}
            ),
        appliedColumnTypes = Table.TransformColumnTypes(processedTable, processedTableNewColumnTypes)
    in
        appliedColumnTypes

    The baseTable is the table before the Group and Expand steps, and processedTable is the table that results after the Expand step (with lost column types).

    Note that any new column(s) added in the group step would have to be type changed in a separate step.

    You can convert Aleksei's version into a function too.


    • Edited by Colin Banfield Wednesday, January 30, 2019 8:35 PM
    • Marked as answer by anthony34 Thursday, January 31, 2019 10:32 PM
    Wednesday, January 30, 2019 7:08 PM
  • Hi, anthony34,

    Suppose you have such a table:

    YourTable

    Then next code should work:

    let
        group = Table.Group(YourTable, {"A"}, {"temp", each _}),
        expand = Table.ExpandTableColumn(group, "temp", {"C"}),
        cols = Table.ColumnNames(expand),
        schema = Table.SelectRows(Table.Schema(YourTable), each List.Contains(cols,[Name])),
        types = Table.TransformColumnTypes(expand, List.Zip({schema[Name],List.Transform(schema[TypeName], each Expression.Evaluate(_, #shared))}))
    in
        types
    • Marked as answer by anthony34 Thursday, January 31, 2019 10:16 AM
    Wednesday, January 30, 2019 4:00 PM
  • Hi Anthony,

    Aleksei's version seems to work perfectly in my case where the number of columns increased.

    The code I posted provides exactly the same results as Aleksei's, even when the number of columns is increased.

    I tested both versions with a much larger dataset. I added the custom function after the Expand step and there was no discernable difference in refresh times (with or without the custom function step). Alexsei's version of the code took a few additional seconds to refresh (which you probably won't notice with a table having three rows and three columns). 

    However, based on the example you provided, I see no need for any special code.

    For example, using your sample table, we can keep the column types intact by using the following code:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"Col1", type text}, {"Col2", Int64.Type}}),
        AddedIndex = Table.AddIndexColumn(ChangedType, "Index", 1),
        GroupedRows = Table.Group(AddedIndex, {"Index"}, {{"DATA", (i)=> Table.AddColumn(i, "New Result Col", each [Index] * 2, Int64.Type), type table}}),
        CombinedTables = Table.Combine(GroupedRows[DATA]),
        RemovedIndexColumn = Table.RemoveColumns(CombinedTables,{"Index"})
    in
        RemovedIndexColumn

    The CombinedTables step, as mentioned in my earlier post, is the key to preserving the column types.

    • Marked as answer by anthony34 Thursday, January 31, 2019 10:05 PM
    Thursday, January 31, 2019 6:38 PM

All replies

  • Hi, anthony34,

    Suppose you have such a table:

    YourTable

    Then next code should work:

    let
        group = Table.Group(YourTable, {"A"}, {"temp", each _}),
        expand = Table.ExpandTableColumn(group, "temp", {"C"}),
        cols = Table.ColumnNames(expand),
        schema = Table.SelectRows(Table.Schema(YourTable), each List.Contains(cols,[Name])),
        types = Table.TransformColumnTypes(expand, List.Zip({schema[Name],List.Transform(schema[TypeName], each Expression.Evaluate(_, #shared))}))
    in
        types
    • Marked as answer by anthony34 Thursday, January 31, 2019 10:16 AM
    Wednesday, January 30, 2019 4:00 PM
  • Hi Anthony,

    The solution depends on whether or not you have the same columns (in name and number), after the table expansion.

    If you are not filtering the expand list, then Table.Combine is a better option, as it will retain the column types.

    On the other hand, if the names or number of columns or both are different from the original table, you need use other techniques to handle this situation.

    Wednesday, January 30, 2019 4:57 PM
  • It just occurred to me that Aleksei provided a solution for the case where the number of columns are reduced. :) 

    A function, such as the following, is useful if your issue is a frequent occurrence. I call the function Table_PreserveColumnTypes:

    (baseTable as table, processedTable as table) =>
    let
        baseTableType = Value.Type(baseTable),
        CommonColumnNames = List.Intersect({Table.ColumnNames(baseTable), Table.ColumnNames(processedTable)}),
        processedTableNewColumnTypes =
            List.Transform(
                CommonColumnNames,
                each {_} & {Type.TableColumn(baseTableType, _)}
            ),
        appliedColumnTypes = Table.TransformColumnTypes(processedTable, processedTableNewColumnTypes)
    in
        appliedColumnTypes

    The baseTable is the table before the Group and Expand steps, and processedTable is the table that results after the Expand step (with lost column types).

    Note that any new column(s) added in the group step would have to be type changed in a separate step.

    You can convert Aleksei's version into a function too.


    • Edited by Colin Banfield Wednesday, January 30, 2019 8:35 PM
    • Marked as answer by anthony34 Thursday, January 31, 2019 10:32 PM
    Wednesday, January 30, 2019 7:08 PM
  • Hi Aleksei and Colin,
    Thanks for your help.

    What I am doing is building a function that transforms an input Table into the same table with additional result columns.

    To calculate the result column(s), I have to group the source table, and then expand it back, which causes the loss of the original column types

    And obviously I want to preserve the column types from the input Table.

    And because the number of columns changes, I could not simply use

    Value.ReplaceType(#"Result", Value.Type(#"Source"))


    Here a simplified case:

    input:


    result without column types:

    M Code

    let
        Source = #table( type table [Col1=text, Col2=Int64.Type], {{"AAA",10},{"BBB",20},{"CCC",30}} ),
        AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1),
        Group = Table.Group(AddIndex, {"Index"}, {{"DATA", each _, type table}}),
        AddResultCol = Table.AddColumn(Group, "New Result Col", each [Index] * 2, Int64.Type), //fake calculation
        HeaderCol=Table.ColumnNames(Source), 
        ExpandDATA = Table.ExpandTableColumn(AddResultCol, "DATA", #"HeaderCol", #"HeaderCol"),
        RemColIndex = Table.RemoveColumns(ExpandDATA,{"Index"}),
        ResultWithoutColType = RemColIndex


    Before I got your replies, I found a way to bypass the Value.ReplaceType() limitation:
    Since I knew what the result column(s) will be, I just created a fake result table with all the column types.
    I created from the original input table.
    Then I used this "fake" table to transfer the column types to the result table using Value.ReplaceType()

    Here the new code:

    let
        Source = #table( type table [Col1=text, Col2=Int64.Type], {{"AAA",10},{"BBB",20},{"CCC",30}} ),
        AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1),
        Group = Table.Group(AddIndex, {"Index"}, {{"DATA", each _, type table}}),
        AddResultCol = Table.AddColumn(Group, "New Result Col", each [Index] * 2, Int64.Type), //fake calculation
        HeaderCol=Table.ColumnNames(Source), 
        ExpandDATA = Table.ExpandTableColumn(AddResultCol, "DATA", #"HeaderCol", #"HeaderCol"),
        RemColIndex = Table.RemoveColumns(ExpandDATA,{"Index"}),
        ResultWithoutColType = RemColIndex,
    
    //transfering the original column types
        FakeResultTable = Table.AddColumn(Table.FirstN(#"Source",1), "New Result Col", each null, Int64.Type),
        ColTypeTransfer = Value.ReplaceType(#"ResultWithoutColType", Value.Type(#"FakeResultTable")),
        ResultWithColType = ColTypeTransfer
    
    in ResultWithColType


    result with the original column types preserved:




    • Edited by anthony34 Thursday, January 31, 2019 10:18 AM
    Thursday, January 31, 2019 9:44 AM
  • Hi Aleksei,

    It works perfectly and is exactly what I was looking for.

    Table.Schema was actually my first move, but I could not manage to get to your each Expression.Evaluate(_, #shared)) that makes it works.

    Thank you



    • Edited by anthony34 Thursday, January 31, 2019 10:25 AM
    Thursday, January 31, 2019 10:08 AM
  • Here exact same result using Aleksei's code:

    let
        Source = #table( type table [Col1=text, Col2=Int64.Type], {{"AAA",10},{"BBB",20},{"CCC",30}} ),
        AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1),
        Group = Table.Group(AddIndex, {"Index"}, {{"DATA", each _, type table}}),
        AddResultCol = Table.AddColumn(Group, "New Result Col", each [Index] * 2, Int64.Type), //fake calculation
        HeaderCol=Table.ColumnNames(Source), 
        ExpandDATA = Table.ExpandTableColumn(AddResultCol, "DATA", #"HeaderCol", #"HeaderCol"),
        RemColIndex = Table.RemoveColumns(ExpandDATA,{"Index"}),
        ResultWithoutColType = RemColIndex,
    
    //Aleksei solution to transfer the original column types  
        cols = Table.ColumnNames(ResultWithoutColType),
        schema = Table.SelectRows(Table.Schema(#"Source") [[Name],[TypeName]], each List.Contains(cols,[Name])),
        types = Table.TransformColumnTypes(#"ResultWithoutColType", List.Zip({schema[Name],List.Transform(schema[TypeName], each Expression.Evaluate(_, #shared))}))
    in type



    • Edited by anthony34 Thursday, January 31, 2019 10:19 AM
    Thursday, January 31, 2019 10:12 AM
  • Colin,

    Aleksei's version seems to work perfectly in my case where the number of columns increased.

    It works because of that smart Table.SelectRows that limits the Type transfer only to the Source Columns:

    = Table.SelectRows(Table.Schema(#"Source") [[Name],[TypeName]], each List.Contains(cols,[Name]))


    • Edited by anthony34 Thursday, January 31, 2019 10:23 AM
    Thursday, January 31, 2019 10:16 AM
  • Hi Anthony,

    Aleksei's version seems to work perfectly in my case where the number of columns increased.

    The code I posted provides exactly the same results as Aleksei's, even when the number of columns is increased.

    I tested both versions with a much larger dataset. I added the custom function after the Expand step and there was no discernable difference in refresh times (with or without the custom function step). Alexsei's version of the code took a few additional seconds to refresh (which you probably won't notice with a table having three rows and three columns). 

    However, based on the example you provided, I see no need for any special code.

    For example, using your sample table, we can keep the column types intact by using the following code:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,{{"Col1", type text}, {"Col2", Int64.Type}}),
        AddedIndex = Table.AddIndexColumn(ChangedType, "Index", 1),
        GroupedRows = Table.Group(AddedIndex, {"Index"}, {{"DATA", (i)=> Table.AddColumn(i, "New Result Col", each [Index] * 2, Int64.Type), type table}}),
        CombinedTables = Table.Combine(GroupedRows[DATA]),
        RemovedIndexColumn = Table.RemoveColumns(CombinedTables,{"Index"})
    in
        RemovedIndexColumn

    The CombinedTables step, as mentioned in my earlier post, is the key to preserving the column types.

    • Marked as answer by anthony34 Thursday, January 31, 2019 10:05 PM
    Thursday, January 31, 2019 6:38 PM
  • Hi Colin,

    In your first reply, you mentionned  Table.Combine  without providing any code to illustrate, and I could not understand it. It might be obvious for you, but I did not figure out we could use Table.Combine to "expand" a table. Nevertheless that it was preserving the types.

    GroupedRows[DATA] is actually a list of tables which makes it a good input for Table.Combine.

    I thank you to have posted the code that makes it clear.

    Your Table_PreserveColumnTypes is also very smart. It took me some time to understand what you mean about the limitation with Aleksei's version. Your function is actually perfect for my need.

    In addition, to be didactical, I posted a simplified case that illustrate my question. The actual function is nothing that simple, and will be used many times a day. Then I will definitively appreciate any speed gain that Table.Combine can provide.



    • Edited by anthony34 Thursday, January 31, 2019 10:40 PM
    Thursday, January 31, 2019 10:18 PM
  • Hi Anthony,

    In your first reply, you mentionned  Table.Combine  without providing any code to illustrate, and I could not understand it. It might be obvious for you, but I did not figure out we could use Table.Combine to "expand" a table. Nevertheless that it was preserving the types.

    I apologize for my oversight. I consider you be a somewhat advanced M coder, and simply assumed that you were aware of the use of Table.Combine. There are many examples from other posts on this site. 

    Your Table_PreserveColumnTypes is also very smart. It took me some time to understand what you mean about the limitation with Aleksei's version.

    Well, both produce the same result, but one is a bit faster. By the way, I forgot that I modified the code about an hour and a half after I first posted it, so you must have picked up the original code, which gave you the error.

    Then I will definitively appreciate any speed gain that Table.Combine can provide.

    Yeah, if Table.Combine works for your scenarios, it's the best way to go. One scenario where it won't work is if after Grouping (and before combining), you add a new column to the table where DATA is a column. In this case, the new column(s) would be lost, because it's not a column in DATA tables. 


    Thursday, January 31, 2019 11:30 PM
  • still learning ...
    Friday, February 1, 2019 7:34 AM
  • still learning ...

    So do I... & Thanks guys
    Friday, February 1, 2019 7:38 AM
  • Still learning too. It never ends.
    Friday, February 1, 2019 4:07 PM
  • Hi Colin,
    it does not work when there is a column of type "Table" in the baseTable

    Wednesday, May 15, 2019 2:33 PM