none
Transform Column Types dynamically via PQ RRS feed

  • Question

  • Hello,

    Is there a way to create a query where it can handle dynamic source data where (number of columns change, column names change, location of column names change, data types change).  Many times my data has a mixture of data types (text columns, numercal columns, date columns, etc).  Ideally there would be a function that promotes and auto detects each column header and detects the data type without there being any hard coded information in the syntax.

    Example:

    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),

    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"DUT receiver port name", type text}, {"minLineLimit", type number}, {"Uncert", type number}, {"US57210031", type number}, {"DUT CW frequency", type number}})

    This works for this file, but If I import a different csv file which has different column names and mixed data types this would obviously fail with an error.  I'm trying to make this part dynamic. {{DUT receiver port name", type text}, {"minLineLimit", type number}, {"Uncert", type number, etc...}} for the Table.TransfomColumnTypes function


    Thanks for any help on this.

    Darin

    Thursday, November 30, 2017 11:33 PM

Answers

  • First create a custom function called Table_ColumnTypes. It takes a table as its only parameter:

    (table as table) as list =>
    let
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(list, each Value.Type(Value.FromText(_))),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        ColumnTypes

    In your main script, add the following lines after #"Promoted Headers"

    ColumnNames = Table.ColumnNames(#"Promoted Headers"),
    ColumnTypes = Table_ColumnTypes(#"Promoted Headers"),
    ChangedType = List.Accumulate(List.Positions(ColumnNames), #"Promoted Headers", (accumulated, i) => Table.TransformColumnTypes(accumulated, {ColumnNames{i}, ColumnTypes{i}}))
    
    The Table_ColumnTypes function is not as sophisticated as the logic used by Power Query to determine the column types in Table.TransformColumnTypes. Table_ColumnTypes return primitive types only, and not subtypes such as Int64.Type, Percentage.Type, etc. In the future, I might extend the function to determine if an appropriate subtype applies to a given column.

    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 12:54 AM
  • Hi Marcel,

    "My suggestions would be to use List.Zip to create a transformlist for the actual transformations."

    Good suggestion! Much simpler than the technique I used.

    "And why not include the transformations in the function, so the function returns a typed table?"

    Good question. When I built the function some time ago, it was intended as a generic way to get a list of table types, when the source types are all "any" (e.g. Excel files), or all "text" (text files). At the time, I didn't have any particular solution in mind. My original goal was to validate lists passed to function, when the list must be of some homogenous type (e.g. all text, all numbers, etc). List.ItemType was created for validating the list passed into the function. For similar purposes, I created a Record.FieldTypes function. For the sake of completeness, I decided well, what the heck, I'll create a Table.ColumnTypes function too. Table.ColumnTypes is the first custom function that I've created without knowing how it would be used. It was only later (well before this thread) that I realized that the perfect application would be for dynamic column typing.

    Anyway, that's the history. However, your point to create a Table_TransformColumnTypes function is probably valid, since I cannot think of any other scenario for the Table_ColumnTypes function. Only problem with the name Table_TransformColumnTypes is that it will clash with the native Table.TransformColumnTypes function after I recreate the function in M extensions. How about Table_ConvertColumnTypes? ;)

    The new Table_ConvertColumnTypes function would therefore be:

    (table as table) as table =>
    let
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
        TransformList = List.Zip({ColumnNameList, ColumnTypes}),
        TypedTable = Table.TransformColumnTypes(table, TransformList),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(list, each Value.Type(Value.FromText(_))),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        TypedTable

    And the main code would be (after the #"Promoted Headers" step) :

    ChangedType = Table_ConvertColumnTypes(#"Promoted Headers")

    • Proposed as answer by MarcelBeug Friday, December 1, 2017 2:59 PM
    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 1:34 PM
  • Hi Colin,

    Nice solution!

    My suggestions would be to use List.Zip to create a transformlist for the actual transformations.

    And why not include the transformations in the function, so the function returns a typed table?

    So after your step ColumnTypes:

        TransformList = List.Zip({ColumnNameList,ColumnTypes}),
        TypedTable = Table.TransformColumnTypes(table, TransformList),
    

    And the result of the function will then be TypedTable, so a table, not a list.

    Maybe rename the function to something like Table_TransformColumnTypes.

    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 5:31 AM

All replies

  • First create a custom function called Table_ColumnTypes. It takes a table as its only parameter:

    (table as table) as list =>
    let
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(list, each Value.Type(Value.FromText(_))),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        ColumnTypes

    In your main script, add the following lines after #"Promoted Headers"

    ColumnNames = Table.ColumnNames(#"Promoted Headers"),
    ColumnTypes = Table_ColumnTypes(#"Promoted Headers"),
    ChangedType = List.Accumulate(List.Positions(ColumnNames), #"Promoted Headers", (accumulated, i) => Table.TransformColumnTypes(accumulated, {ColumnNames{i}, ColumnTypes{i}}))
    
    The Table_ColumnTypes function is not as sophisticated as the logic used by Power Query to determine the column types in Table.TransformColumnTypes. Table_ColumnTypes return primitive types only, and not subtypes such as Int64.Type, Percentage.Type, etc. In the future, I might extend the function to determine if an appropriate subtype applies to a given column.

    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 12:54 AM
  • Hi Colin,

    Nice solution!

    My suggestions would be to use List.Zip to create a transformlist for the actual transformations.

    And why not include the transformations in the function, so the function returns a typed table?

    So after your step ColumnTypes:

        TransformList = List.Zip({ColumnNameList,ColumnTypes}),
        TypedTable = Table.TransformColumnTypes(table, TransformList),
    

    And the result of the function will then be TypedTable, so a table, not a list.

    Maybe rename the function to something like Table_TransformColumnTypes.

    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 5:31 AM
  • Hi Marcel,

    "My suggestions would be to use List.Zip to create a transformlist for the actual transformations."

    Good suggestion! Much simpler than the technique I used.

    "And why not include the transformations in the function, so the function returns a typed table?"

    Good question. When I built the function some time ago, it was intended as a generic way to get a list of table types, when the source types are all "any" (e.g. Excel files), or all "text" (text files). At the time, I didn't have any particular solution in mind. My original goal was to validate lists passed to function, when the list must be of some homogenous type (e.g. all text, all numbers, etc). List.ItemType was created for validating the list passed into the function. For similar purposes, I created a Record.FieldTypes function. For the sake of completeness, I decided well, what the heck, I'll create a Table.ColumnTypes function too. Table.ColumnTypes is the first custom function that I've created without knowing how it would be used. It was only later (well before this thread) that I realized that the perfect application would be for dynamic column typing.

    Anyway, that's the history. However, your point to create a Table_TransformColumnTypes function is probably valid, since I cannot think of any other scenario for the Table_ColumnTypes function. Only problem with the name Table_TransformColumnTypes is that it will clash with the native Table.TransformColumnTypes function after I recreate the function in M extensions. How about Table_ConvertColumnTypes? ;)

    The new Table_ConvertColumnTypes function would therefore be:

    (table as table) as table =>
    let
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
        TransformList = List.Zip({ColumnNameList, ColumnTypes}),
        TypedTable = Table.TransformColumnTypes(table, TransformList),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(list, each Value.Type(Value.FromText(_))),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        TypedTable

    And the main code would be (after the #"Promoted Headers" step) :

    ChangedType = Table_ConvertColumnTypes(#"Promoted Headers")

    • Proposed as answer by MarcelBeug Friday, December 1, 2017 2:59 PM
    • Marked as answer by LearningPQ Friday, December 1, 2017 5:09 PM
    Friday, December 1, 2017 1:34 PM
  • This worked perfectly! thanks for all the help and for a quick reply.  
    Friday, December 1, 2017 5:10 PM
  • Just to share a potential adjustment to the function to prevent errors due to types that are not supported by Table.TransformColumnTypes.

    Additional step:
    
        ValidTypes = {type any, type number, type date, type datetime, type datetimezone, type time, type duration, type logical, type text, type binary},
    
    Adjusted step:
    
        TransformList = List.Select(List.Zip({ColumnNameList, ColumnTypes}), each List.Contains(ValidTypes,_{1})),
    
    Saturday, December 16, 2017 7:52 AM
  • Hi Marcel,

    Could you detail the scenario that you encountered that led to the extra step? Since the determination of type is from a function that returns primitive types, it's not clear how the function can return a type that is not compatible with Table.TransformColumnTypes.

    Saturday, December 16, 2017 1:53 PM
  • Hi Colin,

    After verifying, it is the situation where all column values are null.
    Table.TransformColumnTypes won't transform to type null, e.g. this won't work:

    let
        Source = #table(2,{{1,null},{null,null}}),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column2", type null}})
    in
        #"Changed Type"

    Actually I want to use this function to dynamically parse JSON input as answer to a question on the Power BI Community forum.

    I will take another look at excluding null values and Int64.Type (maybe also Percentage.Type).

    The strange thing about null values is that they are actually conflicting with primitive types, e.g. type text does not include nulls, type nullable text includes nulls.
    It looks like the standard automatic type detection also excludes null values.

    I'll share my findings.

    Saturday, December 16, 2017 3:17 PM
  • Hi Marcel,

    Thank you. If a column contains all nulls, Table.TransformColumnTypes evaluates the column as type any. Percentage type isn't necessary in my opinion, since Value.FromText correctly interprets text as a number with percentage (e.g. "10%"), and the data model doesn't support a percentage type.

    Updated function:

    (table as table) as table =>
    let
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
        TransformList = List.Zip({ColumnNameList, ColumnTypes}),
        TypedTable = Table.TransformColumnTypes(table, TransformList),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(
                                list, 
                                each if Value.Type(Value.FromText(_)) = type null then type any 
                                     else if Value.Type(Value.FromText(_)) = type number then
                                             if Number.IntegerDivide(_, 1) = _ then Int64.Type else type number
                                     else Value.Type(Value.FromText(_))
                            ),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        TypedTable
    Let me know if you encounter any other issues

    Saturday, December 16, 2017 5:58 PM
  • If there is at least one non-null value in the column, Table.TransformColumnTypes will set the type of the column to the type of the non-null value (or values - if they are all of the same type).

    If this logic is preferred over using type any, then we can change the logic of List.ItemType as follows (In addition, I moved the logic for all nulls):

    List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(
                                list, 
                                each if Value.Type(Value.FromText(_)) = type number then
                                         if Number.IntegerDivide(_, 1) = _ then Int64.Type else type number
                                     else Value.Type(Value.FromText(_))
                            ),
                ListItemType = if Type.Union(ItemTypes) = type null then type any 
                               else Type.Union(List.RemoveNulls(ItemTypes))
            in
                ListItemType




    Sunday, December 17, 2017 3:47 AM
  • Hi Colin,

    The line:
    if Number.IntegerDivide(_, 1) = _ then Int64.Type else type number
    should be:
    if Number.IntegerDivide(Value.FromText(_), 1) = Value.FromText(_) then Int64.Type else type number

    Although I have another solution in this pbix file (along with the corrected version of your code).

    In that file you can also find examples, and the results from my further proceedings (in function Table_ConvertColumnTypes_MB, which I - by the way - adapted based on your previous post):

    1. I have another solution for all nulls:
      In the List.Transform step, I replace type null by type none.
      This way, the original Type.Union can be used (without if..then..else):
      type none with 1 other type returns the other type
      all types none returns type any.
    2. I use Int64.From to check for integer values.
      Basically it does the same as Number.IntegerDivide, but I think it's more native.
      So just a personal preference.
    3. I also account for Percentage.Type.
      It is still a type in Power Query, and values can be formatted as percentage in the data model.
    4. I still use the list with valid types to exclude columns with structured values:
      lists, tables, records, functions.
      I didn't think about this yesterday when I mentioned that type null couldn't be handled by Table.TransformColumnTypes: this also applies to structured types.
    5. I added an optional culture parameter, so the function can also be used with a culture code.
      I included an example with Dutch culture "nl-NL" (with decimal commas and dd-MM-yyyy date format). 

    The result looks to me like a nice synergistic result from our joined efforts :-).
    I leave it up to you to share the final code over here (you may still want to adjust my final code).

    Sunday, December 17, 2017 6:08 AM
  • Hi Marcel,

    Great work! You've obviously put a lot of effort into improving the function.

    I have another solution for all nulls: In the List.Transform step, I replace type null by type none.
    This way, the original Type.Union can be used (without if..then..else): type nonewith 1 other type returns the other type all types nonereturns type any.

    Brilliant!

    I use Int64.Fromto check for integer values. Basically it does the same as Number.IntegerDivide, but I think it's more native. So just a personal preference.

    Nope. Int64.From is better, period.

    I also account for Percentage.Type. It is still a type in Power Query, and values can be formatted as percentage in the data model.

    Well, Percentage.Type in the data model is decimal number. So the ability to format as percentage is not relevant to the argument supporting Percentage.Type. Also, Percentage.Type cannot be set from Excel data where numbers are formatted as percentage.

    One other thing. If we support Percentage.Type (which is fine, despite my reservations :)), then there is an even stronger argument for supporting Currency.Type, since it is a native type in the Excel data model. Would something like the following work (after the percent type check)?

    else if Text.Length(Text.Remove(Text.From(_, culture), {"0".."9", ".", ",", "-", "+", "e", "E", "(", ")", " "})) > 0 then Currency.Type

    I still use the list with valid types to exclude columns with structured values: lists, tables, records, functions. I didn't think about this yesterday when I mentioned that type nullcouldn't be handled by Table.TransformColumnTypes: this also applies to structured types.

    Fair enough. However, wouldn't it be simpler to exclude the six invalid types (list, record, table, function, type, none), than to include twelve valid types (thirteen, if we include Currency.Type), since the exclude list is much smaller?

    I added an optional culture parameter, so the function can also be used with a culture code. I included an example with Dutch culture "nl-NL" (with decimal commas and dd-MM-yyyy date format)

    That's fine.

     








    Sunday, December 17, 2017 5:32 PM
  • Thanks Colin,

    Currency Type:

    It looks like I have similar reservations about the Currency.Type as you have about the Percentage.Type.
    But you are right: Currency.Type should also be included, as automatic data typing may also convert values to Currency.Type.
    The problem with currencies is, that there is no general way to convert a number to text with the currency symbol of the applicable culture, or, in other words: to isolate the currency symbol from a  value.
    So a workaround is required as outlined by you: remove all characters from the input, that can make up numbers, and if anything is left, then it must be a currency.
    Hm, sounds tricky, but alright. Except 2 more characters must be added: ' and /, as these are used as separators in some cultures.
    I prefer Text.ToList instead of constructing a list of separate characters, so this is my code for detecting currency types (indeed, after the check for percentage):

    if Text.Length(Text.Remove(Text.From(_, culture), {"0".."9"}&Text.ToList("., -+eE()/'"))) > 0 and
       Currency.From(_, culture) = Value.FromText(_, culture)
    then Currency.Type


    Check for valid types instead of invalid types:

    The test on structured values (tables, lists, records, functions) may be tricky, because these are custom types.

    A test like type table[Id = number] = type table  returns false
    whereas = Type.Is(type table[Id = number], type table) returns true

    Even if 2 tables are created with the exact same specifications, and the types of those tables are compared, the result is false. So this code returns false, even whilst the code at either side of the equal sign is exactly the same:

    = Value.Type(#table(type table[Id = number],{{1}})) = Value.Type(#table(type table[Id = number],{{1}}))

    Only if a table is copied, then the types of the original and the copy are equal, so this returns true:

    let
        Table1 = #table(type table[Id = number],{{1}}),
        Table2 = Table1,
        CheckType = Value.Type(Table1) = Value.Type(Table2)
    in
        CheckType
    That's why I prefer the test on valid types, even if there are more valid types than invalid types.
    Or not, considering there are infinite numbers of specific table types, record types, list types, function types.

    I updated my pbix file in my previous post with the Currency.Type addition.

    Monday, December 18, 2017 10:16 AM
  • Hi Marcel,

    As luck would have it, after arguing for Currency.Type, and adding the currency type check to the code (my version at the time), it turned out that my test column (which had the currency type in PQ) was interpreted as decimal number in the Excel data model (sigh!). However, when I used the standard Table.TransformColumnTypes function, the column was correctly interpreted as currency in the data model! Go figure...

    Anyway, the currency type works perfectly in Power BI, so to hell with PQ and its sometimes flakey interactions with the Excel data model.

    Note that the following portion of your currency check:

    and Currency.From(_,culture) = Value.FromText(_,culture)

    can fail if the original values contains more than 4 decimal places (which is why I didn't include this check in my version). It's not clear why the additional check is necessary, so I'd be interested in knowing the consequences of omitting it. I have a similar question regarding the test for percentage.

    What I had in mind for "invalid" types was a simple modification to your code:

    InvalidTypes = {type list, type record, type table, type function, type type, type none},

    and

    TransformList = List.Select(List.Zip({ColumnNameList, ColumnTypes}), each not List.Contains(InvalidTypes,_{1}))

    Oddly, I couldn't get these modifications to work until I included type none in the list. I got an "unrecognized type" error in the TypedTable step, which was strange because ColumnTypes did not, and cannot contain a "none" type.

    Monday, December 18, 2017 3:35 PM
  • Currency and Percentage type:

    You are right. I took out the checks using Currency.From and Percentage.From and replaced the pbix in the link.

    Check for invalid types:

    As I explained in my previous post, the check on the type has some issues and that's why
    not List.Containds(InavlidTypes, _{1})
    does not work.
    See for instance the example in my previous post that type table[Id = number] = type table returns false.
    So all specific table types are not detected as invalid type with List.Contains. 

    Tuesday, December 19, 2017 3:10 AM
  • "So all specific table types are not detected as invalid type with List.Contains."

    The odd thing is that I tested InvalidTypes with a column of type types, and a column of table types, and it worked in both cases (TransformList filtered correctly). However, InvalidTypes failed with a column of function types. Now that I think about it, I have no idea why the table type column was filtered out in TransformList.

    At any rate, if we want to use InvalidTypes, we need a different approach to filtering TransformList.  So given this list: {type list, type record, type table, type function, type type, type none}

    TransformList becomes:

    TransformList = List.Select(List.Zip({ColumnNameList, ColumnTypes}), (i) => not List.Contains(List.Transform(InvalidTypes, (j) => Type.Is(i{1}, j)), true)),
    Let me know what you think.
    Tuesday, December 19, 2017 3:51 PM
  • Another consideration if InvalidTypes is used. We can replace type none with type null in the list, so: InvalidTypes = {type list, type record, type table, type function, type type, type null}

    Then the final else in ItemTypes is reduced to: else Value.Type(Value.FromText(_, culture)), instead of:
    else Replacer.ReplaceValue(Value.Type(Value.FromText(_, culture)), type null, type none)

    Tuesday, December 19, 2017 7:10 PM
  • With regard to your latest post: any column with at least 1 null value would then be typed as any.
    That is not correct e.g. if a column only contains a mix of text values and nulls: in that case the column should be typed as text.

    With regard to filtering out invalid types instead of selecting valid types, I'm not so eager to make that switch.
    I'm afraid it will take a lot of time to test all possible scenario's and really understand why some invalid types are filtered out, and others are not filtered out.

    For now, I'm fine with the solution I shared in the link in the previous post.
    Maybe I can perform some additional tests later on, or you can share your test results?

    Tuesday, December 19, 2017 8:02 PM
  • With regard to your latest post: any column with at least 1 null value would then be typed as any.
    That is not correct e.g. if a column only contains a mix of text values and nulls: in that case the column should be typed as text.

    A union of nulls and a primitive type such as text, number, or date will return a nullable version of the primitive type. So Type.Union(type null, type null, type text) = nullable text. In one of my test tables, I have a column with one integer and nulls for all other values. The column type after type conversion is whole number. One of goals of using the Replacer function is to handle a column with all nulls. If null becomes an invalid type, the need for the Replacer function goes away.

    I'm afraid it will take a lot of time to test all possible scenario's and really understand why some invalid types are filtered out, and others are not filtered out.

    This was a problem using List.Contains alone. By using List.Contains with Type.Is for filtering, all invalid types are filtered. 

    Maybe I can perform some additional tests later on, or you can share your test results?

    Here is the current code using invalid types, with changes from valid types highlighted:

    (table as table, optional culture as nullable text) as table =>
    let
        InvalidTypes = {type list, type record, type table, type function, type type, type null},
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
        TransformList = List.Select(List.Zip({ColumnNameList, ColumnTypes}), (i) => 
                            not List.Contains(List.Transform(InvalidTypes, (j) => Type.Is(i{1}, j)), true)),
        TypedTable = Table.TransformColumnTypes(table, TransformList, culture),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(
                                list, 
                                each if Value.Type(Value.FromText(_, culture)) = type number
                                     then if Text.Contains(Text.From(_, culture),"%") then Percentage.Type
                                          else if Text.Length(Text.Remove(Text.From(_, culture), {"0".."9"} & Text.ToList("., -+eE()/'"))) > 0 
                                                   then Currency.Type
                                          else if Int64.From(_, culture) = Value.FromText(_, culture) then Int64.Type
                                          else type number
                                          else Value.Type(Value.FromText(_, culture))
                            ),
                ListItemType = Type.Union(ItemTypes)
            in
                ListItemType
    in
        TypedTable



    Tuesday, December 19, 2017 11:50 PM
  • Hi Colin,

    Okay. You are right about the union of primitive types and nulls resulting in the nullable version of the primitive type.

    However, there is a specific issue, which looks like a bug to me: a value with 5 decimals, converted to a nullable Currency.Type, has still 5 decimals instead of 4, while the column type is currency. Otherwise there are no issues with transformation to nullable types.

    Test query to illustrate this bug:

    let
        Source = #table(type table[Nullable Currency = number, Currency = number],{{1.12345,1.12345}}),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Nullable Currency", type nullable Currency.Type}, {"Currency", Currency.Type}})
    in
        #"Changed Type"

    The next part about type values becomes obsolete by the workaround for the nullable-currency bug below:

    I tested with some additional type values. These were not filtered out with your code, but this can easily be corrected by replacing type type by type nullable type in the list with invalid types, so:

    InvalidTypes = {type list, type record, type table, type function, type null, type nullable type},

    This also works if there are no nulls, so type type is regarded as a type nullable type and filtered out.

    So far the (obsolete) part about type values

    Instead of the formula with List.Contains(list, true), function List.AnyTrue can be used, so that line can be rewritten as:

    not List.AnyTrue(List.Transform(InvalidTypes, (j) => Type.Is(i{1}, j)))),

    Tip: I found a very useful Power Query (M) Functions Dashboard with explanatory video in the Data Stories gallery on the Power BI Community site ;-) .

    It looks to me like all seems to work now, except for the nullable-currency bug.

    Workaround for nullable-currency bug:

    The type join line can be replaced by:

    ListItemType = Type.NonNullable(Type.Union(ItemTypes))

    so all types will be nonnullable.

    In this case, the list with invalid types becomes:

    InvalidTypes = {type list, type record, type table, type function, type none, type type},
    Compared with your list, type null is replaced by type none (as type nonnullable nulltype none).



    • Edited by MarcelBeug Wednesday, December 20, 2017 9:39 AM
    Wednesday, December 20, 2017 9:36 AM
  • Thank you very much Marcel for taking the time to test all of those scenarios. :) The main motivation for invalid types vs valid types was to have a smaller list of types to deal with. Originally, I was also concerned about more valid types (really subtypes) being added to M in the future (some of the current valid subtypes were added later in the development cycle, for instance). However, upon reflection, I can't think of any new type or subtype that might be added. Thus, this particular concern is probably without merit. In the end, invalid vs valid is a personal choice, and I'll post the final versions of both solutions for any interested party that gets this far in the discussion. :)

    The nullable currency "bug" is an annoyance.

    I find the null vs none stuff confusing. Before adding type null to the invalid list, I had to use type none to avoid an "unrecognized type" error. But how is none generated? In the query editor, all of the following expressions returns type any:

    = Type.NonNullable(null), although the documentation indicates that it returns type none.
    = Type.Union({type null, type null, type null})
    = Type.Union({type nullable null, type nullable null, type nullable null})

    The fact that type none must be in the invalid list suggests that the return of type any in the editor is bogus, and thus is very misleading.

    Thanks for the reminder about the dashboard. I looked at it after you posted a link in another thread. I had some rendering and scrolling issues on my laptop screen and meant to check it out on my 27" monitor. After playing around with the dashboard on my desktop PC, I have to say that this thing is absolutely fantastic!!! You need to spread the word by every means possible!

    Below are a couple of images of a custom M language that I created in Notepad++. However, I haven't updated it for some time:


    The above image is an example of function auto-complete.

    The above image is an example of function parameter tooltips and function description.

    However, unlike your dashboard, creating and updating the custom language is a fully manual effort.

    Wednesday, December 20, 2017 4:41 PM
  • The fact that type none must be in the invalid list suggests that the return of type any in the editor is bogus, and thus is very misleading.

    Misleading bogus indeed.

    Type.NonNullable(null) DOES return type none.

    All following types display "any" in the editor, which is only correct for the last 2 types in the list:

    = {type null, type none, type nullable null, type nullable none, type nullable any, type nullable anynonnull}

    Thanks for your feedback on the dashboard. I will further promote it indeed.

    Wednesday, December 20, 2017 7:34 PM
  • "All following types display "any" in the editor, which is only correct for the last 2 types in the list:"

    Indeed. In the valid types version, you should consider removing the replacer function. Instead, wrap Type.NonNullable around Type.Union. They both accomplish the same thing, but the latter option is simpler.

    Wednesday, December 20, 2017 11:47 PM
  • Hi Colin,

    Just to let you know that I used this function and referred to you and to this topic from this topic in the Power BI Community forum and also in the related video at around 3:18.

    That topic is about parsing JSON Data for which I created a function that creates a function to drill down into columns with nested (lists of) records, with automatic data typing with the function from this topic.
    It is quite interesting, but notice that the total video length is over 31 minutes.

    Thursday, December 21, 2017 2:31 PM
  • Hello all,

    This was very useful to me, thank you. Altough it doesn't seem to work on columns with both text and numbers, do you guys know any workaround?

    Thank you.

    All the best,
    Andre Pinheiro


    • Edited by xDreekun Thursday, December 12, 2019 5:15 PM
    Thursday, December 12, 2019 5:14 PM