none
Homogeneous list types in M

    Question

  • From the Formula Language Specification (p.49, "5.3 List types") I can read that 

    A value conforms to a list type if the value is a list and each item in that list value conforms to the list type’s item type

    So, 

    functype = type {function}

    declares a type of list of functions.

    But I can't wrap my head around this sample:

    functype = type {function}, ReplType = Value.ReplaceType({"abcd"}, functype), Compare = Type.Is(Value.Type(ReplType), functype), // TRUE


    moreover,

    Compare = Type.Is(Value.Type( { ()=>1 } ), functype)
    // FALSE because actual type of compared value list item is any

    It seems that the type of 

    { ()=>1 }

    is primitive list, and it is doesnt matter whether this list contains functions or not, list item type is any and we cannot use the is and as keywords with custom types.

    So the questions is:

    - what is the reason of these homogenous list types, as we can do things as in the first code part

    - how to check that the actual list is the homogenous list of defined item's type?


    Maxim Zelensky Excel Inside

    Thursday, April 12, 2018 11:50 AM

Answers

  • Hi Maxim,

    First note that in the expression,

    ReplType = Value.ReplaceType({"abcd"}, functype)

    no conformance checking is done. The expression simply replaces the type 'any' of {"abcd"} with functype.

    To return the type of the list, you can use Type.ListItem(Value.Type(ReplType)), which will return 'function'.

    You can check if a given list is homogenous by checking the item type of the list. The item type of a list is the union of the types of the items in the list. For example, a list with items each of type number will have an item type of number.  If the item types are mixed, then the item type of the list will be type 'any'.

    If a list is derived from a table with a set column type, then the item type of the list is simply found by using the expression Type.ListItem(Value.Type(table[column])).

    However, by design, the item type of a literal list like {1, 2, 3, 4} or {() => 1} is type 'any'.

    You can use the following function to determine the item type of a literal list, or a list derived from a table with a column type of any:

    List_ItemType:

    (list as list) as type =>
    let
        selectedItems = List.FirstN(list, 200),
        itemTypes = List.Transform(selectedItems, each Value.Type(_)),
        listItemType = Type.Union(itemTypes)
    in
        listItemType


    Thursday, April 12, 2018 7:45 PM

All replies

  • Hi Maxim,

    First note that in the expression,

    ReplType = Value.ReplaceType({"abcd"}, functype)

    no conformance checking is done. The expression simply replaces the type 'any' of {"abcd"} with functype.

    To return the type of the list, you can use Type.ListItem(Value.Type(ReplType)), which will return 'function'.

    You can check if a given list is homogenous by checking the item type of the list. The item type of a list is the union of the types of the items in the list. For example, a list with items each of type number will have an item type of number.  If the item types are mixed, then the item type of the list will be type 'any'.

    If a list is derived from a table with a set column type, then the item type of the list is simply found by using the expression Type.ListItem(Value.Type(table[column])).

    However, by design, the item type of a literal list like {1, 2, 3, 4} or {() => 1} is type 'any'.

    You can use the following function to determine the item type of a literal list, or a list derived from a table with a column type of any:

    List_ItemType:

    (list as list) as type =>
    let
        selectedItems = List.FirstN(list, 200),
        itemTypes = List.Transform(selectedItems, each Value.Type(_)),
        listItemType = Type.Union(itemTypes)
    in
        listItemType


    Thursday, April 12, 2018 7:45 PM
  • Hi Colin. Sorry for the late reply.

    Nice explanation. It's not so easy to find why there is no special function for custom type conformation, also as why "by design, the item type of a literal list like {1, 2, 3, 4} or {() => 1} is type 'any'."

    The whole types concept not so easy to catch and not so easy to find real use cases for it, except of data load or function descriptions.

    PS. for quick type assertion with conformation, I think, it is possible to use the as keyword:

    (list as list, itemtype as type) => List.Transform(list, each _ as itemtype)

    We'll get a list of values of the given type and errors, and then make the next steps needed in a certain case.


    Maxim Zelensky Excel Inside


    Tuesday, April 17, 2018 10:31 AM
  • Hi Maxim,

    "It's not so easy to find why there is no special function for custom type conformation, ..."

    Per the language spec:

    "There is no support in M for determining compatibility of a given type with a custom type. The standard library does include a collection of functions to extract the defining characteristics from a custom type, so specific compatibility tests can be implemented as M expression."

    Many of functions in the 'Type' category are used for custom type conformance. Conformance of a custom types really means conformance to the primitive type elements of the custom type, e.g. the items of a list or the fields of a record. Section 5.9 of the language spec. details these conformance function types. These functions are useful only if the custom type (list, function, table, function) do not contain elements of all 'any'.

    If you have a table with typed columns, then lists or records derived from the table will have specific item types or field types respectively. All built-in functions and custom functions where types are specified, will have specific parameter types and return type. So the type conformance functions work well in the foregoing cases. 

    "also as why "by design, the item type of a literal list like {1, 2, 3, 4} or {() => 1} is type 'any'."

    It's because in M, any custom type where the elements of the type are not explicitly set, are defaulted to type 'any' - M doesn't make any assumptions on it's own with regards to type. For instance, every table imported into Power Query without a schema defining the column types (non-SQL tables) will have all column types of any, before Power Query automatically inserts a 'Changed Type' step using the 'Table.TransformColumnTypes' function. The same 'any' default is true for lists, records, and functions that have no explicitly typed elements. 

    PS. for quick type assertion with conformation, I think, it is possible to use the as keyword:

    (list as list, itemtype as type) => List.Transform(list, each _ as itemtype)

    The expression 'each _ as itemtype' generates an "invalid type identifier" error because to specify itemtype in the function, you have specify the type as 'type number', 'type text', etc. Thus for 'type number', 'each _ as itemtype' becomes 'each _ as type number', which is not valid (the type name must come directly after 'as', e.g. 'as number').

    You can modify the function as follows:

    (list as list, itemtype as type) => List.AllTrue(List.Transform(list, each Value.Is(_, itemtype)))

    "The whole types concept not so easy to catch and not so easy to find real use cases for it, except of data load or function descriptions."

    Actually, I've found some useful cases for types, but most of these cases have surfaced only fairly recently.

    I would agree that "the whole types concept is not so easy to catch", as I was asking many of the same questions not that long ago. The problem, I think, is that the language spec. is a reference that states a series of facts about the language, and in the case of types, the facts are hard to connect and translate into actual solutions. In other words, the discussion of types is somewhat abstract.

    Usually, someone with "inside information" about M types would have long ago written a blog post or article discussing types in detail with many real-world examples, but such a discussion has not happened (to my knowledge).

    In his videos, Matt Masson bravely delves into types, while admitting that he doesn't understand the type system well. The problem is that some of his statements about types are incorrect, and serve to confuse rather than clarify.

    So an article that elucidates all of the sordid details about types still needs to be written. :)

    Tuesday, April 17, 2018 6:32 PM
  • Hi Colin. 

    Very insightful discussion, thanks.

    If a list is derived from a table with a set column type, then the item type of the list is simply found by using the expression Type.ListItem(Value.Type(table[column])).

    I cannot totally agree with this. It depends on how is the type was assigned to the column. In the sample below a Column1 was set to the type text by Table.TransformColumnTypes, which, AFAIK, uses a set of type conversion functions (like Text.From), depending on the primitive nullable type set. The Column2 type was set via Table.AddColumn, and it works as Value.ReplaceType for the list - just set a type without conformation:

    // Sample
    let
        Source = #table(1, {{1},{"a"}}),
        ChangedType = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
        AddedCustom = Table.AddColumn(ChangedType, "Column2", each "this is a number", type function),
        Column1_ItemType = Type.ListItem(Value.Type(AddedCustom[Column1])), // nullable text - correct
        Column2_ItemType = Type.ListItem(Value.Type(AddedCustom[Column2])) // function - wrong
    in
        Column2_ItemType

    That's why, I think, there is no reason to add the "New Column Type" oprion in the "Add Custom Column" dialog window - there is no type conversion and conformation in this function (of course, if you authoring your own code, you can be sure that all the values in the new column is of assigned column type).

    The same 'any' default is true for lists, records, and functions that have no explicitly typed elements. 

    Again, cannot totally agree:

    Value1 = 1 as number, 
    Value2 = 2 as number, 
    TypeCheck = Type.ListItem(Value.Type({Value1, Value2})), // any
    Value1n = Value.ReplaceType(Value1, type number),
    Value2n = Value.ReplaceType(Value2, type number),
    DoubleCheck = Type.ListItem(Value.Type({Value1n, Value2n})) // any
    TripleCheck = Type.ListItem(
        Value.Type(
            Value.ReplaceType(
                {Value1n, Value2n}, 
                type {number}
                )
            )
        ) // and only there it is a number type

    I have seen a separate pdf reference about types in M, but it scare me - too techy.

    Returning for the initial question... Suppose I write an custom function, which argument is a list of functions. The idea was that we can check the input and if there is a non-function element in this list  then raise an error. Now I can see that it is (possibly) cheaper to check the type of each list element when it used than make a type conformation for the input


    Maxim Zelensky Excel Inside

    Wednesday, April 18, 2018 2:22 PM
  • "I cannot totally agree with this. It depends on how is the type was assigned to the column. In the sample below a Column1 was set to the type text by Table.TransformColumnTypes, which, AFAIK, uses a set of type conversion functions (like Text.From), depending on the primitive nullable type set. The Column2 type was set via Table.AddColumn, and it works as Value.ReplaceType for the list - just set a type without conformation:"

    "That's why, I think, there is no reason to add the "New Column Type" oprion in the "Add Custom Column" dialog window - there is no type conversion and conformation in this function (of course, if you authoring your own code, you can be sure that all the values in the new column is of assigned column type)."

    Assigning types to a column using an optional type parameter, or specifying the types using the #table constructor doesn't check the type(s) of the values in a column. Matt Masson has actually uses this point to suggest that M does no type checking, which is of course incorrect. So what's the point of assigning types to columns in this way? It is mostly for the purpose of the load destination.

    For an Excel table, the only values affected are dates. If the date column is not assigned a date type, serial numbers will show up in the date column of the table loaded to Excel:

    let
        Source = #table(type table[Year = Int64.Type, Month = Int64.Type, Day = Int64.Type], {{2018, 12, 5}}),
        AddedCustom = Table.AddColumn(Source, "Date", each #date([Year], [Month], [Day]))
    in
        AddedCustom

    But now if you add a date type:

    let
        Source = #table(type table[Year = Int64.Type, Month = Int64.Type, Day = Int64.Type], {{2018, 12, 5}}),
        AddedCustom = Table.AddColumn(Source, "Date", each #date([Year], [Month], [Day]), type date)
    in
        AddedCustom

    the date values in the Excel table now display the "Short Date" format.

    Without getting into a lot of detail, after loading data to the data model with column types such as Int64.Type, and Currency.Type, the table columns in the data model appear with the correct type assignments. Finally, the Table.Profile function does not return correct values if the columns are untyped. Of course, for all of this to work you must be careful with assigning the correct type to the column. You wouldn't assign the wrong type deliberately (but it can happen accidently). If it is absolutely necessary to ensure that column values cannot be mixed (e.g. have both numbers and text) or the values must be restricted in some way (e.g. only allow integers within a specific range), then you must use Table.TransformColumnTypes.

    Type Checking vs. Type Conformance.

    Conformance is simply a mechanism used to test whether a given type is compatible with some other type, using functions available to facilitate such testing. It is not unlike any other kind of testing you may want to do in M (e.g. text comparisons).

    On the other hand, type checking occurs when an expression is evaluated, and the checking occurs automatically within the M engine. In fact, type checking is done in M only when evaluating an expression. This is why no checking is done when you assign a type to a table column, and why checking is done when you use Table.TransformColumnTypes. The point of type checking is to raise an error if you attempt to perform an operation with incompatible types.

    "...which, AFAIK, uses a set of type conversion functions (like Text.From), depending on the primitive nullable type set."

    Actually this is not necessarily the case. I created a function (with valuable contributions from Marcel Beugelsdijk) named Table_ConvertColumnTypes  (to avoid having the same name as the built-in function). Table_ConvertColumnTypes works exactly like Table.TransformColumnTypes, but was designed to address the issue of table columns with changing names and variable number of columns, which is not an uncommon occurrence in csv files. With Table.TransformColumnTypes, all of the column names are hard-coded. The code for Table_ConvertColumnTypes uses just one type conversion function. Also, it excludes types that cannot be converted in table columns - table, record, list, and function.

    Table_ConvertColumnTypes

    (table as table, optional culture as nullable text) as table =>
    let
        ValidTypes = {type any, type number, type date, type datetime, type datetimezone, 
                      type time, type duration, type logical, type text, type binary, 
                      Int64.Type, Percentage.Type, Currency.Type
                     },
        Top200Rows = Table.FirstN(table, 200), //we use up to 200 rows to establish a column type
        ColumnNameList = Table.ColumnNames(Top200Rows),
        ColumnDataLists = List.Accumulate(ColumnNameList, {}, (accumulated, i) => accumulated & {Table.Column(Top200Rows, i)}),
        ColumnTypes = List.Transform(ColumnDataLists, (i) => List.ItemType(i)),
        TransformList = List.Select(List.Zip({ColumnNameList, ColumnTypes}), each List.Contains(ValidTypes,_{1})),
        TypedTable = Table.TransformColumnTypes(table, TransformList, culture),
             
            List.ItemType = (list as list) =>
            let            
                ItemTypes = List.Transform(
                                list, 
                                each if Value.Type(Value.FromText(_, culture)) = type number
                                     then if Text.Contains(Text.From(_, culture),"%") then Percentage.Type
                                          else if Text.Length(Text.Remove(Text.From(_, culture), {"0".."9"} & Text.ToList("., -+eE()/'"))) > 0 
                                               then Currency.Type
                                               else if Int64.From(_, culture) = Value.FromText(_, culture)
                                                    then Int64.Type 
                                               else type number
                                     else Value.Type(Value.FromText(_, culture))
                            ),
                ListItemType = Type.NonNullable(Type.Union(ItemTypes))
            in
                ListItemType
    in
        TypedTable
    Another use for this function as a replacement for Table.TransformColumnTypes is when a table has a large number of columns and you don't to see a mile long list of column names in the function expression.
    Wednesday, April 18, 2018 7:45 PM
  • Colin, why don't you run a blog? :)

    Actually this is not necessarily the case.

    But you use the Table.TransformColumnTypes function in your code - so all type conversions was actually made by it. And how it converts values is hidden from us (now I think to use an AST analysis with Value.ResourceExpression), but if we try to convert to text type a column like {1,"1"} then we'll get {"1","1"}. Analyzing errors raised with Table.TransformColumnTypes, I made suggestion that all type conversions are performed with direct type conversion functions. That's why, for example, we can convert a column of text strings like "Feb 29, 2016" to data type AND values


    Maxim Zelensky Excel Inside

    Wednesday, April 18, 2018 9:14 PM
  • Colin, why don't you run a blog? :)

    That's a good question. I've thought about it many times, since there is a pile of fairly unique M content that I can contribute. One problem is blog fatigue. There are many very good Power BI blogs out there already (yours included), and it's not clear how to start a new blog without getting lost in the crowd. Wouldn't people just roll their collective eyes and say "Jesus, another Power BI blog. Do we really need one more? Who's this guy anyway?" In the early days of DAX I was a frequent contributor on PowerPivotPro, but my last post was a long time ago.

    But you use the Table.TransformColumnTypes function in your code - so all type conversions was actually made by it.

    No, that's not true. In Power Query, before the Changed Type step is added automatically, PQ uses the first 200 rows of the table and runs an algorithm on the columns to determine the column types. It only then uses the types in the Table.TransformColumnTypes function (Table.TransformColumnTypes, uses types passed to it - it cannot determine the types on its own).

    Table_ConvertColumnTypes works similarly. It takes a table as an argument (and optionally a culture argument).  The function uses up to 200 rows in the table, and runs a custom List.ItemType function to determine the column type of each column. List.ItemType contains logic to determine whether a numeric column is just a number (type number), integer (Int64.Type), currency (Currency.Type), or percentage (Percentage.Type). It also determines whether a column is text, date, etc. To determine all of the possible primitive types, List.ItemType uses only one type conversion function (Value.FromText). It doesn't need to apply any other type conversion function to determine the column type.

    After all of the column types are determined, the function uses a combination of the column names and the types to construct the list for the second argument of Table.TransformColumnTypes. Because Table_ConvertColumnTypes determines the types on the fly, any column changes in the source table will be picked up by the function. With Table.TransformColumnTypes, all of the column names and types are static.

    Analyzing errors raised with Table.TransformColumnTypes, I made suggestion that all type conversions are performed with direct type conversion functions. That's why, for example, we can convert a column of text strings like "Feb 29, 2016" to data type ANDvalues

    I'm not sure that I understand what you're saying here. Could you elaborate?


    Thursday, April 19, 2018 1:19 AM
  • After all of the column types are determined, the function uses a combination of the column names and the types to construct the list for the second argument of Table.TransformColumnTypes

    It is a very nice and useful function, indeed, and I think it kind of simulates that work-under-the-hood which PQ does when import from CSV, for example. But can you describe, how actually then Table.TransformColumns changes the type (provided by your other code) of values? What it doing?

    Analyzing errors raised with Table.TransformColumnTypes, I made suggestion that all type conversions are performed with direct type conversion functions. That's why, for example, we can convert a column of text strings like "Feb 29, 2016" to data type ANDvalues

    I'm not sure that I understand what you're saying here. Could you elaborate?


    Of course. If we pass to Table.TransformColumnTypes a column of the next values: 

    Value
    Jan 1, 2018
    2018-05-17
    15 января 2017

    How did it understand that all if them are dates? If it uses Value.FromText (for example), it can determine that all these values are dates. Then it need to change these values to the date values - not only say "this is a date type now", but convert them to the proper date type. May be it uses the functions that depends on determined type, or just use Value.FromText again - I do not know.

    For the next table the type determined by PQ will be "any", because it cannot convert "1" to the number, and both Value.FromText or Number.From 

    Value Enetered as
    1 text
    2 number
    "1" text
    1 text with '


    Maxim Zelensky Excel Inside

    Sunday, April 22, 2018 7:10 PM
  • It is a very nice and useful function, indeed, and I think it kind of simulates that work-under-the-hood which PQ does when import from CSV, for example. But can you describe, how actually then Table.TransformColumns changes the type (provided by your other code) of values? What it doing?

    The code interprets the types of the columns (like PQ), but requires Table.TransformColumnTypes to apply the actual changes to the columns. Note that you could write out the entire Table.TransformColumnTypes expression by hand (with no automatic determination of types) - specifying whatever types you want for the columns. If there are values in the column that are not compatible with the type you specify, then you will get errors for those values. I think that if the function was named Table.ConvertColumnTypes there would be less confusion, since that name is more explicit in describing what the function is actually doing.

    If the table has a fixed number and names of columns (that don't change over multiple refreshes), the Changed Type step has some potential advantages. The column types interpreted by PQ might be incorrect. PQ looks at only the first 200 rows in a column (it has to stop somewhere - looking at every value in the column could lead to severe performance issues if the table is large). It is possible that the values are truly mixed, but a different type shows up after the first 200 rows. Another scenario is where a column type is determined to be, say a number, but it is actually text that is stored as numbers (e.g 001234). In these cases, you could simply modify the step to apply a different type to the column(s). 

    Of course. If we pass to Table.TransformColumnTypes a column of the next values: 

    Value
    Jan 1, 2018
    2018-05-17
    15 января 2017

    How did it understand that all if them are dates? If it uses Value.FromText (for example), it can determine that all these values are dates.

    I don't know how it's possible to determine that those values in the column are all dates. On my system, the first two values are interpreted as dates, and the last as text. I could supply an appropriate culture argument to the function to determine that the last value is a date, but the first two values won't be dates in that culture.

    Then it need to change these values to the date values - not only say "this is a date type now", but convert them to the proper date type.

    Don't know what you mean. There is only one 'type date' in M.

    For the next table the type determined by PQ will be "any", because it cannot convert "1" to the number, and both Value.FromText or Number.From 

    Notice that text values in table columns (from any source) do not normally contain quotes to be identified as text. So the unquoted 'text' numbers convert to numbers but quoted text do not. An unquoted '1' stored as text is equivalent to Value.FromText("1"), which converts to a number (1). However, a quoted '1' is equivalent to Value.FromText("""1"""), which does not convert to a number (it converts to "1", which is text).

    Sunday, April 22, 2018 11:43 PM
  • Colin,
    Definitively I will be one of your many readers whenever you decide to make your own blog.

    Anthony


    • Edited by anthony34 Tuesday, April 24, 2018 12:14 PM
    Tuesday, April 24, 2018 12:13 PM
  • Hi Anthony, thx for the vote of confidence. :)
    Wednesday, April 25, 2018 7:32 PM
  • Hi Maxim,

    After some reflection, and based on your last post, I think that I'm now understanding the nature of some of your questions.

    Let's start with the following question:

    But can you describe, how actually then Table.TransformColumns changes the type (provided by your other code) of values? What it doing?

    I thought that we had this Tables.TransformColumns stuff nailed down based on previous discussions, but apparently this wasn't the case. I hope my last response clarified things a bit, but I want to take a step backwards.

    We've established that Power Query makes a determination about column types (for sources that don't provide a schema) based on sampling the first 200 rows of each column. As early as the Source step, we can already see text values aligned left in the column, and number values aligned right. Date values are aligned accordingly. However, all of the column types are 'type any'. Why doesn't Power Query internally convert the column types at the same time, without needing to output a separate 'Changed Type' step to do the work? I think the answer is that the conversion could have been done internally, but the development team decided to do this as an M expression instead. Why? Well, it is not guaranteed that PQ would determine the correct column types in all situations. If it didn't, then the user would have to add a 'Changed Type' step. With the existing approach, all you have to do is modify the existing step.

    The next question:

    Then it need to change these values to the date values - not only say "this is a date type now", but convert them to the proper date type.

    I'm still not 100% sure that I understand the statement, but I suspect that you're referring to type date, or type datetime, and so on.

    Indirectly, you may be also referring to Excel as a source, because I cannot think of any other source where a notion of "convert them to a proper date type" would apply. I am also assuming that you are referring to the fact that PQ imports dates from Excel as 'type datetime', and never as 'type date'. 

    In determining whether a value in an Excel table is a date, PQ has to read the format of the cell. If the format is compatible with one of PQ's recognized date formats, then PQ assumes that the value is a date (naturally, PQ has to take regional settings into account). This is the only way to know whether the underlying number stored in Excel is a serial number representing a date, or is just an ordinary number. However, PQ cannot depend on the format only to determine the date values, since formats might not include a year (e.g. Feb 1), a day (e.g. Feb, 2016), or a four-digit year (year might be 2 digits in the format). It therefore has to read the serial number. The serial number represents a date and a time. If a serial number is a whole number, then the time component is 12:00:00 AM. So PQ imports dates in Excel as a 'datetime' type. The user can modify the 'Changed Type' step if it is known that columns are only dates with no meaningful time component.

    I apologize if my assumptions in this post are incorrect. It's just my attempt to read between the lines.

    Wednesday, April 25, 2018 9:34 PM
  • Hi Colin.

    Thanks for such detailed answer.

    Actually I see that I still trying to describe my thoughts in a very dim way :)

    Ok. I was not sure whether PQ reads the data formats from the Excel file - my first thoughts it does not, but then I passed a table with 01.05.2018 (cell format is Date, dd.mm.yyyy), 43221 (General), and "May 1, 2018" (General). PQ correctly recognized 1st as datetime and 2nd as number and 3rd as text. I think it doesn't actually read cell formats, but primitive data types (for example, it do not recognize currency or other custom formats).

    If we'll speak about CSV, it then will apply something like your function to determine column types for "Change Type" automatic step (if this operation didn't set to off).

    1. Then if I apply the date type to this table column, PQ (as I can suppose based on your function) will use something like Date.From to convert each value to the date (change the primitive type of values from any to date). The result will be equal for all three values - 01.05.2018
    2. THEN I can convert this column to the number (in the separate step), for example, and PQ will make a correct conversion: all three values become 43221.
    3. If I'll try to convert the source data (01.05.2018, 43221 , "May 1, 2018") to the number type directly, without converting to the dates on previous step), I'll obviously get an error on the third value.
    4. So, after the step 1, the values in the column became the correct date values.

    So, what I'm talking about is that your function simulates the type recognition part of what PQ do when import from CSV, for example. But actual (values) type conversion is made by Table.TransformColumnTypes, which, based on types passed to it, perform the actual values conversion - not only "marking" the column as some type.

    Finally, if we speak about changing the column type in PQ, what Table.TransformColumnTypes do is not only changing a column type to the passed type, but attempting to convert all the values of this column to the (some internal) format of this type.

    That's why we cannot just replace table type with another table type,  where some of the columns will be of different type - we need a values conversion.

    I hope that now my reflections became more clear


    Maxim Zelensky Excel Inside

    Friday, April 27, 2018 9:59 AM
  • Ok. I was not sure whether PQ reads the data formats from the Excel file - my first thoughts it does not, but then I passed a table with 01.05.2018 (cell format is Date, dd.mm.yyyy), 43221 (General), and "May 1, 2018" (General). PQ correctly recognized 1st as datetime and 2nd as number and 3rd as text. I think it doesn't actually read cell formats, but primitive data types (for example, it do not recognize currency or other custom formats).

    Since dates are stored as numbers in Excel, the only way to distinguish between a number and a date is by the cell format. If the cell format is not a date, then the cell value is simply a number, regardless of any numeric formatting. Power Query ignores all numeric formatting (percent, currency,...) when importing data from Excel. I suspect that this behavior is a legacy of the days before PQ supported Currency.Type, Percentage.Type, etc., and the behavior has not been updated to support numeric formats.

    So, what I'm talking about is that your function simulates the type recognition part of what PQ do when import from CSV, for example. But actual (values) type conversion is made by Table.TransformColumnTypes, which, based on types passed to it, perform the actual values conversion - not only "marking" the column as some type.

    Correct. My function is equivalent to the first two steps that you get when PQ imports a file (the 'Source' step and the #"Changed Type" step).

    Finally, if we speak about changing the column type in PQ, what Table.TransformColumnTypes do is not only changing a column type to the passed type, but attempting to convert all the values of this column to the (some internal) format of this type.

    Yes. That's why I named my function Table_ConvertColumnTypes, and why I think that the native function should have been named Table.ConvertColumnTypes. There is absolutely nothing in the name 'Table.TransformColumnTypes' to suggest that type conversion is actually what the function is doing.

     

    Friday, April 27, 2018 2:56 PM