none
Questions on type RRS feed

  • Question

  • Imke's recent post on "Type Conversion Mysteries" has had me thinking about various aspects of types that are not entirely clear to me. I'm hoping to get answers to the following questions:

    1) Type assignment vs type conversion - Are there any advantages of assigning a type to a table column (using a custom table type, or by replacing a type) versus converting a type (using Table.TransformColumnType)? Compared to assigning types, converting column types guarantees that the column values are compatible with the specified type, converts column values to the specified type, and allows the data model to recognize the set column types.

    2) Why is Int64.Type considered a whole number, but Int8.Type, Int16.Type, and Int32.Type are not?

    3) Matt Masson says that M is not dynamically typed, but the documentation says otherwise. Which source is correct? Matt appears to base his assertion on type assignment (where there is no type checking), but in virtually all other operations, type checking takes place.

    4) Custom list, record, and function types. The question here is whether I am missing something fundamental. When I first read about custom list types, I foolishly assumed that it would be a way to validate if a list parameter passed in a function was of a specific homogenous type - not by using something like "list as type {number}" (since unfortunately, only primitive types can be used in functions), but by using the custom list type within some function. Unfortunately, this is not the case. Same deal with record, or function parameters.

    According to the documentation:

    "There is no support in M for determining compatibility of a given type with a custom type. The standard library does include a collection of functions to extract the defining characteristics from a custom type, so specific compatibility tests can be implemented as M expressions."

    The problem with the foregoing statement is that there is really no practical way to use most of referred to functions (which are in the Type category).

    What function, for example, is there to determine whether a given list is compatible with type {number}. The only possible function is Type.ListItem. However, all this function allows to do is pass a custom list type as an argument, e.g. Type.ListItem(type {number}). What is point of passing a type to a function just to return the type you passed in? Moreover, the function doesn't help in determining whether a given list is compatible with type {number}. The same issue applies to Type functions used for records and functions - you pass in custom types just to return the types that you pass in. 

    What is missing are functions that return types of a given list, record fields or function parameters, and a function that compares these types with custom types. These functions would infer a type based on list values, record field values, or function parameter types (or any if no type is provided for a parameter).

    For example, functions like List.ItemTypes (possibly returning "any" if not homogenous), Record.FieldTypes, Function.ParameterTypes, and Table.ColumnTypes. And then Value.IsCustomType(List.ItemTypes(list), type {number}).....

    Currently, if I have to validate that a list parameter is homogenous, I use a custom function called List_IsOfType:

    (list as list, requiredType as text) =>
    let
        Type = Expression.Evaluate("type "&requiredType),
        IsRequiredType = List.AllTrue(List.Transform(list, each Value.Is(_, Type)))
    in
        IsRequiredType


    Friday, October 6, 2017 6:49 PM

Answers

  • Hey Colin,

    Here's the reason why ascribing data types is pretty amazing for REST API Custom Connectors:

    https://github.com/Microsoft/DataConnectors/tree/master/samples/TripPin/7-AdvancedSchema 

    You should probably read the Part 6 as well to understand more about why ascribing data types is necessary. In short, the 'any' state can cause trouble and can create multiple unnecessary requests, hence why defining the schema of the output table is necessary.

    I don't believe that MSFT promotes the formula language specification or any of the other pdfs that they had. Most of the links that I had to download them were killed and the main code reference nowadays is the msdn site:

    https://msdn.microsoft.com/en-us/library/mt211003.aspx

    to top it all up, new things are being added especially now with the Extensibility Model *PQSD* so new things are brought up to the table. Even some codes that were written in 2013, 2014 won't work now o refresh on the Power BI Service. I wonder where you got that October 2016 version - I can't even find cached versions of that pdf anywhere.

    You could ask this same question on the GitHub repo. I don't think Matt monitors this forum (is mostly Ehren and sometimes Curt), but Matt def appears to be the owner of the GitHub repo so you can post a question over there and get ahold of him directly.

    Thursday, October 26, 2017 9:37 PM

All replies

  • From my personal experience:

    1. Ascribing a data type is only a good thing when you don't need to perform a conversion/transformation and is EXTREMELY useful when creating a custom connector against an API.
    2. Back in 2015 the PQ crew made a choice to keep things simple on the tool and the cloud service itself. You could read all of those types, but they'll be "showcased" on the front end as either just number or integer and is even easier to read when you write something like "as number" or "as integer". 
    3. Would you mind sharing the source that you're using? what specific documentation are you reading? If it's the 2015 pdf, there are some things that have changed and some others that are still in place, but I wouldn't rely 100% on that document. Either way, only Matt can answer this question.
    4. You say that "...not by using something like "list as type {number}" (since unfortunately, only primitive types can be used in functions), but by using the custom list type within some function. Unfortunately, this is not the case. Same deal with record, or function parameters" but you can def use a complex object like a list of records, list of lists, list of records and tables and use that list inside of a function as a parameter or something else, furthermore you can use Value.Type to set the types inside each level and use some more complex solutions, but I've never had the need to use this. I'm not completely following this one, so I'll let somebody else answer this one.

    Sunday, October 8, 2017 10:58 PM
  • Hi Miguel,

    Thank you for taking the time to respond to my questions. Each of your answers lead to further questions/clarifications. Before addressing these additional questions with you directly, I'll wait to see if other responses are forthcoming.

    Monday, October 9, 2017 3:51 PM
  • Hi Miguel, since there has been no further posts on the questions I asked, I will address your responses:

    "Ascribing a data type is only a good thing when you don't need to perform a conversion/transformation ..."

    So why is ascribing a data type a good thing? What difference does it make vs. leaving the type as any? Table.Profile is one function that I've noticed where it makes a difference. And although there may be other scenarios where one might notice a difference, unless otherwise proven, I am of the firm belief that one is better off using Table.TransformColumnTypes. Consider the following scenarios:

    1) Importing a table from a database - Power Query imports the datatypes, and these datatypes show up in the data model. If you attempt to replace a value in a column with one that is not of the correct data type, you get and error. If you now add a custom column and ascribe a data type (e.g. type number), the type in the column is not checked by the engine and the data model doesn't pick up the type. Transforming the column to the appropriate data type addresses the issues.

    2) For data sources that provide no column type metadata, Power Query adds a transform column type step. This function provides the same "quality" types as database columns. If you change a value in a column to a different type, you get an error. The data model picks up the types correctly. The designers could have decided to have Table.TransformColumnTypes simply ascribe "unchecked" types to columns, but they didn't.

    "...and is EXTREMELY useful when creating a custom connector against an API"

    Are you suggesting that in the custom connector you provide metadata for types which then shows up correctly in the imported table? If not, it's not clear how the statement relates to ascribing vs. converting table column types.

    "Back in 2015 the PQ crew made a choice to keep things simple on the tool and the cloud service itself. You could read all of those types, but they'll be "showcased" on the front end as either just number or integer and is even easier to read when you write something like "as number" or "as integer".

    That's not the question I'm asking though. I can assign Int64.Type to a table column, and it appears as a whole number. The question is: Int8, Int16, Int32 are all whole numbers. So when you set a column to one of these types (Int8.Type, Int16.Type, Int32.Type), why doesn't the column appear as a whole number column?

    <sub></sub><sup></sup><strike></strike>"Would you mind sharing the source that you're using? what specific documentation are you reading? If it's the 2015 pdf, there are some things that have changed and some others that are still in place, ...

    The only official documentation on the topic: file:///C:/Users/Colin/AppData/Local/Packages/Microsoft.MicrosoftEdge_8wekyb3d8bbwe/TempState/Downloads/Power%20Query%20Formula%20Language%20Specification%20(October%202016).pdf

    It's dated October 2016, but like re-stamping a new date on expired food, a new date is stamped in the pdf every year or so, with no obvious changes within the document. 

    "...but I wouldn't rely 100% on that document."

    If we can't depend on this document, what more reliable source(s) is available?

    "Either way, only Matt can answer this question.."

    Matt already said that M is not dynamically typed  - I'm trying to understand the discrepancy between his statement and the one in the documentation. Type checking is done in most areas (function parameters, ascribing a type Value.ReplaceType, and so on). The primitive type of custom types are checked. It seems that the one area type is not checked is when one is assigned to a table column (without conversion). 

    "...but you can def use a complex object like a list of records, list of lists, list of records and tables and use that list inside of a function as a parameter or something else, furthermore you can use Value.Type to set the types inside each level and use some more complex solutions, but I've never had the need to use this. I'm not completely following this one, so I'll let somebody else answer this one"

    But you have provided an answer - except that it doesn't address the question. :) And since it appears that nobody else will be answering the question, I'm not going to bother further.

    I have created custom functions to return the item type of a given list, the types of record fields, and the types of table columns. They work for my purposes - most of the Type functions don't meet my type needs.


    Thursday, October 26, 2017 6:48 PM
  • Hey Colin,

    Here's the reason why ascribing data types is pretty amazing for REST API Custom Connectors:

    https://github.com/Microsoft/DataConnectors/tree/master/samples/TripPin/7-AdvancedSchema 

    You should probably read the Part 6 as well to understand more about why ascribing data types is necessary. In short, the 'any' state can cause trouble and can create multiple unnecessary requests, hence why defining the schema of the output table is necessary.

    I don't believe that MSFT promotes the formula language specification or any of the other pdfs that they had. Most of the links that I had to download them were killed and the main code reference nowadays is the msdn site:

    https://msdn.microsoft.com/en-us/library/mt211003.aspx

    to top it all up, new things are being added especially now with the Extensibility Model *PQSD* so new things are brought up to the table. Even some codes that were written in 2013, 2014 won't work now o refresh on the Power BI Service. I wonder where you got that October 2016 version - I can't even find cached versions of that pdf anywhere.

    You could ask this same question on the GitHub repo. I don't think Matt monitors this forum (is mostly Ehren and sometimes Curt), but Matt def appears to be the owner of the GitHub repo so you can post a question over there and get ahold of him directly.

    Thursday, October 26, 2017 9:37 PM
  • Hola Miguel,

    Well it turns out that I was wrong about ascribed types not being picked up by the data model. I managed to confuse myself on the issue. The main difference between ascribing types vs. transforming types is that ascribing types allows any value to be entered in a typed column without generating an error (e.g. you can put text in a numeric column). Transforming is better than ascribing if the data in a column must be validated. So, I consider the matter closed. :)

    Sorry about the incorrect link for the documentation. This one is correct:  https://msdn.microsoft.com/en-us/library/mt807488.aspx?f=255&MSPPError=-2147217396

    A google search turns up several additional links. The document is not supported from a maintenance standpoint,  but note that Matt quotes from the document in one of his latest videos, and specifies the document as a reference in his data connector discussions. Matt contradicts the document in some areas, so it could be that his information is more up to date.


    Friday, October 27, 2017 4:42 PM