none
Excel.Workbook() and delayTypes

    Question

  • Can anyone tell me what the delayTypes parameter of Excel.Workbook() does? It seems to be set to true by default in all recent versions of Power Query and *not* setting it to true seems to be bad for performance. I guess it is something to do with inferring data types on columns, but I'd like to know for sure.

    Thanks,

    Chris


    Check out my MS BI blog I also do SSAS, Power BI, MDX, DAX and M consultancy

    Saturday, December 15, 2018 8:17 PM

Answers

  • Can anyone tell me what the delayTypes parameter of Excel.Workbook() does? It seems to be set to true by default in all recent versions of Power Query and *not* setting it to true seems to be bad for performance. I guess it is something to do with inferring data types on columns, but I'd like to know for sure.

    Thanks,

    Chris


    Check out my MS BI blog I also do SSAS, Power BI, MDX, DAX and M consultancy

    Hi Chris,

    In the Source step of binary data files (or in a navigation step that follows the Source step) Power Query already interprets the data types for each column (you can tell by the way the data is formatted in the columns). However, at this point, by default, the column types are not converted. Conversion is done in the Changed Type step, so that the user can correct any misinterpretation made by Power Query.

    In the case of Excel.Workbook with delayTypes set to false, the conversion of column types is done within the Navigation step. This would allow for "dynamic" typing of columns, so that if the name of a column is changed for example, or if data in a column is corrected (which could influence how the type is interpreted), you don't have to alter a hard-coded Changed Type step.

    However, one limitation to the false setting (other than your observation that performance appears to be affected) is that it only converts to primitive types. For example, in a column with whole numbers, the type will be converted from any to number, whereas TransformColumnTypes will use the type facet 'Int64.Type' in the Changed Type step. This may be a limitation that one will be willing to accept in exchange for "dynamic" typing.

    Monday, December 17, 2018 5:28 PM

All replies

  • maybe you can find the answer to this forum

    This function has always had two parameters as documented for today

    Excel.Workbook (workbook as binary, useHeaders as optional logical nullable) as table
    https://msdn.microsoft.com/en-us/library/mt260859.aspx

     

    However, with the latest PQ update connector for the Excel file generates 3 parameters, it seems

    Source = Excel.Workbook (File.Contents ("TheBook.xlsx"), null, true)
    Second parameter (here is null) works as expected. Which is the third parameter I could not recognize, it does not affect anything in my short tests.

     

    Maybe it's somehow connected with modified combined binaries introduced in January / 17 update. It automatically generates "Transform ..." queries based on Excel.Wokbook (). That's just the guess.

     

    So far I have no idea what to do with the third parameter. Except to remove it - otherwise, people who do not have recent G & T updates (more accurately, Excel 2016 updates - most do not even know what's in Excel) will receive an error that some functions get 3 parameters instead of 1 or 2 ...

    Sunday, December 16, 2018 12:42 AM
  • Can anyone tell me what the delayTypes parameter of Excel.Workbook() does? It seems to be set to true by default in all recent versions of Power Query and *not* setting it to true seems to be bad for performance. I guess it is something to do with inferring data types on columns, but I'd like to know for sure.

    Thanks,

    Chris


    Check out my MS BI blog I also do SSAS, Power BI, MDX, DAX and M consultancy

    Hi Chris,

    In the Source step of binary data files (or in a navigation step that follows the Source step) Power Query already interprets the data types for each column (you can tell by the way the data is formatted in the columns). However, at this point, by default, the column types are not converted. Conversion is done in the Changed Type step, so that the user can correct any misinterpretation made by Power Query.

    In the case of Excel.Workbook with delayTypes set to false, the conversion of column types is done within the Navigation step. This would allow for "dynamic" typing of columns, so that if the name of a column is changed for example, or if data in a column is corrected (which could influence how the type is interpreted), you don't have to alter a hard-coded Changed Type step.

    However, one limitation to the false setting (other than your observation that performance appears to be affected) is that it only converts to primitive types. For example, in a column with whole numbers, the type will be converted from any to number, whereas TransformColumnTypes will use the type facet 'Int64.Type' in the Changed Type step. This may be a limitation that one will be willing to accept in exchange for "dynamic" typing.

    Monday, December 17, 2018 5:28 PM
  • Hi Colin,

    Thanks, that all makes sense - I hadn't noticed that the data types were being set at the Navigation stage with delayTypes=false. I guess the dev team found that asking whatever provider they use under the covers to set the data types was a lot slower than and less sophisticated than requesting untyped data and setting the data types inside the Power Query engine. Certainly in the limited testing I've done, on a biggish Excel file, with delayTypes=false there is a very long "Evaluating" stage when the query starts to load, but once that's done the data loads fairly quickly; with delayTypes=true there is no "Evaluating" stage, the data starts to load immediately and the query is much faster (there is also, as you might expect, a bit of an overhead to having a Changed Type step too). I need to do some more research and get this written up as a blog post...

    As an aside, I've always wondered what the "Evaluating" phase of a query does - I have my suspicions and these experiments have made them stronger, but I'd love a clear explanation from the dev team.

    Regards,

    Chris


    Check out my MS BI blog I also do SSAS, Power BI, MDX, DAX and M consultancy

    Monday, December 17, 2018 7:58 PM