none
IF in Power Query statement struggling with converting text to numbers RRS feed

  • Question

  • Hi I'm a PQ newbie and struggling to understand a slow loading issue with an if statement.

    In the raw data the field [Location] is a text field e.g. 0010. I have a parameterised query that get a Location_Value from Excel and passes it to the PQ query using

    #"Filtered Rows1" = Table.SelectRows(#"Removed Other Columns", each ([SalesMode] = 0) and ([SalesType] = 0) and ([Location] = Location_Value))

    This works fine if you chose a single location.  However I wanted to be able to select all locations and text is horrible to work with so in PQ I used the change type function to change the location column into whole numbers. I changed excel to also pass a number as Location_Value.   I was therefore surprised when the same query took 2.5 times longer to refresh????

    My PQ now looks like this

      #"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Location", Int64.Type}}),
        #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([SalesMode] = 0) and ([SalesType] = 0) and ([Location] = Location_Value))

    i'm wondering if I need to do something to the ([Location] = Location_Value) bit as maybe it still thinks [Loation] is text and it is trying to compare it to a number. I st assumed the step above meant that [Location] was now a number, but maybe you still have wrap it with some kind of VALUES or TEXT function?

    Thanks for any advice

    mike

    Thursday, November 19, 2015 8:18 AM

Answers

  • Tried that and takes forever until eventually just returning 1 blank row. Not sure why as I can see a preview of the query in PQ editor that looks fine.  I suspect that any translation of data type is taking time, but seems really odd for this to be the case.  I've always been taught that numbers are more efficient than text, but not in this case!!!!

    Any way seems the lesson is to be very careful if changing the source data type for the effect on the refresh speed. Maybe it only matter if you filter on that changed field.
    Friday, November 20, 2015 4:31 PM

All replies

  • It is not clear to me how you link the need to select multiple locations with the type of the location column. But if you need to filter a column on a list of value, you could should use a statement like:

    // the list of location values is in column 'Locations' of table ParameterTable
    ListofLocations = ParameterTable[Locations],
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([SalesMode] = 0) and ([SalesType] = 0) and List.Contains(ListofLocations,[Location])

    Thursday, November 19, 2015 9:30 AM
  • Great something new to try.  Can the Parametertable[Location] be an excel table that you access through power query tab/from table?

    I'm still perplexed why changing between text and numbers is making such a big difference? Really like to understand that as quite often come across data that is a number but stored as text.

    Cheers for the tip

    Mike

    Thursday, November 19, 2015 9:52 AM
  • Hello Mike

    Did you ensured that your variable Location_Value has the type "Int64.Type", but not "any"? May be, as it looks in PQ as integer (coursive digit), actually it has type any or number, and PQ performs type change on every row


    Maxim Zelensky Excel Inside

    Thursday, November 19, 2015 6:26 PM
  • How would I do that? It's coming from an excel table where I just clicked on the convert text to number.
    Thursday, November 19, 2015 8:55 PM
  • When you load your parameters table in Power Query, on "Changed Type" step check the type PQ assigned to Locations: it should be "Int64.Type", not "number" or "any"

    For example, if you see this:

    = Table.TransformColumnTypes(Source,{{"Location", type any}})

    in parameters load query, its wrong, should be

    = Table.TransformColumnTypes(Source,{{"Location", Int64.Type}})

    You can add step to the query where you load Parameters table: select Location column and apply "data type" -> "Integer"


    Maxim Zelensky Excel Inside

    Friday, November 20, 2015 6:56 AM
  • Hi Maxim

    The the PQ table i used this command to change the lcoation into a number which looks the same as yours? [Location] is the column coming from the DB as text

     #"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Location", Int64.Type}}),

    However are you saying I need to do the same somewhere when getting the parameter form excel? I'm using this to set up the parameter so do I need to add a transform in here as just assumed it would be a number?

        Parameter = Excel.CurrentWorkbook(){[Name="Parameter"]}[Content],

        Location_Value = Parameter{0}[Value],

    Thanks Mike

    Friday, November 20, 2015 8:23 AM
  • Hi Mike

    Try this:

        Parameter = Excel.CurrentWorkbook(){[Name="Parameter"]}[Content],
        Location_Value = Int64.From(Parameter{0}[Value])

    May be I am wrong and slow processing caused by other reasons. For example, no query folding


    Maxim Zelensky Excel Inside

    Friday, November 20, 2015 9:23 AM
  • Pretty sure it is this as if I just change the PQ data back to text and excel back to text it runs much faster.  I'll give it a try thanks
    Friday, November 20, 2015 12:39 PM
  • So pretty disappointing results.  I ran 4 time trials for various ways of doing this mm:ss.

    1. Text in PQ and text in excel  1:40
    2. Change PQ colum to type "whole number" and in excel use the "convert to number" function. When I checked the format it says "General"  7:30!!!!
    3. Format the number in excel to "number" 0 decimals   3:30
    4. Change the PQ query to also use Maxim code above Location_Value = Int64.From(Parameter{0}[Value])

              that should be forcing the value from Excel to be exactly the same format as PQ code below to change the column data

        #"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Location", Int64.Type}}),

         However it is still only 3:10 so slightly faster than making sure the excel number is actually a number, but still 50% slower than using a text match.

     This seems really really odd and poor behavior!!!!  I guess I'm stuck with text unless anyone else has any bright ideas?

    Friday, November 20, 2015 1:52 PM
  • Hi Mike,

    it is very interesting. If all previous steps in your query are the same, it seems that the biggest time-consuming step is converting of entire [Location] column to Int64

    But if not, you can try to convert this "bug" into "feature": try to convert both [SalesMode] and [SalesType] to text and then perform filtering as [SalesMode] = "0" and [SalesType] = "0"

    If it is really faster with text filed filtering, then you could get a big increase in performance ))


    Maxim Zelensky Excel Inside

    Friday, November 20, 2015 3:12 PM
  • Tried that and takes forever until eventually just returning 1 blank row. Not sure why as I can see a preview of the query in PQ editor that looks fine.  I suspect that any translation of data type is taking time, but seems really odd for this to be the case.  I've always been taught that numbers are more efficient than text, but not in this case!!!!

    Any way seems the lesson is to be very careful if changing the source data type for the effect on the refresh speed. Maybe it only matter if you filter on that changed field.
    Friday, November 20, 2015 4:31 PM