none
how to improve the speed of this query RRS feed

  • Question

  • Hi everyone,

    I would need some help to understand why the query below that is that slow.

    Basically, it needs to convert some allocation reference such as "101:060|108:040"
    (where 101 and 108 are ID of people and 060 and 040 are their respective percentages)
    into "Mister101:060|Mister108:040",

    using a table with all the ID and Names.


    I cannot manage to do it with a decent speed, and I do not understand what in my code is making the query so slow

    Below I coded some dummy tables "Operations" and "Names", the issue sits somewhere in the Fn_PnlwithNames function:

    let
        Operations =
            let
                References =  Table.FromColumns({{1..9999}}, type table [#"Ref"=Int64.Type]),
                Pnl_id= Table.AddColumn(#"References", "Pnl id", each 
                    if [Ref] <= 5 then Text.From([Ref]+200)
                    else Text.From([Ref]+100) & ":030|" & Text.From([Ref]+101) & ":070", type text )
            in Pnl_id,
    
    
        Fn_PnlwithNames =
        //INPUT = "101:080|102:020"                or "101" (which means "101:100")
        // OUTPUT = "Mister101:080|Mister102:020"  or "Mister101"  
    
            (myInput as text) =>
                let
                     Names =
                        let
                            Source = Table.FromColumns({{100..999}}, type table [#"id"=text]),
                            ChType = Table.TransformColumnTypes(Source, {{"id", type text}}),
                            AddNames = Table.AddColumn(ChType, "Name", each "Mister" & Text.From([id]), type text)
                        in AddNames,
    
                    SINGLE_ALLOCATION = Table.SelectRows(#"Names", each [id]=myInput) [Name] {0}, 
    
                	MULTIPLE_ALLOCATION =
    		    let
    			Split = Text.Split(myInput, "|"),
    			ListTransform = List.Transform(Split, (_) =>
    				let
    					MySalesrepID = Text.Start(_,3),
    					myTable = #table(type table [a.id=text], {{MySalesrepID}} ),
    					Merge = Table.Join(myTable, {"a.id"}, #"Names", {"id"} ) [Name] {0},
    					ExtractSales = Merge,
                                            //Alternative version
    					//    ExtractSales= Table.SelectRows(Names, each [id]=MySalesrepID) [Name] {0}, 
    					Output = ExtractSales & Text.End(_,4)
    				in Output
    			),
    			ListCombine = Text.Combine(ListTransform, "|")
    		    in ListCombine,
    
    	        myResult = 
    		    if Text.Length(myInput)=3
    		    then SINGLE_ALLOCATION 
    		    else MULTIPLE_ALLOCATION
    
                in myResult,
    
    
        PnlwithNames =
            let
                Source = Operations,
                AddPnl = Table.AddColumn(Source, "Pnl", each Fn_PnlwithNames(Text.From([Pnl id])), type text)
            in AddPnl
    
    
    in PnlwithNames

    Obviously something in the function is not optimal. I also tried with Table.Buffer but it did not help.

    Thank you for your idea and explanations

    Anthony




    • Edited by anthony34 Monday, May 14, 2018 8:12 PM
    Monday, May 14, 2018 4:32 PM

Answers

  • Got it. I think then the biggest leak is a lot of joins - for each row in the table, it is too much for PQ.

    Tried to rework the whole task with the buttons (ok, except last step):

    // Operations
    let
        Operations =
            let
                References =  Table.FromColumns({{1..9999}}, type table [#"Ref"=Int64.Type]),
                Pnl_id= Table.AddColumn(#"References", "Pnl id", each 
                    if [Ref] <= 5 then Text.From([Ref]+200)
                    else Text.From([Ref]+100) & ":030|" & Text.From([Ref]+101) & ":070", type text )
            in Pnl_id
    in
        Operations
    
    // Names
    let
                Source = Table.FromColumns({{100..999}}, type table [#"id"=text]),
                ChType = Table.TransformColumnTypes(Source, {{"id", type text}}),
                AddNames = Table.AddColumn(ChType, "Name", each "Mister" & [id], type text)
            in 
                Table.Buffer(AddNames)
    
    // Pnl with Names
    let
        Operations =
            Operations,
        #"Added Custom" = Table.AddColumn(Operations, "ToList", each Text.Split([Pnl id],"|")),
        #"Expanded {0}" = Table.ExpandListColumn(#"Added Custom", "ToList"),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Expanded {0}", "ToList", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"ToList.1", "ToList.2"}),
        #"Merged Queries" = Table.NestedJoin(#"Split Column by Delimiter",{"ToList.1"},Names,{"id"},"Names",JoinKind.LeftOuter),
        #"Expanded {0}1" = Table.ExpandTableColumn(#"Merged Queries", "Names", {"Name"}, {"Name2"}),
        #"Inserted Merged Column" = Table.AddColumn(#"Expanded {0}1", "Name", each Text.Combine({[Name2], [ToList.2]}, ":"), type text),
        #"Removed Other Columns" = Table.SelectColumns(#"Inserted Merged Column",{"Ref", "Pnl id", "Name"}),
        #"Grouped Rows" = Table.Group(#"Removed Other Columns", {"Ref", "Pnl id"}, {{"tmp", each Text.Combine(List.Sort(_[Name]), "|"), type text}}),
        #"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Ref", Order.Ascending}})
    in
        #"Sorted Rows"
    
    

    I think this should work a little bit faster since there is the only one join and grouping.


    Maxim Zelensky Excel Inside

    • Marked as answer by anthony34 Wednesday, May 16, 2018 8:29 AM
    Tuesday, May 15, 2018 4:20 PM

All replies

  • maybe I am oversimplifying this but wouldn't concatenating "Mister" & using Text.Replace to change "|" to "|Mister" do the job?
    Tuesday, May 15, 2018 9:36 AM
  • I do not understand why this task is so complex, may be, there shouldn't be just "Mister", but your Names table possible has a different names that you need to found via id? Please elaborate the task.

    If you need to add exactly the "Mister" to the input codes then there is a much shorter solution:

    let
        Operations =
            let
                References =  Table.FromColumns({{1..9999}}, type table [#"Ref"=Int64.Type]),
                Pnl_id= Table.AddColumn(#"References", "Pnl id", each 
                    if [Ref] <= 5 then Text.From([Ref]+200)
                    else Text.From([Ref]+100) & ":030|" & Text.From([Ref]+101) & ":070", type text )
            in Pnl_id,
    
        Names =
            let
                Source = Table.FromColumns({{100..999}}, type table [#"id"=text]),
                ChType = Table.TransformColumnTypes(Source, {{"id", type text}}),
                AddNames = Table.AddColumn(ChType, "Name", each "Mister" & [id], type text)
            in 
                Table.Buffer(AddNames),
    
        Fn_PnlwithNames =
        //INPUT = "101:080|102:020"                or "101" (which means "101:100")
        // OUTPUT = "Mister101:080|Mister102:020"  or "Mister101"  
    
            (myInput as text) =>
                let
                    Split = Text.Split(myInput, "|"),
                    ListTransform = List.Transform(Split, each "Mister"&_),
                    ListCombine = Text.Combine(ListTransform, "|")
                in 
                ListCombine,
    
        PnlwithNames =
            let
                Source = Operations,
                AddPnl = Table.AddColumn(Source, "Pnl", each Fn_PnlwithNames(Text.From([Pnl id])), type text)
            in AddPnl
    
    in PnlwithNames


    For the sample code provided the main reason of slowness, I think, that you create the Names table on the each call of this function.
    Move this part of the code outside of the function and test it:

                     Names =
                        let
                            Source = Table.FromColumns({{100..999}}, type table [#"id"=text]),
                            ChType = Table.TransformColumnTypes(Source, {{"id", type text}}),
                            AddNames = Table.AddColumn(ChType, "Name", each "Mister" & Text.From([id]), type text)
                        in AddNames,

    The other corrections are small, I think. For example, it seems that on this step you don't need to transform the [id] to the text again as you already converted it to the text type on the previous step:

    AddNames = Table.AddColumn(ChType, "Name", each "Mister" & Text.From([id]), type text)

    The same on the last step:

    AddPnl = Table.AddColumn(Source, "Pnl", each Fn_PnlwithNames(Text.From([Pnl id])), type text)
    
    Pnl id is already has the  text type.



    Maxim Zelensky Excel Inside

    Tuesday, May 15, 2018 10:36 AM
  • Thank you Maxim,

    The Names and Operations table are just dummy tables I created here quickly to make the query work. The real tables are external and have many more data. The MisterXXX are just replacing the real names.

    The purpose of the function is only to replace the Ids by the Respective Names in the code Id1:Pct1|Id2:Pct2|Id3:Pct3 (there is no limit for the number of IDs), using the Table Names [[id],[Name]]

    To simplify, Id and Pct have always 3 digits

    Example:

    101:020|102:030|556:050 => Paul:020|John:030|Kevin:050

    Where table Names gives:

    [[Id=101, Name=Paul], [Id=102, Name=John], [Id=556=Kevin]]

    The code to replace could be 101:100 or 101:050|102:050 or 101:050|102:040|103:010 or 101:010|102:040|103:020|108:030 etc ...

    The function needs to replace the Ids by the names, whatever how many Ids the is in the code

    As you suggested, moving the table Names out of the function definitively improved it. But it still slow for such an simple task.

    Tuesday, May 15, 2018 3:36 PM
  • the "MisterXXX" are only for example. See my explanations to Maxim below
    Tuesday, May 15, 2018 3:38 PM
  • Got it. I think then the biggest leak is a lot of joins - for each row in the table, it is too much for PQ.

    Tried to rework the whole task with the buttons (ok, except last step):

    // Operations
    let
        Operations =
            let
                References =  Table.FromColumns({{1..9999}}, type table [#"Ref"=Int64.Type]),
                Pnl_id= Table.AddColumn(#"References", "Pnl id", each 
                    if [Ref] <= 5 then Text.From([Ref]+200)
                    else Text.From([Ref]+100) & ":030|" & Text.From([Ref]+101) & ":070", type text )
            in Pnl_id
    in
        Operations
    
    // Names
    let
                Source = Table.FromColumns({{100..999}}, type table [#"id"=text]),
                ChType = Table.TransformColumnTypes(Source, {{"id", type text}}),
                AddNames = Table.AddColumn(ChType, "Name", each "Mister" & [id], type text)
            in 
                Table.Buffer(AddNames)
    
    // Pnl with Names
    let
        Operations =
            Operations,
        #"Added Custom" = Table.AddColumn(Operations, "ToList", each Text.Split([Pnl id],"|")),
        #"Expanded {0}" = Table.ExpandListColumn(#"Added Custom", "ToList"),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Expanded {0}", "ToList", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"ToList.1", "ToList.2"}),
        #"Merged Queries" = Table.NestedJoin(#"Split Column by Delimiter",{"ToList.1"},Names,{"id"},"Names",JoinKind.LeftOuter),
        #"Expanded {0}1" = Table.ExpandTableColumn(#"Merged Queries", "Names", {"Name"}, {"Name2"}),
        #"Inserted Merged Column" = Table.AddColumn(#"Expanded {0}1", "Name", each Text.Combine({[Name2], [ToList.2]}, ":"), type text),
        #"Removed Other Columns" = Table.SelectColumns(#"Inserted Merged Column",{"Ref", "Pnl id", "Name"}),
        #"Grouped Rows" = Table.Group(#"Removed Other Columns", {"Ref", "Pnl id"}, {{"tmp", each Text.Combine(List.Sort(_[Name]), "|"), type text}}),
        #"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Ref", Order.Ascending}})
    in
        #"Sorted Rows"
    
    

    I think this should work a little bit faster since there is the only one join and grouping.


    Maxim Zelensky Excel Inside

    • Marked as answer by anthony34 Wednesday, May 16, 2018 8:29 AM
    Tuesday, May 15, 2018 4:20 PM
  • Your solution works like a rocket (about 100x faster)

    I made it far too complex on the row level, while the Table.Join/Table.Group on the column is so efficient, still not that obvious. I could not manage to make the

    Table.Group(#"step-1", {"Ref", "Pnl id"}, {{"tmp", each Text.Combine(List.Sort(_[Name]), "|"), type text}}),

    you suggested and which makes it possible. Thanks a million Maxim



    • Edited by anthony34 Wednesday, May 16, 2018 9:17 AM
    Wednesday, May 16, 2018 8:29 AM
  • You are welcome :)

    Maxim Zelensky Excel Inside

    Wednesday, May 16, 2018 2:42 PM