locked
Recursive function very fast for 1-3 iterations, very slow (dies) at 9+ RRS feed

  • Question

  • Hi,

    I have a recursive function which maps a multicolumn general ledger table by a mapping table and outputs to a third output table showing the results (input, map results side by side).

    It is very fast for 1-3 row of mapping inputs, 4- 9 is tolerable, 9 + doesn't work.  Is there some way to clear the buffer each time the recursive function starts a new row?  All the data resides in the spreadsheet.

    Many thanks.

    The code I am using is

    let

    //Loop Through each line of the Mapping table
    //Filter the DataTable input columns
    //Add a temporary DataTable output Column
    //Combine the temporary and previous mapping update into Mapped Output table


      //Filter the DataTable based on the inputs from the Mapping Table
      TrialMapFunction = (DataTable as table, MappingTable as table, optional StartRow as number) =>

       let
         
        ActualRow= if (StartRow =null) then 0 else StartRow,

       FilteredData= FuncFilterDataFile(DataTable, MappingTable,ActualRow),

     AddCurMapCol_0 = Table.AddColumn(FilteredData, "CurLoopCo", each MappingTable{ActualRow}[newAccount]),
     AddCurMapCol_1 = Table.AddColumn(AddCurMapCol_0, "CurLoopAccount", each MappingTable{ActualRow}[newCo]), 
     AddCurMapCol_2 = Table.AddColumn(AddCurMapCol_1, "CurLoopGaapPC", each MappingTable{ActualRow}[newGaapPC]),
     AddCurMapCol_3 = Table.AddColumn(AddCurMapCol_2, "CurLoopStatPC", each MappingTable{ActualRow}[newStatPC]),
     AddCurMapCol_4 = Table.AddColumn(AddCurMapCol_3, "CurLoopPreCom", each MappingTable{ActualRow}[newPreCom]),
     AddCurMapCol_5 = Table.AddColumn(AddCurMapCol_4, "CurLoopReCat", each MappingTable{ActualRow}[newReCat]),
     AddCurMapCol_6 = Table.AddColumn(AddCurMapCol_5, "CurLoopReCo", each MappingTable{ActualRow}[newReCo]),
     AddCurMapCol_7 = Table.AddColumn(AddCurMapCol_6, "CurLoopTran", each MappingTable{ActualRow}[newTran]),
     AddCurMapCol_8 = Table.AddColumn(AddCurMapCol_7, "CurLoopGeo", each MappingTable{ActualRow}[newGeo]),
     AddCurMapCol_9 = Table.AddColumn(AddCurMapCol_8, "CurLoopCCtr", each MappingTable{ActualRow}[newCCtr]),
     AddCurMapCol_10 = Table.AddColumn(AddCurMapCol_9, "CurLoopPCtr", each MappingTable{ActualRow}[newPCtr]),
     AddCurMapCol_11 = Table.AddColumn(AddCurMapCol_10, "CurLoopLdg", each MappingTable{ActualRow}[newLdg]),
     AddCurMapCol_12 = Table.AddColumn(AddCurMapCol_11, "CurLoopAff", each MappingTable{ActualRow}[newAff]),
     //AddCurMapCol_3 = Table.AddColumn(AddCurMapCol_12, "CurLoopxxx", each MappingTable{ActualRow}[newStatPC]),
     //AddCurMapCol_3 = Table.AddColumn(AddCurMapCol_2, "CurLoopxxx", each MappingTable{ActualRow}[newStatPC]),
     //AddCurMapCol_3 = Table.AddColumn(AddCurMapCol_2, "CurLoopxxx", each MappingTable{ActualRow}[newStatPC]),
     //AddCurMapCol_3 = Table.AddColumn(AddCurMapCol_2, "CurLoopxxx", each MappingTable{ActualRow}[newStatPC]),

        RenameIndexForMerge = Table.RenameColumns(AddCurMapCol_12 ,{{"Index", "Loop.Index"}}),
        MergeMap_0 = Table.Join(DataTable,{"Index"},Table.SelectColumns(RenameIndexForMerge,{"Loop.Index","CurLoopAccount","CurLoopCo","CurLoopGaapPC", "CurLoopStatPC","CurLoopPreCom","CurLoopReCat","CurLoopReCo","CurLoopTran","CurLoopGeo","CurLoopCCtr","CurLoopPCtr","CurLoopLdg","CurLoopAff"}),{"Loop.Index"},JoinKind.FullOuter),

        CombineLoop_0 = Table.AddColumn(MergeMap_0, "LatestCo", each
               if [mCo] = "NoMapping" and [CurLoopCo] <> "" and [CurLoopCo] <> null
                 then
               [CurLoopAccount]
                 else
               [mCo]
                               ),

    //Repeats for each input Column

    // Each input can receive a different mapping combination for each subsequent row of mapping input

    // e.g  Co code 111 may be mapped to AAA, Co code 112 to ZZZ

        CombineLoop_1 = Table.AddColumn(CombineLoop_0, "LatestAccount", each
               if [mAccount] = "NoMapping" and [CurLoopAccount] <> "" and [CurLoopAccount] <> null
                 then
               [CurLoopCo]
                 else
               [mAccount]
                               ),

        CombineLoop_2 = Table.AddColumn(CombineLoop_1, "LatestGaapPC", each
               if [mGaapPC] = "NoMapping" and [CurLoopGaapPC] <> "" and [CurLoopGaapPC] <> null
                 then
               [CurLoopGaapPC]
                 else
               [mGaapPC]
                               ),

    // StatPC PreCom ReCat ReCo Tran Geo CCtr PCtr Ldg Aff 


        CombineLoop_3 = Table.AddColumn(CombineLoop_2, "LatestStatPC", each
               if [mStatPC] = "NoMapping" and [CurLoopStatPC] <> "" and [CurLoopStatPC] <> null

                 then
               [CurLoopStatPC]
                 else
               [mStatPC]
                               ),
            
        CombineLoop_4 = Table.AddColumn(CombineLoop_3, "LatestPreCom", each
               if [mPreCom] = "NoMapping" and [CurLoopPreCom] <> "" and [CurLoopPreCom] <> null
                 then
               [CurLoopPreCom]
                 else
               [mPreCom]
                               ),
            
        CombineLoop_5 = Table.AddColumn(CombineLoop_4, "LatestReCat", each
               if [mReCat] = "NoMapping" and [CurLoopReCat] <> "" and [CurLoopReCat] <> null
                 then
               [CurLoopReCat]
                 else
               [mReCat]
                               ), 
        CombineLoop_6 = Table.AddColumn(CombineLoop_5, "LatestReCo", each
               if [mReCo] = "NoMapping" and [CurLoopReCo] <> "" and [CurLoopReCo] <> null
                 then
               [CurLoopReCo]
                 else
               [mReCo]
                               ),
        CombineLoop_7 = Table.AddColumn(CombineLoop_6, "LatestTran", each
               if [mTran] = "NoMapping" and [CurLoopTran] <> "" and [CurLoopTran] <> null
                 then
               [CurLoopTran]
                 else
               [mTran]
                               ),
        CombineLoop_8 = Table.AddColumn(CombineLoop_7, "LatestGeo", each
               if [mGeo] = "NoMapping" and [CurLoopGeo] <> "" and [CurLoopGeo] <> null
                 then
               [CurLoopGeo]
                 else
               [mGeo]
                               ),
        CombineLoop_9 = Table.AddColumn(CombineLoop_8, "LatestCCtr", each
               if [mCCtr] = "NoMapping" and [CurLoopCCtr] <> "" and [CurLoopCCtr] <> null
                 then
               [CurLoopCCtr]
                 else
               [mCCtr]
                               ),
        CombineLoop_10 = Table.AddColumn(CombineLoop_9, "LatestPCtr", each
               if [mPCtr] = "NoMapping" and [CurLoopPCtr] <> "" and [CurLoopPCtr] <> null
                 then
               [CurLoopPCtr]
                 else
               [mPCtr]
                               ),
        CombineLoop_11 = Table.AddColumn(CombineLoop_10, "LatestLdg", each
               if [mLdg] = "NoMapping" and [CurLoopLdg] <> "" and [CurLoopLdg] <> null
                 then
               [CurLoopLdg]
                 else
               [mLdg]
                               ),
        CombineLoop_12 = Table.AddColumn(CombineLoop_11, "LatestAff", each
               if [mAff] = "NoMapping" and [CurLoopAff] <> "" and [CurLoopAff] <> null
                 then
               [CurLoopAff]
                 else
               [mAff]
                               ),
        //CombineLoop_13 = Table.AddColumn(CombineLoop_12, "Latestyyy", each
        //       if [myyy] = "NoMapping" and [CurLoopyyy] <> "" and [CurLoopyyy] <> null
        //         then
        //       [CurLoopyyy]
        //         else
        //       [myyy]
        //                       ),

        RemovePrevious = Table.RemoveColumns(CombineLoop_12,{"Loop.Index", "CurLoopAccount", "CurLoopCo", "CurLoopGaapPC", "CurLoopStatPC", "CurLoopPreCom", "CurLoopReCat", "CurLoopReCo", "CurLoopTran", "CurLoopGeo", "CurLoopCCtr", "CurLoopPCtr", "CurLoopLdg", "CurLoopAff", "mCo", "mAccount", "mGaapPC", "mStatPC", "mPreCom", "mReCat", "mReCo", "mTran", "mGeo", "mCCtr", "mPCtr", "mLdg", "mAff"}),
        result = Table.RenameColumns(RemovePrevious,{{"LatestCo", "mCo"}, {"LatestAccount", "mAccount"}, {"LatestGaapPC", "mGaapPC"}, {"LatestStatPC", "mStatPC"}, {"LatestPreCom", "mPreCom"}, {"LatestReCat", "mReCat"}, {"LatestReCo", "mReCo"}, {"LatestTran", "mTran"}, {"LatestGeo", "mGeo"}, {"LatestCCtr", "mCCtr"}, {"LatestPCtr", "mPCtr"}, {"LatestLdg", "mLdg"}, {"LatestAff", "mAff"}}),

          NextRow = ActualRow + 1,
      
          OutputTable =
            if NextRow > (Table.RowCount(MappingTable)-1)
              then result
            else
              @TrialMapFunction(result, MappingTable, NextRow)
       in
           OutputTable
    in
        TrialMapFunction

    Tuesday, August 25, 2015 8:19 AM

Answers

  • Hi Guys, thanks for your input, I picked up a clue from some blogs and I found that if I added Table.Buffer to line where the DataTable is passed back up to the recursive function it works a treat.

    @TrialMapFunction(Table.Buffer(result), MappingTable, NextRow).

    Speed is amazing.  5 secs for 35 mapping rows, 1550 Data Input rows - because of the <>*xx* flexible filtering, the mapping tables can be kept short and efficient and future proof. PQ is so cool.

    It feels like without Table.Buffer, M somehow keeps track of added columns and cleans up only when the final table is 'Loaded', Table.Buffer forces it to clean up the table at each loop shedding any unneeded baggage.

    I will try the append for filtered replace - might make it quicker still, probably simpler to follow too.

    Thanks for your efforts guys ;-)

    Tuesday, August 25, 2015 10:21 PM

All replies

  • Read through some of your code but it is a little difficult to understand what you are doing, can you post sample input and output data and a little more description of what you are trying to accomplish in the code above?
    Tuesday, August 25, 2015 2:18 PM
  • Hi Seth

    I have included three tables, GL to be mapped in Green, Mapping table in Brown, and the resulting output in Blue.  Hope this makes sense.

    The way it works is ,Read the input table and add the required output columns (prefixed with an 'm')  Passed to function as 'DataTable'.  The m columns start with a value of 'NoMapping' which is overwritten when a mapping is found.

    The left hand side of the mapping table shows the original GL combinations to search on, the right hand side (of the same mapping table) shows which 'm' columns to update when the matching row combination is found in the original GL table.

    Because there is no [Filter Replace] function in M language the only way to update only the rows that meet the original GL map sequence is to filter the original GL file, merge the filtered record set with the copy of the DataTable (original GL plus the target 'm' fields), add an additional 'm' equivalent populated with an results of the previous and the current mapping line iterations, deleting original 'm' field and then renaming the new 'mCurrent' to the original 'm' field.  The resultant table therefore has the result of all the mapping results to pf previous iterations.

    So for each mapping line the sequence is, Filter Original Input file, for each field with a string in the 'new' field of the mapping table - merge a column to the results table based on the common index, add an additional column to represent the combined new current results and previous results (from the 'm' of the results table), delete the original 'm' field and the merged field, then rename the combined field back to the original m field, then move on to the next mapping record and start again.

    The code is very sensitive to the number of mapping rows.  My feeling is that with all the columns being added somehow PQ tries to keep track of all the added columns until all the mapping rows are complete and then deletes the columns that are not needed at the end.  The original result table doesn't grow during the code.  The only thing that happens is that the 'NoMapping' is populated in all 'm' fields as below, the code replaces the 'NoMapping' with the values from the mapping table.

    This PQ does exactly what the user needs and replaces a difficult to use mainframe system, just need to work out why it chokes when the number of mapping combinations hits double digits - in practice there are likely to be 100- 200 mapping combinations of lines.

    Thanks for taking a look and for your help.

    Original input GL file to be mapped.

    Mapping file.  Orange is sequence to search for in the GL file, Brown (right hand side) is the value to update (replace the 'NoMapping' string) in the equivalent 'm' field of the Results file.

    Results file.  In this case every combination has been mapped all records in the input file had Co 234 and in the mapping row containing the Co = 234 filter all 'new/m' columns had a mapto string.

    Tuesday, August 25, 2015 4:35 PM
  • I am thinking that you could accomplish what you are trying to do by creating a Merge query step or a series of Merge queries steps. So, for example, you could create a Merge query step that Merged your input query with the mapping table matched up on GL_PROD_CODE. This would add the columns from the mapping table to your input table. Now, you would probably get a lot of irrelevant fields, but you could get rid of those with a Remove Columns step.

    It might be advantageous to create a query for each "type" of mapping where you pare down only the columns you want and for each you would filter out blank values in the mapping column. So, for example, create one for the mapping table for GL_PROD_CODE that filters out blank values from GL_PROD_CODE and removes all other irrelevant columns. Do the same thing in another query for GL_PROFIT_CTR.

    Then, create a series of Merge queries or Merge query steps that UNIONs all of these together to get a final table.

    This should get you what you want without the recursive code and, in my opinion, be easier and more understandable to maintain going forward without all of the code.

    Tuesday, August 25, 2015 4:50 PM
  • Would unpivot - translate - and then pivot back work here?


    Imke Feldmann TheBIccountant.com

    Tuesday, August 25, 2015 6:26 PM
  • One of the things I didn't mention is that the filters have wild cards such as

    =xx Equals xx
    <>=xx Does Not equal xx
    *xx*  Contains xx
    <>*xx* Does not Contain xx
    *xx Begins with xx
    <>*xx Does Not Begin with xx
    xx* Ends with xx
    xx*<> Does Not end with xx

    This means it is necessary to first filter the input table to identify which records satisfy the filters for each mapping line.  Maybe there is a concatenated unique key solution joining each table then merge? 

    Tuesday, August 25, 2015 6:31 PM
  • sounds good - create a concatenated key - perform your steps & then merge back on concKey to show your single cols for the Output table.

    Imke Feldmann TheBIccountant.com

    Tuesday, August 25, 2015 6:45 PM
  • Will have a go, can I just check one thing, this would be very straight forward if there was a 'filter a column' before performing a replace/update (update a column dependent on a value in a different column).  Is it right this doesn't exist? Is it planned for a future release?  Thanks for you attentions.

    Tuesday, August 25, 2015 7:26 PM
  • Not sure if I understood your request here, so to check: If you want to perform the replace/update operation just on selected rows you have basically 2 choices:

    1) like you did above: wrap your replace-Operation in an if-then-statement where the if checks the filter condition on that (other) column. So all rows will stay "visible", just the replacement will be conditional.

    2) filter on the rows to maintain for the replace-operation (include your row key/index!), perform replacements and then append (UNION) the previous step before you performed the filter: This will add the whole table before the filter/replacement steps have been performed. Then comes the trick: Remove duplicates on your index-columns. This will keep the first occurences of your index rows (the ones with the replacement) and only the not-replaced from the appdend. Sort and there you are :-)


    Imke Feldmann TheBIccountant.com

    Tuesday, August 25, 2015 7:43 PM
  • Hi Guys, thanks for your input, I picked up a clue from some blogs and I found that if I added Table.Buffer to line where the DataTable is passed back up to the recursive function it works a treat.

    @TrialMapFunction(Table.Buffer(result), MappingTable, NextRow).

    Speed is amazing.  5 secs for 35 mapping rows, 1550 Data Input rows - because of the <>*xx* flexible filtering, the mapping tables can be kept short and efficient and future proof. PQ is so cool.

    It feels like without Table.Buffer, M somehow keeps track of added columns and cleans up only when the final table is 'Loaded', Table.Buffer forces it to clean up the table at each loop shedding any unneeded baggage.

    I will try the append for filtered replace - might make it quicker still, probably simpler to follow too.

    Thanks for your efforts guys ;-)

    Tuesday, August 25, 2015 10:21 PM