none
Organizing "Or" in Conditional Logic RRS feed

  • Question

  • Hi Everyone,

    I am wondering what is the best way to structure multiple "or" statements in a conditional logic expression.  The below query (together with its duplicate twin that I have to run at the same time) is working, but it is running quite slow (6+ minutes), and I am wondering if it is due to inefficient code.

    Here is the query, if there are any suggestions on re-writing the bottom part, they would be most welcome!

    Thank you to everyone reading!

    Jake

    let
    
    //Bring in the pendingData table
        Source = pendingData,
    
    //Set the lists necessary to complete new columns on conditional logic
        ms = Table.Column(pendingLegend, "Milestone"),
        typ = Table.Column(pendingLengend, "Type"),
        fc = Table.Column(pendingLegend, "Forecast Date Column"),
        sd = Table.Column(pendingLegend, "Status Date Column"),
        s = Table.Column(pendingLegend, "Status Column"),
        st1 = Table.Column(pendingLegend, "Status Trigger 1"),
        st2 = Table.Column(pendingLegend, "Status Trigger 2"),
        st3 = Table.Column(pendingLegend, "Status Trigger 3"),
        st4 = Table.Column(pendingLegend, "Status Trigger 4"),
        st5 = Table.Column(pendingLegend, "Status Trigger 5"),
        st6 = Table.Column(pendingLegend, "Status Trigger 6"),
        st7 = Table.Column(pendingLegend, "Status Trigger 7"),
        st8 = Table.Column(pendingLegend, "Status Trigger 8"),
        st9 = Table.Column(pendingLegend, "Status Trigger 9"),
        st10 = Table.Column(pendingLegend, "Status Trigger 10"),
    
    //Set a count variable that can be referenced in future expressions
        count = List.Count(fc),
    
    //Run conditional logic
        Custom = List.Accumulate({0..count-1}, Source, (state, current) => Table.AddColumn(state, "s"&Text.From(current), each if 
            Record.Field(_, s{current}) = st1{current} or 
            Record.Field(_, s{current}) = st2{current} or 
            Record.Field(_, s{current}) = st3{current} or 
            Record.Field(_, s{current}) = st4{current} or 
            Record.Field(_, s{current}) = st5{current} or 
            Record.Field(_, s{current}) = st6{current} or 
            Record.Field(_, s{current}) = st7{current} or 
            Record.Field(_, s{current}) = st8{current} or 
            Record.Field(_, s{current}) = st9{current} or 
            Record.Field(_, s{current}) = st10{current} then
                Record.Field(_, sd{current}) else Record.Field(_, fc{current})))


    • Edited by Jake Burns Friday, January 12, 2018 4:24 PM Clean Up
    Friday, January 12, 2018 4:24 PM

Answers

  • Hi Jake,

    it looks as if List.Contains could do a better job here:

    List.Contains( {st1{current}...st10{current}, Record.Field(_, s{current}) )

    But in general, the overall setting doesn't look ideal to me. Have you considered unpivoting your "Status Trigger"-columns and then merge both tables instead?


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Marked as answer by Jake Burns Friday, January 12, 2018 5:41 PM
    Friday, January 12, 2018 5:23 PM
    Moderator
  • Hi Jake,

    I omitted the Source step, which should precede the code that I posted. I assume that this step is referring to the other query named "pendingData."

    Source = pendingData,
    stNames = {"Status Trigger 1", "Status Trigger 2", "Status Trigger 3", "Status Trigger 4", "Status Trigger 5",
    "Status Trigger 6", "Status Trigger 7", "Status Trigger 8", "Status Trigger 9", "Status Trigger 10"},
    AddedCustom = Table.AddColumn(
    Source,
    "s" & Text.From(List.Count(fc)),
    each if List.Contains(Record.ToList(Record.SelectFields(_, stNames)), [Status Column]) then [Status Date Column] else [Forecast Date Column]
    )



    • Edited by Colin Banfield Saturday, January 20, 2018 1:48 AM
    • Marked as answer by Jake Burns Monday, January 22, 2018 4:36 PM
    Saturday, January 20, 2018 1:45 AM

All replies

  • Hi Jake,

    it looks as if List.Contains could do a better job here:

    List.Contains( {st1{current}...st10{current}, Record.Field(_, s{current}) )

    But in general, the overall setting doesn't look ideal to me. Have you considered unpivoting your "Status Trigger"-columns and then merge both tables instead?


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Marked as answer by Jake Burns Friday, January 12, 2018 5:41 PM
    Friday, January 12, 2018 5:23 PM
    Moderator
  • Hi Imke,

    Thanks for the List.Contains suggestion.  Seems like it will work really well.

    I pondered an unpivot based on your suggestion, but I don't think it's plausible since the list variables that the expression is calling are often referring to headers in the table.  Thus I don't think I could create a sensible key and perform the matches on a row level.  

    Jake

    Friday, January 12, 2018 5:41 PM
  • If a merge doesn't work, then you should try buffering all items/lists that go into the List.Accumulate-function, as they will be referenced multiple times.

    I never use List.Accumulate, as it is often slow. I found List.Generate much faster (at least for my use cases)


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Friday, January 12, 2018 5:48 PM
    Moderator
  • Hmmmm.  I am testing the List.Buffers now.  The query you just assisted me with is getting pulled into the COUNT query below (where it calls "pendingDeploy" in row 3).

    I am wondering if my slowness (which comes from running COUNT query) is due to the huge stack of lists that  you are speaking of.

        Counts = Table.AddColumn(FilterCol, "Counts", each 
            List.Count(List.FindText(List.Transform(
                Table.Column(pendingDeploy,
                    "s" & Text.From(List.PositionOf(Table.Column(pendingLegend, "Milestone"), [Milestone]))), 
                        each Text.From(Date.EndOfMonth(Date.From(_)))), [EOMonth]))),

    Jake

    Friday, January 12, 2018 6:08 PM
  • Cannot tell without the context and a proper understanding of what shall be achieved. Just from my experience, it looks "too cody" and my gut feel is that with a better modelling approach it should be faster (and simpler to code).

    If you would mock up some sample data I would have a look.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Friday, January 12, 2018 6:19 PM
    Moderator
  • Thanks for the incredible offer!  I will strive to post it today.  It will take some time to randomize it (it's all confidential client data in a pretty complex flat file).  Jake
    Friday, January 12, 2018 6:27 PM
  • Hi Jake,

    If I'm correct in guessing what you trying to do, your code in indeed quite inefficient. Unless I'm mistaken, I think that what you need is something like the following:

    stNames = {"Status Trigger 1", "Status Trigger 2", "Status Trigger 3", "Status Trigger 4", "Status Trigger 5",
    	  "Status Trigger 6", "Status Trigger 7", "Status Trigger 8", "Status Trigger 9", "Status Trigger 10"},
    AddedCustom = Table.AddColumn(
                      Source, 
    		  "s" & Text.From(List.Count(fc)),
    		  each if List.Contains(Record.ToList(Record.SelectFields(_, stNames)), [Status Column]) then [Status Date Column] else [Forecast Date Column]
    	      )

    Friday, January 12, 2018 9:57 PM
  • "I never use List.Accumulate, as it is often slow. I found List.Generate much faster (at least for my use cases)"

    Interesting. In another thread, Marcel mentioned that he never uses List.Generate, unless an iteration cannot be done using List.Accumulate. For "for-each" type looping, List.Accumulate is more natural, since (unlike List.Generate), you a not creating a bunch of partial lists, only to select the final list result. When I switched from using List.Generate to using List.Accumulate, all of my scenarios showed no performance difference between the two functions. I'd love to see use cases where List.Generate is actually faster than List.Accumulate by a perceptible amount. Of course I still use List.Generate for "do/while" type looping.

    Friday, January 12, 2018 10:17 PM
  • Hi Colin,

    maybe your lists weren't long enough :-)

    Check out these codes for running totals:

    // ListAcumFullTest
    let
        Source = {1..5000},
        Custom1 = List.Buffer(Source),
        #"Added Custom" = List.Skip(List.Accumulate(Custom1,{0}, (ResultSoFar, Current) => ResultSoFar & {List.Last(ResultSoFar) + Current}),1)
    in
        #"Added Custom"

    // ListGenerateFullTest
    let
        Source = {1..5000},
        Custom1 = List.Buffer(Source),
        #"Added Custom" = List.Skip(List.Generate( ()=> [List=Custom1{0}, Counter=0], 
                                                    each [Counter] <= List.Count(Custom1),
                                                    each [List = Custom1{[Counter]}+[List],
                                                          Counter = [Counter]+1],
                                                    each [List]
                                                    ),1)
    in
        #"Added Custom"


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Saturday, January 13, 2018 8:27 AM
    Moderator
  • "maybe your lists weren't long enough :-)"

    Actually, none of the List.Accumulate solutions that I've created require using or returning long lists (most don't return a list at all). Check any solution that I've posted in this forum using List.Accumulate for examples...

    Now your very specific example is interesting, since it very clearly demonstrates the advantage of using List.Generate (more on this later).

    The way that I've used List.Accumulate to do running totals is similar to the way most people do running totals on tables, i.e. using an index with List.FirstN:

    let
        Source = List.Buffer({1..5000}),
        RunningTotal = List.Accumulate(List.Positions(Source), {}, (accumulated, current) => accumulated & {List.Sum(List.FirstN(Source, current + 1))})
    in
        RunningTotal

    With 5,000 list elements, I found no discernable difference compared to the List.Generate function. As the list grows, the List.Accumulate solution becomes slower, but with 40,000 elements, the processing time was less than two seconds on my system. So actually List.Accumulate isn't too bad if you don't expect to work with very large lists. The main advantage of List.Accumulate over List.Generate is that the code is far more readable.

    Having said the foregoing, the List.Accumulate solution is unusable with larger lists because you have to first create a list to hold the running totals, and each calculated running total is added to the stack (which inevitably slows the process down as the list becomes larger). On my system, the stack space was exhausted somewhere between 43,000 and 44,000 list elements.

    On the other hand, List.Generate simply calculates running totals for the list elements and adds each total to the list being generated. The operation does not use any stack space and thus the resultant list is generated blazingly fast. There is no discernable difference between calculating the running total for 5,000 elements or for 100,000 elements! It's quite an instructive use of List.Generate, and something everyone should be aware of. 

    By the way, there is a slight error in your List.Generate function. The initial value for "List" should be 0, and not Custom1{0}.


    Monday, January 15, 2018 4:47 PM
  • Thanks Colin,

    good insights and impulse: Will blog about it soon.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Tuesday, January 16, 2018 8:59 AM
    Moderator
  • Hi Imke!

    Thank you again for the offer to look at a sample.  After talking to my boss, I'm not able to post example, even randomized.  However, your fix helped me get it done and delivered.  There is one odd kink I never figured out:

    • This tool imports a flat file from my desktop.
    • That flat file has multiple tabs (with the same basic structure and size)
    • The tools works FAST with data from tab #1 (like 10 seconds).
    • The tool seems to choke (i.e. take 6 minutes to run) (but it works) with data from tab #2.
    • After racking my brain for days, studying columns, data types, etc. etc. etc. I have NO explanation.

    I am expecting a shrug as an answer, but just thought I'd throw it out there in case you have experienced before.  Everything works good, at similar speed, until the final COUNT function.  Very strange.

    No response needed if you don't have immediate idea.  THANK YOU AGAIN.

    Jake


    • Edited by Jake Burns Saturday, January 20, 2018 1:06 AM Clean
    Saturday, January 20, 2018 1:05 AM
  • Hi Sir!

    Sorry for the late response!  I am not sure how your stNames step would work above, since the "Status Trigger 1", etc. variables are actually columns from a table in another query.  Does that make sense?

    Jake

    Saturday, January 20, 2018 1:07 AM
  • Hi Jake,

    I omitted the Source step, which should precede the code that I posted. I assume that this step is referring to the other query named "pendingData."

    Source = pendingData,
    stNames = {"Status Trigger 1", "Status Trigger 2", "Status Trigger 3", "Status Trigger 4", "Status Trigger 5",
    "Status Trigger 6", "Status Trigger 7", "Status Trigger 8", "Status Trigger 9", "Status Trigger 10"},
    AddedCustom = Table.AddColumn(
    Source,
    "s" & Text.From(List.Count(fc)),
    each if List.Contains(Record.ToList(Record.SelectFields(_, stNames)), [Status Column]) then [Status Date Column] else [Forecast Date Column]
    )



    • Edited by Colin Banfield Saturday, January 20, 2018 1:48 AM
    • Marked as answer by Jake Burns Monday, January 22, 2018 4:36 PM
    Saturday, January 20, 2018 1:45 AM
  • OK, now I see what you did.  Definitely shortens the code up, much more sensible.

    Thank you sir!

    Jake

    Monday, January 22, 2018 4:36 PM