none
Is there some way to use VBA to capture the finish of refreshing a query? RRS feed

  • Question

  • Hi,

    I am using Power Query to do some complex mapping process. The mapping rule table (T1) contains thousands of rules. The data set (D1) needs to go through each line of T1.

    Because of the complexity of T1, the main query (Q1, which contains the function that iterates through the mapping rules) runs forever.

    Then I remembered using 

    Source=Excel.CurrentWorkbook(){[Name="Table1"]}[Content] //Method1

    is faster than

    Source=Table1 //Method2

    So I redesign Q1. In Q1, I use Source=Excel.CurrentWorkbook(){[Name="Q1"]}[Content]

    In this case, I only need to refresh Q1 for multiple times to iterate.

    Then I split the mapping rule table into 15 tranches. D1 will go through the first tranche first, refresh Q1, and load the result (R1) to Excel workbook. Using R1 as the input source of the second tranche, refresh the Q1, and load the result (R2) to Excel workbook. Repeating this process for 15 times (total around 25 mins), I finally got my result. (Cannot imagine how long it needs if I run the mapping process only once. Not to mention how easily Excel can crash when refreshing the query.)

    Here comes the question. Manually refreshing the same query for 15 times is not a good solution for a user. Is there some way to use VBA to capture the finish of refreshing a query? If yes, then we can use VBA to do the iteration.

    Thanks

    Wednesday, May 11, 2016 8:55 PM

Answers

  • I'd really like to take a look at the details of what these queries are doing. Could you send a Frown that includes both methods, indicating which query represents which method, and referencing my name in the Frown email?

    Thanks,

    Ehren

    Thursday, May 12, 2016 11:16 PM
    Owner

All replies

  • Why are you referencing the results of other queries using "From Table" instead of directly referencing the queries themselves?

    If you just use direct query references, the user should only have to refresh once.

    Ehren

    Thursday, May 12, 2016 7:02 PM
    Owner
  • Because 

    Source=Excel.CurrentWorkbook(){[Name="Table1"]}[Content] //Method1 

    is much faster than 

    Source=Table1 //Method2

    If Table1 (the query) is complex, method 1 may be 10 times faster than method 2.

    Thursday, May 12, 2016 8:10 PM
  • But doing Method1 requires that a refresh has run to populate Table1. I would expect that the time of running the two refreshes would be roughly equal to using Method2. Is that not the case?

    Ehren

    Thursday, May 12, 2016 8:12 PM
    Owner
  • Hi Ehren,

    No. For example, Table 1 takes 30s to refresh. After Table 1 is refreshed, I use Method1 to refresh the next query, which takes just 5s. If I use Method2, meaning refreshing only one query. This query will take more than 2 mins, sometimes even up to 4 mins, depends on the complexity of the query.

    Thursday, May 12, 2016 8:22 PM
  • That's really strange. Would you mind posting your formulas here?

    Ehren

    Thursday, May 12, 2016 8:24 PM
    Owner
  • I can say some basic logic. And I just tested it using a small sample of data.

    Q1 is the one retrieving data from some csv files and do some ETL work locally. Q1 takes 6s to load to Excel Workbook. Q2 takes a local table from Excel, and joins with Q1 (Excel.CurrentWorkbook function), and loads to Excel Workbook. Q2 takes 7s to refresh. Q3 uses Q2 as input (Excel.CurrentWorkbook function), and runs through a function (loop), and loads to Excel Workbook. Q3 takes 26s to refresh. Q4 joins Q2 and Q3 (both from Excel.CurrentWorkbook function). Q4 takes 5s. There are a lot of ETL work (Renaming, Reordering, Adding new columns) going on in those 4 queries.

    However, if I remove all the Excel.CurrentWorkbook functions, and refresh Q4, it takes around 2mins.

    BTW, Q2, Q3 are loaded to Excel because of user's request. User wants to see these intermediate steps.

    Thursday, May 12, 2016 9:09 PM
  • Here is another example. I found out a way to use VBA to capture the finish of the refreshment of a query.

    As I mentioned in the question, I split the mapping rule table into 15 tranches. Then I use the below table to control. The "x" in column check will move down every time the refreshment is done. The StartRow and NumofRows include the parameters for Table.Range(MappingTable,StartRow ,NumofRows). 

    StartRow NumofRows Check
    0 17 x
    17 249
    266 248
    514 43
    557 43
    600 43
    643 43
    686 43
    729 43
    772 43
    815 43
    858 43
    901 43
    944 197
    1141 1176

    Using VBA to create a for loop to refresh the query 15 times. Below are the results. The second column includes the timestamp each time the refreshment is done. The third column is the time needed for each refreshment. We can see that the whole process takes 65s. However, if I just run through the mapping table once as a whole, it takes 4 mins and 35s.

    Beginning Time 16:59:52  
    Finish time 1 16:59:59 0:00:07
    Finish time 2 17:00:04 0:00:05
    Finish time 3 17:00:08 0:00:04
    Finish time 4 17:00:13 0:00:05
    Finish time 5 17:00:18 0:00:05
    Finish time 6 17:00:23 0:00:05
    Finish time 7 17:00:27 0:00:04
    Finish time 8 17:00:32 0:00:05
    Finish time 9 17:00:37 0:00:05
    Finish time 10 17:00:42 0:00:05
    Finish time 11 17:00:45 0:00:03
    Finish time 12 17:00:48 0:00:03
    Finish time 13 17:00:52 0:00:04
    Finish time 14 17:00:54 0:00:02
    Finish time 15 17:00:57 0:00:03

    I think this is a good indication.

    Thursday, May 12, 2016 10:13 PM
  • I'd really like to take a look at the details of what these queries are doing. Could you send a Frown that includes both methods, indicating which query represents which method, and referencing my name in the Frown email?

    Thanks,

    Ehren

    Thursday, May 12, 2016 11:16 PM
    Owner
  • Hi Ehren,

    Just sent you the email.

    Thanks

    Friday, May 13, 2016 6:47 PM
  • Can confirm these finding. Happens so often - named it "No way to break the refresh-chain". Also buffering doesn't help.

    Here is another example where the time difference factor actually grows (> x10), the more rows you use (so exponentially): https://social.technet.microsoft.com/Forums/en-US/0192d0da-8e2a-4a39-afed-3aae2fa31812/rsi-relative-strength-index-formula-to-powerpivot?forum=sqlkjpowerpivotforexcel


    Imke Feldmann TheBIccountant.com

    Saturday, May 14, 2016 5:43 AM
    Moderator
  • When querying SQL-source, this might be seen as a feature, but it might return unexpected results if you're not aware of them. In another example I created an Index-column in the first Source-query. When referenced by different subsequent queries, those Indizes where completely different in these referencing queries. This made it obvious to me, that the results haven't been cached but the source query ran repeatedly for every different referencing sub-query.

    https://social.technet.microsoft.com/Forums/en-US/30e9ab27-23a5-465b-a0c4-36e4e48ce2db/bug-or-feature-when-querying-sqlserver-data?forum=powerquery


    Imke Feldmann TheBIccountant.com

    Saturday, May 14, 2016 6:05 AM
    Moderator