none
Power Query is sooooooo slow!!!! RRS feed

  • Question

  • When I work in the preview - I am expecting the next step to apper instantly

    Instead it is reading ALL THE FILES' DATA OVER AGAIN

    Why?! Soo slowwww and not efficient.

    Please fix - only when I click "Refresh data" - should it do all the process over...

    Please fix, please!

    Thank you

    Michael


    Michael

    Wednesday, August 22, 2018 7:38 PM

Answers

All replies

  • Hi Michael. What data source(s) are you pulling from? CSV files on SharePoint? Excel files on a network folder? Something else?

    Thanks,

    Ehren

    Wednesday, August 22, 2018 8:13 PM
    Owner
  • Excel files

    Very few (5-8) and very small (up to 1 Mb)

    What bothers me is that every time I add a step, just a simple step like "change a column name" or multiply a column by a number - it starts to load all the files AGAIN (it shows this at the bottom).

    I already have all the preview I need. I just want to rename one column, not to load everything over again...

    I find myself WAITING on PowerQuery 80% of my time!!! 80% of my time!!! 80% WAITING!!!

    Don't get me wrong - I LOVE you, guys and I admire all the innovation you make with Power BI, Query, etc...

    But my User Experience of WAITING for Power Query is AWFUL.

    One day I will become a Program Manger of all Microsoft's Data Platform products and I we will definitely focus on the UX.

    Anyways - can you help, please?

    Thank you!

    Michael


    Michael

    Wednesday, August 22, 2018 8:37 PM
  • Hi Michael. Thanks for the additional detail. Are these files .xlsx, .xls, or .xlsb?

    Ehren

    Wednesday, August 22, 2018 9:22 PM
    Owner
  • Ehren, thank you for your prompt response!

    I've doubled checked - there are 6 .xlsx files in network folder, each file is 4.3 Mb and there are 2 more .xlsx files: one is tiny 70 Kb and the other one is 10Mb .xlsx in other network folder

    I am running Excel 2013 with Power Query Add-on Version 2.55.5010.1402 64-bit

    Thank you for help!!!

    I'll buy you a beer if you solve this!

    Michael


    Michael

    Wednesday, August 22, 2018 9:51 PM
  • Hi Michael. Can you try downloading the latest version of the PQ plugin? If that doesn't improve things, can you record a video of what you're seeing and share a link to it here? For example, renaming a column shouldn't cause PQ to re-pull the underlying data. If you're seeing something different, we'd like to get to the bottom of it.

    Also, if you could share the M formula text of your queries that would also be helpful. You can copy this from the Advanced Editor in the Power Query Editor.

    Ehren

    Thursday, August 23, 2018 5:13 PM
    Owner
  • Hi Ehren,

    Thank you very much for trying to help.

    Because the files contain sensitive data, I cannot sent them as is. But since you are really want to get to the bottom of it - I will gladly invest two-three hours to change the numbers to random and sent the folder to you so you can improve the Power Query product. I'll get to it this week

    Thanks again,

    Michael


    Michael

    Sunday, August 26, 2018 5:23 AM
  • Hi Michael. Were you able to come up with an anonymized version of your scenario?

    Thanks,

    Ehren

    Thursday, August 30, 2018 11:20 PM
    Owner
  • Not yet - I am working to finish the project for my client and then I can change the actual data for anonymized.

    I had to build all from the scratch and to separate the queries to different files :(

    It worked OK at the beginning but then - stuck again... waiting and waiting for each step for several minutes...

    Nightmare!

    I've noticed though that the slowdown began when I "Duplicated" two of my queries.

    Maybe a DUPLICATION of a query creates some sort of dependency bug? 

    Michael


    Michael

    Friday, September 14, 2018 4:20 PM
  • One thing to try would be to Reference your queries instead of duplicating them. Duplicating them may end up causing PQ to do duplicate work.

    Ehren

    Friday, September 14, 2018 4:55 PM
    Owner
  • Hi,

    I'm experiencing the exact same scenario as OP. Like the OP, my files are private.

    I have a query which loads files from CSV (about 5 of them). They are large but not exceptionsally so - about 100,000 rows and 100 columns.  Call this query  "CSVExtracts".

    My main quesy is  starting with = CSV Extracts.

    Then does a bunch of steps, deleting columns, filtering rows, deriving some new columns.

    Bizarrely, this used to work OK, now it seems to grind to a halt.

    Just like OP says, making aby change even renaming a column, or even renaming a step in the visual editor just make thes whole thing bog down My memory (maybe unreliable, it used to work fine, now it just doesn't).

    PQ has now become unusable.

    I will make another thread if needed, but the symptom is exactly like those described so it seems sensible to keep it here.

    Friday, October 19, 2018 9:37 AM
  • Should have added ;

    Excel 2016 v 16.0.4738 (which looks like a recent update to me ? )  64 bit version

    Running on Win 7 64-bit on regular i5 laptop with 8GB RAM.

     

    Friday, October 19, 2018 10:01 AM
  • Hi there. Can you share the version of PQ you're running? To find it, do the following:

    • Click the Data tab of the ribbon
    • Click Get Data (or New Query, if you are running Excel build prior to 8067.2115) > Query Options
    • Click Diagnostics. You should see the Power Query version printed there.

    Thanks,
    Ehren


    Friday, October 19, 2018 4:15 PM
    Owner
  • The optionNew Query/Query Options/DIagnostics/Version returns :

    2.61.5192.461 64 bit

    By the way this is a corporate licence of Office Pro Plus, it is running on locked-down laptop, and the security patches etc are out of my control.

    Friday, October 19, 2018 10:11 PM
  • I went back and looked at some other examples, including the original version of this same query (which only operated on data loaded from 1 file ). I'm finding the same thing - excruciatingly slow performance in the Preview within Power Query Editor.

    Looking some more, I found a process MIcrosoft.Mashup.Container.NetFX40.exe which is dominating CPU when the PQE is open.

    Some more investigations reveal

    https://community.powerbi.com/t5/Desktop/PowerBI-Desktop-CPU-Memory-Hog/m-p/13942/highlight/true

    which suggests to turn off the option

    'Allow data prrview to download in the background'.

    I've now done that and the file I am interested in seems to be performing much better. Still not great though, given I'm not really doing anything very intensive (once I load the data I am filtering to just a few 100 or 1000 rows, and adding new columns etc). e.g. takes about 10 seconds if I change the parameter of a integer divide step

    Also not clear why this behaviour wasn't happening and then started to happen. Could this be because a new version of Excel 2016 somehow changed default settings.

    Sunday, October 21, 2018 9:41 PM
  • I have noticed the same behavior.  And it gets slower with each action.  Since PQ is a recorder/interpreter and must be running from some sort of stack, perhaps the stack gets copied onto the stack and then the change is added to the top.  In the olden days we might have called that memory leakage.

    Right now I am hung in Excel 365 with a "Document Recovery" screen showing over my sheet.  The Query Editor is running but it is greyed-out.  Task Manager says 2 to 10% cpu and 48% Memory.  

    I suspect I will have to kill excel to get out.


    elc

    Monday, October 22, 2018 9:25 PM
  • I am having a terrible time with the refreshes.  I have buffered as many tables as possible, broken down queries into small steps, Turned off Background Refresh.  Tried Fast Data Load, didn't do much.  Turned off GPU Hardware Acceleration.  Only have 1 Excel file open.  Outlook and SFB are running too. 

    Even a simple change, like a field name, causes the preview to be regenerated.  I agree on the 80% waiting time, CPU = 100%, Memory typically 90%...this particular project has cost me 2 full weekends plus every evening and I'm still not done.

    Using Office 365, Semi-Annual Channel, version 1803 (Build 9126.2315 Click to Run), Lenovo T470, 2 cores, 4 processors, 3.49 GHz, 8 Gb RAM, HDD: 239GB SSD.   

    Update:  By using the SplitCamelCase (see (https://datachant.com/2018/02/14/split-camelcase-headers-m/) function wrapping technique I was able to get some performance improvements.
    • Edited by JVD_123 Wednesday, December 12, 2018 12:45 AM Updated information
    Tuesday, December 11, 2018 10:30 PM
  • Hi JVD_123. Sorry to hear about your refresh trouble. Can you clarify what you're seeing when you change a field name? Is it when the field name changes in the data source, or when you rename a column in the PQ Editor? If it's the latter, the change should be nearly instantaneous.

    If you'd like to share more details of your scenario privately, please PM me on the Power BI Community.

    Ehren

    Wednesday, December 12, 2018 6:40 PM
    Owner
  • I'm so glad to see that I'm not the only one experiencing this. I have also spent many hours just waiting. It's very frustrating. I'm not sure what the issue is because it wasn't so bad at first, but after reading through some comments I wonder if it had something to do with duplicating a query. I have 2 folders with date/country/product data where I need 3 columns from one type of file and one column from the other. After setting up a fantastic query to get the data just right from the first folder, I duplicated the query and changed the source and one step in the query (remove other columns). I feel like this might be when it started becoming ridiculously slow. Changing anything in the PQ editor is now EXCRUCIATING. Refreshing or even opening the PQ editor, I can see it loading and it seems to be going through all of the files multiple times. I did just do a Windows update, I don't know if that might have something to do with it. Anyhow, glad that it's not just me because I'm new to this and still learning the ins and outs.
    Tuesday, January 8, 2019 4:23 PM
  • Hi EmmaBDub. What kind of files do these folders contain? Are you combining the data from these two folders (via Merge or Append), or just pulling them in separately? How many files does each folder contain, and how large are the files in total?

    Thanks,
    Ehren

    Tuesday, January 8, 2019 5:41 PM
    Owner
  • Hi,

    Just want to follow this one up as I have the same issue with my excel/query editor... my query was running quite well without any problems, but has become so slow from one point to another... i even cannot open the query editor correctly, the window only shows the ribbon, the rest is white and i cannot do anything...

    It also seems to be the Microsoft.Mashup.Container.NetFX40.exe which completely dominates my CPU ...

    Connected databases are Access (OLEDB) and some excel files. I don't use duplicate queries, but have a few merging steps included... I disabled "allow data preview to download in the background", enabled "fast data load", and i don't load data to the data model.

    I feel like this has changed since an update (Excel or PBI), but i am not sure... My current version is:

    Excel 2016 (16.0.10730.20280) 64-bit
    Query Editor 2.61.5192.1301 64-bit

    Any kind of help would be much appreciated!!

    Cheers,

    Matt

    Tuesday, March 19, 2019 3:48 PM
  • Hi Matt. If you feel comfortable sharing traces, please capture them and ping me. I can try analyzing them to see what's taking so long.
    Friday, March 22, 2019 11:25 PM
  • Thanks...

    In the meantime I have found the issue which was actually one of my add-ins in Excel: Microsoft Power BI Publisher for Excel. I disabled that and now it works much better again.

    In general, when it comes to performance of Power Query I feel like the following things are worth to consider (as far as I know, please feel free to comment or add...)

    - query options: disable "allow data preview to download in the background", enable "fast data load", and do not "load data to the data model" if not necessary (note that this is set default in case of multiple queries)... automatic type detection could be disabled too if you don't need all columns...

    - watch your code and the connections between tables... clean code, avoid duplicate steps, try to reduce data and do steps only for rows or columns needed for your purpose, for instance change type, merging data...

    - consider office 64-bit

    - and do not use the PBI publisher add-in ;-)

    Cheers!

    Matt

    Tuesday, April 16, 2019 10:56 AM