none
Power Query - extremely long load times after simple filters applied RRS feed

  • Question

  • Hi all,

    Having a really tough time with Power Query. Does anyone know why I would be experiencing extremely slow filtering? I've connected to an oracle database with 7,000+ tables.

    I'm only concerned with one of these tables (50,000 rows) but when I try to apply simple filters to the columns, It's taking 20 minutes to an hour for each filter. The more filters I apply, the slower it gets. Also, loading to the workbook takes a crazy long time.

    Is this normal? I don't see anyone else complaining about this so I'm sure I'm doing something wrong. Is PQ trying to filter through all those 7,000+ tables?

    let
    Source = Oracle.Database("ORCL"),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name], "Database.Customer_Table")),
    Database.Customer_Table = #"Filtered Rows"{[Schema="Database",Item="Customer_Table"]}[Data],
    #"Filtered Rows1" = Table.SelectRows(Database.Customer_Table, each not Text.Contains([SO_NUMBER], "S-54398") or not Text.Contains([SO_NUMBER], "S-50208")),
    #"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each not Text.Contains([SO_NUMBER], "S-50204") or not Text.Contains([SO_NUMBER], "S-50205"))
    in
    #"Filtered Rows2"

    Thanks for any advance. This is driving me nuts.
    Saturday, November 7, 2015 12:13 AM

Answers

  • Does this query already perform badly, or does performance only get worse with a lot more filters? This SQL is exactly what I would expect to see if Power Query is pushing the filter back to the database; if this query performs badly there's nothing much that Power Query will be able to do.

    As an alternative, I suggest you create two tables in Excel: one containing all of the customer names you want to exclude, and one containing all the SO Numbers you want to exclude. Next, create two Power Query queries to load all the data from these tables. Finally, create a query that gets all the data (with none of the filters) from the table in the query above and do a merge (https://support.office.com/en-us/article/Merge-queries-Power-Query-fd157620-5470-4c0f-b132-7ca2616d17f9?ui=en-US&rs=en-US&ad=US) between each of the "exclude" queries and the third query, using an anti-join (you'll need a recent update of Power Query to see this option: http://www.radacad.com/how-to-change-joining-types-in-power-bi-and-power-query) to remove all of the customer names and SO Numbers you don't want to see. This *may* be more efficient; it will probably be more convenient than creating lots of filter steps.

    Chris


    Check out my MS BI blog I also do SSAS, PowerPivot, MDX and DAX consultancy and run public SQL Server and BI training courses in the UK

    Wednesday, November 11, 2015 8:15 AM

All replies

  • No, it's not normal but there could be a valid explanation. My advice would be to talk to your DBA and ask to see the SQL that Power Query is generating when you run your query: that way you'll be able to see whether Power Query is querying data that it shouldn't be querying, or generating some inefficient SQL somewhere.

    Chrsi


    Check out my MS BI blog I also do SSAS, PowerPivot, MDX and DAX consultancy and run public SQL Server and BI training courses in the UK

    Monday, November 9, 2015 11:21 AM
  • I connected the oracle table I'm working with to Access in an attempt to clean up the data there before I import to PQ. I'm running into the same problem however. I can remove as much data as I want in Access and the query will load fine but, the table will load extremely slow or not at all in Power Query.

    PQ will load the query from Access without a problem if there are one or maybe two filters but will load extremly slowly or not at all if I apply more. SQL from Access below. Please help this is very frustrating as it's making Power Query useless. I need clean data that is refreshable and don't understand why Power Query would bog down like this.

    SELECT GUIDE_CUSTOMER_VIEW.CUST_NAME, GUIDE_CUSTOMER_VIEW.CUSTOMER_FULL_NAME, GUIDE_CUSTOMER_VIEW.ORIGIN, GUIDE_CUSTOMER_VIEW.PRODUCT, GUIDE_CUSTOMER_VIEW.SALES_PRODUCT_SPEC, GUIDE_CUSTOMER_VIEW.SALES_CONTRACT_STATUS, GUIDE_CUSTOMER_VIEW.INVOICED_QTY, GUIDE_CUSTOMER_VIEW.PO_NUMBER, GUIDE_CUSTOMER_VIEW.SO_NUMBER, GUIDE_CUSTOMER_VIEW.SO_INVOICED_QTY, GUIDE_CUSTOMER_VIEW.INV_DATE, GUIDE_CUSTOMER_VIEW.INVOICE_NO, GUIDE_CUSTOMER_VIEW.INV_PRICE, GUIDE_CUSTOMER_VIEW.INV_AMOUNT, GUIDE_CUSTOMER_VIEW.SO_INCO_LOCATION, GUIDE_CUSTOMER_VIEW.WH_REF

    FROM GUIDE_CUSTOMER_VIEW WHERE CUST_NAME <> “TOM” AND CUST_NAME <> “TOM-B” AND SO_NUMBER <> “S-59864” AND SO_NUMBER <> “S-50902”;

    Wednesday, November 11, 2015 3:21 AM
  • Does this query already perform badly, or does performance only get worse with a lot more filters? This SQL is exactly what I would expect to see if Power Query is pushing the filter back to the database; if this query performs badly there's nothing much that Power Query will be able to do.

    As an alternative, I suggest you create two tables in Excel: one containing all of the customer names you want to exclude, and one containing all the SO Numbers you want to exclude. Next, create two Power Query queries to load all the data from these tables. Finally, create a query that gets all the data (with none of the filters) from the table in the query above and do a merge (https://support.office.com/en-us/article/Merge-queries-Power-Query-fd157620-5470-4c0f-b132-7ca2616d17f9?ui=en-US&rs=en-US&ad=US) between each of the "exclude" queries and the third query, using an anti-join (you'll need a recent update of Power Query to see this option: http://www.radacad.com/how-to-change-joining-types-in-power-bi-and-power-query) to remove all of the customer names and SO Numbers you don't want to see. This *may* be more efficient; it will probably be more convenient than creating lots of filter steps.

    Chris


    Check out my MS BI blog I also do SSAS, PowerPivot, MDX and DAX consultancy and run public SQL Server and BI training courses in the UK

    Wednesday, November 11, 2015 8:15 AM