none
PowerBI - Help with a nested join RRS feed

  • Question

  • Hello:

    I am relatively new to PowerBI and am trying to optimize my queries. Can you please advise on one or both questions below?

    1. I have two external data sources, both SQL Server. I created two queries - one gets all students with a lastname starting with a letter and the second gets all students living in a certain state. I want to join the results of both queries which are pulling from the same table. The result of my query is as below:

    let
        Source = Sql.Database("server", "database", [Query="select * from students where state = 'ca'"]),
        Merge = Table.NestedJoin(Source,{"LastName"},NameQuery,{"LastName"},"NewColumn",JoinKind.Inner)
    in
        Merge

    The net result of this according to SQL profiler is that it pulls all 20,000 records for the name query and all 50,000 records for the state query and then appears to merge them in PowerQuery. I was expecting that the merge / join would pass this query down to the database and do the join there instead. Perhaps I missed a step in setting up the Query.

    2. Assuming #1 is possible, what if I wanted to achieve the same behavior with OData? Although I am very familiar with the OData filter syntax, how setting up the relationship in PowerQuery and pre-generating the query seems like it would not be possible but that's why I am looking for answers!

    Thanks.

    Thursday, February 12, 2015 10:59 PM

Answers

  • You can do a self join in a single query ...

    let
        Source = Sql.Database("server", "database"),
        dbo_Enrollments  = Source{[Schema="dbo",Item="Enrollments"]}[Data],
        dbo_Enrollments2 = Source{[Schema="dbo",Item="Enrollments"]}[Data],
    
        #"Filtered Rows" = Table.SelectRows(dbo_Enrollments, each ([campusname] = "Campus A"))
    
        Merge = Table.NestedJoin(#"Filtered Rows",{"rowid"},dbo_Enrollments2,{"rowid"},"NewColumn",JoinKind.Inner)
    in
        Merge

    Tuesday, February 17, 2015 2:47 PM
  • We never do joins against native query. You'd need to have defined the source entirely inside PQ without SQL text and let PQ do the query generation.

    With OData, we use the metadata document to drive query generation.

    Thursday, February 12, 2015 11:08 PM
  • Instead of merging 2 queries against the same SQL Server table, why not just modify your Table.SelectRows function to include both conditions ... somewhat like what is shown below:

     #"Filter Rows" = Table.SelectRows( dbo_Enrollments, each [CampusName] = "Campus A" and Text.StartsWith( [LastName], "B" ))

    Note: I'm not quite certain of the exact names of your table column names.


    Monday, February 16, 2015 10:47 PM
  • In addition to Curt's excellent commentary ...

    Power Query never attempts to "understand" or interpret your native SQL code ... so, it can't produce and send a single, unified query to your database management system.

    Thursday, February 12, 2015 11:50 PM

All replies

  • We never do joins against native query. You'd need to have defined the source entirely inside PQ without SQL text and let PQ do the query generation.

    With OData, we use the metadata document to drive query generation.

    Thursday, February 12, 2015 11:08 PM
  • In addition to Curt's excellent commentary ...

    Power Query never attempts to "understand" or interpret your native SQL code ... so, it can't produce and send a single, unified query to your database management system.

    Thursday, February 12, 2015 11:50 PM
  • Thank you for the response(s).

    Can we dive just a bit deeper? Instead of specifying a query, I added just the connection and database. In the navigator I chose a view and created a connection (only) from it. I filtered this view (in PQ) to only return columns that matched a given value. I then created another query using the same connection as above and did a merge against my first dataset on the rowkey column in order to get the filtered result. I did not specify any queries. The result was still two queries, one filtered one not versus a singular query joined at the keys.

    For OData I am gathering that you are saying that my service has to include the columns through something like the expand method. Is there any documentation you can point me to regarding query generation of OData endpoints in Power BI?

    Friday, February 13, 2015 2:22 AM
  • Regarding your Power Query against SQL Server ... would you please provide the Power Query Language code so that we might review it?
    Friday, February 13, 2015 4:08 PM
  • Thank you for your time; I apologize it took so long to respond:

    Query1:

    let
        Source = Sql.Database("server", "database"),
        dbo_Enrollments = Source{[Schema="dbo",Item="Enrollments"]}[Data],
        #"Filtered Rows" = Table.SelectRows(dbo_Enrollments, each ([campusname] = "Campus A"))
    in
        #"Filtered Rows"

    Query 2:

    let
        Source = Sql.Database("server", "database"),
        dbo_Enrollments = Source{[Schema="dbo",Item="Enrollments"]}[Data],
        Merge = Table.NestedJoin(dbo_Enrollments,{"rowid"},#"Enrollments for Campus A",{"rowid"},"NewColumn",JoinKind.Inner)
    in
        Merge

    Monday, February 16, 2015 10:01 PM
  • Instead of merging 2 queries against the same SQL Server table, why not just modify your Table.SelectRows function to include both conditions ... somewhat like what is shown below:

     #"Filter Rows" = Table.SelectRows( dbo_Enrollments, each [CampusName] = "Campus A" and Text.StartsWith( [LastName], "B" ))

    Note: I'm not quite certain of the exact names of your table column names.


    Monday, February 16, 2015 10:47 PM
  • Thanks Mark - ordinarily I would do this but am providing this as an example of merging two datasets.
    Monday, February 16, 2015 11:03 PM
  • You can do a self join in a single query ...

    let
        Source = Sql.Database("server", "database"),
        dbo_Enrollments  = Source{[Schema="dbo",Item="Enrollments"]}[Data],
        dbo_Enrollments2 = Source{[Schema="dbo",Item="Enrollments"]}[Data],
    
        #"Filtered Rows" = Table.SelectRows(dbo_Enrollments, each ([campusname] = "Campus A"))
    
        Merge = Table.NestedJoin(#"Filtered Rows",{"rowid"},dbo_Enrollments2,{"rowid"},"NewColumn",JoinKind.Inner)
    in
        Merge

    Tuesday, February 17, 2015 2:47 PM