locked
Pivot Help Please - Getting Unwanted Cascading Pattern RRS feed

  • Question

  • XLS here: http://1drv.ms/1EYDY6h

    This XLS is like a PowerQuery I have that pulls Search results from Office365 SharePoint's search API. That works fine.

    But, once i drill to the table, i get back 49 rows of data, that repeat for as many results as i have per my search result.

    To Pivot into a proper table, i've done this:

    * Create an Index column (as you can't pivot on data that isn't unique)

    * Pivot on Key column (new header) for the Values column.

    * Problem:  I get this cascading, waterfall, pattern of my data.  I can't figure out how to pull all the data up so that i end up with (in this example) only 3 rows and 49 columns.

    Hope someone can help.

    Thanks.


    Thursday, May 7, 2015 1:55 AM

Answers

  • I've only looked at the .xlsx and not the posted query (which looks quite a bit longer). What's missing is a way to group all of those values together. As it appears that each record consists of 47 values, I simply divide the added index by 47 to create a grouping:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
        Grouped = Table.TransformColumns(#"Added Index", {"Index", each Number.RoundDown(_ / 47)}),
        #"Pivoted Column" = Table.Pivot(Grouped, List.Distinct(#"Added Index"[Key]), "Key", "Value")
    in
        #"Pivoted Column"

    This approach only works if you can guarantee that the sets of data all have the same number of values. Otherwise, a more complicated approach is required.

    Thursday, May 7, 2015 1:20 PM
  • Yes, but maybe instead of dealing with the no of colums, we simply apply the index at a different stage:

    let
        Source = OData.Feed("<a href="http://.sharepoint.com/_api/search/query?querytext='Something'">http://<xxxxx>.sharepoint.com/_api/search/query?querytext='Something'"),
        PrimaryQueryResult = Source[PrimaryQueryResult],
        RelevantResults = PrimaryQueryResult[RelevantResults],
        Table = RelevantResults[Table],
        Rows = Table[Rows],
        #"Table from List" = Table.FromList(Rows, Splitter.SplitByNothing(), null, null),
        #"Expand Column1" = Table.ExpandRecordColumn(#"Table from List", "Column1", {"Cells"}, {"Column1.Cells"}),
            //Here lies the trick
        #"Added Index1" = Table.AddIndexColumn(#"Expand Column1", "Index", 0, 1),
        #"Expand Column1.Cells" = Table.ExpandListColumn(#"Added Index1", "Column1.Cells"),
        #"Expand Column1.Cells1" = Table.ExpandRecordColumn(#"Expand Column1.Cells", "Column1.Cells", {"Key", "Value"}, {"Column1.Cells.Key", "Column1.Cells.Value"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expand Column1.Cells1",null,"N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","","N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Renamed Columns" = Table.RenameColumns(#"Replaced Value1",{{"Column1.Cells.Key", "Item"}, {"Column1.Cells.Value", "Value"}}),
        #"Pivoted Column1" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Item]), "Item", "Value")
    in
        #"Pivoted Column1"


    Imke

    Thursday, May 7, 2015 4:57 PM

All replies

  • BTW:  Instead of using the sample XLS above, here's the M-code with my tenant name removed.  If you have Office365 SharePoint, this should work for you as-is.  It searches for "something".

    You'll note the waterfall pattern that I'd like to eliminate.

    let
        Source = OData.Feed("http://<TenantName>.sharepoint.com/_api/search/query?querytext='Something'"),
        PrimaryQueryResult = Source[PrimaryQueryResult],
        RelevantResults = PrimaryQueryResult[RelevantResults],
        Table = RelevantResults[Table],
        Rows = Table[Rows],
        #"Table from List" = Table.FromList(Rows, Splitter.SplitByNothing(), null, null),
        #"Expand Column1" = Table.ExpandRecordColumn(#"Table from List", "Column1", {"Cells"}, {"Column1.Cells"}),
        #"Expand Column1.Cells" = Table.ExpandListColumn(#"Expand Column1", "Column1.Cells"),
        #"Expand Column1.Cells1" = Table.ExpandRecordColumn(#"Expand Column1.Cells", "Column1.Cells", {"Key", "Value"}, {"Column1.Cells.Key", "Column1.Cells.Value"}),
        #"Sorted Rows" = Table.Sort(#"Expand Column1.Cells1",{{"Column1.Cells.Key", Order.Ascending}}),
        #"Replaced Value" = Table.ReplaceValue(#"Sorted Rows",null,"N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","","N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Renamed Columns" = Table.RenameColumns(#"Replaced Value1",{{"Column1.Cells.Key", "Item"}, {"Column1.Cells.Value", "Value"}}),
        #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 0, 1),
        #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Item]), "Item", "Value")
    in
        #"Pivoted Column"


    Thursday, May 7, 2015 2:09 AM
  • I've only looked at the .xlsx and not the posted query (which looks quite a bit longer). What's missing is a way to group all of those values together. As it appears that each record consists of 47 values, I simply divide the added index by 47 to create a grouping:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
        Grouped = Table.TransformColumns(#"Added Index", {"Index", each Number.RoundDown(_ / 47)}),
        #"Pivoted Column" = Table.Pivot(Grouped, List.Distinct(#"Added Index"[Key]), "Key", "Value")
    in
        #"Pivoted Column"

    This approach only works if you can guarantee that the sets of data all have the same number of values. Otherwise, a more complicated approach is required.

    Thursday, May 7, 2015 1:20 PM
  • Thanks.  I used your /47 trick and got it all in one query.

    Curt:  Instead of /47 (mines now 42, but...), why doesn't "/Table.ColumnCount(Table) work?  Then, it wouldn't matter how many elements are in the result, it'll repivot back out. ???  (I tried but got an error)

    BTW:

    Here's the finished code if someone wants it.

    let
        Source = OData.Feed("http://<sharepoint host>.com/_api/search/query?querytext='Something'&rowlimit=500"),
        PrimaryQueryResult = Source[PrimaryQueryResult],
        RelevantResults = PrimaryQueryResult[RelevantResults],
        Table = RelevantResults[Table],
        Rows = Table[Rows],
        #"Table from List" = Table.FromList(Rows, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Expand Column1" = Table.ExpandRecordColumn(#"Table from List", "Column1", {"Cells"}, {"Column1.Cells"}),
        #"Expand Column1.Cells" = Table.ExpandListColumn(#"Expand Column1", "Column1.Cells"),
        #"Expand Column1.Cells1" = Table.ExpandRecordColumn(#"Expand Column1.Cells", "Column1.Cells", {"Key", "Value"}, {"Column1.Cells.Key", "Column1.Cells.Value"}),
        #"Renamed Columns" = Table.RenameColumns(#"Expand Column1.Cells1",{{"Column1.Cells.Key", "Key"}, {"Column1.Cells.Value", "Value"}}),
        #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 0, 1),
        #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Key]), "Key", "Value"),
        #"ReIndexed" = Table.TransformColumns(#"Added Index", {"Index", each Number.RoundDown(_ / 42)}),
        #"RePivoted" = Table.Pivot(#"ReIndexed", List.Distinct(#"Added Index"[Key]), "Key", "Value")
    in
        #"RePivoted"

    Thursday, May 7, 2015 2:38 PM
  • In your spreadsheet example, the source table has two columns and not 47.
    Thursday, May 7, 2015 3:29 PM
  • The XLS has a tab called SourceTable (which mimics what the M code above retrieves) - a 2 column stacked result set.  It has a tab called From PowerQuery to show the results of pivoting with that unwanted pattern.

    I included the XLS so that others didn't need to run the SharePoint query themselves to reproduce - though I provided the M code if you wanted to.  Just replace <tenant> with your office365 or sharepoint2013 server URL and it should work..

    From that 2 column result set, i continue to add to my PQ the steps to add an index so i can Pivot that stacked column into 47 (40-something, not sure the exact number) columns and X rows (the xls should have 3 as you fixed).

    Your solution worked great.  I was able to incorporate it into my main sharepoint search odata.feed query.

    However, per your comments about having to make sure you know the exact number of values to perform the /47 re-index trick to group the table; my question was, can you exchange the "\47" with "\Table.ColumnCount(table)" so that you don't have to know that exact number of values - let it be dynamic?

    Thursday, May 7, 2015 3:41 PM
  • Right, and I'm trying to explain that the answer is "no" by illustrating it with the example that your sample data had a column count of 2 -- and yet we needed to divide by 47.

    There are multiple ways to do it dynamically, but none is straightforward or efficient. One is to count the total number of rows and divide by the total number of occurrences of the value in the first row.

    Thursday, May 7, 2015 3:55 PM
  • Yes, but maybe instead of dealing with the no of colums, we simply apply the index at a different stage:

    let
        Source = OData.Feed("<a href="http://.sharepoint.com/_api/search/query?querytext='Something'">http://<xxxxx>.sharepoint.com/_api/search/query?querytext='Something'"),
        PrimaryQueryResult = Source[PrimaryQueryResult],
        RelevantResults = PrimaryQueryResult[RelevantResults],
        Table = RelevantResults[Table],
        Rows = Table[Rows],
        #"Table from List" = Table.FromList(Rows, Splitter.SplitByNothing(), null, null),
        #"Expand Column1" = Table.ExpandRecordColumn(#"Table from List", "Column1", {"Cells"}, {"Column1.Cells"}),
            //Here lies the trick
        #"Added Index1" = Table.AddIndexColumn(#"Expand Column1", "Index", 0, 1),
        #"Expand Column1.Cells" = Table.ExpandListColumn(#"Added Index1", "Column1.Cells"),
        #"Expand Column1.Cells1" = Table.ExpandRecordColumn(#"Expand Column1.Cells", "Column1.Cells", {"Key", "Value"}, {"Column1.Cells.Key", "Column1.Cells.Value"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expand Column1.Cells1",null,"N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","","N/a",Replacer.ReplaceValue,{"Column1.Cells.Value"}),
        #"Renamed Columns" = Table.RenameColumns(#"Replaced Value1",{{"Column1.Cells.Key", "Item"}, {"Column1.Cells.Value", "Value"}}),
        #"Pivoted Column1" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Item]), "Item", "Value")
    in
        #"Pivoted Column1"


    Imke

    Thursday, May 7, 2015 4:57 PM
  • Most excellent.  Thanks all.

    This is potentially a huge time saver for our SharePoint team.  They get req's from folks asking for a "list of all content X in SharePoint, date, type, author, title, etc."...  They have to drop what their doing PowerShell something for each request.

    We're testing PQ to see if we can replace that labor, self service.

    This is working well - all up to the part that the REST API for SharePoint only gives you a max parameter of &rowlimit=500.

    My next trick is to try to figure out if I can get PQ to get a list of result pages in one query, then pass that to another query to run a search and append all the pages worth of results into a final, huge set.

    Given my M abilities, I know it's futile, but I'm going to give it a shot anyway!  (another post).


    Thursday, May 7, 2015 7:14 PM