none
Expanding Merged Table is Changing Values RRS feed

  • Question

  • Hi,

    I have a merge of two tables. For some reason when I expand the data in my second table, the data in my first table is changing. Screenshots attached. The 'fund code' is the same in both tables. 

    Any ideas what could be causing this? I've tried merging in the reverse order and it's causing the same issue. 

    Step 1 (Table 1):

    Step 2 (Merge Table):

    Step 3 (Expand table) - Values are changing to 1/2/2018:

    Tuesday, January 2, 2018 9:01 PM

Answers

  • I feel your pain, it is still unclear to me, why we still have to buffer our sort-commands: This must fool so many people, wasting lifetime and nerves:

    Power Query doesn't keep it's sort-order reliably. So you have to buffer it to make it stick like this i.e.:

    let
        Source = #"Step 1 - Raw Data",
        #"Removed Columns1" = Table.RemoveColumns(Source,{"Data"}),
        #"Sorted Rows" = Table.Buffer(Table.Sort(#"Removed Columns1",{{"Fund Code", Order.Ascending}, {"Date Created", Order.Ascending}})),
        #"GroupedRows" = Table.Group(
                          #"Sorted Rows", 
                          {"Fund Code"}, {{"GroupedTables", each _, type table}}
                      ),
        #"AddedClusteredIndex" = Table.TransformColumns(
                                  #"GroupedRows", 
                                  {
                                    "GroupedTables", 
                                    each Table.AddIndexColumn(_,"ClusteredIndex",1,1)
                                  }
                              ),
        #"ExpandedTable" = Table.ExpandTableColumn(
                            #"AddedClusteredIndex", 
                            "GroupedTables", 
                            {"Date Created", "ClusteredIndex"}
                        ),
        #"Filtered Rows" = Table.SelectRows(ExpandedTable, each ([ClusteredIndex] = 2)),
        #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"ClusteredIndex"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Date Created", "Earlier Date"}})
    in
        #"Renamed Columns"


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Marked as answer by Justin927 Wednesday, January 3, 2018 7:41 PM
    Wednesday, January 3, 2018 3:28 PM
    Moderator

All replies

  • Hi Justin,

    This case is really interesting.

    I tried a few times and I do not receive this kind of error.

    Is that ok for you to post the workbook you have without confidential information so that I can have a better look at it?

    Kind regards,

    Tom Sun

    Wednesday, January 3, 2018 5:24 AM
  • A possible explanation could be that the Fund Code appears twice in your data.

    With some transformations in Power Query the sort order of the original table gets lost and in this case it would have been sorted by Earliest Date instead.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Proposed as answer by Tom J Sun Wednesday, January 3, 2018 9:24 AM
    Wednesday, January 3, 2018 8:34 AM
    Moderator
  • In an earlier step I used a clustered index to sort the order of the data. Fund codes repeated with different dates and it was sorted so that I could find the date immediately prior to the lastest date. The subsequent step removed the other dates so fund codes are unique.



    • Edited by Justin927 Wednesday, January 3, 2018 7:42 PM
    Wednesday, January 3, 2018 3:07 PM
  • I feel your pain, it is still unclear to me, why we still have to buffer our sort-commands: This must fool so many people, wasting lifetime and nerves:

    Power Query doesn't keep it's sort-order reliably. So you have to buffer it to make it stick like this i.e.:

    let
        Source = #"Step 1 - Raw Data",
        #"Removed Columns1" = Table.RemoveColumns(Source,{"Data"}),
        #"Sorted Rows" = Table.Buffer(Table.Sort(#"Removed Columns1",{{"Fund Code", Order.Ascending}, {"Date Created", Order.Ascending}})),
        #"GroupedRows" = Table.Group(
                          #"Sorted Rows", 
                          {"Fund Code"}, {{"GroupedTables", each _, type table}}
                      ),
        #"AddedClusteredIndex" = Table.TransformColumns(
                                  #"GroupedRows", 
                                  {
                                    "GroupedTables", 
                                    each Table.AddIndexColumn(_,"ClusteredIndex",1,1)
                                  }
                              ),
        #"ExpandedTable" = Table.ExpandTableColumn(
                            #"AddedClusteredIndex", 
                            "GroupedTables", 
                            {"Date Created", "ClusteredIndex"}
                        ),
        #"Filtered Rows" = Table.SelectRows(ExpandedTable, each ([ClusteredIndex] = 2)),
        #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"ClusteredIndex"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Date Created", "Earlier Date"}})
    in
        #"Renamed Columns"


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    • Marked as answer by Justin927 Wednesday, January 3, 2018 7:41 PM
    Wednesday, January 3, 2018 3:28 PM
    Moderator
  • Hello and thank you in advance if you can help,

    I am using Excel 2019 Power Query, and all my files and lookups are loaded as connection and loaded to the data model.  I am very new to PQ, so please understand that.  Aside from it being EXTREMELY slow for me (trying to work that out on another thread), I am having a very similar problem as described here, I think, in that I have a data source, and then reference that query to trim, filter and modify it for further queries I need to do.  From that query, I create other reference queries to produce different summaries.  I have a query that I put together that would allow me to group locations and produce a column of the merged row contents separated by a comma.  The first time I did this, the data that was merged from the rows was not in alphabetical order, so I sorted it the way I needed it, and then reran everything and it came out with the contents in alphabetical order; however, now when I try to merge this data into another query, the sort order gets lost again.  Can you tell me if the Table.Buffer will help eliminate this issue, or is there a way to copy that column over and make it not link to the source data sorting?  As well, at what point should I be putting in the Table.Buffer, if that is what I should be doing (Query details below).

    Thank you for your time,

    Maggie

    let

        Source = eBird_WINTER_WORKING,

        #"Removed Columns" = Table.RemoveColumns(Source,{"GLOBAL_UNIQUE_IDENTIFIER", "CATEGORY", "Maine_Species_List_Lookup.Species_Type", "BREEDING_BIRD_ATLAS_CODE", "OBSERVATION_DATE", "OBS_Day-Month", "DAY-MONTH_VALUE", "WINTER_SEASON", "WINTER_TIME_PERIOD", "MODIFIED_WINTER_BLOCK_TYPE", "Winter_BlockType"}),

        #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Winter_CoordRegio", "Winter_CLOBlockNa", "COMMON_NAME", "OBSERVER_ID", "SAMPLING_EVENT IDENTIFIER", "GROUP_IDENTIFIER", "FINAL_EVENT_ID", "PROTOCOL_TYPE", "Modified_DURATION_MINUTES", "DURATION_MINUTES", "APPROVED", "REASON", "PROJECT_CODE", "COUNTY", "LOCALITY", "TRIP_COMMENTS", "SPECIES_COMMENTS"}),

        #"Removed Columns1" = Table.RemoveColumns(#"Reordered Columns",{"OBSERVER_ID", "SAMPLING_EVENT IDENTIFIER", "GROUP_IDENTIFIER", "FINAL_EVENT_ID", "PROTOCOL_TYPE", "Modified_DURATION_MINUTES", "DURATION_MINUTES", "APPROVED", "REASON", "PROJECT_CODE", "COUNTY", "LOCALITY", "TRIP_COMMENTS", "SPECIES_COMMENTS"}),

        #"Sorted Rows" = Table.Sort,(#"Removed Columns1",{{"Winter_CoordRegio", Order.Ascending}, {"Winter_CLOBlockNa", Order.Ascending}, {"COMMON_NAME", Order.Ascending}}),

        #"Grouped Rows" = Table.Group(#"Sorted Rows", {"Winter_CoordRegio", "Winter_CLOBlockNa"}, {{"Count_Records", each Table.RowCount(_), type number}, {"Species_List", each _, type table}}),

        // This is hte step/code for expanding the species list resulting from the ALL ROWS command in the group feature.

        #"Added Species_List_FULL_Custom" = Table.AddColumn(#"Grouped Rows", "Species_List_FULL", each Table.ToList(

    Table.Transpose(

    Table.Distinct(

    Table.FromList(

    Table.Column([Species_List],"COMMON_NAME")

    )

    )

    ),

    Combiner.CombineTextByDelimiter(", ")

    )),

        // This extracts the table in the column to get the text output for the species list.

        #"Extracted Species_List_FULL_Values" = Table.TransformColumns(#"Added Species_List_FULL_Custom", {"Species_List_FULL", each Text.Combine(List.Transform(_, Text.From), ","), type text}),

        #"Changed Type" = Table.TransformColumnTypes(#"Extracted Species_List_FULL_Values",{{"Count_Records", Int64.Type}})

    in

        #"Changed Type"

    Wednesday, February 6, 2019 10:17 PM
  • Hi MFBirdy,

    I'm not sure if this will help with your problem, but while I was searching this thread for a solution to a similar issue of my data becoming unsorted during merge/expansion, I figured one out.  I noticed that my grouped data kept its order (item code with the earliest to latest ETA) before and after the merge, but lost it once the newly merged data was expanded.  I added a step before the merge to insert an index column, which recorded the date sorted order of my data.  Then when I expanded the merged data I was able to resort afterwards to my original order.  This fixed the problem of my merged order ETA's column showing up in chronological order (ex. "4/30, 5/2, 6/8" rather than "5/2, 6/8, 4/30").  I wanted to attach images that show the before/after result of adding the index step, but my account has not yet been confirmed so permission was denied.

    Tuesday, April 30, 2019 7:16 PM