none
Merge and Stack Multiple Sets of Columns dynamically | Power Query RRS feed

  • Question

  • I have a single csv dataset that looks like below. I have multiple csv files like this which i want to combine once i figure out the correct transformation for a single file. Here is the Sample Excel File and Sample CSV file containing this data :

    into this expected Output : 

    Output table

    The Input data has broadly 4 sections : 

    1] The 1st 3 columns are used to create Questions, Answers & Label columns.

    2] a section of Percentage columns.

    3] a section of Stat-testing columns.

    4] a section of 1 columns.

    I want to Merge each set of columns R.1....R10 and then Unpivot (Stack) them one above the other as shown in the output. The R1...R10 column counts may vary each time, but they will be the same for each section i.e. for Percentage columns, Stat testing columns & 1 columns.

    While googling, i am came across an Imke Feldmann post on Merging and unpivoting columns dynamically using a custom function. However, i am unable to select and merge the different sections of columns dynamically. Also, i found Bill Szysz's post which is somewhat similar and trying to figure out a way to modify his 2nd solution i.e. Combine Method on a similar post. Here is Bill Szysz's Combine Method solution.

    let Source = Table.PromoteHeaders(Excel.CurrentWorkbook(){[Name="UglyData"]}[Content], [PromoteAllScalars=true]), Lists = Table.FromColumns({List.Transform({0..((List.Count(Table.ColumnNames(Source))-6)/5)-1}, each List.Range(Table.ColumnNames(Source), _*5+6, 5)) }), AlmostReady = Table.Combine(Table.AddColumn(Lists, "Columns", each Table.FromColumns({Table.Column(Source, Table.ColumnNames(Source){0})} & {List.Repeat({[Column1]{0}}, Table.RowCount(Source))} & List.Transform(List.Skip([Column1], 1), each Table.Column(Source, _)), {Table.ColumnNames(Source){0}, "Name"} & List.Transform(List.Skip([Column1], 1), each Text.BeforeDelimiter(_, "_")) ) )[Columns]), Ready = Table.AddColumn(AlmostReady, "Total", each List.Sum(List.Skip(Record.ToList(_), 2))) in Ready
    The problem is, Bill uses fixed column and row counts ( 5 & 6) whereas my columns and rows count could change. for e.g. today there are 10 sets of columns per section, tomorrow there may be 5 or 20. But the columns count will always be the same across each section i.e. for Percentage section, Stats section and Number section. Also, the sequence of the Columns may change but maintain the same sequence across the sections:
    e.g:

    Percentage section   |     | Stats section             |     | Numbers section      |

    R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |

    This data is tricky and beyond my amateur skills. Is this possible to do?


    Regards. ~~SiFaR~~

    Friday, November 6, 2020 1:13 PM

Answers

  • let
        Source = Csv.Document(File.Contents("<PathToCsvFile>"),
            [Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]
        ),
        PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        RColumns = List.Select(Table.ColumnNames(PromotedHeaders), each Text.StartsWith(_,"R.")),
            Section1 = List.Buffer(List.FirstN(RColumns, List.Count(RColumns)/3)),
            Section2 = List.Buffer(List.FirstN(List.Skip(RColumns, List.Count(RColumns)/3),List.Count(RColumns)/3)),
            Section3 = List.LastN(RColumns, List.Count(RColumns)/3),
        SelectedColumns = Table.SelectColumns(PromotedHeaders,
            List.FirstN(Table.ColumnNames(PromotedHeaders), 3) & RColumns
        ),
        TypedQue = Table.TransformColumnTypes(SelectedColumns,{{"Que", Int64.Type}}),
        FilledDownQue = Table.FillDown(TypedQue,{"Que"}),
        NoBlank = Table.SelectRows(FilledDownQue, each ([Que Ans Lbl] <> "")),
        GroupedQue = Table.Group(NoBlank, {"Que"},
            {"GRP_Que", each
                let
                    QA = Table.FirstN(_, 2)[Que Ans Lbl],
                    Question = Table.AddColumn(Table.Skip(_, 2), "Question", each QA{0}, type text)
                in
                    Table.AddColumn(Question, "Answer", each QA{1}, type text),
                type table
            }
        ),
        CombinedNested = Table.Combine(GroupedQue[GRP_Que]),
        TypedNumbers = Table.TransformColumnTypes(CombinedNested,
            List.Transform({"Agg"} & Section1 & Section3, each {_, type number})
        ),
        RoundedAgg = Table.TransformColumns(TypedNumbers,
            {"Agg", each Number.Round(_,2), type number}
        ),
        RoundedSection1 = Table.TransformColumns(RoundedAgg,
            List.Transform(Section1, each {_, each Number.Round(_,5), type number})
        ),
        RenamedAsLabel = Table.RenameColumns(RoundedSection1,{{"Que Ans Lbl", "Label"}}),
        UnpivotedR = Table.UnpivotOtherColumns(RenamedAsLabel, {"Que", "Question", "Answer", "Label", "Agg"}, "RN", "Value"),
        Section = Table.AddColumn(UnpivotedR, "Section", each
            if List.Contains(Section1,[RN]) then 1
            else if List.Contains(Section2,[RN]) then 2
            else 3,
            Int64.Type
        ),
        NormalizedR = Table.TransformColumns(Section,
            {"RN", each "R." & Text.PadStart(Text.AfterDelimiter(Text.BeforeDelimiter(_,"_"),"."),2,"0"), type text}
        ),
        PivotedR = Table.Pivot(NormalizedR, List.Distinct(NormalizedR[RN]), "RN", "Value"),
        Reordered = Table.SelectColumns(PivotedR,
            {"Que","Question","Answer","Label","Agg"}
            & List.Sort(List.Select(Table.ColumnNames(PivotedR), each Text.StartsWith(_,"R.")))
        )
    in
        Reordered

    Sample (with a varation) available here

    • Marked as answer by sifar786 Monday, November 9, 2020 9:36 AM
    Monday, November 9, 2020 8:55 AM

All replies

  • Excel 365 Pro Plus with PowerPivot and Power Query
    Added PP conversion for readability.
    http://www.mediafire.com/file/7c13uqtw0dnv26t/11_06_20a.xlsm/file
    http://www.mediafire.com/file/op24925gdh9k7kq/11_06_20a.pdf/file

    Friday, November 6, 2020 7:02 PM
  • https://stackoverflow.com/questions/64710348/merge-and-stack-multiple-sets-of-columns-dynamically-power-query

    hassansayedissa

    Friday, November 6, 2020 8:07 PM
  • let
        Source = Csv.Document(File.Contents("<PathToCsvFile>"),
            [Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]
        ),
        PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        RColumns = List.Select(Table.ColumnNames(PromotedHeaders), each Text.StartsWith(_,"R.")),
            Section1 = List.Buffer(List.FirstN(RColumns, List.Count(RColumns)/3)),
            Section2 = List.Buffer(List.FirstN(List.Skip(RColumns, List.Count(RColumns)/3),List.Count(RColumns)/3)),
            Section3 = List.LastN(RColumns, List.Count(RColumns)/3),
        SelectedColumns = Table.SelectColumns(PromotedHeaders,
            List.FirstN(Table.ColumnNames(PromotedHeaders), 3) & RColumns
        ),
        TypedQue = Table.TransformColumnTypes(SelectedColumns,{{"Que", Int64.Type}}),
        FilledDownQue = Table.FillDown(TypedQue,{"Que"}),
        NoBlank = Table.SelectRows(FilledDownQue, each ([Que Ans Lbl] <> "")),
        GroupedQue = Table.Group(NoBlank, {"Que"},
            {"GRP_Que", each
                let
                    QA = Table.FirstN(_, 2)[Que Ans Lbl],
                    Question = Table.AddColumn(Table.Skip(_, 2), "Question", each QA{0}, type text)
                in
                    Table.AddColumn(Question, "Answer", each QA{1}, type text),
                type table
            }
        ),
        CombinedNested = Table.Combine(GroupedQue[GRP_Que]),
        TypedNumbers = Table.TransformColumnTypes(CombinedNested,
            List.Transform({"Agg"} & Section1 & Section3, each {_, type number})
        ),
        RoundedAgg = Table.TransformColumns(TypedNumbers,
            {"Agg", each Number.Round(_,2), type number}
        ),
        RoundedSection1 = Table.TransformColumns(RoundedAgg,
            List.Transform(Section1, each {_, each Number.Round(_,5), type number})
        ),
        RenamedAsLabel = Table.RenameColumns(RoundedSection1,{{"Que Ans Lbl", "Label"}}),
        UnpivotedR = Table.UnpivotOtherColumns(RenamedAsLabel, {"Que", "Question", "Answer", "Label", "Agg"}, "RN", "Value"),
        Section = Table.AddColumn(UnpivotedR, "Section", each
            if List.Contains(Section1,[RN]) then 1
            else if List.Contains(Section2,[RN]) then 2
            else 3,
            Int64.Type
        ),
        NormalizedR = Table.TransformColumns(Section,
            {"RN", each "R." & Text.PadStart(Text.AfterDelimiter(Text.BeforeDelimiter(_,"_"),"."),2,"0"), type text}
        ),
        PivotedR = Table.Pivot(NormalizedR, List.Distinct(NormalizedR[RN]), "RN", "Value"),
        Reordered = Table.SelectColumns(PivotedR,
            {"Que","Question","Answer","Label","Agg"}
            & List.Sort(List.Select(Table.ColumnNames(PivotedR), each Text.StartsWith(_,"R.")))
        )
    in
        Reordered

    Sample (with a varation) available here

    • Marked as answer by sifar786 Monday, November 9, 2020 9:36 AM
    Monday, November 9, 2020 8:55 AM
  • Just Awesome!

    Thanks a lot.


    Regards. ~~SiFaR~~

    Monday, November 9, 2020 9:37 AM
  • An alternative to Pivoting

    let
        Source = Csv.Document(File.Contents(<PathToCsvFile),
            [Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]
        ),
        PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        RColumns = List.Select(Table.ColumnNames(PromotedHeaders), each Text.StartsWith(_,"R.")),
            Section1 = List.Buffer(List.FirstN(RColumns, List.Count(RColumns)/3)),
            Section2 = List.Buffer(List.FirstN(List.Skip(RColumns, List.Count(RColumns)/3),List.Count(RColumns)/3)),
            Section3 = List.Buffer(List.LastN(RColumns, List.Count(RColumns)/3)),
        SelectedColumns = Table.SelectColumns(PromotedHeaders,
            List.FirstN(Table.ColumnNames(PromotedHeaders), 3) & RColumns
        ),
        TypedQue = Table.TransformColumnTypes(SelectedColumns,{{"Que", Int64.Type}}),
        FilledDownQue = Table.FillDown(TypedQue,{"Que"}),
        NoBlank = Table.SelectRows(FilledDownQue, each ([Que Ans Lbl] <> "")),
        GroupedQue = Table.Group(NoBlank, {"Que"},
            {"GRP_Que", each
                let
                    QA = Table.FirstN(_, 2)[Que Ans Lbl],
                    Question = Table.AddColumn(Table.Skip(_, 2), "Question", each QA{0}, type text)
                in
                    Table.AddColumn(Question, "Answer", each QA{1}, type text),
                type table
            }
        ),
        CombinedNested = Table.Combine(GroupedQue[GRP_Que]),
        TypedNumbers = Table.TransformColumnTypes(CombinedNested,
            List.Transform({"Agg"} & Section1 & Section3, each {_, type number})
        ),
        RoundedAgg = Table.TransformColumns(TypedNumbers,
            {"Agg", each Number.Round(_,2), type number}
        ),
        RoundedSection1 = Table.TransformColumns(RoundedAgg,
            List.Transform(Section1, each {_, each Number.Round(_,5), type number})
        ),
        RenamedAsLabel = Table.RenameColumns(RoundedSection1,{{"Que Ans Lbl", "Label"}}),
        OutColNames = List.Buffer(List.Transform(Section1, each "R." & Text.PadStart(Text.AfterDelimiter(_,"."),2,"0"))),
        TableOfSections = Table.AddColumn(RenamedAsLabel, "SectionsData", each
            Table.FromRecords(
                {
                    Record.RenameFields(Record.SelectFields(_,Section1), List.Zip({Section1, OutColNames})),
                    Record.RenameFields(Record.SelectFields(_,Section2), List.Zip({Section2, OutColNames})),
                    Record.RenameFields(Record.SelectFields(_,Section3), List.Zip({Section3, OutColNames}))
                }
            ),
            type table
        ),
        RemovedSectionColumns = Table.RemoveColumns(TableOfSections, Section1 & Section2 & Section3),
        Expanded = Table.ExpandTableColumn(RemovedSectionColumns, "SectionsData", OutColNames),
        Reordered = Table.SelectColumns(Expanded,
            {"Que","Question","Answer","Label","Agg"}
            & List.Sort(OutColNames)
        )
    in
        Reordered
    Sample updated

    Monday, November 9, 2020 11:46 AM
  • not sure why but can add edit previous reply to say you can replace above step TableOfSections with:

    TableOfSections = Table.AddColumn(RenamedAsLabel, "SectionsData",
        (rec)=> Table.FromRecords(
            List.Transform({Section1,Section2,Section3},
                (sName)=> Record.RenameFields(Record.SelectFields(rec, sName), List.Zip({sName, OutColNames}))
            )
        ),
        type table
    ),

    Monday, November 9, 2020 1:28 PM
  • Brilliant! Thanks for making it more concise.

    Regards. ~~SiFaR~~

    Monday, November 9, 2020 7:36 PM
  • not sure why but can add edit previous reply to say you can replace above step TableOfSections with:

    TableOfSections = Table.AddColumn(RenamedAsLabel, "SectionsData",
        (rec)=> Table.FromRecords(
            List.Transform({Section1,Section2,Section3},
                (sName)=> Record.RenameFields(Record.SelectFields(rec, sName), List.Zip({sName, OutColNames}))
            )
        ),
        type table
    ),

    @lz._ are you on gmail or hotmail? How can i reach you offline?

    Regards. ~~SiFaR~~

    Tuesday, November 10, 2020 8:55 AM
  • @SiFar

    Sure I have an email @. However, sharing personal information isn't allowed on most forums inc. this one AFAIK

    Tuesday, November 10, 2020 9:24 AM
  • @SiFar

    Sure I have an email @. However, sharing personal information isn't allowed on most forums inc. this one AFAIK

    oh, ok.

    Regards. ~~SiFaR~~

    Tuesday, November 10, 2020 10:04 AM