none
Creating Flags to Compare Demographics in PQ RRS feed

  • Question

  • Hi all,

    I'm attempting to set up a few flags that will identify whether demographic variables such as Age, Date of Birth and Gender match within the same case.

    As far as how the data are set up go, I have joined two forms from two surveys into a second query (for simplicity's sake, I just created and attached a sample workbook with just one query, meant to be representative of the joined query). 

    Age1 is the Age on the first form, Age2 is the Age on the second...The same rule applies to DOB and Gender.

    Some participants will only have done one survey thus far, and I would like to code ones who don't have two forms filled in (i.e., no duplicate ID to compare against) as NA, so they can't be matching or mismatching yet.

    As always, any help/advice is greatly appreciated!

    Darko 

    Workbook:

    TestWorkBook

    Summary Screenshot:

    Wednesday, February 6, 2019 12:16 AM

Answers

  • Hi Darko

    As Lz also stands for Lazy I assumed your Table1 is as below, with Age1 instead of Age (actually to avoid a couple of column rename steps)

    One way:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,
            {{"ID", Int64.Type},{"Age1", Int64.Type},{"Age2", Int64.Type},{"DOB1", type date},
             {"DOB2", type date},{"Gender1", type text},{"Gender2", type text}
            }
        ),    
        GroupedID = Table.Group(ChangedType, {"ID"}, {{"GroupID", each _, type table}}),
        AlteredGroup = Table.TransformColumns(GroupedID,
            {"GroupID", (groupTable) =>
                let
                    AddedIsUnique = Table.AddColumn(groupTable, "Is_Unique", each
                        List.Count(List.Buffer(groupTable[ID])) = 1,
                        type logical
                    ),
                    ColumnsToCompare = {"Age","DOB","Gender"},
                    AddedFlags = List.Accumulate({0..2}, AddedIsUnique,
                        (accum, current) =>
                            Table.AddColumn(accum, (ColumnsToCompare{current} & "Flag"),
                                Expression.Evaluate("each
    	                        if [Is_Unique] then ""NA""
    	                        else if [" & ColumnsToCompare{current}&"1" & "] = [" & ColumnsToCompare{current}&"2" & "] then ""Match""
    	                        else ""Mismatch"""
                                ),
                                type text
                            )
                    ),
                    RemovedIsUnique = Table.RemoveColumns(AddedFlags, "Is_Unique"),
                    FlagColumnNames = {"AgeFlag", "DOBFlag", "GenderFlag"},
                    FixedFlags = List.Accumulate({0..2}, RemovedIsUnique,
                        (accum, current) =>
                            let
                                MistmatchFound = List.Contains(Table.Column(accum,FlagColumnNames{current}), "Mismatch"),
                                ReplacedValues = if MistmatchFound
                                                 then Table.ReplaceValue(accum, "Match","Mismatch", Replacer.ReplaceValue,{FlagColumnNames{current}})
                                                 else accum
                            in
                                ReplacedValues
                    )
                in
                    FixedFlags,
                type table
            }
        ),
    
        ColumnsToExpand = List.Skip(Table.ColumnNames(AlteredGroup[GroupID]{0})),
        GroupIDShema = Table.Schema(AlteredGroup[GroupID]{0}),
        GroupIDExpanded = Table.ExpandTableColumn(AlteredGroup, "GroupID", ColumnsToExpand),
    
        // Restore column types (A. Zhigulin version)
        RestoredTypes = Table.TransformColumnTypes(GroupIDExpanded,
            List.Zip({GroupIDShema[Name],
                      List.Transform(GroupIDShema[TypeName], each Expression.Evaluate(_, #shared))
                     }
            )
        )
    in
        RestoredTypes

    To restore the column types (at the end) I used an approach posted by A. Zhigulin on How to preserve the Columns Types after a Table.Group / Table.Expand. Check that post where C. Banfield offers a function that does the same thing

    Updated workboook with above code avail. here

    PS: Suggestion for next times => Provide a rough estimation of the number of rows/records involved - "large" is not a good estimation ;-). Hope you understand why & it makes sense
    • Edited by Lz._ Wednesday, February 6, 2019 1:20 PM added suggestion
    • Marked as answer by Darko Giacomini Wednesday, February 6, 2019 2:59 PM
    Wednesday, February 6, 2019 10:12 AM

All replies

  • Hi Darko

    As Lz also stands for Lazy I assumed your Table1 is as below, with Age1 instead of Age (actually to avoid a couple of column rename steps)

    One way:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        ChangedType = Table.TransformColumnTypes(Source,
            {{"ID", Int64.Type},{"Age1", Int64.Type},{"Age2", Int64.Type},{"DOB1", type date},
             {"DOB2", type date},{"Gender1", type text},{"Gender2", type text}
            }
        ),    
        GroupedID = Table.Group(ChangedType, {"ID"}, {{"GroupID", each _, type table}}),
        AlteredGroup = Table.TransformColumns(GroupedID,
            {"GroupID", (groupTable) =>
                let
                    AddedIsUnique = Table.AddColumn(groupTable, "Is_Unique", each
                        List.Count(List.Buffer(groupTable[ID])) = 1,
                        type logical
                    ),
                    ColumnsToCompare = {"Age","DOB","Gender"},
                    AddedFlags = List.Accumulate({0..2}, AddedIsUnique,
                        (accum, current) =>
                            Table.AddColumn(accum, (ColumnsToCompare{current} & "Flag"),
                                Expression.Evaluate("each
    	                        if [Is_Unique] then ""NA""
    	                        else if [" & ColumnsToCompare{current}&"1" & "] = [" & ColumnsToCompare{current}&"2" & "] then ""Match""
    	                        else ""Mismatch"""
                                ),
                                type text
                            )
                    ),
                    RemovedIsUnique = Table.RemoveColumns(AddedFlags, "Is_Unique"),
                    FlagColumnNames = {"AgeFlag", "DOBFlag", "GenderFlag"},
                    FixedFlags = List.Accumulate({0..2}, RemovedIsUnique,
                        (accum, current) =>
                            let
                                MistmatchFound = List.Contains(Table.Column(accum,FlagColumnNames{current}), "Mismatch"),
                                ReplacedValues = if MistmatchFound
                                                 then Table.ReplaceValue(accum, "Match","Mismatch", Replacer.ReplaceValue,{FlagColumnNames{current}})
                                                 else accum
                            in
                                ReplacedValues
                    )
                in
                    FixedFlags,
                type table
            }
        ),
    
        ColumnsToExpand = List.Skip(Table.ColumnNames(AlteredGroup[GroupID]{0})),
        GroupIDShema = Table.Schema(AlteredGroup[GroupID]{0}),
        GroupIDExpanded = Table.ExpandTableColumn(AlteredGroup, "GroupID", ColumnsToExpand),
    
        // Restore column types (A. Zhigulin version)
        RestoredTypes = Table.TransformColumnTypes(GroupIDExpanded,
            List.Zip({GroupIDShema[Name],
                      List.Transform(GroupIDShema[TypeName], each Expression.Evaluate(_, #shared))
                     }
            )
        )
    in
        RestoredTypes

    To restore the column types (at the end) I used an approach posted by A. Zhigulin on How to preserve the Columns Types after a Table.Group / Table.Expand. Check that post where C. Banfield offers a function that does the same thing

    Updated workboook with above code avail. here

    PS: Suggestion for next times => Provide a rough estimation of the number of rows/records involved - "large" is not a good estimation ;-). Hope you understand why & it makes sense
    • Edited by Lz._ Wednesday, February 6, 2019 1:20 PM added suggestion
    • Marked as answer by Darko Giacomini Wednesday, February 6, 2019 2:59 PM
    Wednesday, February 6, 2019 10:12 AM
  • This is definitely the farthest thing from lazy, thanks so much! This did exactly what I was hoping it would.

    Apologies for the ambiguity, still getting a hang of power query and its respective forums. Suggestion well taken!

    Darko

    Wednesday, February 6, 2019 2:59 PM
  • Hey Darko

    Glad I could help. Note that I detailed a bit for easier understanding and

    FixedFlags = List.Accumulate({0..2}, RemovedIsUnique,
        (accum, current) =>
            let
                MistmatchFound = List.Contains(Table.Column(accum,FlagColumnNames{current}), "Mismatch"),
                ReplacedValues = if MistmatchFound
                 then Table.ReplaceValue(accum, "Match","Mismatch", Replacer.ReplaceValue,{FlagColumnNames{current}})
                 else accum
            in
                ReplacedValues
    )

    can be shortened as:

    FixedFlags = List.Accumulate({0..2}, RemovedIsUnique,
        (accum, current) =>
                if List.Contains(Table.Column(accum,FlagColumnNames{current}), "Mismatch")
                then Table.ReplaceValue(accum, "Match","Mismatch", Replacer.ReplaceValue,{FlagColumnNames{current}})
                else accum
    )
    Hope this helps & Thanks for providing feedback

    Wednesday, February 6, 2019 3:17 PM