none
Seek help to select two most updated records using Power Query RRS feed

  • Question

  • Hi all!

    I'm a newbie to Power Query, I need to select two most updated records using Power Query but not yet know how to do, please help me.

    I have data base like below:

     
    PatientID Testid TestDate Value
    0001 03 3/8/2014 40
    0001 03 9/11/2014 22
    0002 03 3/26/2014 77
    0002 03 9/11/2014 101
    0004 03 3/26/2014 9
    0004 03 9/11/2014 8
    0005 03 3/26/2014 82
    0005 03 9/11/2014 68
    0008 03 1/8/2014 17
    0008 03 7/11/2014 44
    0010 03 4/17/2014 58
    0010 03 10/10/2014 30
    0011 03 4/17/2014 11
    0011 03 10/10/2014 12
    0012 03 4/17/2014 60
    0012 03 10/10/2014 97
    0012 03 11/10/2014 83
    0013 03 3/28/2014 18
    0013 03 10/28/2014 19

    For this example table, I want to get rid of ID 0012 at the date 4/17/2014.

    Please help,

    Thank you very much,

    Best regards,

     
    Tuesday, February 24, 2015 4:41 AM

Answers

  • First you will need to create a query from the table, that finds the two most recent dates per Patient and Test IDs.

    Here is the M formula (assuming that your data is in "Table1" - generated with Ctrl-T):

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Removed Columns" = Table.RemoveColumns(Source,{"Value"}),
        #"To Date" = Table.TransformColumns(#"Removed Columns",{{"TestDate", DateTime.Date}}),
        #"Grouped Rows" = Table.Group(#"To Date", {"PatientID", "Testid"}, {{"MaxN", each List.MaxN([TestDate],2), type list}}),
        #"Expand MaxN" = Table.ExpandListColumn(#"Grouped Rows", "MaxN")
    in
        #"Expand MaxN"

    You can generate this query by clicking Power Query-->From Other Sources-->Blank Query, and paste the formula above in the Query Editor (which is available in the View tab of Query Editor dialog).

    Name the new query as "MaxN", and create a new query from Table1 that is a merge with the "MaxN" query.

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"TestDate", type date}}),
        Merge = Table.NestedJoin(#"Changed Type",{"PatientID", "Testid", "TestDate"},Max2,{"PatientID", "Testid", "MaxN"},"NewColumn",JoinKind.Inner),
        #"Removed Columns" = Table.RemoveColumns(Merge,{"NewColumn"})
    in
        #"Removed Columns"

    I can also share the step-by-step to create the two queries from the UI if needed.

    • Marked as answer by hoangbachdao Tuesday, February 24, 2015 10:00 AM
    Tuesday, February 24, 2015 9:04 AM

All replies

  • Hi hoangbachdao,

    you can choose "remove duplicates" on the Patient ID after you have resorted the TestDate Column in descending order.

    "Remove duplicates" removes all duplicates after the first occurence of the value, so for your request it's important that you reorder first.

    https://support.office.com/en-in/article/Remove-duplicates-d9cffc69-dc5d-4d94-8b66-72779688874d


    hth, Imke

    Tuesday, February 24, 2015 8:14 AM
    Moderator
  • Thanks for your advice, however, I want to keep two most updated, one patientID with their two last results, just want to get rid of the third and so on. 

    Group by is also provide similar function as you suggested, but it works for the most updated one.

    Please correct me if I am wrong.

    Best regards,

    Tuesday, February 24, 2015 8:41 AM
  • First you will need to create a query from the table, that finds the two most recent dates per Patient and Test IDs.

    Here is the M formula (assuming that your data is in "Table1" - generated with Ctrl-T):

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Removed Columns" = Table.RemoveColumns(Source,{"Value"}),
        #"To Date" = Table.TransformColumns(#"Removed Columns",{{"TestDate", DateTime.Date}}),
        #"Grouped Rows" = Table.Group(#"To Date", {"PatientID", "Testid"}, {{"MaxN", each List.MaxN([TestDate],2), type list}}),
        #"Expand MaxN" = Table.ExpandListColumn(#"Grouped Rows", "MaxN")
    in
        #"Expand MaxN"

    You can generate this query by clicking Power Query-->From Other Sources-->Blank Query, and paste the formula above in the Query Editor (which is available in the View tab of Query Editor dialog).

    Name the new query as "MaxN", and create a new query from Table1 that is a merge with the "MaxN" query.

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"TestDate", type date}}),
        Merge = Table.NestedJoin(#"Changed Type",{"PatientID", "Testid", "TestDate"},Max2,{"PatientID", "Testid", "MaxN"},"NewColumn",JoinKind.Inner),
        #"Removed Columns" = Table.RemoveColumns(Merge,{"NewColumn"})
    in
        #"Removed Columns"

    I can also share the step-by-step to create the two queries from the UI if needed.

    • Marked as answer by hoangbachdao Tuesday, February 24, 2015 10:00 AM
    Tuesday, February 24, 2015 9:04 AM
  • Please share the step-by-step for me, I tried it and it said expression.Error.

    Thank you so much,

    Best 

    Tuesday, February 24, 2015 9:51 AM
  • I made it,

    thank you so much for your support

    Best

    Tuesday, February 24, 2015 10:00 AM