none
Just keep the newest files in a folder RRS feed

  • Question

  • Hi,

    I have several files with the same filename in a Folder. Is there any way to just keep the newest file with PQ using import from File Folder method?


    Lars Wärvik

    Monday, May 4, 2015 4:02 PM

Answers

  • Assuming the files with the same name are in subfolders of the main folder, you could do something like this to get the file name and folder path of the most recent copy of each file:

    let
        Source = Folder.Files("C:\My Files"),
        #"Grouped Rows" = Table.Group(Source, {"Name"}, {{"Date created", each List.Max([Date created]), type datetime}}),
        #"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    You could then feed these results into the formula that accesses the file(s).

    Ehren

    Monday, May 4, 2015 9:37 PM
    Owner

All replies

  • Assuming the files with the same name are in subfolders of the main folder, you could do something like this to get the file name and folder path of the most recent copy of each file:

    let
        Source = Folder.Files("C:\My Files"),
        #"Grouped Rows" = Table.Group(Source, {"Name"}, {{"Date created", each List.Max([Date created]), type datetime}}),
        #"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    You could then feed these results into the formula that accesses the file(s).

    Ehren

    Monday, May 4, 2015 9:37 PM
    Owner
  • Hi ,

    This is working perfect!

    If I need to filter out newest records in a file based upon CustID and DateCreated, how do I accomplish this?


    Lars Wärvik

    Wednesday, May 6, 2015 6:13 PM
  • I have a similar problem of loading only the newest version of files. I have my basic query already completed. But way too many unneeded files. I used the suggested code provided above but get an error at #"Grouped Rows". Here's my code.

     

    let
        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Reordered Columns" = Table.ReorderColumns(Source,{"Content", "Name", "Extension", "Date created", "Date accessed", "Date modified", "Attributes", "Folder Path"})

    in
        #"Reordered Columns"

        #"Grouped Rows" = Table.Group(Source, {"Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"}, {{"Date created", each List.Max([Date created]), type datetime}}),
        #"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})

    in
        #"Expanded NewColumn"

    ===============================

    I have files with names like these.

    APM Form for F2X DN 000 SN 001 Rev A

    APM Form for F2X DN 000 SN 001 Rev B

    APM Form for F2X DN 028 SN 028 Rev A

    APM Form for F2X DN 028 SN 028 Rev B

    APM Form for F2X DN 028 SN 028 Rev C

    So in these sample files all I need is SN 001 Rev B and SN 028 Rev C

    I'm new to PQ and don't know M coding. Just completed reading and doing exercises in M is for DATA Monkeys but didn't find anything specifically related to what I need.

    Any help is much appreciated,

    ET


    ET

    Tuesday, May 14, 2019 3:26 PM
  • Hi ET. The first "in" before the Grouped Rows step is not valid syntax. Also, the Grouped Rows step has an aggregation which is a text version of the Source query, which is odd.

    Try this:

    let
        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Grouped Rows" = Table.Group(Source, {"Name"}, {{"Date created", each List.Max([Date created]), type datetime}}),
        #"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    Ehren

    Tuesday, May 14, 2019 5:05 PM
    Owner
  • Ehren,

    Many thanks, this worked perfectly. I do have one problem. The initial issue of our engineering reports use IR as the rev letter(s) for initial Release.  We do use I as a rev level also. So I need to filter out IR if there is a rev A or higher letter.


    ET

    Wednesday, May 15, 2019 2:35 PM
  • Hi there. I'm not sure I understand. Can you elaborate and provide some examples of what you're currently seeing, and what result you would like to achieve?

    Ehren

    Wednesday, May 15, 2019 4:59 PM
    Owner
  • Initial file is called Rev IR

    APM Form for F2X DN198 SN 195 Rev IR

    2nd file would be 

    APM Form for F2X DN198 SN 195 Rev A, 3rd file same name Rev B

    Since I in IR is greater than B it returns IR - the original file rather than Rev B

    Need to filter out IR if there is a Rev A or higher. 


    ET

    Wednesday, May 15, 2019 6:43 PM
  • Try this:

    let
        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Inserted Text Before Delimiter" = Table.AddColumn(Source, "NameWithoutRev", each Text.BeforeDelimiter([Name], " Rev "), type text),
        #"Grouped Rows" = Table.Group(#"Inserted Text Before Delimiter", {"NameWithoutRev"}, {{"Most Recent", each List.Max(_, null, (x, y) => Comparer.Ordinal(x[Date created], y[Date created])), type record}}),
        #"Expanded Most Recent" = Table.ExpandRecordColumn(#"Grouped Rows", "Most Recent", {"Name", "Date created"}, {"Name", "Date created"}),
        #"Merged Queries" = Table.NestedJoin(#"Expanded Most Recent",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    It uses the part of the file name prior to " Rev " to do the grouping, and then takes the file from each group that has the most recent creation date.

    Ehren


    Wednesday, May 15, 2019 7:22 PM
    Owner
  • Here's the code I have now. From your 2nd row: #"Inserted Text Before Delimiter" I get an error of Token Comma expected.

    let
        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Removed Columns" = Table.RemoveColumns(Source,{"Extension", "Date accessed", "Date modified", "Attributes"})
        #"Inserted Text Before Delimiter" = Table.AddColumn(Source, "NameWithoutRev", each Text.BeforeDelimiter([Name], " Rev "), type text),
        #"Grouped Rows" = Table.Group(#"Inserted Text Before Delimiter", {"NameWithoutRev"}, {{"Most Recent", each List.Max(_, null, (x, y) => Comparer.Ordinal(x[Date created], y[Date created])), type record}}),
        #"Expanded Most Recent" = Table.ExpandRecordColumn(#"Grouped Rows", "Most Recent", {"Name", "Date created"}, {"Name", "Date created"}),
        #"Merged Queries" = Table.NestedJoin(#"Expanded Most Recent",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"


    ET

    Wednesday, May 15, 2019 8:16 PM
  • Never mind. Found I left out comma on previous row. 

    ET

    Wednesday, May 15, 2019 8:16 PM
  • So once I added code I got another error after selecting DONE. 

    = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})

    Explanation is  Expression.Error: We cannot convert a value of type Table to type list.

    Details:

      Value=Table

           Type=Type

    All entries in Most Recent Column show ERROR. Selecting left/right arrows shows No columns were found.


    ET

    Wednesday, May 15, 2019 8:30 PM
  • Ehren,

    Still can't seem to get things to work. How about if  after I first bring files from folder into PQ I split the column NAME so that the Revision letter was in a column by itself. Then IR could be filtered out in your code if there was any other letter A, B, C ..... for any given SN. 


    ET

    Thursday, May 16, 2019 4:48 PM
  • My code above is already doing a split and taking the most recent file among those with the same prefix (prior to " Rev "). I would recommend starting with my code. Tweaking it by hand in the Advanced Editor is likely to lead to errors unless you're familiar with writing M. What more do you need to do beyond the code I posted above? Can you do it via the UI instead of the Advanced Editor?

    Ehren

    Friday, May 17, 2019 11:32 PM
    Owner
  • Ehren,

    When I use your code exactly as you wrote it I get multiple errors at different points. When I click on Show Error it takes me to Grouped Rows step and I see  table showing Column NameWithoutRev just fine. But column for Most Recent shows ERROR  in each of all cells for every file.

    When I click on Error in any cell I Get yellow bar with triangle caution symbol. It says

    "Expression.Error: We cannot convert a value of type Table to type List.

    Details:

         Value = Table

              Type = Type "

    Right clicking on Column Most Recent does not provide me with Change Type option. Clicking on bent arrows in Most Recent column gives a small window that says no columns found.

    Applied steps that I see are:

    Source, Filtered Row, Inserted Text Before delimiter, Grouped Rows, Expanded Most Recent, Merged Queries, Expanded NewColumn


    ET

    Monday, May 20, 2019 7:53 PM
  • The code as I wrote it doesn't have a "Most Recent" column in the Grouped Rows step, and also does not have a Filtered Row step. It sounds like you've modified it somehow, which is causing the errors you're seeing. Can you share the M code that is generating the errors you're seeing?

    Ehren

    Monday, May 20, 2019 8:02 PM
    Owner
  • Most Recent column is appearing when I select Grouped Rows in Applied Steps. It shows 2 columns: NameWithoutRev and Most Recent. I see Most Recent in your code 

    APPLIED STEPS are: Source, Removed Columns, Inserted Text Before Delimiter, Grouped Rows, Expanded Most Recent, Merged Queries, and Expanded NewColumn. I have tried several versions of what you wrote. Here is the initial version which should be exactly as you wrote it. Below is screen shot of query result after selecting DONE.Query Result

    let

        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Inserted Text Before Delimiter" = Table.AddColumn(Source, "NameWithoutRev", each Text.BeforeDelimiter([Name], " Rev "), type text),
        #"Grouped Rows" = Table.Group(#"Inserted Text Before Delimiter", {"NameWithoutRev"}, {{"Most Recent", each List.Max(_, null, (x, y) => Comparer.Ordinal(x[Date created], y[Date created])), type record}}),
        #"Expanded Most Recent" = Table.ExpandRecordColumn(#"Grouped Rows", "Most Recent", {"Name", "Date created"}, {"Name", "Date created"}),
        #"Merged Queries" = Table.NestedJoin(#"Expanded Most Recent",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    I am putting this in query as soon as files are selected and PQ is in Edit. Once I hit  Done I get the Expression.Error I mentioned before;

    "Expression.Error: We cannot convert a value of type Table to type List.

    Details:

         Value = Table

              Type = Type "


    ET

    Tuesday, May 21, 2019 3:33 PM
  • Here's screen shot of actual code in PQ Editor

    ET

    Tuesday, May 21, 2019 3:39 PM
  • Ah, my mistake. There is a "Most Recent" column in my Grouped Rows step.

    However, I was correct about the M code being modified. You've added a Removed Colums step, which is causing the error you're seeing. Use the code as I shared it above.

    Ehren

    Tuesday, May 21, 2019 5:38 PM
    Owner
  • Sorry, don't know how that added step got in there. Obviously I must have deleted some columns. This time for sure I used just your basic code. Still get error at Grouped Rows and every step after that one. I'm attaching screenshot of each step. Last pic shows your code in Editor.

    ET

    Tuesday, May 21, 2019 8:15 PM
  • Ok, thanks. Sorry for the confusion. It looks like my code was depending on a behavior that has changed between your version of Power Query and mine. Try changing the following code:

    each List.Max(_, 

    to this:

    each List.Max(Table.ToRecords(_)

    So the full code would read:

    let
        Source = Folder.Files("C:\Users\THIBE\Desktop\F2000 Power Query Practice\All Files"),
        #"Inserted Text Before Delimiter" = Table.AddColumn(Source, "NameWithoutRev", each Text.BeforeDelimiter([Name], " Rev "), type text),
        #"Grouped Rows" = Table.Group(#"Inserted Text Before Delimiter", {"NameWithoutRev"}, {{"Most Recent", each List.Max(Table.ToRecords(_), null, (x, y) => Comparer.Ordinal(x[Date created], y[Date created])), type record}}),
        #"Expanded Most Recent" = Table.ExpandRecordColumn(#"Grouped Rows", "Most Recent", {"Name", "Date created"}, {"Name", "Date created"}),
        #"Merged Queries" = Table.NestedJoin(#"Expanded Most Recent",{"Name", "Date created"},Source,{"Name", "Date created"},"NewColumn"),
        #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Folder Path"}, {"Folder Path"})
    in
        #"Expanded NewColumn"

    Let me know if this works.
    Ehren

    • Proposed as answer by Old Judoka Wednesday, May 22, 2019 3:31 PM
    Tuesday, May 21, 2019 10:17 PM
    Owner
  • !!!! Worked Perfectly.

    Only issue now is that it selects IR version, which is our Initial Release, instead of Rev A,B,C ......, as I is greater than A, B ....

    I need it to select IR if that is only version released. But don't want IR selected if there is a version with higher Rev letter. 


    ET

    Wednesday, May 22, 2019 3:14 PM
  • Never mind previous reply. I had a practice folder with a few files in it. So Created date was the same for all files as it took date i copied files to practice folder. 

    Works perfectly when i run query in normal file save location. 

    Thanks for all your help and patience Erhen. 


    ET

    Wednesday, May 22, 2019 3:31 PM
  • Ehren,

    In all the work we've done on this issue I forgot to mention one important item. I need the Column titled Content which has the binary files to remain in. those are actually all the files I need to be able to put in a pivot table. 


    ET

    Wednesday, May 22, 2019 6:22 PM
  • Not to worry. I figured out how to add it. 

    ET

    Wednesday, May 22, 2019 6:46 PM