locked
Get the lastest access file date for a directory structure. RRS feed

  • Question

  • Hi guys,

    I know my title is not very clear.

    Here is my problem, I'm sure that I will figure it out at some point, I've been thinking about this for only 5/10 minutes only but maybe one of you had that problem to solve in the past.

    I need to do some cleaning on a file structure (multiple millions of files and directories). The idea is to get the list of file within a folder, find the most recent file accessed in that folder, and create a array which will include the path of that folder, and the timestamp found.

    Example:

    Folder1
    --> file1 accesstime:2018-06-28-15:00 *most recently accessed
    --> file2 accesstime:2018-06-28-14:00
    --> file3 accesstime:2012-06-28-15:00
    ------> folder2
    ---------> file1 accesstime:2016-06-28-18:00 *most recently accessed
    ---------> file2 accesstime:2014-06-28-14:00
    ---------> file3 accesstime:2013-06-28-15:00
    etc ...

    The result:

    Folder1 - 2018-06-28-15:00
    Folder1/Folder2 - 2016-06-28-18:00

    If a folder is empty let the timestamp blank

    I though about Get-childitem -recurse, but I'm not sure it will be really helpful, as it is going to spit a list of multiple millions files, and that I will have to recurse that list anyway.

    At this point, I'm thinking about doing my own recursive function to crawl the tree, and add the logic to add an array element for each folder, and assign the timestamp after doing a sort on timestamp with in each folder.

    Thank in advance for your help

    Personne

    Thursday, June 28, 2018 11:06 PM

Answers

  • I think you overcomplicated something that is simple to do:

    param(
        [Parameter(Position=0,Mandatory=$false)]$sourcePath ="D:\Cloud\OneDrive\Contributions"
    )
    
    #check if given $sourcepath var exists, if not just throw an error an exit
    if(Test-Path $sourcePath){
        #get all folders recursively from the "SourcePath"
        $Allfolders = Get-ChildItem -Path $sourcePath -Directory -Recurse
    }
    else{
        Write-Error "The Source path ""$sourcePath"" does not exists"
        exit -1;
    }
    
    #Final array table (it's a PS Object, for csv we would just require to export it to CSV.
    $FinalArray =@()
    
    
    #for each folder in $Allfolders
    foreach($folder in $Allfolders){
        #get files foreach folder.
        $first = $null
        $FilesInFolder = Get-ChildItem -Path $($folder.FullName) -File 
        
        if($FilesInFolder.Length -gt 0){
            #from the list of files, get the properties lastaccess, name and directory then sort them by lastAccessTime descending
            $sorted = $FilesInFolder | select LastAccessTime,Name,Directory | Sort-Object LastAccessTime -Descending 
            #and select the 1st one
            $first = $sorted | select -First 1
            #add it to the finalarray
            $FinalArray+= $first
        }
    }
    
    $FinalArray # show it on screen
    #and export it to csv
    #$FinalArray | Export-Csv -NoTypeInformation "Export.csv"
    

    Output:

    The code is commented.

    PS: Just change line2. or provide it as parameter

    .\scriptname.ps1 -SourcePath "C:\windows\system32"



    • Proposed as answer by j0rt3g4 Tuesday, July 3, 2018 8:30 AM
    • Edited by j0rt3g4 Tuesday, July 3, 2018 8:32 AM
    • Marked as answer by Cyreli Friday, August 10, 2018 6:33 PM
    Tuesday, July 3, 2018 8:30 AM

All replies

  • Hint: Get all folders then get the latest files in each folder.

    $folders = Get-ChildItem -Diretory -Recurse

    $folders | ForEach-Object{ $_.GetFiles() | Sort LastWriteTime | Select -first 1 }


    \_(ツ)_/

    Thursday, June 28, 2018 11:52 PM
  • Thank for your help, but I got that part very quickly, the problem I'm trying to resolve is a bit more complex.

    What I'm trying to get is the lastestaccess time of anyfile within an entire structure, and put the result in an array which include the foldername, and the lastestaccess time of anyfile under that folder tree

    See below:

    # Create a test file structure mkdir c:\test\level1\level2\level3a\level4\level5a mkdir c:\test\level1\level2\level3a\level4\level5b mkdir c:\test\level1\level2\level3a\level4\level5c mkdir c:\test\level1\level2\level3b\level4\level5a mkdir c:\test\level1\level2\level3b\level4\level5b mkdir c:\test\level1\level2\level3b\level4\level5c # Create some test file, and set the LastAccessTime 3a New-Item c:\test\level1\level2\level3a\level4\level5a\file1.txt New-Item c:\test\level1\level2\level3a\level4\level5a\file2.txt New-Item c:\test\level1\level2\level3a\level4\level5a\file3.txt (get-childitem c:\test\level1\level2\level3a\level4\level5a\file1.txt).LastAccessTime = (get-date).adddays(-1) (get-childitem c:\test\level1\level2\level3a\level4\level5a\file2.txt).LastAccessTime = (get-date).adddays(-2) (get-childitem c:\test\level1\level2\level3a\level4\level5a\file3.txt).LastAccessTime = (get-date).adddays(-3) New-Item c:\test\level1\level2\level3a\level4\level5b\file1.txt New-Item c:\test\level1\level2\level3a\level4\level5b\file2.txt New-Item c:\test\level1\level2\level3a\level4\level5b\file3.txt (get-childitem c:\test\level1\level2\level3a\level4\level5b\file1.txt).LastAccessTime = (get-date).adddays(-4) (get-childitem c:\test\level1\level2\level3a\level4\level5b\file2.txt).LastAccessTime = (get-date).adddays(-5) (get-childitem c:\test\level1\level2\level3a\level4\level5b\file3.txt).LastAccessTime = (get-date).adddays(-6) New-Item c:\test\level1\level2\level3a\level4\level5c\file1.txt New-Item c:\test\level1\level2\level3a\level4\level5c\file2.txt New-Item c:\test\level1\level2\level3a\level4\level5c\file3.txt (get-childitem c:\test\level1\level2\level3a\level4\level5c\file1.txt).LastAccessTime = (get-date).adddays(-7) (get-childitem c:\test\level1\level2\level3a\level4\level5c\file2.txt).LastAccessTime = (get-date).adddays(-8) (get-childitem c:\test\level1\level2\level3a\level4\level5c\file3.txt).LastAccessTime = (get-date).adddays(-9) # Create some test file, and set the LastAccessTime 3b New-Item c:\test\level1\level2\level3b\level4\level5a\file1.txt New-Item c:\test\level1\level2\level3b\level4\level5a\file2.txt New-Item c:\test\level1\level2\level3b\level4\level5a\file3.txt (get-childitem c:\test\level1\level2\level3b\level4\level5a\file1.txt).LastAccessTime = (get-date).adddays(-10) (get-childitem c:\test\level1\level2\level3b\level4\level5a\file2.txt).LastAccessTime = (get-date).adddays(-11) (get-childitem c:\test\level1\level2\level3b\level4\level5a\file3.txt).LastAccessTime = (get-date).adddays(-12) New-Item c:\test\level1\level2\level3b\level4\level5b\file1.txt New-Item c:\test\level1\level2\level3b\level4\level5b\file2.txt New-Item c:\test\level1\level2\level3b\level4\level5b\file3.txt (get-childitem c:\test\level1\level2\level3b\level4\level5b\file1.txt).LastAccessTime = (get-date).adddays(-13) (get-childitem c:\test\level1\level2\level3b\level4\level5b\file2.txt).LastAccessTime = (get-date).adddays(-14) (get-childitem c:\test\level1\level2\level3b\level4\level5b\file3.txt).LastAccessTime = (get-date).adddays(-15) New-Item c:\test\level1\level2\level3b\level4\level5c\file1.txt New-Item c:\test\level1\level2\level3b\level4\level5c\file2.txt New-Item c:\test\level1\level2\level3b\level4\level5c\file3.txt (get-childitem c:\test\level1\level2\level3b\level4\level5c\file1.txt).LastAccessTime = (get-date).adddays(-16) (get-childitem c:\test\level1\level2\level3b\level4\level5c\file2.txt).LastAccessTime = (get-date).adddays(-17) (get-childitem c:\test\level1\level2\level3b\level4\level5c\file3.txt).LastAccessTime = (get-date).adddays(-18)

    New-Item c:\test\level1\level2\file1.txt
    (get-childitem c:\test\level1\level2\file1.txt).LastAccessTime = (get-date).adddays(-1)

    function GetFiles($path = $pwd) { foreach ($item in Get-ChildItem $path -Directory) { #$item.Fullname $fi = $item.GetFiles() if ($fi.count -eq 0) { $d = "Empty" } else { $d = ($fi | sort lastaccesstime | select -last 1).lastaccesstime } $re = [pscustomobject] @{ folderpath = $item.Fullname date = $d } GetFiles $item.FullName } return $re } $global:re = @() $re=getfiles C:\test $re

    I'm getting that result, which is not too bad, but not perfect, I'll explain

    folderpath                                   date
    ----------                                   ----
    C:\test\level1\level2\level3a\level4\level5a 6/27/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5b 6/24/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5c 6/21/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5c 6/21/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4         Empty
    C:\test\level1\level2\level3b\level4\level5a 6/18/2018 11:24:31 AM
    C:\test\level1\level2\level3b\level4\level5b 6/15/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4\level5c 6/12/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4\level5c 6/12/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4         Empty
    C:\test\level1\level2\level3b                Empty
    C:\test\level1\level2                        6/28/2018 11:24:32 AM
    C:\test\level1                               Empty

    I'm not sure I understand with I get a double entry for 5c (surely a recursion issue)

    What I'll like to get, is:

    folderpath                                   date
    ----------                                   ----
    C:\test\level1\level2\level3a\level4\level5a 6/27/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5b 6/24/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5c 6/21/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4\level5c 6/21/2018 11:24:31 AM
    C:\test\level1\level2\level3a\level4         6/27/2018 11:24:31 AM <- most recent from the subfolder 5a/5b/5c
    
    C:\test\level1\level2\level3b\level4\level5a 6/18/2018 11:24:31 AM
    C:\test\level1\level2\level3b\level4\level5b 6/15/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4\level5c 6/12/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4\level5c 6/12/2018 11:24:32 AM
    C:\test\level1\level2\level3b\level4         6/18/2018 11:24:31 AM <- most recent from the subfolder 5a/5b/5c
    
    C:\test\level1\level2\level3b                6/18/2018 11:24:31 AM <- most recent from the subfolder 5a/5b/5c (3b)
    C:\test\level1\level2\level3a                6/27/2018 11:24:31 AM <- most recent from the subfolder 5a/5b/5c (3a)
    
    C:\test\level1\level2                        6/28/2018 11:24:32 AM -< this is the date of the file in c:\test\level1\level2
    
    C:\test\level1                               6/28/2018 11:24:32 AM -< most recent from level2

    It seems that I have a recursion issue because I'm getting 5c twice, and for some reason I'm not getting level3a.


    • Edited by Cyreli Friday, June 29, 2018 6:38 PM
    Friday, June 29, 2018 6:33 PM
  • This will get all files grouped by folder into an array.  You can then just filter the files in the group as needed

    Get-ChildItem -file -Recurse |
         select Name, @{n='Folder';e={$_.DirectoryName}},LastAccessTime|Group Folder*


    \_(ツ)_/

    Friday, June 29, 2018 6:55 PM
  • I think you overcomplicated something that is simple to do:

    param(
        [Parameter(Position=0,Mandatory=$false)]$sourcePath ="D:\Cloud\OneDrive\Contributions"
    )
    
    #check if given $sourcepath var exists, if not just throw an error an exit
    if(Test-Path $sourcePath){
        #get all folders recursively from the "SourcePath"
        $Allfolders = Get-ChildItem -Path $sourcePath -Directory -Recurse
    }
    else{
        Write-Error "The Source path ""$sourcePath"" does not exists"
        exit -1;
    }
    
    #Final array table (it's a PS Object, for csv we would just require to export it to CSV.
    $FinalArray =@()
    
    
    #for each folder in $Allfolders
    foreach($folder in $Allfolders){
        #get files foreach folder.
        $first = $null
        $FilesInFolder = Get-ChildItem -Path $($folder.FullName) -File 
        
        if($FilesInFolder.Length -gt 0){
            #from the list of files, get the properties lastaccess, name and directory then sort them by lastAccessTime descending
            $sorted = $FilesInFolder | select LastAccessTime,Name,Directory | Sort-Object LastAccessTime -Descending 
            #and select the 1st one
            $first = $sorted | select -First 1
            #add it to the finalarray
            $FinalArray+= $first
        }
    }
    
    $FinalArray # show it on screen
    #and export it to csv
    #$FinalArray | Export-Csv -NoTypeInformation "Export.csv"
    

    Output:

    The code is commented.

    PS: Just change line2. or provide it as parameter

    .\scriptname.ps1 -SourcePath "C:\windows\system32"



    • Proposed as answer by j0rt3g4 Tuesday, July 3, 2018 8:30 AM
    • Edited by j0rt3g4 Tuesday, July 3, 2018 8:32 AM
    • Marked as answer by Cyreli Friday, August 10, 2018 6:33 PM
    Tuesday, July 3, 2018 8:30 AM
  • Hi,

    thanks for you replies, I've done multiple performance test, I'm dealing for multiple millions of files, and a tree structure of 11 sub-level, and the best solution for my need is below

    Note:

    • $_.Getfiles() is much faster than Get-Childitem, and follow symlink on linux OS
    • $_.GetFiles(): Does not recurse
    • $_.GetFiles('*','AllDirectories'): Does recursion

    $sourcePath  = "C:\Program Files"
    $FinalArray = @()
    
    $Allfolders = Get-ChildItem -Path $sourcePath -Directory -Recurse #DOES NOT INCLUDE SYMLINK
    
    $Allfolders | ForEach-Object{
            # Write-Host -fore Cyan $_.fullname;
            $t1 = Measure-Command {
                $a1 = $_.GetFiles('*','AllDirectories');
                $first = $a1 | Sort-object LastWriteTime | Select -first 1
                } # will follow linux symlink and recurse folders, much faster than Get-Childitem
    
            #$t2 = measure-command  {$a2 = Get-ChildItem -Path $($_.FullName) -File -Recurse; $first = $a2 | Sort-object LastWriteTime | Select -first 1} # does not follow symlink
            #"Getfiles: $($a1.count) : $($t1.ticks)"
            #"ChildItem: $($a2.count) : $($t2.ticks)"
    
            $FinalArray += [pscustomobject][ordered]@{
                            Folder = $_.FullName
                            Date   = $first.LastAccessTime
                            }
        }
    
    $dest       = $($sourcePath -replace ('/','_'))+ ".csv"
    $FinalArray | Export-Csv $dest -notypeinformation

    • Edited by Cyreli Friday, August 10, 2018 6:34 PM
    Friday, August 10, 2018 6:32 PM