none
PowerShell Get-Content alternative RRS feed

  • Question

  • Hi, I am trying to parse some rather large IIS logs through the use of

    $IISLogFileRaw = [System.IO.File]::ReadAllLines($log.fullname)

    However, I get the errot hat the file cannot be opened as it is currently in use, which I believe it is by the logging process.

    I no longer want to use 'Get-Content' as this is to slow.

    What can I use with $IISLogFileRaw = [System.IO.File]::ReadAllLines($log.fullname) to force PowerShell to parse the contents of the log file. I have looked at IO.FileAccess.Read and IO.FileMode.Open etc but not having much luck. I have even tried embedding some C# into my PS script but this didnt work either.Any ideas? 

    Saturday, December 27, 2014 6:37 AM

Answers

  • Hi, I am trying to parse some rather large IIS logs through the use of

    $IISLogFileRaw=[System.IO.File]::ReadAllLines($log.fullname)

    However, I get the errot hat the file cannot be opened as it is currently in use, which I believe it is by the logging process.

    I no longer want to use 'Get-Content' as this is to slow.

    What can I use with $IISLogFileRaw=[System.IO.File]::ReadAllLines($log.fullname) to force PowerShell to parse the contents of the log file. I have looked at IO.FileAccess.Read and IO.FileMode.Open etc but not having much luck. I have even tried embedding some C# into my PS script but this didnt work either.Any ideas? 

    With files that size Get-Content is going go be slow using the default ReadCount of 1.  

    If you want to read all the lines in at once, use Get-Content with a ReadCount of 0.  

    With large files this can also cause performance issues because you're reading the entire file into memory and causing a lot of system overhead in memory management.

    The best solution is usually to use a ReadCount of somewhere between 1000 and 5000, depending on the average size of the lines in the file.  For IIS logs, I've found 1500-2000 works pretty well.

    The log will be returned in arrays of however many lines is specified in ReadCount, and you'll need to write your script to deal with it as arrays, but once you get that sorted out you should be able to process the file very quickly without causing memory related performance problems.


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    • Marked as answer by JimAll Wednesday, December 31, 2014 8:18 AM
    Saturday, December 27, 2014 3:00 PM
    Moderator

All replies

  • It doesn't matter what you use, if the file is locked you cannot read it with anything.

    Read al lines is not different from Get-Content.  It will not be more efficient for large files,


    ¯\_(ツ)_/¯

    Saturday, December 27, 2014 9:27 AM
  • Get-Content works fine (PS4):

    $Logs = Get-ChildItem 'D:\IISLogs\W3SVC2'
    $CurrentLog = $Logs| Sort LastWriteTime -Descending | Select -First 1
    
    $Duration = Measure-Command { $IISLogFileRaw = Get-Content $CurrentLog.FullName }
    "Read $($IISLogFileRaw.Count) lines $($Duration.Minutes):$($Duration.Seconds) in mm:ss" 
    # Read 204578 lines 0:28 in mm:ss

    Took under 30 sec to read over 200k lines from a 57 MB file. 

    How big is the file you're trying to read?


    Sam Boutros, Senior Consultant, Software Logic, KOP, PA http://superwidgets.wordpress.com (Please take a moment to Vote as Helpful and/or Mark as Answer, where applicable) _________________________________________________________________________________ Powershell: Learn it before it's an emergency http://technet.microsoft.com/en-us/scriptcenter/powershell.aspx http://technet.microsoft.com/en-us/scriptcenter/dd793612.aspx

    Saturday, December 27, 2014 12:01 PM
  • Hi, I am trying to parse some rather large IIS logs through the use of

    $IISLogFileRaw=[System.IO.File]::ReadAllLines($log.fullname)

    However, I get the errot hat the file cannot be opened as it is currently in use, which I believe it is by the logging process.

    I no longer want to use 'Get-Content' as this is to slow.

    What can I use with $IISLogFileRaw=[System.IO.File]::ReadAllLines($log.fullname) to force PowerShell to parse the contents of the log file. I have looked at IO.FileAccess.Read and IO.FileMode.Open etc but not having much luck. I have even tried embedding some C# into my PS script but this didnt work either.Any ideas? 

    With files that size Get-Content is going go be slow using the default ReadCount of 1.  

    If you want to read all the lines in at once, use Get-Content with a ReadCount of 0.  

    With large files this can also cause performance issues because you're reading the entire file into memory and causing a lot of system overhead in memory management.

    The best solution is usually to use a ReadCount of somewhere between 1000 and 5000, depending on the average size of the lines in the file.  For IIS logs, I've found 1500-2000 works pretty well.

    The log will be returned in arrays of however many lines is specified in ReadCount, and you'll need to write your script to deal with it as arrays, but once you get that sorted out you should be able to process the file very quickly without causing memory related performance problems.


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    • Marked as answer by JimAll Wednesday, December 31, 2014 8:18 AM
    Saturday, December 27, 2014 3:00 PM
    Moderator
  • Thank you for the replies.
    I have moved away from the .ReadAllLines options as suggested by jrv and will be using the get content with a -ReadCount of 1000.

    This bit of code/functionality will form part of a larger BizTalk system healthecheck script (that is complete now) which we didnt want to take to long, 30 seconds per log file times 6 servers would have been to long, hence why we looking for the most efficient way to parse a log file.

    Looking good now though. Thanks again.

    Wednesday, December 31, 2014 8:22 AM