none
Different result when using -ReadCount with Get-Content RRS feed

  • Question

  • I'm processing a large text file and getting different results when I use -ReadCount.  I believe that -ReadCount is supposed to be a performance option only 
    (and it certainly does improve performance!) but shouldn't change results. 

    My command is something like this:

    gc somefile.txt -ReadCount 1000 | where-object {$_ -like "*somestring*"}|set-content anotherfile.txt

    When I use ReadCount greater than 1, I get lines of output that don't satisfy the condition of the where-object filter in the middle.

    It may be the same as this issue but no resolution or workaround is offered there. 

    help?


    Leo Tohill

    Wednesday, June 20, 2012 5:18 PM

Answers

  • When you pass an integer greater than one as an argument to Get-Content's ReadCount parameter, the cmdlet outputs/pipes an array of lines as one chunk; that is why performance improves but confuses new scripters. You are comparing the array of lines — not a single line — in the Where-Object's FilterScript and, when any line in the array meets the criteria, the entire array filters through because the output of the comparison is converted to Boolean type, even though it could be one or more lines. To get only the expected result, pipe Get-Content's output to ForEach-Object. This time, when you compare the array, only the lines that meet the criteria will filter through.

    gc somefile.txt -ReadCount 1000 | foreach-object {$_ -like "*somestring*"}|set-content anotherfile.txt


    Run this one line at a time to see the point:

    # define a String array
    [String[]]$array = 'White', 'Purple', 'Green', 'Blue'

    # $_ is a String; works as expected
    $array | Where-Object {$_ -like '*u*'}

    # $_ is a String array, note the Comma operator*,
    # comparison filters matching Strings but this is
    # converted to Boolean True; entire array goes through
    ,$array | Where-Object {$_ -like '*u*'}

    # $_ is a String array, comparison filters matching Strings…
    ,$array | ForEach-Object {$_ -like '*u*'}

    # …which is similar to:
    $array -like '*u*'


    * The Comma operator wraps the String array in an Object array; the Object array is piped and its only element, the String array, becomes the current pipeline object $_
    • Edited by zx38 Wednesday, June 20, 2012 6:30 PM
    • Marked as answer by Yan Li_Moderator Friday, June 29, 2012 9:32 AM
    Wednesday, June 20, 2012 6:29 PM

All replies

  • When you pass an integer greater than one as an argument to Get-Content's ReadCount parameter, the cmdlet outputs/pipes an array of lines as one chunk; that is why performance improves but confuses new scripters. You are comparing the array of lines — not a single line — in the Where-Object's FilterScript and, when any line in the array meets the criteria, the entire array filters through because the output of the comparison is converted to Boolean type, even though it could be one or more lines. To get only the expected result, pipe Get-Content's output to ForEach-Object. This time, when you compare the array, only the lines that meet the criteria will filter through.

    gc somefile.txt -ReadCount 1000 | foreach-object {$_ -like "*somestring*"}|set-content anotherfile.txt


    Run this one line at a time to see the point:

    # define a String array
    [String[]]$array = 'White', 'Purple', 'Green', 'Blue'

    # $_ is a String; works as expected
    $array | Where-Object {$_ -like '*u*'}

    # $_ is a String array, note the Comma operator*,
    # comparison filters matching Strings but this is
    # converted to Boolean True; entire array goes through
    ,$array | Where-Object {$_ -like '*u*'}

    # $_ is a String array, comparison filters matching Strings…
    ,$array | ForEach-Object {$_ -like '*u*'}

    # …which is similar to:
    $array -like '*u*'


    * The Comma operator wraps the String array in an Object array; the Object array is piped and its only element, the String array, becomes the current pipeline object $_
    • Edited by zx38 Wednesday, June 20, 2012 6:30 PM
    • Marked as answer by Yan Li_Moderator Friday, June 29, 2012 9:32 AM
    Wednesday, June 20, 2012 6:29 PM
  • Thanks zx, that makes it clear. 

    Regarding "...but confuses new scripters...", I've been scripting in PS for years and still things like this catch me.  Someday I'll know it all.

    Regards,

    Leo


    Leo Tohill

    Wednesday, June 20, 2012 7:34 PM
  • True, I know what you mean.

    You're welcome.

    Thursday, June 21, 2012 12:07 AM