locked
Massive memory usage in a pipeline RRS feed

  • General discussion

  • Here's a simple script: concatenate 100 text files, making a small modification to some text on each line:

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        ("{0,11:D}" -f (([int32] $_.SubString(0,11))+40000*$r))+$_.SubString(11)
                    }
            } | Set-Content run_all.f81

    Each file is around 100 MB or so. I did this with a pipeline expecting that it would use little memory. Instead, the script proceeded to chew through all my RAM. It was writing the output file, but I never let it finish since the machine became unusable. I traced it with ParameterBinding, and it was behaving as expected: one line bound to the Foreach-Object pipeline, immediately followed by the modified line bound to the Set-Content pipeline. Nonetheless, memory usage got out of control, so I clearly am not understanding how PowerShell pipelining works and how to do this properly.

    I would appreciate it if someone would explain to me why the memory usage got out of control and how I should have done the pipeline (if there is such a way to avoid the memory usage problem). I'm not looking for a radically different way of doing the task, I did it another way (C++); I'm just trying to understand PowerShell here. Thanks.

    • Changed type Bill_Stewart Tuesday, February 24, 2015 10:43 PM Discussion
    Tuesday, December 23, 2014 3:35 PM

All replies

  • I suspect the problem is with the string concatenation. 

    Did you try using a stringbuilder instead?


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Tuesday, December 23, 2014 3:44 PM
  • I doubt the problem lies with the string concatenation: the lines are only about 100 characters long, and the trace shows a single line going to Foreach-Object followed by a single line going to Set-Content. Just to check I did tried writing this using StringBuilder, with the same result of growing memory usage:

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
         $l = New-Object System.Text.StringBuilder;
         $e = ([int32] $_.SubString(0,11))+40000*$r;
         $n = $l.AppendFormat("{0,11:D}",$e);
         $n = $l.Append($_.SubString(11));
         $l.ToString();
         Remove-Variable n;
         Remove-Variable e;
         Remove-Variable l;
           }
            } | Set-Content run_sb.f81

    Thanks; any other clue what might be going on?

    Tuesday, December 23, 2014 4:37 PM
  • I suspect you're still eating up memory in temporary allocations.

    Try creating one StringBuilder object, and then keep re-using it:

    $l = New-Object System.Text.StringBuilder
    
    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {     
         $e = ([int32] $_.SubString(0,11))+40000*$r;
         [void]$l.AppendFormat("{0,11:D}",$e);
         [viod]$l.Append($_.SubString(11));
         $l.ToString();
         $l.clear()
           }
            } | Set-Content run_sb.f81


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


    • Edited by mjolinor Tuesday, December 23, 2014 4:50 PM
    Tuesday, December 23, 2014 4:47 PM
  • Unfortunately, that didn't help either. I tried exactly what you wrote (modulo typos and adding a [void] cast to the clear), and it still chewed up memory. Thanks.
    Tuesday, December 23, 2014 5:00 PM
  • OK.  I went back and had another look at the original code.  

    You only have one line at a time in the pipeline, but you're collecting it all before you write the file.

    See if this works better:

    Clear-Content run_all.f81
    
    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        ("{0,11:D}" -f (([int32] $_.SubString(0,11))+40000*$r))+$_.SubString(11) |
                        Add-Content run_all.f81
                    }
            }


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Tuesday, December 23, 2014 5:08 PM
  • You are still accumulating lines before the set.


    ¯\_(ツ)_/¯

    Tuesday, December 23, 2014 5:11 PM
  • I tried this, and I'm not precisely sure how bad the memory usage would get, since it is incredibly slow in this form (it's writing at couple percent of the speed of the original version, at least before memory usage gets to be a problem). So let's get at the question of the underlying problem: both you and jrv suggest that the problem is that the lines are accumulated before getting to the Set-Content; I'm not convinced as yet that they are, based on running Trace-Command with ParameterBinding. I've appended the beginning of the output below, going up through the first couple of lines that go through the pipeline. What you see there is one line at a time binds to the Value parameter of Set-Content. Am I misunderstanding what the trace means? Thanks.

    ParameterBinding Information: 0 : BIND NAMED cmd line args [Get-ChildItem]
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [Get-ChildItem]
    ParameterBinding Information: 0 :     BIND arg [run_*\fort.81] to parameter [Path]
    ParameterBinding Information: 0 :         Binding collection parameter Path: argument type [String], parameter type [System.String[]], collection type Array, element type [System.String], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.String] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.String[]] to param [Path] SUCCESSFUL
    ParameterBinding Information: 0 : BIND cmd line args to DYNAMIC parameters.
    ParameterBinding Information: 0 :     DYNAMIC parameter object: [Microsoft.PowerShell.Commands.GetChildDynamicParameters]
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Get-ChildItem]
    ParameterBinding Information: 0 : BIND NAMED cmd line args [Select-Object]
    ParameterBinding Information: 0 :     BIND arg [FullName] to parameter [ExpandProperty]
    ParameterBinding Information: 0 :         COERCE arg to [System.String]
    ParameterBinding Information: 0 :             Parameter and arg types the same, no coercion is needed.
    ParameterBinding Information: 0 :         BIND arg [FullName] to param [ExpandProperty] SUCCESSFUL
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [Select-Object]
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Select-Object]
    ParameterBinding Information: 0 : BIND NAMED cmd line args [ForEach-Object]
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [ForEach-Object]
    ParameterBinding Information: 0 :     BIND arg [
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        ("{0,11:D}" -f (([int32] $_.SubString(0,11))+40000*$r))+$_.SubString(11)
                    }
            ] to parameter [Process]
    ParameterBinding Information: 0 :         Binding collection parameter Process: argument type [ScriptBlock], parameter type [System.Management.Automation.ScriptBlock[]], collection type Array, element type [System.Management.Automation.ScriptBlock], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.Management.Automation.ScriptBlock] and 1 elements
    ParameterBinding Information: 0 :         Argument type ScriptBlock is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type ScriptBlock to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.Management.Automation.ScriptBlock[]] to param [Process] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : BIND NAMED cmd line args [Set-Content]
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [Set-Content]
    ParameterBinding Information: 0 :     BIND arg [run_all.f81] to parameter [Path]
    ParameterBinding Information: 0 :         Binding collection parameter Path: argument type [String], parameter type [System.String[]], collection type Array, element type [System.String], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.String] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.String[]] to param [Path] SUCCESSFUL
    ParameterBinding Information: 0 : BIND cmd line args to DYNAMIC parameters.
    ParameterBinding Information: 0 :     DYNAMIC parameter object: [Microsoft.PowerShell.Commands.FileSystemContentWriterDynamicParameters]
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Set-Content]
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [Select-Object]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.IO.FileInfo]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [C:\Users\jsberg\Documents\map\Target\XDing-130323\run_000\fort.81] to parameter [InputObject]
    ParameterBinding Information: 0 :         BIND arg [C:\Users\jsberg\Documents\map\Target\XDing-130323\run_000\fort.81] to param [InputObject] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Select-Object]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [ForEach-Object]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [C:\Users\jsberg\Documents\map\Target\XDing-130323\run_000\fort.81] to parameter [InputObject]
    ParameterBinding Information: 0 :         BIND arg [C:\Users\jsberg\Documents\map\Target\XDing-130323\run_000\fort.81] to param [InputObject] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : BIND NAMED cmd line args [Get-Content]
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [Get-Content]
    ParameterBinding Information: 0 :     BIND arg [C:\Users\jsberg\Documents\map\Target\XDing-130323\run_000\fort.81] to parameter [Path]
    ParameterBinding Information: 0 :         Binding collection parameter Path: argument type [String], parameter type [System.String[]], collection type Array, element type [System.String], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.String] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.String[]] to param [Path] SUCCESSFUL
    ParameterBinding Information: 0 : BIND cmd line args to DYNAMIC parameters.
    ParameterBinding Information: 0 :     DYNAMIC parameter object: [Microsoft.PowerShell.Commands.FileSystemContentReaderDynamicParameters]
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Get-Content]
    ParameterBinding Information: 0 : BIND NAMED cmd line args [ForEach-Object]
    ParameterBinding Information: 0 : BIND POSITIONAL cmd line args [ForEach-Object]
    ParameterBinding Information: 0 :     BIND arg [
                        ("{0,11:D}" -f (([int32] $_.SubString(0,11))+40000*$r))+$_.SubString(11)
                    ] to parameter [Process]
    ParameterBinding Information: 0 :         Binding collection parameter Process: argument type [ScriptBlock], parameter type [System.Management.Automation.ScriptBlock[]], collection type Array, element type [System.Management.Automation.ScriptBlock], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.Management.Automation.ScriptBlock] and 1 elements
    ParameterBinding Information: 0 :         Argument type ScriptBlock is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type ScriptBlock to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.Management.Automation.ScriptBlock[]] to param [Process] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : CALLING BeginProcessing
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [ForEach-Object]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9      5.516      2.427      0.000  2.144199E-04  3.177027E-05  1.909944E-03  1.922205E-03  2.021630E+02  4.879061E-01] to parameter [InputObject]
    ParameterBinding Information: 0 :         BIND arg [          1      9      5.516      2.427      0.000  2.144199E-04  3.177027E-05  1.909944E-03  1.922205E-03  2.021630E+02  4.879061E-01] to param [InputObject] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [Set-Content]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [Value] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9      5.516      2.427      0.000  2.144199E-04  3.177027E-05  1.909944E-03  1.922205E-03  2.021630E+02  4.879061E-01] to parameter [Value]
    ParameterBinding Information: 0 :         Binding collection parameter Value: argument type [String], parameter type [System.Object[]], collection type Array, element type [System.Object], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.Object] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.Object[]] to param [Value] SUCCESSFUL
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Set-Content]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [ForEach-Object]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9      4.188      3.454      0.000  5.779805E-05  2.465617E-05  6.789177E-04  6.818195E-04  2.020625E+02  4.879061E-01] to parameter [InputObject]
    ParameterBinding Information: 0 :         BIND arg [          1      9      4.188      3.454      0.000  5.779805E-05  2.465617E-05  6.789177E-04  6.818195E-04  2.020625E+02  4.879061E-01] to param [InputObject] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [Set-Content]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [Value] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9      4.188      3.454      0.000  5.779805E-05  2.465617E-05  6.789177E-04  6.818195E-04  2.020625E+02  4.879061E-01] to parameter [Value]
    ParameterBinding Information: 0 :         Binding collection parameter Value: argument type [String], parameter type [System.Object[]], collection type Array, element type [System.Object], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.Object] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.Object[]] to param [Value] SUCCESSFUL
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Set-Content]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [ForEach-Object]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9     -3.902      4.123      0.000 -4.136246E-04  2.724050E-04  5.269902E-03  5.293124E-03  2.020827E+02  4.879061E-01] to parameter [InputObject]
    ParameterBinding Information: 0 :         BIND arg [          1      9     -3.902      4.123      0.000 -4.136246E-04  2.724050E-04  5.269902E-03  5.293124E-03  2.020827E+02  4.879061E-01] to param [InputObject] SUCCESSFUL
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [ForEach-Object]
    ParameterBinding Information: 0 : BIND PIPELINE object to parameters: [Set-Content]
    ParameterBinding Information: 0 :     PIPELINE object TYPE = [System.String]
    ParameterBinding Information: 0 :     RESTORING pipeline parameter's original values
    ParameterBinding Information: 0 :     Parameter [Value] PIPELINE INPUT ValueFromPipeline NO COERCION
    ParameterBinding Information: 0 :     BIND arg [          1      9     -3.902      4.123      0.000 -4.136246E-04  2.724050E-04  5.269902E-03  5.293124E-03  2.020827E+02  4.879061E-01] to parameter [Value]
    ParameterBinding Information: 0 :         Binding collection parameter Value: argument type [String], parameter type [System.Object[]], collection type Array, element type [System.Object], no coerceElementType
    ParameterBinding Information: 0 :         Creating array with element type [System.Object] and 1 elements
    ParameterBinding Information: 0 :         Argument type String is not IList, treating this as scalar
    ParameterBinding Information: 0 :         Adding scalar element of type String to array position 0
    ParameterBinding Information: 0 :         BIND arg [System.Object[]] to param [Value] SUCCESSFUL
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName NO COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 :     Parameter [Credential] PIPELINE INPUT ValueFromPipelineByPropertyName WITH COERCION
    ParameterBinding Information: 0 : MANDATORY PARAMETER CHECK on cmdlet [Set-Content]
    

    Tuesday, December 23, 2014 6:01 PM
  • Start with a simple test to see what is happening.

    Just loop through the file and set the output:  Don't edit the lines.  You should see the same bad behavior.

    Now test using Out-File -append.  It behaves differerntly

    Finally you can use IO primitives to read and write the files,   Pick the method that fives you the best performance.

    Reading batches of lines can also speed things up as it avoids round trips to the disk layer.


    ¯\_(ツ)_/¯

    Tuesday, December 23, 2014 6:15 PM
  • As to simplifying the test, I tried two things: first, I replaced the string calculation with a simple $_. The second thing I tried was eliminating completely the inner Foreach-Object. Both tests showed the same results as the original, a rapidly increasing memory usage.

    I also tried the Out-File -append instead of the Add-Content, which had the same very slow write behavior (and slow memory usage growth). In fact, the Out-File version seems to be worse in terms of memory usage!

    But in fact my real question here is not about how to optimize the file I/O (in fact, once the solution gets much more complex it no longer makes sense to do this in PowerShell), but about the behavior of the PowerShell pipeline and memory usage. From the trace above, one line at a time goes to Set-Content? Is memory for objects that go through the pipeline not released, or am I misunderstanding how to use pipelines? I just did another experiment: I replaced the Set-Content in the original with an Out-Null: the result was that there was no massive increase in memory usage; it seems to be well-behaved. So is there something about how Set-Content interacts with the pipeline? Does a single Set-Content call not free resources for the objects it has received and written until after the completion of Set-Content? After all, the lines are read by Set-Content one at a time, not as a big group or entire file, and they are continually being written (I can see the file size growing and the content in the file).

    I tried another experiment along these lines: I replaced Set-Content in the original with Out-File. Memory usage still grows quickly, but not as quickly as with Set-Content. It was the same if I removed the string processing (even when removing the inner Foreach-Object). So the behavior seems to be a function of the cmdlet receiving the data from the pipeline.

    Considering all this, maybe my question is this: is this behavior just a matter of cmdlets not cleaning up objects coming from the pipeline as they are done with them? It seems to be possible, since Out-Null seems to do so.

    Thanks.

    Tuesday, December 23, 2014 7:05 PM
  • You may have two problems - one being the accumulation of lines and the other being memory usage due to temporary allocations.  I believe the stringbuilder solution should help with the temporary allocations, since it keeps re-using the same memory for each line.  Using Add-Content inside the loop, instead of Set-Content at the end should take care of the line accumulation:

    $l = New-Object System.Text.StringBuilder
    Clear-Content run_sb.f81
    
    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32]$r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {     
         $e = ([int32] $_.SubString(0,11))+40000*$r;
         [void]$l.AppendFormat("{0,11:D}",$e);
         [viod]$l.Append($_.SubString(11));
         $l.ToString() | Add-Content run_sb.f81
         $l.clear()
           }
      } 


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


    • Edited by mjolinor Tuesday, December 23, 2014 7:25 PM
    Tuesday, December 23, 2014 7:23 PM
  • Using StringBuilder really won't change anything. Let me expand on my previous post. Appended below are three versions of the script with no processing, and a fourth that does processing. The first three cause a similar rapid increase in memory usage. What is interesting is that memory usage in the third case does not increase as quickly as the first two, apparently due to the use of Out-File instead of Set-Content.  The fourth case has the processing, but pipes the results to Out-Null rather than Set-Content or Out-File; it does not exhibit the growth in memory usage (the powershell process limits its usage to a comparatively svelte 60 MB, vs. unbounded growth into multiple GB for the others).

    Using StreamWriter, for instance, solves the problem just fine (Add-Content is slow for any solution I've tried). But at this point, with the debugging we've done here, I'm really trying to understand the reason for the memory usage of Set-Content/Out-File. Is it really that they are just not cleaning up objects from the pipeline (or temporaries they've created from them) once they're done with them? This doesn't seem to be a problem intrinsic to cmdlets since Out-Null doesn't seem to have the same problem. Is it really just a case that Set-Content and Out-File are badly written (which would seem somewhat surprising)? Or is there a smarter way to pipe a large number of objects to a single cmdlet?

    Thanks...

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        $_
                    }
            } | Set-Content run_all.f81

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_
            } | Set-Content run_all.f81

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        $_
                    }
            } | Out-File -Encoding ASCII run_all.f81

    Get-ChildItem run_*\fort.81 |
        Select-Object -ExpandProperty FullName |
            ForEach-Object {
                [int32] $r = $_.SubString($_.IndexOf('run_')+4,3);
                Get-Content $_ |
                    ForEach-Object {
                        ("{0,11:D}" -f (([int32] $_.SubString(0,11))+40000*$r))+$_.SubString(11)
                    }
            } | Out-Null

    Tuesday, December 23, 2014 8:29 PM
  • I cannot reproduce your results.

    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Tuesday, December 23, 2014 10:46 PM