none
read and split a text file RRS feed

  • Question

  • Hi ,

    I have a text file (results.txt) which I would like to try and split using powershell. The file is about 5 meg and is formated as below :

     

    Host:somehostname001

    data

    data

    data

    Host:somehostname002

    data
    data

    data

    Host:somehostname005

     

    Ideally I would like to create a seperate file named after the Host: value ie somehostname001.txt amd have it contain all the data until the next host: section

    so something like

    $f = get-content ‘results.txt’
    foreach ($line in $f)
    {
    #check each line and write to a file based on host name
    }

    so you've guessed by now I'm pretty new to scripting and need a hand with the check each line bit .. all ideas welcomed ..

     

    Thursday, October 28, 2010 1:38 PM

Answers

  • Beat me to it...

    This is what I used:

    $file = (GC .\Test.txt)
    ForEach ($line in $file) {
      If ($line -match "^Host:\w+") {
        $newfile = "$($line.Split(':')[1]).txt"
        }
      Else {
        $line | Out-File -Append $newfile
        }    
      }
    
    • Marked as answer by ecco_ecco Thursday, October 28, 2010 2:44 PM
    Thursday, October 28, 2010 2:01 PM
  • If there's a space in your data after the "Host:" (eg Host: hostname.com) it will do what you describe.

    If that's the case, this should fix it:

    $f = get-content ‘results.txt’
    $temp = @()
    foreach ($line in $f){
        if ($line -match "Host:\s*(\S+)" -and $temp){
            $temp | out-file ($host + '.txt')
            $host = $matches[1]
            $temp = @()
            }
           
         else {if ($line){$temp += $line}}
      } 




    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
    • Marked as answer by ecco_ecco Thursday, October 28, 2010 2:54 PM
    Thursday, October 28, 2010 2:49 PM

All replies

  • Not tested.

     

    $f = get-content ‘results.txt’
    $temp = @()
    foreach ($line in $f){
        if ($line -match "Host:(\S+)" -and $temp){
            $temp | out-file $host.txt
            $host = $matches[1]
            $temp = @()
            }
           
         else {if ($line){$temp += $line}}
      } 


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
    Thursday, October 28, 2010 1:53 PM
  • Ack.  Ignore that first one. 

    $f = get-content ‘results.txt’
    $temp = @()
    foreach ($line in $f){
        if ($line -match "Host:(\S+)" -and $temp){
            $temp | out-file ($host + '.txt')
            $host = $matches[1]
            $temp = @()
            }
           
         else {if ($line){$temp += $line}}
      } 



    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
    Thursday, October 28, 2010 1:57 PM
  • Beat me to it...

    This is what I used:

    $file = (GC .\Test.txt)
    ForEach ($line in $file) {
      If ($line -match "^Host:\w+") {
        $newfile = "$($line.Split(':')[1]).txt"
        }
      Else {
        $line | Out-File -Append $newfile
        }    
      }
    
    • Marked as answer by ecco_ecco Thursday, October 28, 2010 2:44 PM
    Thursday, October 28, 2010 2:01 PM
  • With a 5MB input file, doing an out-file -append on each data line is going to be a lot a disk I/O, IMHO.
    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
    Thursday, October 28, 2010 2:11 PM
  • Good point.
    Thursday, October 28, 2010 2:29 PM
  • thanks both of you for your quick posts. mjolinor when I run yours it just seems to hang on something with no output or error

    Boe prox when i run yours it responds with

    Out-File : Cannot bind argument to parameter 'FilePath' because it is null

    so I played around with the match string and got it working when I changed it to

     If ($line
     -match
     "Host:"
    ) {

    I can follow most of what is doing but can you explain what "^Host:\w+" does specifically the ^and the \w+   ?

     

    thanks again as splitting the 200 odd entries would have taken me ages . : -)

     

    Thursday, October 28, 2010 2:41 PM
  • If there's a space in your data after the "Host:" (eg Host: hostname.com) it will do what you describe.

    If that's the case, this should fix it:

    $f = get-content ‘results.txt’
    $temp = @()
    foreach ($line in $f){
        if ($line -match "Host:\s*(\S+)" -and $temp){
            $temp | out-file ($host + '.txt')
            $host = $matches[1]
            $temp = @()
            }
           
         else {if ($line){$temp += $line}}
      } 




    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
    • Marked as answer by ecco_ecco Thursday, October 28, 2010 2:54 PM
    Thursday, October 28, 2010 2:49 PM
  • The "^" states that it must match at the start of the line while the \w+ means that it will match any "word" after the Host:.

    This link will give you some more information and probably explain it better than me.

    http://www.regular-expressions.info/

    Thursday, October 28, 2010 2:55 PM
  • It is not recommended to store/collect large file content in variables, yes 5MB is not that bad, but still, PowerShell's switch statement can read directly from disk by passing the file's path to its File parameter, it's also faster than the pipeline when dealing with large files. Try this:

    switch -regex -file results.txt {

     '^Host:\s*(.+)$' {

      # if one already exists add content to it before getting new name 

      if ($local:file) {Add-Content $file $data -Encoding UTF8}

      # set data variable Mpty String Array

      $data = New-Object String[] 0

      # get file name

      $file = $Matches[1] + '.txt'

     }

     default {

     # collect data, skip blank lines

     $data += if ($_.Trim()) {$_}

     }

    }

    # add the final content to a file

    Add-Content $file $data -Encoding UTF8

    # cleanup

    Remove-Variable data, file


      Robert Robelo  
    Thursday, October 28, 2010 6:18 PM
  • [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    rob_campbell@centraltechnology.net

    good!


    Wilton Soken

    Friday, August 17, 2012 7:05 PM
  • You realize I do that instead of just posting it on my sig so the spam spiders don't slurp it up, right?


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Friday, August 17, 2012 7:08 PM