Answered Parse text from a text file

  • Friday, March 30, 2012 8:44 PM
     
     

    I've looked at the other topics, which were unhelpful.  I need to Parse a limited amount of text from a text file to input to another text file.

    EXAMPLE:
    1: 0 2: 0 3: 0 4: 0 5: 0
    In the fifth inning Jones hit a home run.
    In the fifth inning Smith committed an error.
    In the fifth inning two RBIs came in.
    Total RBIs: 22
    Team 1 score: 1

    There would be 20 of these listed in the text file.  I need PowerShell to only copy three lines and input it into another text file:

    1: 0 2: 0 3: 0 4: 0 5: 0
    Total RBIs: 22
    Team 1 score: 1

    Everything else can be skipped.

All Replies

  • Friday, March 30, 2012 11:02 PM
     
     Answered Has Code

    Something like this should work based on the info you provided...

    $Outputlog = 'newfile.txt'
    Switch -regex -File ("<path to text file>") {
        "^(?:\d:\s\d\s){4}\d:\s\d$" {
            $_ | Out-File -Append $outputlog
        }
        "^Total\sRBIs:\s\d{1,2}$" {
            $_ | Out-File -Append $outputlog
        }
        "^Team\s\d?\sscore:\s\d?$" {
            $_ | Out-File -Append $outputlog
        }
        Default {}
    }
    


    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

  • Sunday, April 01, 2012 12:53 AM
     
     Answered
    gc input.txt | ? {$_ -notmatch '^In'} | out-file output.txt
     
  • Sunday, April 01, 2012 8:07 PM
     
      Has Code

    Something like this should work based on the info you provided...

    $Outputlog = 'newfile.txt'
    Switch -regex -File ("<path to text file>") {
        "^(?:\d:\s\d\s){4}\d:\s\d$" {
            $_ | Out-File -Append $outputlog
        }
        "^Total\sRBIs:\s\d{1,2}$" {
            $_ | Out-File -Append $outputlog
        }
        "^Team\s\d?\sscore:\s\d?$" {
            $_ | Out-File -Append $outputlog
        }
        Default {}
    }


    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

    Some minor questions regarding your regex pattern:

    What does the ?: mean in this context?

    ^(?:\d:\s\d\s){4}\d:\s\d$

    Which values does the pattern d? allow? Just single digit numbers or also more?

    "^Team\s\d?\sscore:\s\d?$" 

    thank you and regards

    alex


  • Sunday, April 01, 2012 8:45 PM
     
      Has Code

    The \d? will match one instance of the digit. In hindsight, I probably would have went with "\d{1,}" to match at least 1 or more instances of a digit. Regardless, it would still match and process the text.

    The ?: does not capture that group within the (). Not really a big deal in this case, but something that I have been using lately for matches.

    Without the ?:

    "123.124.125" -match "(\d{1,3}\.){2}\d{1,3}"
    $Matches

    Name                           Value
    ----                           -----
    1                              124.
    0                              123.124.125

    With the ?:

    "123.124.125" -match "(?:\d{1,3}\.){2}\d{1,3}"
    $matches
    Name                           Value
    ----                           -----
    0                              123.124.125

    You can find out more about these using Get-Help about_regular_expressions

    Hope this helps...


    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

  • Monday, April 02, 2012 12:19 AM
     
      Has Code

    Tried this script for parsing text only out of a massive document, but I received the error message below:

    Missing condition in switch statement clause.
    At C:\Users\Sean\parsefile.ps1:12 char:2
    + } <<<< 
        + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
        + FullyQualifiedErrorId : MissingSwitchConditionExpression

  • Monday, April 02, 2012 12:48 AM
     
      Has Code

    Tried this script for parsing text only out of a massive document, but I received the error message below:

    Missing condition in switch statement clause.
    At C:\Users\Sean\parsefile.ps1:12 char:2
    + } <<<< 
        + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
        + FullyQualifiedErrorId : MissingSwitchConditionExpression


    Can you post the full code that you are using?

    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

  • Monday, April 02, 2012 2:54 PM
     
      Has Code

    The \d? will match one instance of the digit. In hindsight, I probably would have went with "\d{1,}" to match at least 1 or more instances of a digit. Regardless, it would still match and process the text.

    The ?: does not capture that group within the (). Not really a big deal in this case, but something that I have been using lately for matches.

    Without the ?:

    "123.124.125" -match "(\d{1,3}\.){2}\d{1,3}"
    $Matches

    Name                           Value
    ----                           -----
    1                              124.
    0                              123.124.125

    With the ?:

    "123.124.125" -match "(?:\d{1,3}\.){2}\d{1,3}"
    $matches
    Name                           Value
    ----                           -----
    0                              123.124.125

    You can find out more about these using Get-Help about_regular_expressions

    Hope this helps...


    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

    Thank you for your really helpfull answer but still i dont quite understand why this ends up with one match of (124.). For me it appears a bit arbitrary that it just adds (124.) and not also (123.). What did i misinterpret? :)

    "123.124.125" -match "(\d{1,3}\.){2}\d{1,3}"
    $Matches

    Name                           Value
    ----                           -----
    1                              124.
    0                              123.124.125

  • Monday, April 02, 2012 3:22 PM
     
      Has Code

    Yeah, thanks.  My apologies; I had forgotten a bracket.  However, this is the code I ran, and it worked, however I am unable to locate the output file (I used windows search because it didn't land in the location it stated):

    $Outputlog = 'C:\thenewfile.txt'
    Switch -regex -File ("C:\results.txt") {
        "^Program\s\d?$" {
            $_ | Out-File -Append $outputlog
        }
        "^User\s\d?$" {
            $_ | Out-File -Append $outputlog
        }
        "^Last\s\d?$" {
            $_ | Out-File -Append $outputlog
        }
        Default {}
    }

  • Monday, April 02, 2012 5:46 PM
     
     
    That's because it didn't match anything.
    Substitute
    Default {$_ | Out-File -Append $outputlog}
    and it will create the output file, probably as a copy of the input file.
     
    Then you can change it back and debug your matching logic.
     
    Also, I'd use singlequotes around the strings, especially the RE's.
     
  • Monday, April 02, 2012 11:13 PM
     
     Answered Has Code


    Thank you for your really helpfull answer but still i dont quite understand why this ends up with one match of (124.). For me it appears a bit arbitrary that it just adds (124.) and not also (123.). What did i misinterpret? :)

    "123.124.125" -match "(\d{1,3}\.){2}\d{1,3}"
    $Matches

    Name                           Value
    ----                           -----
    1                              124.
    0                              123.124.125

    The reason why it only grabs the 124. is because it was the last match in that capture group (\d{1,3{\.) Just like the following example will match the last item (125.) in the capture group (which is why I use the ?: to disregard it completely in my solution):

    "123.124.125.126." -match "(\d{1,3}\.){3}"
    $matches

    Name                           Value
    ----                           -----
    1                              125.
    0                              123.124.125.



    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

  • Tuesday, April 03, 2012 7:38 PM
     
      Has Code


    Thank you for your really helpfull answer but still i dont quite understand why this ends up with one match of (124.). For me it appears a bit arbitrary that it just adds (124.) and not also (123.). What did i misinterpret? :)

    "123.124.125" -match "(\d{1,3}\.){2}\d{1,3}"
    $Matches

    Name                           Value
    ----                           -----
    1                              124.
    0                              123.124.125

    The reason why it only grabs the 124. is because it was the last match in that capture group (\d{1,3{\.) Just like the following example will match the last item (125.) in the capture group (which is why I use the ?: to disregard it completely in my solution):

    "123.124.125.126." -match "(\d{1,3}\.){3}"
    $matches

    Name                           Value
    ----                           -----
    1                              125.
    0                              123.124.125.



    Boe Prox

    Please remember to mark the best solution as the answer using Mark as Answer. If you find a solution to be helpful, please use Vote as Helpful.

    Looking for a script? Check out the Script Repository
    Need a script written for you? Submit a request at the Script Request Page

    Thank you sir, really helpful explanation as usual .. :-)