locked
String Searching and Export Results RRS feed

  • Question

  • Hello.  This is my first post regarding PowerShell.  I know very, very little about PowerShell, and am hoping that someone can point me in the right direction.  I am trying to write a script that will go into a directory (and subdirectories) full of .txt files and look for a specific phrase.  The phrase can appear multiple times.  Here's an example:

    ABC*00**00*ZZ*CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING~JKLMNOP**BCD*AS*UYGUYG*HGVHGVHGV*CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING~

    So in this code example, the phrase appears twice.  The phrase begins with CLP and ends with NON-PARTICIPATING.  My code (below) will not work.  I am using regex and when I put in my wildcard for searching it treats the two CLP phrases above as one and returns everything in the middle, like this:

    CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING~JKLMNOP**BCD*AS*UYGUYG*HGVHGVHGV*CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING~

    instead of this:

    CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING

    CLP*1234567*2868.88*36.28**WX*NON-PARTICIPATING

    Also, I am trying to export all of the found phrases, along with the name of the file they were found in, to a .csv file (or text if I have to).  Here's my code so far:

    $searchDir = "C:\\Users\me\"
    $outputDir = "C:\Users\me\results\RESULT.csv" 
    $files = Get-ChildItem $searchDir | Where-Object { $_.FullName -like "*.txt"}
    Foreach ($file in $files) {
       $FilteredContent = Get-Content $file.FullName
       $regex = [regex]'CLP.*NON-PARTICIPATING'
                                    #$matches = ([regex]'CLP.*NON-PARTICIPATING').Matches($FilteredContent)
       $matches = $regex.Matches($FilteredContent)
       $Output = $matches
       $Output | Out-File $outputDir -Append

    Whoever reads this, I truly do appreciate your help.  This has been a real challenge for me (multiple days).  Thank you.


    Robert

    Monday, February 17, 2020 6:36 PM

All replies

  • This is a common issue when you try to use PowerShell before you have learned PowerShell.  YOu also have not read teh ehlp for the CmdLets.

    That said here is the way to do this in PowerShell.

    Get-ChildItem C:\\Users\me\* -include *.txt |
        Get-Content |
        Select-String -Pattern 'CLP.*NON-PARTICIPATING' |
        Select-Object -Expand Line |
        Out-File C:\Users\me\results\RESULT.csv


    \_(ツ)_/



    • Edited by jrv Monday, February 17, 2020 7:39 PM
    Monday, February 17, 2020 7:37 PM
  • Your regex is incorrect. You use of a "greedy" pattern (".*") causes the regex engine to match everything up to the end of the string, and then "backtrack" from there, looking for the "NON-PARTICIPATING". It finds the second instance of that string, not the first. That greediness will also match a zero-length string, which may not be your intention (unless you're looking for "CLPNON-PARTICIPATING" as a valid match).

    Try this regex instead:

        $regex = [regex]'CLP.+?NON-PARTICIPATING'

    The "+" matches "one or more" characters, and the "?" modifies the search to be non-greedy.

    Your $matches should now hold as many "phrases" as you expect.


    --- Rich Matheisen MCSE&I, Exchange Ex-MVP (16 years)

    Monday, February 17, 2020 8:49 PM
  • Thank you for your help.  I will try your suggestion.

    Robert

    Tuesday, February 18, 2020 1:30 PM
  • Thank you for your help.  I will try your suggestion as well.

    Robert

    Tuesday, February 18, 2020 1:30 PM
  • Your suggestion worked like a charm, and I thank you 

    Robert

    Tuesday, February 18, 2020 2:48 PM