locked
PS to search for multiple string and exact match in a word document RRS feed

  • Question

  • Thanks to @albert ling for the below script

    #Stop-Process -Name 'WINWORD'
    $word = New-Object -ComObject Word.Application
    $source = 'D:\source'
    $destination = 'D:\destination'
    $docs = Get-ChildItem -Path $source | Where-Object {$_.Name -match '.docx'}
    foreach ($doc in $docs)
    {
        if ($word.Documents.Open($doc.FullName).Content.Find.Execute('Security'))
        {
            $word.Application.ActiveDocument.Close()
            Write-Host "$doc contains 'Security'"
            Move-Item -Path $doc.FullName -Destination $destination -Verbose
        }
        else
        {
            $word.Application.ActiveDocument.Close()
        }
    }

    I am trying to search for 3 different words in different doc and docx file if the mentioned string like "Security, OT,Resource" exactly contains in any of my word document, i am trying to relocate the same to network location, when i do search for "Security" its working fine, when i replace "OT" it brings all word like other,total etc, i want my search to just look for string if OT alone there not other and total etc strings.. i even tried this script too to find all the 3 word in one line like "if($word.Documents.Open($doc.FullName).Range().text -like  'Security' -or 'OT' -or 'Resource')"

    but this too didnt worked, any suggestion please!!


    Mohammed...

    Monday, November 6, 2017 4:58 PM

Answers

  • Hi Mohammed,

    Based on your situation, I modified the scripts to make it more consistent with your needs. First, I modified $_.Name -like '*.doc*' so that it can filter .doc and .docx. Also, I'd like to explain that the two $true in the Find.Execute method mean MatchCase and MatchWholeWord which specify the find text be case sensitive and locate only entire words. The following scripts and link for your reference, hope it is helpful to you.
    Stop-Process -Name 'WINWORD' -ErrorAction SilentlyContinue
    $word = New-Object -ComObject Word.Application
    $source = 'D:\source'
    $destination = 'D:\destination'
    $keyword1 = 'Security'
    $keyword2 = 'OT'
    $keyword3 = 'Resouce'
    $docs = Get-ChildItem -Path $source | Where-Object {$_.Name -like '*.doc*'}
    foreach ($doc in $docs)
    {
        $condition1 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword1,$true,$true)
        $condition2 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword2,$true,$true)
        $condition3 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword3,$true,$true)
        if ($condition1 -or $condition2 -or $condition3)
        {
            $word.Application.ActiveDocument.Close()
            Write-Host "$doc contains $keyword1 or $keyword2 or $keyword3"
            Move-Item -Path $doc.FullName -Destination $destination
        }
        else
        {
            $word.Application.ActiveDocument.Close()
            Write-Host "$doc does not contain $keyword1 or $keyword2 or $keyword3"
        }
    }
    Stop-Process -Name 'WINWORD' -ErrorAction SilentlyContinue
    Find.Execute Method (Word)
    https://msdn.microsoft.com/en-us/vba/word-vba/articles/find-execute-method-word

    Please let us know if you would like further assistance.

    Best Regards,
    Albert Ling

    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Wednesday, November 8, 2017 7:06 AM

All replies

  • use select-string (it very fast)

    https://social.technet.microsoft.com/Forums/ie/en-US/535ac157-bb82-4442-a24b-0a59751395a5/find-multiple-strings-in-text-files-powershell?forum=winserverpowershell

    or

    gci C:\Logs| select-string -pattern '(VendorEnquiry.*Failed)|(Failed.*VendorEnquiry)'


    Chris

    Monday, November 6, 2017 5:05 PM
  • DOCX files are binary files and cannot be manipulated using text methods.

    A DOCX file is an archive of XML files.  Change the extension to ZIP and double click to se the contents.


    \_(ツ)_/


    • Edited by jrv Monday, November 6, 2017 5:16 PM
    Monday, November 6, 2017 5:13 PM
  • @jrv I didn't make out your point of changing the extension to ZIP and double click to see the content..Is it possible to check for exact and particular wording like 'OT' in a docx file and move???

    I tried the option of 'Security' string, it work out perfectly, when i do the same for 2 letter search, it takes out wording which have total,other etc...


    Mohammed...

    Tuesday, November 7, 2017 12:53 AM
  • No. The file is a compressed collection of XML files and folders.  It cannot be searched.  The ZIP extension allows you to use the system zip service to open the file as an uncompressed file so you can see that what is in it is XML files.  When compressed as a DOCX file it is just compressed binary sequences that ae unreadable.

    Start by reading up on what a binary file is.  Next learn how computers use binary files.  After that read up on MS Office file formats to see how they are not readable.

    To read a Word file you must use the Word automation objects and the object methods and properties to find strings within the file.


    \_(ツ)_/

    Tuesday, November 7, 2017 12:59 AM
  • Thanks for all your helps friend..

    The below script working fine for me, just need one more help, is it possible to include 3 to 4 string to search in the same script, i tried including -or cmd and gave 'security' -or 'OT' -or 'Resource'  ,but it didnt work, any suggestion pls...

    Also i am trying this in production environment wanting to run this script during user login to system, i wish i could add Result.csv output file, which should contain the file that copied from source to destination and filtered details...

    $Path = "D:\PS"
    $word = New-Object -ComObject word.application
    $word.visible =$false
    $out =  "F:\New folder"
    $wordExts = '.doc','.docx'
    $findtext = 'Apprentice'
    $docs = Get-ChildItem -Path "$Path" -Recurse |Where-Object {$_.Extension -in $wordExts}
    foreach ($doc in $docs)
    {
    if($word.Documents.Open($doc.FullName).content.find.execute($findtext,$false,$true,$false,$false,$false,$true,1))
      {
      $word.Application.ActiveDocument.close()
      Write-Host "$doc contains 'mentioned wording'"
      Move-Item -Path $doc.FullName -Destination $out -Verbose
      }
      else
      {
      $word.Application.ActiveDocument.close()
      }
        }

    Thank you @jrv. but two more help in the same topic pls..


    Mohammed...

    Tuesday, November 7, 2017 8:33 AM
  • Hi Mohammed,

    Based on your situation, I modified the scripts to make it more consistent with your needs. First, I modified $_.Name -like '*.doc*' so that it can filter .doc and .docx. Also, I'd like to explain that the two $true in the Find.Execute method mean MatchCase and MatchWholeWord which specify the find text be case sensitive and locate only entire words. The following scripts and link for your reference, hope it is helpful to you.
    Stop-Process -Name 'WINWORD' -ErrorAction SilentlyContinue
    $word = New-Object -ComObject Word.Application
    $source = 'D:\source'
    $destination = 'D:\destination'
    $keyword1 = 'Security'
    $keyword2 = 'OT'
    $keyword3 = 'Resouce'
    $docs = Get-ChildItem -Path $source | Where-Object {$_.Name -like '*.doc*'}
    foreach ($doc in $docs)
    {
        $condition1 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword1,$true,$true)
        $condition2 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword2,$true,$true)
        $condition3 = $word.Documents.Open($doc.FullName).Content.Find.Execute($keyword3,$true,$true)
        if ($condition1 -or $condition2 -or $condition3)
        {
            $word.Application.ActiveDocument.Close()
            Write-Host "$doc contains $keyword1 or $keyword2 or $keyword3"
            Move-Item -Path $doc.FullName -Destination $destination
        }
        else
        {
            $word.Application.ActiveDocument.Close()
            Write-Host "$doc does not contain $keyword1 or $keyword2 or $keyword3"
        }
    }
    Stop-Process -Name 'WINWORD' -ErrorAction SilentlyContinue
    Find.Execute Method (Word)
    https://msdn.microsoft.com/en-us/vba/word-vba/articles/find-execute-method-word

    Please let us know if you would like further assistance.

    Best Regards,
    Albert Ling

    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Wednesday, November 8, 2017 7:06 AM
  • @Albert Ling, thanks you more Sir!!!

    I heartily thanks you for your great time taken to look into my query and reply your valuable answer, i were elaborating my script, this clear script makes more sense...

    I tried now in my Virtual machine its working great, i will perform the same tomorrow in my Production environment..

    I started powershell with Zero Knowledge now i can able to at least understand the command and eager to  write small scripts, I heartily thanks once again to all for taking your valuable time in reading and replying on my query...


    Mohammed...

    Wednesday, November 8, 2017 3:29 PM