none
Looking to replace text between two words - powershell 1.0 RRS feed

  • Question

  • An example of what I have is:

    <br>
    <br>
    <center><p><b>Hello out there!</b></p></center>
    I want to use the replace function "-replace" to replace what is between <center> and </center>. Is there a way I can do that without writing out everything between <center> and </center>?



    Oregon State

    Tuesday, May 27, 2014 6:04 PM

Answers

  • Technically you can, but using regex to parse HTML is notoriously error prone.

    Here's one pattern that could work with your sample text, as written:

    $test = @'
    <br>
    <br>
    <center><p><b>Hello out there!</b></p></center>
    '@
    
    $test -replace '(?<=<center>).*?(?=</center>)', 'This is my new text.'

    Edit:  I should point out that I didn't test this on PowerShell 1.0, as I don't even have a copy of that anymore.  You should upgrade to PowerShell 2.0 at a minimum these days, for any kind of support.

    Tuesday, May 27, 2014 6:09 PM

All replies

  • Technically you can, but using regex to parse HTML is notoriously error prone.

    Here's one pattern that could work with your sample text, as written:

    $test = @'
    <br>
    <br>
    <center><p><b>Hello out there!</b></p></center>
    '@
    
    $test -replace '(?<=<center>).*?(?=</center>)', 'This is my new text.'

    Edit:  I should point out that I didn't test this on PowerShell 1.0, as I don't even have a copy of that anymore.  You should upgrade to PowerShell 2.0 at a minimum these days, for any kind of support.

    Tuesday, May 27, 2014 6:09 PM
  • What about finding the positions of the center tags and replacing it that way?  

    Oregon State

    Tuesday, May 27, 2014 6:52 PM
  • David's code does that already (?).


    PS C:\> '<center><p><b>Hello out there!</b></p></center>' -replace '(?<=<center>).*?(?=</center>)', 'This is some text'
    <center>This is some text</center>
    


    -- Bill Stewart [Bill_Stewart]

    Tuesday, May 27, 2014 6:57 PM
    Moderator
  • OK, however, it is not working. It gets rid of the start and end tags and that's it.

    Here's what I'm trying to do:

    $remove1 = '(?<=<!--Start-->).*?(?=<!--End-->)'
    $remove2 = '<success>'
    
    Get-ChildItem 'C:\Documents and Settings\Desktop\Inetpub\wwwroot\*\login.aspx' | ForEach {
        
        (Get-Content $_.FullName) | ForEach {
        
            $_ -replace $remove1,$remove2
            
        } | Set-Content $_.FullName
    
    }

    Oregon State



    Tuesday, May 27, 2014 7:04 PM
  • Not sure what results you're expecting?


    PS C:\> '<!--Start-->text<!--End-->' -replace '(?<=<!--Start-->).*?(?=<!--End-->)','<success>'
    <!--Start--><success><!--End-->
    



    -- Bill Stewart [Bill_Stewart]

    Tuesday, May 27, 2014 7:14 PM
    Moderator
  • Yes, I tested it that way, and you're right, it does work. However, with my "Here's what I'm trying to do" it's not working. It does what I say it does.

    Sorry, you're right. It works.

    The error is within the parsing of the html

     


    Oregon State


    Tuesday, May 27, 2014 7:18 PM
  • I forgot to ask but what do the question marks mean?

    (?<=<!--Start-->).*?(?=<!--End-->)



    Oregon State

    Tuesday, May 27, 2014 7:25 PM
  • (?<=regex) and (?=regex) are the lookbehind and lookahead assertion operators.  They match a position in the text where the text immediately preceding or following the position matches the "regex" pattern that's within the parentheses, but the text matched by that pattern doesn't become part of the overall match.  This allows you to do the replace operation without having to inject the start / end patterns again yourself.

    An alternative would be to do it this way:

    $text = 'Start1 Inner Text End1'
    
    $text -replace '(Start\d*).*?(End\d*)', '$1 New Inner Text $2'

    In that pattern, I didn't use the lookahead / lookbehind assertions, and instead just allowed the "Start1" and "End1" text to become part of the match that's being replaced.  As a result, I had to include $1 and $2 in the replacement text to make sure those strings were still part of the output.
    Tuesday, May 27, 2014 7:36 PM
  • Documentation:

    http://msdn.microsoft.com/en-us/library/az24scfc.aspx

    You're looking for "Grouping Constructs".


    -- Bill Stewart [Bill_Stewart]

    Tuesday, May 27, 2014 7:37 PM
    Moderator