# Selecting from string

### Question

• I would need to select all characters to the end of string starting from a given pattern. How can this be done?

yaro

Tuesday, May 15, 2018 9:02 AM

• You have certainly omitted a lot in your question

#text file with you're information
$file = Get-Content .\text.txt #Pattern to match everything after a given word, (SERVEPTP1) in this case$pattern ="^.*(?<=(SERVEPTP1))(?s)(.*$)" #Regex Object [System.Text.RegularExpressions.Regex]$regex = New-object System.Text.RegularExpressions.Regex -ArgumentList ($pattern, [System.Text.RegularExpressions.RegexOptions]::IgnoreCase) #Do the Matches [System.Text.RegularExpressions.MatchCollection]$collection = $regex.Matches($file)
#Define an empty string for the output
[string]$result=[System.String]::Empty #just get the group 1 and 2 (1 it's the word) and (2 is the rest of the text)$result = "$($collection.Groups.GetValue(1))$($collection.Groups.GetValue(2))"

#Print it
$result #remove all variables but result. Remove-Variable file,pattern,regex,collection The result is: • Edited by Thursday, May 17, 2018 12:17 AM • Proposed as answer by Thursday, May 17, 2018 12:17 AM • Marked as answer by Monday, May 21, 2018 8:21 AM Thursday, May 17, 2018 12:16 AM ### All replies • 'this is an abc string' -match '(abc.*)'$matches[1]

\_(ツ)_/

• Proposed as answer by Tuesday, May 15, 2018 10:00 AM
Tuesday, May 15, 2018 9:08 AM
• Sorry forgot to mention it's a multiline string i'm talking about.

yaro

Tuesday, May 15, 2018 11:05 AM
• $lines | ForEach-Object{ 'this is an abc string' -match '(abc.*)'$matches[1]

}


\_(ツ)_/

Tuesday, May 15, 2018 11:20 AM
• Sorry jrv but I'm not sure if we're on the same page. Let's assume my string is a whole page from a book and what I'd like to do is to find a combination of characters in the text that I know occurs only once on that page. When the word is found I'd like to select all the rest of the page including the word and put it in a new variable. I was hoping there is some way to easily define end of string including new lines so the code goes right to the last dot.

yaro

Wednesday, May 16, 2018 9:06 AM
• You are making a lot of assumptions about a page of text that just don't happen to be true.  A page does not have a "last dot" that is at the end of the page.  A file of text has no pages.  Formatted text files like Word are not text.  Word and other structured documents may have pages that are defined by the program that creates the document.

You can learn Regular Expressions and use them to extract text in almost any way you want with some effort.

The code I posted will select a string of characters to the end of thee line or the end of the page depending on the options you set in the RegEx.

\_(ツ)_/

Wednesday, May 16, 2018 9:10 AM
• It's nothing to do with formatted text. It's a config file in this case I need to parse. OK so in your example $lines is a string consisting of lines of text. In that case what's ﻿﻿'this is an abc string' ? shouldn't it just be$_ ?

yaro

Wednesday, May 16, 2018 9:21 AM
• Parsing lines in a config file is done based on the format of the config file.

Post an example of the config file so I can see the format.

\_(ツ)_/

Wednesday, May 16, 2018 9:26 AM
• Oh it's a simple multiline text nothing special to it.

e.g. where I would need to select all following "SERVERPTP1 (﻿" but the number of lines following this pattern may vary which is why I'm looking to somehow tell PS to go to end of string rather than to select say $lines[8-12] SOURCE0 ( ) f PTPDOMAIN=0; PTPCLIENTVERSION=2; IFACE= e t h 0 ; g 2 SOURCE1 ( ) f NTPSERVER= 1 0 . 5 . 3 . 4 5 ; g 3 SERVEPTP0 ( ) f 4 PTPSERVERVERSION=2 5 PTPSERVERDOMAIN=0 8 6 PTPSERVERSYNCRATE=0.9 7 IFACE= e t h 1 8 } 9 SERVEPTP1 ( ) f 10 PTPSERVERVERSION=2 11 PTPSERVERDOMAIN=0 12 PTPSERVERSYNCRATE=0.9 13 IFACE= e t h 2 14 } yaro Wednesday, May 16, 2018 9:57 AM • Just read in a loop until the SERVERPTP1 is detected then output all lines to the end. This has nothing to do with documents and pages. I don't think the file has line numbers. loop until SERVERPTP1 and output plus all subsequent lines. \_(ツ)_/ Wednesday, May 16, 2018 10:02 AM • If you use -Raw parameter of Get-Content: Get-Content$File -Raw
the text is not stored in an array but in a simple string. Linebreaks are stored as characters. Now, you can use multiline Regex-patterns to select content. If you still want to use single-line patterns, you can replace the linebreaks with any sign (of course a sign that is not used in text or a combination that does not occur before, e.g. "##newline##"). After finding your text, you can replace this sign back to linebreaks.
Wednesday, May 16, 2018 10:35 AM
• You have certainly omitted a lot in your question

#text file with you're information
$file = Get-Content .\text.txt #Pattern to match everything after a given word, (SERVEPTP1) in this case$pattern ="^.*(?<=(SERVEPTP1))(?s)(.*$)" #Regex Object [System.Text.RegularExpressions.Regex]$regex = New-object System.Text.RegularExpressions.Regex -ArgumentList ($pattern, [System.Text.RegularExpressions.RegexOptions]::IgnoreCase) #Do the Matches [System.Text.RegularExpressions.MatchCollection]$collection = $regex.Matches($file)
#Define an empty string for the output
[string]$result=[System.String]::Empty #just get the group 1 and 2 (1 it's the word) and (2 is the rest of the text)$result = "$($collection.Groups.GetValue(1))$($collection.Groups.GetValue(2))"

#Print it
$result #remove all variables but result. Remove-Variable file,pattern,regex,collection The result is: • Edited by Thursday, May 17, 2018 12:17 AM • Proposed as answer by Thursday, May 17, 2018 12:17 AM • Marked as answer by Monday, May 21, 2018 8:21 AM Thursday, May 17, 2018 12:16 AM • j0rt3g4 could you please reveal the symbols in$pattern in plain English :)? ^.*

would this be everything from start and .*$everything to end? Then ?<=(SERVEPTP1) would be looking for an occurrence of ﻿﻿SERVEPTP1 including that occurance? not sure about ?s. BTW the stuff in square brackets always puts me back not looking very powershelly and legible ;) I know it's just my poor knowledge of PS but am I right in this case it's a way of compressing something that would require much more code? yaro Thursday, May 17, 2018 9:21 AM • You can get it on the web: regex101.com if you put the regular expression there. • Edited by Friday, May 18, 2018 5:23 AM Friday, May 18, 2018 5:19 AM • How about this? The variable "$s" will contain "rn" line terminators.

The "(?sm)" regex modifier will match single and multi-lines.

# returns all characters FOLLOWING the last 'hello'
$p = "(?sm)^.+hello(.+)$"
# returns the last 'hello' and all characters FOLLOWING the last 'hello'
$p1 = "(?sm)^.+(hello)(.+)$"

$s = Get-Content c:\temp\Lines.txt -raw$r = $s -replace "$p", '$1'$r1 = $s -replace "$p1", '$1$2'

--- Rich Matheisen MCSE&I, Exchange Ex-MVP (16 years)

Sunday, May 20, 2018 3:46 AM