I have a requirement wherein I want to remove all the Texts prior to a particular xml tag
Sample xml file sample.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE ichicsr SYSTEM "http://eudravigilance.ema.europa.eu/dtd/icsr21XML.dtd">
<ichicsr lang="en">
<tag1><tag1/>
<tag2><tag2/>
<tag3><tag3/>
</ichicsr>
As depicted in my sample.xml file, I want to create a new xml in a different path using sample.xml file, where I want to delete all the texts prior to tag <ichicsr lang="en">. so my target xml would be as below:
<ichicsr lang="en">
<tag1><tag1/>
<tag2><tag2/>
<tag3><tag3/>
</ichicsr>
or, in other words, I want my target xml file to have everything between the tags <ichicsr lang="en"> and </ichicsr>
</ichicsr>
Below is the code I am using, apparently is not working. Apparently, where and skipuntil doesnt work in Powershell version 2.0. Appreciate any workaround to this code. Please remember that we cannot perform any manipulations based on the line numbers because
the line numbering varies from file to file.
#set your directory
$file_temp = "C:\DTD_R2_RAW"
#grab your files
$xml_files = Get-ChildItem $file_temp *.XML -Recurse
#designate your keyword
$keyword = "my keyword"
#create your new 'keep' folder
New-Item -ItemType Directory C:\DTD_R2_RAW\Keep
#if there are files, do something...
if ($xml_files) {
#for each file, skip all characters until your find the keyword, then output everything from that point
ForEach ($x in $xml_files) {
$file = Get-Content -Path ($file_temp + '\' + $x.Name)
$keep = $file.Where({$_ -match $keyword}, 'SkipUntil') | Out-File C:\DTD_R2_RAW\keep\$($x.name)
}
}