none
Parse a file to extract Distinguished name RRS feed

  • Question

  • I have an XML file (lets call it test.xml) where part of it looks like this

    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">

    There are a bunch of lines similar to this and the rest of the xml is stuff I don't care about.

    How can I get the output to be just this:

    "CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"

    There are actually several Distinguished names I want out of that file. Basically grab the stuff betwee dn= and > symbol

    I tried this select-string ./test.xml -pattern "dn" but it splits the output after the space into another line, and I only want the DN. I tried regular expressions, but I can't make regular expressions to save my butt.

    This i what I have so far.

    $input_path = './test.xml'
    $output_file = './out.txt'
    $regex = '[\dn=]{1}([A-Za-z.]*)[\>"]{1}'
    select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

    Can someone help me with the proper $regex line?

    Saturday, February 14, 2015 7:41 PM

Answers

  • If you want to use regex for this, the [regex]::Matches static method is actually easier than Select-String, IMHO.

    @'
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">
    
    '@ | Set-Content test.xml
    
    $input_path = './test.xml'
    $output_file = './out.txt'
    
    $text = Get-Content $input_path -Raw
    
    [regex]::Matches($text,'dn="(.+?)"') |
    foreach {$_.groups[1].value} |
    Set-Content $output_file
    
    Get-Content $output_file
    
    CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local
    CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Saturday, February 14, 2015 8:08 PM
    Moderator
  • The easiest and most reliable way to extract from xml is like this:

    $txt=@'
    <someroot>
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
    </someroot>
    '@
    $xml=[xml]$txt
    $xml.SelectNodes('//export-error')|select -expand dn
    


    ¯\_(ツ)_/¯

    Saturday, February 14, 2015 9:31 PM

All replies

  • If you want to use regex for this, the [regex]::Matches static method is actually easier than Select-String, IMHO.

    @'
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">
    
    '@ | Set-Content test.xml
    
    $input_path = './test.xml'
    $output_file = './out.txt'
    
    $text = Get-Content $input_path -Raw
    
    [regex]::Matches($text,'dn="(.+?)"') |
    foreach {$_.groups[1].value} |
    Set-Content $output_file
    
    Get-Content $output_file
    
    CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local
    CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local


    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

    Saturday, February 14, 2015 8:08 PM
    Moderator
  • I got this to work:

    $regex = '(dn=\")(.+)(\">)'
    select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value.SubString(4).TrimEnd('">') } > $output_file
    


    Richard Mueller - MVP Directory Services

    Saturday, February 14, 2015 8:24 PM
    Moderator
  • The easiest and most reliable way to extract from xml is like this:

    $txt=@'
    <someroot>
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
    <export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
    </someroot>
    '@
    $xml=[xml]$txt
    $xml.SelectNodes('//export-error')|select -expand dn
    


    ¯\_(ツ)_/¯

    Saturday, February 14, 2015 9:31 PM