none
Powershell - XML file query RRS feed

  • Question

  • Hello, I need to query an XML based file to extract info.  I have used the line below in a script with success,

    $Xml.category.machine | Select-Object -Property IP | format-table

    based on querying an XML file formatted as below

    <Category name="System Summary">
     <!--1. machine details-->
     <Machine test="22">
      <Name>Server1</Name>
      <IP>31.248.95.170</IP>

    However the data I need to query is in the format of below.  Can anyone help with how I can reference this to get output from below of Display_Name, State and Start_Mode (the 'Category' field is not unique in the 2nd file so ideally I would be able to search for Category name="System Summary" but this doesn't work for me.   Thanks!

    <Category name="Services">
    <Data>
    <Display_Name><![CDATA[Adobe Flash Player Update Service]]></Display_Name>
    <Name><![CDATA[AdobeFlashPlayerUpdateSvc]]></Name>
    <State><![CDATA[Stopped]]></State>
    <Start_Mode><![CDATA[Manual]]></Start_Mode>
    <Service_Type><![CDATA[Own Process]]></Service_Type>
    <Path><![CDATA[c:\windows\system32\macromed\flash\flashplayerupdateservice.exe]]></Path>
    <Error_Control><![CDATA[Normal]]></Error_Control>
    <Start_Name><![CDATA[LocalSystem]]></Start_Name>
    <Tag_ID><![CDATA[0]]></Tag_ID>
    </Data>
    <Data>

    Friday, January 10, 2014 8:00 PM

Answers

  • Only works with version 3 or later.

    $nfo=[xml](cat \temp\system.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data') |
         ForEach-Object{
              $p=@{
                   Name=$_.Name.innerText
                   DisplayName=$_.Display_Name.innerText
                   Path=$_.Path.innerText
                   State=$_.State.innerText
                   StartMode=$_.Start_Mode.innerText
              }
              New-Object PsObject -Property $p
         } |ft -auto


    ¯\_(ツ)_/¯

    • Marked as answer by Thumper71 Saturday, January 11, 2014 10:03 PM
    Saturday, January 11, 2014 12:51 PM

All replies

  • You can't it is not legal XML.


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 8:06 PM
  • After fixing the XML errors we have this:

    $txt=@'
    <Category name="Services">
     <Data>
     <Display_Name>
    <![CDATA[Adobe Flash Player Update Service]]>
    </Display_Name>
     <Name>
    <![CDATA[AdobeFlashPlayerUpdateSvc]]>
    </Name>
     <State>
    <![CDATA[Stopped]]>
    </State>
     <Start_Mode>
    <![CDATA[Manual]]>
    </Start_Mode>
    <Service_Type>
    <![CDATA[Own Process]]>
    </Service_Type>
     <Path><![CDATA[c:\windows\system32\macromed\flash\flashplayerupdateservice.exe]]></Path>
     <Error_Control><![CDATA[Normal]]></Error_Control>
     <Start_Name><![CDATA[LocalSystem]]></Start_Name>
     <Tag_ID><![CDATA[0]]></Tag_ID>
     </Data>
    </Category>
    '@
    
    $xml=[xml]$txt
    $xml.Category.Data | ft


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 8:10 PM
  • You can also do this which will unwind the CDATA section:

    $xml.Category.data.Path

    Normally I would write a transform for this but that is probably beyond your knowledge of XML.

    Use this pattern to extract the nodes

    $nodes=$xml.SelectNodes('//Data')
    $nodes|%{'{0}|{1}|{2}' -f $_.ChildNodes[0].'#cdata-section',$_.ChildNodes[1].'#cdata-section',$_.ChildNodes[2].'#cdata-section'}


    ¯\_(ツ)_/¯


    • Edited by jrv Friday, January 10, 2014 8:30 PM
    Friday, January 10, 2014 8:25 PM
  • This might be more intuitive:

    $nodes|%{'{0}|{1}|{2}' -f $_.Name.'#cdata-section',$_.Path.'#cdata-section',$_.DisplayName.'#cdata-section'}


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 8:33 PM
  • Thanks for quick reply jrv.  The file I'm trying to read is an nfo file (which I understand is XML format).  Your clean up and example code gets me much further but two problems I notice

    1) When I run $xml.Category.Data | ft the output is                                                                                                

    Display_Name                                                                                                
    Display_Name                                                                                                
    Display_Name etc                                                                                          

    How do I capture the CDATA text such as AdobeFlashPlayerUpdateSvc?

    Ultimately I am looking extract service name, state etc

    2) The nfo file itself contains several sections that contain Category and Data how is it possible to distinguish between <Category name="Hardware Resources"> and <Category name="Conflicts/Sharing">

    Thanks again

    Friday, January 10, 2014 8:36 PM
  • This is another way to simplify this:

    $nodes|%{'{0}|{1}|{2}' -f $_.Name.InnerText,$_.Path.InnerText,$_.DisplayName.InnerText}

    Of course 'InnerText' will only work on a CDATA section.


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 8:39 PM
  • The XML you gave me and the XML you are trying to report on are not at all alike.  YOu cannot  just arbitrarily cut xml out of a file and call it a representation.


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 9:27 PM
  • I provided the best examples I could without wishing to start an opening post with hundreds of lines and put anyone off from replying.  I tried to give an example of an XML file that I could query with powershell and the problems I have had with querying a different type of XML based file (nfo file).  Basically I just want to extract the services part out of the nfo - which it would seem is not as easy as I hoped. 
    Friday, January 10, 2014 9:47 PM
  • NFO files have a very inconsistent schema and are not meant to be easily used.  If you are not good at XML they will drive you nuts.

    The schema is a simplified linked list.  You have to start at the root and build down. Each layer allows you to build the query to get the next layer.

    If you look at the schema it can look like this:

    MSInfo.Category.Category.Category.Categroy.Data.Resource

    It worls like a path.

    \System\Hardware\Disk\....

    It is not documented anywhere that I have found in many years


    ¯\_(ツ)_/¯

    Friday, January 10, 2014 10:16 PM
  • Thanks for confirming.  Yes I would have to say I am not good at xml!  Unfortunately I have a need to try and get some data out of nfo files produced from external customers.  If I export the NFO to txt manually I can then extract the data I need but I can't see a way to script this conversion to txt file so I started to look if powershell could viably read nfo it as xml.  Possible dead-end I fear now though, but if you know of a way to export the nfo to txt that would viable too!  Thanks for all help :-)
    Friday, January 10, 2014 10:39 PM
  • Just open the nfo on any system and choose export from the file menu.


    ¯\_(ツ)_/¯

    Saturday, January 11, 2014 12:14 AM
  • You can also learn XPath and XQuery then do this:

    $nfo=[xml](cat \temp\system.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data')


    ¯\_(ツ)_/¯

    Saturday, January 11, 2014 12:23 AM
  • Here is a more comprehensive decode example.

    $nfo=[xml](cat \temp\system.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data') |
         ForEach-Object{
              $p=[ordered]@{
                   Name=$_.Name.innerText
                   DisplayName=$_.Display_Name.innerText
                   Path=$_.Path.innerText
                   State=$_.State.innerText
                   StartMode=$_.Start_Mode.innerText
              }
              New-Object PsCustomObject -Property $p
         } |ft -auto

    You have to create each query carefully and then build the decode block to unwrap the CDATA.


    ¯\_(ツ)_/¯

    Saturday, January 11, 2014 12:36 AM
  • PS C:\scripts> $nfo.selectNodes("//Data[Item='Processor']/Value").innerText
    Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 3401 Mhz, 4 Core(s), 8 Logical Processor(s)


    ¯\_(ツ)_/¯

    Saturday, January 11, 2014 12:54 AM
  • Thanks jrv.  When I ran $nfo=[xml](cat c:\main.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data')

    I got several lines of below so it seems to be accessing the right part of the nfo file - which is great.

    Display_Name  : Display_Name
    Name          : Name
    State         : State
    Start_Mode    : Start_Mode
    Service_Type  : Service_Type
    Path          : Path
    Error_Control : Error_Control
    Start_Name    : Start_Name
    Tag_ID        : Tag_ID

    When I ran

    $nfo=[xml](cat \temp\system.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data') |
         ForEach-Object{
              $p=[ordered]@{
                   Name=$_.Name.innerText
                   DisplayName=$_.Display_Name.innerText
                   Path=$_.Path.innerText
                   State=$_.State.innerText
                   StartMode=$_.Start_Mode.innerText
              }
              New-Object PsCustomObject -Property $p
         } |ft -auto

    I got lots of messages like below:

    New-Object : Cannot validate argument on parameter 'Property'. The argument is null or empty. Supply an argument that is not null or em
    pty and then try the command again.
    At line:11 char:46
    +           New-Object PsCustomObject -Property <<<<  $p
        + CategoryInfo          : InvalidData: (:) [New-Object], ParameterBindingValidationException
        + FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.NewObjectCommand
     
    Unable to find type [ordered]: make sure that the assembly containing this type is loaded.
    At line:4 char:23
    +           $p=[ordered] <<<< @{
        + CategoryInfo          : InvalidOperation: (ordered:String) [], RuntimeException
        + FullyQualifiedErrorId : TypeNotFound

    Any ideas?

    Saturday, January 11, 2014 9:50 AM
  • Only works with version 3 or later.

    $nfo=[xml](cat \temp\system.nfo)
    $nfo.SelectNodes('//Category[@name="Software Environment"]/Category[@name="Services"]/Data') |
         ForEach-Object{
              $p=@{
                   Name=$_.Name.innerText
                   DisplayName=$_.Display_Name.innerText
                   Path=$_.Path.innerText
                   State=$_.State.innerText
                   StartMode=$_.Start_Mode.innerText
              }
              New-Object PsObject -Property $p
         } |ft -auto


    ¯\_(ツ)_/¯

    • Marked as answer by Thumper71 Saturday, January 11, 2014 10:03 PM
    Saturday, January 11, 2014 12:51 PM
  • Many Thanks for your help with this
    Saturday, January 11, 2014 10:04 PM
  • Many Thanks for your help with this

    You will find that each different element will require a new and possibly more complex query.


    ¯\_(ツ)_/¯

    Saturday, January 11, 2014 10:06 PM