locked
Convert CSV to XML RRS feed

  • Question

  • Im having a formatting issue with the following code, for some reason the source CSV file format is not being maintained, for example if i have 10 columns labeled "name1, name2, ...., name10" and i run the following code the output XML files is only 4 columns "Type, Property, Name, Type2"

    I need to be able to maintain the column names so that the xml is correct.

    My code:

    $csv = Get-Content "c:\temp\327326.csv"
    $obj = $csv | convertfrom-csv -Delimiter " "
    $xml = $obj | convertto-xml
    $xml.InnerXml | Out-File -Width 80 "c:\temp\products.xml"

    Monday, January 13, 2014 9:35 PM

Answers

  • For completion here is how to build a raw XML document which is the safe way to build the document declaration.  For simple things this is not usually used but when you are automating it can be necessary to control the language and other elements of the document.

    #build raw document and set sml declaration
    $xml=New-Object System.Xml.XmlDocument
    $xmlDecl=$xml.CreateXmlDeclaration('1.0','en-US','yes')
    [void]$xml.AppendChild($xmlDecl)
    
    # add the document element
    $root=$xml.CreateElement('root')
    [void]$xml.AppendChild($xmlDecl)
    
    # add nodes
    # $root.AppendChild($el)
    


    ¯\_(ツ)_/¯

    • Marked as answer by Ahmad Alkaysey Wednesday, January 15, 2014 1:59 PM
    Tuesday, January 14, 2014 8:44 PM
  • #Author: Ahmad Alkaysey
    #Convert CSV FILES TO XML
    #Test if directory has and csv files, if yes then proceed
    if((test-path c:\temp\*csv) -eq 1){
        $csvpath= get-childitem c:\temp\*.csv| foreach-object { $_.FullName }
        $csv=Import-Csv $csvpath
        $xml=[xml]'<csvobjects/>'
        $p=$csv|gm -MemberType NoteProperty|%{$_.Name}
        $csv|%{
             $row=$xml.CreateElement('row')
            $r=$_;
            $p|%{
                 $el=$xml.CreateElement($_)
                 $el.InnerText=$r.($_)
                 [void]$row.appendChild($el)
             }
             [void]$xml.DocumentElement.AppendChild($row)
        }
        $currdate=get-date -f MM_dd_yyyy
        $f= "$($currdate).xml"
        $xml.Save($f)
    
    #######################################################
    
    #move files when done
        Get-ChildItem C:\temp -Filter *.csv | Where-Object {!$_.PSIsContainer} | Foreach-Object{
        $currdate=get-date -f MM_dd_yyyy
        $dir=md -Path "C:\temp\$currdate" -Force
        $_ | Move-Item -Destination $dir -Force
            }
        Get-ChildItem $dir -Filter *.csv | Where-Object {!$_.PSIsContainer} | Foreach-Object{
        $dest = Join-Path $_.DirectoryName (($_.BaseName -split '_')[2])
                if(!(Test-Path -Path $dest -PathType Container)){
                    $null = md $dest
                }
                $_ | Move-Item -Destination $dest -Force
             }
        }
    #######################################################
    #if the directory does not have any .csv files then break
    else {
       break
    }
    


    • Marked as answer by Ahmad Alkaysey Wednesday, January 15, 2014 10:13 PM
    Wednesday, January 15, 2014 10:13 PM

All replies

  • You are way over complicating a very easy conversion.

    $xml=import-csv users.csv  |convertto-xml
    $xml.Save('myfile.xml')


    ¯\_(ツ)_/¯

    • Proposed as answer by jrv Monday, January 13, 2014 9:41 PM
    • Unproposed as answer by Ahmad Alkaysey Monday, January 13, 2014 10:16 PM
    Monday, January 13, 2014 9:41 PM
  • Ok but that still does not recognized the "," delimiter AND does not preserve columns in the xml version, it leaves me with 2 columns "property, name" not the 10 i need.
    Monday, January 13, 2014 9:55 PM
  • It absolutely preserves the columns.  What do you mean it does not recognize the coma.  There are no commas in XML.

    CSV-

    Firstname;LastName;EmailAddress;Department;Manager;Password;Account
    Mary;Jones;mjones@home.net;depatment1;John boy;Pass12Word34;maryjones

    XML of CSV

    <Objects>
      <Object Type="System.Management.Automation.PSCustomObject">
        <Property Name="Firstname" Type="System.String">Mary</Property>
        <Property Name="LastName" Type="System.String">Jones</Property>
        <Property Name="EmailAddress" Type="System.String">mjones@home.net</Property>
        <Property Name="Department" Type="System.String">depatment1</Property>
        <Property Name="Manager" Type="System.String">John boy</Property>
        <Property Name="Password" Type="System.String">Pass12Word34</Property>
        <Property Name="Account" Type="System.String">maryjones</Property>
      </Object>
    </Objects>

    Notice all columns are accounted for.


    ¯\_(ツ)_/¯

    Monday, January 13, 2014 10:04 PM
  • i see, here is an example of what im getting:

    <Property Name="name1">Bob</Property> <Property Name="name2">Audrey</Property> <Property Name="name3">Joe</Property> I need it to look like this:

    <Name1>Bob</Name1> <name2>Audrey</name2> <name3>Joe</name3>


    You see the difference? 

    PS: the comma delimeter is in the CSV, if you open in notepad you will see the column separator is a comma.


    Monday, January 13, 2014 10:16 PM
  • The XML is what CSV converted to XML looks like.  What would you expect?


    ¯\_(ツ)_/¯

    Monday, January 13, 2014 11:11 PM
  • ok, see what i mean, below are two dropbox links, one for the original CSV and the other for the XML. Open the XML in excel and you will see what im talking about. 

     just copy paste links to your browser

    Original CSV

    https://www.dropbox.com/s/32fwx3zivqaykyk/327326.csv

    output XML 

    https://www.dropbox.com/s/4b40taboozz5evt/products.xml

    i need to maintain the same layout so to speak, so if the original CSV has 7 columns, the output xml has to also have 7 columns.



    Tuesday, January 14, 2014 2:07 PM
  • If convertto-xml doesn't do what you want, you can construct your own output. For example:


    $rows = import-csv "test1.csv"
    $output = @('<xml version="1.0"?>','<objects>')
    $columnNames = $rows.PSObject.Properties | select-object -expandproperty Name
    foreach ( $columnName in $columnNames ) {
      foreach ( $row in $rows ) {
        $output += "  <{0}>{1}</{0}>" -f $columnName,$row.$columnName
      }
    }
    $output += '</objects>'
    $output | out-file "test1.xml"
    

    Bill

    Tuesday, January 14, 2014 3:26 PM
  • that code does not produce anything, i just get a blank XML, i made the necessary adjustments to the import-csv path and same for the output, but it gives a blank XML.

    FYI the columns will not always be the same they are always changing, sometimes i may have 10 columns, other times i may have 20 etc... so i cant set them to static variables.

    Tuesday, January 14, 2014 3:39 PM
  • Well, then you will need to experiment to get the output you need.

    Remember, this forum is a peer-to-peer support forum and there is no service-level agreement that guarantees someone will write the exact code you need. You will have to take a look at the examples provided and make an effort at your own solution.

    If you need someone to write code for you to spec, then you will need to hire a consultant.

    Bill

    Tuesday, January 14, 2014 3:49 PM
  • Of course i understand that. This forum is great and has helped me NUMEROUS times in the past, did you look at the output that convertto-xml produced (in one of my above posts w/ the dropbox links)?

    I just need a way to maintain the original formatting, thats my main problem right now...

    Tuesday, January 14, 2014 4:01 PM
  • ok, see what i mean, below are two dropbox links, one for the original CSV and the other for the XML. Open the XML in excel and you will see what im talking about. 

     just copy paste links to your browser

    Original CSV

    https://www.dropbox.com/s/32fwx3zivqaykyk/327326.csv

    output XML 

    https://www.dropbox.com/s/4b40taboozz5evt/products.xml

    i need to maintain the same layout so to speak, so if the original CSV has 7 columns, the output xml has to also have 7 columns.



    This is exactly what I posted for you. It just has no line breaks in it.

    Here is a formatted version of your dropbox XML:

    <?xml version="1.0"?>
    <Objects>
    <Object>
    <Property Name="col1">bob</Property>
    <Property Name="col2">nancy</Property>
    <Property Name="col3">Joe</Property>
    <Property Name="col4">richard</Property>
    <Property Name="col5">anthony</Property>
    <Property Name="col6">natalie</Property>
    <Property Name="col7">jack</Property>
    </Object>
    </Objects>
    

    Look at what I posted earlier.  It is exactly the same.


    ¯\_(ツ)_/¯

    Tuesday, January 14, 2014 4:26 PM
  • Of course i understand that. This forum is great and has helped me NUMEROUS times in the past, did you look at the output that convertto-xml produced (in one of my above posts w/ the dropbox links)?

    I just need a way to maintain the original formatting, thats my main problem right now...


    CML is not formatted.  It is just a special text representation of data.  Excel does not deal well with XML without a schema.  XML doe not use commas in its data.  The ML example file you linked does not have an commas in it an it is the same format as the output of convertto-xml produces.

    ¯\_(ツ)_/¯

    Tuesday, January 14, 2014 4:37 PM
  • OK. The example I posted earlier was flawed. This example might work better. Suppose test1.csv:


    "Name1","Name2","Name3"
    "Bob","Audrey","Joe"
    "Frank","Fred","Tom"

    Script to output as XML data:


    $rows = import-csv "test1.csv"
    $output = @('<?xml version="1.0"?>','<objects>')
    $columnNames = ($rows | select-object -first 1).PSObject.Properties | select-object -expandproperty Name
    for ( $i = 0; $i -lt $rows.Count; $i++ ) {
      $output += "  <row{0}>" -f ($i + 1)
      foreach ( $columnName in $columnNames ) {
        $output += "    <{0}>{1}</{0}>" -f $columnName,$rows[$i].$columnName
      }
      $output += "  </row{0}>" -f ($i + 1)
    }
    $output += '</objects>'
    $output

    It produces output like this:


    <?xml version="1.0"?>
    <objects>
      <row1>
        <Name1>Bob</Name1>
        <Name2>Audrey</Name2>
        <Name3>Joe</Name3>
      </row1>
      <row2>
        <Name1>Frank</Name1>
        <Name2>Fred</Name2>
        <Name3>Tom</Name3>
      </row2>
    </objects>

    Bill

    Tuesday, January 14, 2014 5:52 PM
  • jrv, is there anyway i can set "Property Name" to something else ? 

    perhaps instead of this:

    <Property Name="col1">bob</Property>


    it should look like this:

    <col1>bob</col1>


    and what that will do it will give the xml if opened in excel its own column with header col1. So from what im seeing here is that i need to re-map Property name to the header on the csv for the top of each column, do you see what i mean ?

    Tuesday, January 14, 2014 5:58 PM
  • Suppose test1.csv:
    "Name1","Name2","Name3"
    "Bob","Audrey","Joe"
    "Frank","Fred","Tom"

    Script to output as XML data:


    $rows = import-csv "test1.csv"
    $output = @('<?xml version="1.0"?>','<objects>')
    $columnNames = ($rows | select-object -first 1).PSObject.Properties | select-object -expandproperty Name
    for ( $i = 0; $i -lt $rows.Count; $i++ ) {
      $output += "  <row{0}>" -f ($i + 1)
      foreach ( $columnName in $columnNames ) {
        $output += "    <{0}>{1}</{0}>" -f $columnName,$rows[$i].$columnName
      }
      $output += "  </row{0}>" -f ($i + 1)
    }
    $output += '</objects>'
    $output

    It produces output like this:


    <?xml version="1.0"?>
    <objects>
      <row1>
        <Name1>Bob</Name1>
        <Name2>Audrey</Name2>
        <Name3>Joe</Name3>
      </row1>
      <row2>
        <Name1>Frank</Name1>
        <Name2>Fred</Name2>
        <Name3>Tom</Name3>
      </row2>
    </objects>

    Bill


    Tuesday, January 14, 2014 6:03 PM
  • Bills method works but I prefer using the XML object to generate XML.  It is safer since all tags are guaranteed to be set correctly.

    $csvpath='c:\scripts\testcsv.csv'
    $csv=Import-Csv $csvpath
    $xml=[xml]'<csvobjects/>'
    $p=$csv|gm -MemberType NoteProperty|%{$_.Name}
    $csv|%{
         $row=$xml.CreateElement('row')
         $r=$_;
         $p|%{
             $el=$xml.CreateElement($_)
             $el.InnerText=$r.($_)
             [void]$row.appendChild($el)
         }
         [void]$xml.DocumentElement.AppendChild($row)
    }
    $f="$csvpath.xml"
    $xml.Save($f)
    # open XML in default program (usuallly IE)
    &$f
    


    ¯\_(ツ)_/¯

    Tuesday, January 14, 2014 6:59 PM
  • I agree with jrv, using the XML object should be preferred. Here's another example based on that:


    $rows = import-csv "test1.csv"
    $columnNames = $rows | get-member -membertype NoteProperty |
      select-object -expandproperty Name
    $xml = [Xml] '<?xml version="1.0"?><objects />'
    foreach ( $row in $rows ) {
      $xmlRow = $xml.CreateElement("row")
      foreach ( $columnName in $columnNames ) {
        $xmlElement = $xml.CreateElement($columnName)
        $xmlElement.InnerText = $row.$columnName
        [Void] $xmlRow.AppendChild($xmlElement)
      }
      [Void] $xml.DocumentElement.AppendChild($xmlRow)
    }
    $xml
    

    This example includes the XML header and uses foreach instead of % (which is really ForEach-Object).

    Bill



    • Edited by Bill_Stewart Tuesday, January 14, 2014 7:42 PM Typo
    Tuesday, January 14, 2014 7:38 PM
  • guys i cannot thank you enough, that worked out perfectly. I will be using jrv's method, however i have one final question.

    Current code:

    $csvpath= 'c:\temp\*.csv'
    $csv=Import-Csv $csvpath
    $xml=[xml]'<csvobjects/>'
    $p=$csv|gm -MemberType NoteProperty|%{$_.Name}
    $csv|%{
         $row=$xml.CreateElement('row')
         $r=$_;
         $p|%{
             $el=$xml.CreateElement($_)
             $el.InnerText=$r.($_)
             [void]$row.appendChild($el)
         }
         [void]$xml.DocumentElement.AppendChild($row)
    }
    $f="c:\temp\test1.xml"
    $xml.Save($f)
    # open XML in default program (usuallly IE)
    &$f

    as you can see my path is c:\temp\*.csv so basically the script will take any file with .csv extension and add it to the xml. What i want to know now is how to create a separate xml for each csv file.

    Tuesday, January 14, 2014 8:32 PM
  • You cannot use a wildcard in this script.

    I suggest that you first learn how the file system works then read up on how to use PowerShell with the file system.

    There are hundreds of examples to help out if you go to the learning link.


    ¯\_(ツ)_/¯

    Tuesday, January 14, 2014 8:42 PM
  • For completion here is how to build a raw XML document which is the safe way to build the document declaration.  For simple things this is not usually used but when you are automating it can be necessary to control the language and other elements of the document.

    #build raw document and set sml declaration
    $xml=New-Object System.Xml.XmlDocument
    $xmlDecl=$xml.CreateXmlDeclaration('1.0','en-US','yes')
    [void]$xml.AppendChild($xmlDecl)
    
    # add the document element
    $root=$xml.CreateElement('root')
    [void]$xml.AppendChild($xmlDecl)
    
    # add nodes
    # $root.AppendChild($el)
    


    ¯\_(ツ)_/¯

    • Marked as answer by Ahmad Alkaysey Wednesday, January 15, 2014 1:59 PM
    Tuesday, January 14, 2014 8:44 PM
  • I see, so there is no workaround for this >?

    perhaps "foreach-object{ } for the import-csv call ?

    Tuesday, January 14, 2014 9:39 PM
  • Study this example:


    get-childitem *.csv | foreach-object { $_.FullName }

    Bill

    Tuesday, January 14, 2014 9:42 PM
  • Beat me to it Bill, i figured it out since i had another script i wrote a while ago. Anyway, since thats figured out i have 1 last thing that just came to mind. Suppose i want the output xml file to maintain the same filename is the source csv file, what is a good way to achieve this ?

    I was thinking $f="c:\temp\$csvpath.xml" but that wont work im sure...


    Tuesday, January 14, 2014 10:56 PM
  • I just added .xml to the end of the original file name if you look closely at my original code.


    ¯\_(ツ)_/¯

    Tuesday, January 14, 2014 11:47 PM
  • No longer works since $csvpath=get-childitem c:\temp\*.csv | foreach-object { $_.FullName } If you notice in my last comment I mentioned that I tried $f="c:\temp\$csvpath.xml" and that did not work.
    Wednesday, January 15, 2014 1:32 AM
  • I repeat.  You need to spend some time learning the basics.  We can help you but only if you help yourself.  Having us incrementally write each line of code for you is not the purpose of this forum.  We are here to assist technicians who are actively using script or who are actively learning script.

    If you had spent some time learning the very basic bits of PowerShel you would know how to add an extension to a file name otf the format of $_.Fullname.

    "$($_.Fullname).xml"

    This is PowerShell 101. It is your job to learn this.  This site has ample training materials and there are many excellent books on how to use PowerShell and how the Windows file system works.


    ¯\_(ツ)_/¯

    Wednesday, January 15, 2014 2:16 AM
  • Sorry for the lack of updates, 10 minutes after posting my last comment i figured out the issue. Thanks for all the help guys :)
    Wednesday, January 15, 2014 1:59 PM
  • #Author: Ahmad Alkaysey
    #Convert CSV FILES TO XML
    #Test if directory has and csv files, if yes then proceed
    if((test-path c:\temp\*csv) -eq 1){
        $csvpath= get-childitem c:\temp\*.csv| foreach-object { $_.FullName }
        $csv=Import-Csv $csvpath
        $xml=[xml]'<csvobjects/>'
        $p=$csv|gm -MemberType NoteProperty|%{$_.Name}
        $csv|%{
             $row=$xml.CreateElement('row')
            $r=$_;
            $p|%{
                 $el=$xml.CreateElement($_)
                 $el.InnerText=$r.($_)
                 [void]$row.appendChild($el)
             }
             [void]$xml.DocumentElement.AppendChild($row)
        }
        $currdate=get-date -f MM_dd_yyyy
        $f= "$($currdate).xml"
        $xml.Save($f)
    
    #######################################################
    
    #move files when done
        Get-ChildItem C:\temp -Filter *.csv | Where-Object {!$_.PSIsContainer} | Foreach-Object{
        $currdate=get-date -f MM_dd_yyyy
        $dir=md -Path "C:\temp\$currdate" -Force
        $_ | Move-Item -Destination $dir -Force
            }
        Get-ChildItem $dir -Filter *.csv | Where-Object {!$_.PSIsContainer} | Foreach-Object{
        $dest = Join-Path $_.DirectoryName (($_.BaseName -split '_')[2])
                if(!(Test-Path -Path $dest -PathType Container)){
                    $null = md $dest
                }
                $_ | Move-Item -Destination $dest -Force
             }
        }
    #######################################################
    #if the directory does not have any .csv files then break
    else {
       break
    }
    


    • Marked as answer by Ahmad Alkaysey Wednesday, January 15, 2014 10:13 PM
    Wednesday, January 15, 2014 10:13 PM
  • Unfortunately that is not going to work as you expect.  Look at what is happening very carefully.

    When we test we do not need to use equivalnces

    The results of "Test" CmdLets is always a Boolean by design.

    if((test-path c:\temp){...

    Test-Path cannot test a wild card correctly.

    The following:

    $csvpath= get-childitem c:\temp\*.csv| foreach-object {$_.FullName }
        $csv
    =Import-Csv$csvpath

    Will attempt to merge all files into one file.  This will lose data as only the first file columns will be valid.

    I recommend that you spnd some time learning thebasics of PowerShell as all of your mistakes are due to lack of basic knowledge of scrioting and PowerShell.


    ¯\_(ツ)_/¯

    Wednesday, January 15, 2014 10:22 PM
  • There was some miscommunication between the team im dealing with and myself. Apparently it doesnt matter, this script will run once a day at 11:59pm daily. It will process all the files with .csv extension in the given directory and convert them all as well as merge them into the xml with filename "current date", then it will move all the CSV files to a directory with name "current date". so it will work out for what i need it for
    Wednesday, January 15, 2014 10:35 PM
  • Your originally said that the files were not all the same.  If they are all the same then this whole exercise was a bit of a waste of time.

    If you really do have a bunch of identical CSV files they can be merged and converted very easily.

    I still suggest that you learn the basics of PowerShell.  This will all be much easier if you learn the tool.


    ¯\_(ツ)_/¯

    Wednesday, January 15, 2014 10:49 PM
  • jrv,  I am... and i really appreciate all the help. Originally I was under the impression that all the files were to be separate but after some clarification it didnt matter because one of the fields in the xml will be the identifier for where the data goes so it wouldnt matter if they were all separate or all combined into one :)
    Thursday, January 16, 2014 12:53 AM
  • jrv,  I am... and i really appreciate all the help. Originally I was under the impression that all the files were to be separate but after some clarification it didnt matter because one of the fields in the xml will be the identifier for where the data goes so it wouldnt matter if they were all separate or all combined into one :)

    Well good luck.  Get some books.  You will find PowerShell very helpful once you nail the basics.  It is not hard and takes only a few hours a week for a month to get it nailed.

    ¯\_(ツ)_/¯

    Thursday, January 16, 2014 1:08 AM