locked
Import-csv line by line RRS feed

  • Question

  • I need to read a CSV file in and act upon the data line by line.

    The code I have (in msdos) reads a CSV file in one line at a time and then depending upon a (text) date value in one field creates a new file and outputs the line to that. So from one CSV file you end up with several CSV files depending upon this date.

    import-csv seems to read it all in one one go so should I use something else for this problem?

     

    Monday, September 12, 2011 10:28 AM

Answers

  • Firsly, please excuse the undescriptive variable names. I wrote this in a hurry! But I've tested it on your data, and with more data similar to yours, and it works very well. (Revised as per Larry's suggestions).

     

    $t = Get-Content test.txt
    $l =@($t | Select-String -Pattern '^HDR,' | select -ExpandProperty LineNumber)
    
    for ($i = 0; $i -lt $l.count; $i++){
          if ($l[$i] -ne $l[$l.count-1]) {
            $a1 = ($l[$i]-1)..($l[$i+1]-2)
            $t1 = $t[$a1]
            $name = $t1[0] | Select-String '\d\d/\d\d/\d\d\d\d'
            $name = 'Order' + ($name.matches | select -ExpandProperty value)+ '.csv' -replace '/'
            Out-File -InputObject $t1 -FilePath $name
            } # end if
        else {
            $a1 =($l[$i]-1)..($t.count-1)
            $t1 = $t[$a1]
            $name = $t1[0] | Select-String '\d\d/\d\d/\d\d\d\d'
            $name = 'Order' + ($name.matches | select -ExpandProperty value)+ '.csv' -replace '/'
            Out-File -InputObject $t1 -FilePath $name
            } # end else
    } # end for
    
    

    1. It reads the csv file in as a text file, and establishes which lines have the string "HDR" in them. It store these line numbers in an array called $l.
     
    2. The next bit is difficult to explain. The for loop iterates for each item in the $l array. It checks to see if it is the last item in the array. If it is it executes the else statement.
     
    This code creates an array of integers for each number in the $l array. In your example, the arrays were 1..6 and 7..10. It then uses these as line numbers to create subsets of your text file (stored in variable $t).
     
    3. The other tricky thing was extracting a file name from the first line of the 'set'. For this I used Regular Expressions to search for a date format.

     


    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    • Edited by Bigteddy Tuesday, September 13, 2011 5:47 AM
    • Marked as answer by Richard G Kavanagh Tuesday, September 13, 2011 7:42 AM
    Tuesday, September 13, 2011 5:46 AM

All replies

  • Try this:

    $CSV = Import-CSV somefile.csv
    
    $CSV | %{
    if ($_.date -match "some date value"){output another field to file}
    else{do something else}
    }
    



    If you found this post helpful, please "Vote as Helpful". If it answered your question, remember to "Mark as Answer".

    Rich Prescott | MCITP, MCTS, MCP

    [Blog] Engineering Efficiency | [Twitter] @Rich_Prescott | [Powershell GUI] Client System Administration toolkit
    Monday, September 12, 2011 11:01 AM
  • Import-CSV will read your csv text file, and present it as an array of records.  Each record can be accessed individually, and each column can be accessed by name if the csv file has headings.

    This is best explained by a simple example:  I have a small csv file called 'batters.csv', which looks like this:

    Name,AtBats,Hits
    Ken Myer,43,13
    Pilar Ackerman,28,11
    Jonathan Haas,37,17
    Syed Abbas,41,20
    Luisa Cazzaniga,22,6
    Andrew Cencini,35,11
    Baris Cetinok,19,4

    Notice the column headers, Name, AtBats, Hits.

    Now I run this command:

    $csv = Import-CSV batters.csv

    $csv.count

    #returns the number of records in the file, exluding the header row (7)

    $csv[0]

    #returns the entire first record of the file:

    Name                                                 AtBats                                               Hits                                              
    ----                                                 ------                                               ----                                              
    Ken Myer                                             43                                                   13                                                

    PS C:\scripts> $csv[2].Name
    Jonathan Haas

    #returns the name of the third batter.

    I think you get the idea!  Import-CSV is very useful.


    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"


    • Edited by Bigteddy Monday, September 12, 2011 11:13 AM
    Monday, September 12, 2011 11:07 AM
  • I've got this so far (I couldn't get your method to work. It just output the lines of code within the brackets following the $csv |% ).

    Firstly it only outputs the first line of the first file, and secondly it's adding "" around everything so 1,2,3,4 becomes "1","2","3","4" in the output file.

                $csvfile = Import-Csv $pathCSV$item -header("c1","c2","c3","c4","c5","c6","c7","c8","c9","c10","c11","c12","c13","c14","OrdDate","PickDate","DelDate")
            
                foreach ($line in $csvfile)
                {
                    $line | Export-Csv $pathCSV"test.csv" -NoTypeInformation
                }

    If someone could point me in the right direction I'd be grateful!

    Monday, September 12, 2011 11:43 AM
  • You need to debug your script.  Comment out the whole 'foreach' block, and analyse the contents of $csvfile.
    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    Monday, September 12, 2011 11:53 AM
  • If your only reason for importing the file and exporting it again it to change the column headers, then this will work:

     

    $csvfile = Import-Csv $pathCSV$item -header("c1","c2","c3","c4","c5","c6","c7","c8","c9","c10","c11","c12","c13","c14","OrdDate","PickDate","DelDate")
    $csvfile = $csvfile[1..$csvfile.count] # This assumes the original file had column headers. 
    Export-Csv test.csv -NoTypeInformation -InputObject $csvfile
    

     


    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    • Edited by Bigteddy Monday, September 12, 2011 12:25 PM
    Monday, September 12, 2011 12:23 PM
  • When I try to run this

                Import-Csv $pathCSV$item -header("c1","c2","c3","c4","c5","c6","c7","c8","c9","c10","c11","c12","c13","c14","OrdDate","PickDate","DelDate") |

                foreach
                {
                      # do something
                }

     

    I get two pop-up windows (I'm assuming these relate to the two files in my directory) asking to supply values.

    Then it displays

           # do something

     

    Monday, September 12, 2011 12:44 PM
  • That was just an example of the structure, not real code.  If you can explain what you are trying to achieve, I think we could give you more specific answers.
    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    Monday, September 12, 2011 1:03 PM
  • I think I may be getting a bit confused here and also confusing everyone else! I'm going to post the original msdos script that I wrote and try to explain what it does. Maybe import-csv isn't the way to go and someone will know better than me!

    The files I'm reading in are CSV files that contain lines like this. The xx's aren't important at the moment! As you can see the HDR lines contain the date and I use this to output that line, and following lines (until the date changes) to a file named the date.

    HDR,xx,xx,xx,xx,xx,xx,xx,01/01/2000

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    HDR,xx,xx,xx,xx,xx,xx,xx,01/02/2000

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    From that file you'd end up with two separate files named thus

    Order01012000.csv

    HDR,xx,xx,xx,xx,xx,xx,xx,01/01/2000

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    Order01022000.csv

    HDR,xx,xx,xx,xx,xx,xx,xx,01/02/2000

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

    LNE,xx,xx,xx,xx,xx,xx,xx,xx

     

    Also for some reason allocating points isn't working for me at the moment so apologies so far, I'll correct this when it works again.

    -----

    My original msdos code

    rem based upon the pick date field.
    rem Output files are named OrderDDMMYYY.txt

    setlocal enabledelayedexpansion

    set csvPath=C:\Progra~1\AtlasM~1\CSV\
    set csvNew=C:\Progra~1\AtlasM~1\CSV\New\
    set csvArchive=C:\Progra~1\AtlasM~1\CSV\Archive\

    echo.

    rem Parse through each file in the folder (dir /b /a-d) lists files and ignores folders.
    for /f %%a in ('dir /b /a-d %csvPath%') do (

        echo Processing %%a

        rem Read in each line from file %%a and place in %%r
        for /f "tokens=* delims=," %%r in (%csvPath%%%a) do (

            rem Split the line into fields. %%i (field 1), %%j (field 13), %%k (field 14)
            rem %%j = order date, %%k = pick date,
            for /f "tokens=1,13,14 delims=," %%i in ("%%r") do (

                rem If this is a header then change the output file name to be that of the pick
                rem date in the HDR line.
                if %%i==HDR (
                
                    rem create filename based on pick date (%%k)
                    for /f "tokens=1,2,3 delims=/" %%x in ("%%k") do (
                        set ganame=Order%%x%%y%%z
                    )
                )
            )
            rem %%r>> is correct!   %%r >> won't work as it adds a space at the end of the line!
            echo %%r>> %csvNew%!ganame!.csv
        )

        rem Finally move the processed file into the archive folder
        move %csvPath%%%a %csvArchive%
    )

    Monday, September 12, 2011 1:05 PM
  • I started playing with this code example and have a few comments, some
    substantive, and some stylistic.
     
    $t = gc test.txt
    $l = @()
    $s = Select-String -Path test.txt -Pattern "HDR"
    $l += $s | select -ExpandProperty LineNumber
     
    can be coded as
     
    $t = gc test.txt
    $l = @($t | Select-String -Pattern '^HDR,' | select -ExpandProperty
    LineNumber)
     
    to isolate the external dependence to only the input file reference on
    that first line and eliminate the need for the $s variable.
     
    The @(...) wrapper is needed in case there is only one HDR line.
    It insures that $l will be an array.
     
    Also I used the match RE
    ^HDR,
    to avoid matching lines that had HDR embedded in them by coincidence,
    and used singlequotes since we are not using any string expansions.
     
     
    Monday, September 12, 2011 9:31 PM
  • I started playing with this code example and have a few comments, some
    substantive, and some stylistic.
     
    $t = gc test.txt
    $l = @()
    $s = Select-String -Path test.txt -Pattern "HDR"
    $l += $s | select -ExpandProperty LineNumber
     
    can be coded as
     
    $t = gc test.txt
    $l = @($t | Select-String -Pattern '^HDR,' | select -ExpandProperty
    LineNumber)
     
    Also I used the match RE
    ^HDR,
    to avoid matching lines that had HDR embedded in them by coincidence,
    and used singlequotes since we are not using any string expansions.
     
     


    Two good points, Larry.  I like the way you have written directly into the $l variable.  Also, good idea with the beginning-of-line marker on the RE.

    I will amend my code accordingly.

    I assume that if that's all you found lacking, that it works ok for you?

    I really do appreciate.  Thanks.


    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    • Edited by Bigteddy Tuesday, September 13, 2011 3:40 AM
    Tuesday, September 13, 2011 3:21 AM
  •  
    The @(...) wrapper is needed in case there is only one HDR line.
    It insures that $l will be an array.
     
    Not quite true.  The @() wrapper is there to ensure an array of numbers instead of simply adding them up arithmetically.  In other words, in the example given, the two lines of interest were 1 and 7.  Without the use of an array, the += operator would give the result 8, when what we want is 1, 7.

    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    Tuesday, September 13, 2011 3:37 AM
  • Firsly, please excuse the undescriptive variable names. I wrote this in a hurry! But I've tested it on your data, and with more data similar to yours, and it works very well. (Revised as per Larry's suggestions).

     

    $t = Get-Content test.txt
    $l =@($t | Select-String -Pattern '^HDR,' | select -ExpandProperty LineNumber)
    
    for ($i = 0; $i -lt $l.count; $i++){
          if ($l[$i] -ne $l[$l.count-1]) {
            $a1 = ($l[$i]-1)..($l[$i+1]-2)
            $t1 = $t[$a1]
            $name = $t1[0] | Select-String '\d\d/\d\d/\d\d\d\d'
            $name = 'Order' + ($name.matches | select -ExpandProperty value)+ '.csv' -replace '/'
            Out-File -InputObject $t1 -FilePath $name
            } # end if
        else {
            $a1 =($l[$i]-1)..($t.count-1)
            $t1 = $t[$a1]
            $name = $t1[0] | Select-String '\d\d/\d\d/\d\d\d\d'
            $name = 'Order' + ($name.matches | select -ExpandProperty value)+ '.csv' -replace '/'
            Out-File -InputObject $t1 -FilePath $name
            } # end else
    } # end for
    
    

    1. It reads the csv file in as a text file, and establishes which lines have the string "HDR" in them. It store these line numbers in an array called $l.
     
    2. The next bit is difficult to explain. The for loop iterates for each item in the $l array. It checks to see if it is the last item in the array. If it is it executes the else statement.
     
    This code creates an array of integers for each number in the $l array. In your example, the arrays were 1..6 and 7..10. It then uses these as line numbers to create subsets of your text file (stored in variable $t).
     
    3. The other tricky thing was extracting a file name from the first line of the 'set'. For this I used Regular Expressions to search for a date format.

     


    [string](0..9|%{[char][int](32+("39826578840055658268").substring(($_*2),2))})-replace "\s{1}\b"
    • Edited by Bigteddy Tuesday, September 13, 2011 5:47 AM
    • Marked as answer by Richard G Kavanagh Tuesday, September 13, 2011 7:42 AM
    Tuesday, September 13, 2011 5:46 AM
  • Hi If you want I have created a tool in powershell in the aim to export computer configuration (service - hotfix - drivers - process - software) to several format (csv - xml - html - xls). It also allows to compare 2 csv or 2 xml previously exported with the tool, and display results and difference in a html report and xls workbook.

    http://systanddeploy.blogspot.fr/2014/08/powershell-tool-list-compare-my-master.html

    Sunday, August 24, 2014 10:51 AM