none
Import CSV and clean up each column. RRS feed

  • Question

  • Hi, I have a messy column in a CSV file it looks like this

    column1,column2,column3
    2013-01-01,Servername,dnsfsdf=""4324"" tcpsdsd=""334455"" DataTransferred=""43556677""

    how do I clean up column3 and remove EVERYTHING except the datatransferred numeric amount?  If it was in single quotes I can do, I can't figure out the double quote weirdness.  please help

    jmh

    Saturday, April 12, 2014 8:28 PM

Answers

  • Like this:

    import-csv col3.csv|%{if($_.column3 -match 'DataTransferred=""(\d+)'){$matches[1]}}


    ¯\_(ツ)_/¯

    • Marked as answer by jh hj Sunday, April 13, 2014 8:27 PM
    Saturday, April 12, 2014 8:44 PM
  • this will work but only if you are correct about the format.

    import-csv col3.csv|%{if($_.column3 -match 'incremental.*DataTransferred=""(\d+)'){$matches[1]}}


    ¯\_(ツ)_/¯

    • Marked as answer by jh hj Sunday, April 13, 2014 8:27 PM
    Sunday, April 13, 2014 2:40 PM

All replies

  • Like this:

    import-csv col3.csv|%{if($_.column3 -match 'DataTransferred=""(\d+)'){$matches[1]}}


    ¯\_(ツ)_/¯

    • Marked as answer by jh hj Sunday, April 13, 2014 8:27 PM
    Saturday, April 12, 2014 8:44 PM
  • What scripting tool did you intend to use? Batch? VBScript? PowerShell? Other? Here is one solution:

    @echo off
    for /F "delims=" %%a in (test.txt) do call :sub %%a
    goto :eof

    :Sub
    set x=%*
    echo %x:""=%

    Saturday, April 12, 2014 8:45 PM
  • Sorry I neglected to mention I was using Powershell. thank you for the replies.
    Sunday, April 13, 2014 1:34 AM
  • I tried what you said.  It pulls 1 digit from the bunch, it pulled the number 7 for some reason which is the second number in my test file. 
    Sunday, April 13, 2014 2:25 AM
  • I tried what you said.  It pulls 1 digit from the bunch, it pulled the number 7 for some reason which is the second number in my test file. 

    Impossible if you gave me the exact same file.  I put your data in a file and it worked exactly like you asked.

    Either your file is different or you copied it wrong.

    POst at least 4 or 5 lines of the exact file.  Copy it exactly from your file.


    ¯\_(ツ)_/¯


    • Edited by jrv Sunday, April 13, 2014 2:37 AM
    Sunday, April 13, 2014 2:36 AM
  • i'll check it in the morning to see.  maybe I'm doing something wrong.... if it still doesn't work i'll paste an exact copy of the data I'm trying to clean.  thanks again for the help.
    Sunday, April 13, 2014 4:58 AM
  • ok, I found what the issue is.  it's a backup log which I'm parsing and turns out there's two Datatransferred= in it.  it's the second one I want not the first. sorry for the half-explanation, I didn't see that.  my csv is like this :

    column1,column2,column3
    2013-01-01,Servername,dnsfsdf=""4324"" bootdrive=""1"" DataTransferred=""7"" accesspath=""C:"" incremental=""yes"" DataTransferred=""4738596987""

    that's why I was getting '7' as my answer.  what I want is the data amount after C: not the bootdrive, sorry about that, and thanks for the help.

    Sunday, April 13, 2014 2:17 PM
  • Have you really analyzed this?  Is this really a CSV or do you just think it looks like one?

    If it is a ragged, tagged log this could go on repeatedly.


    ¯\_(ツ)_/¯

    Sunday, April 13, 2014 2:32 PM
  • this will work but only if you are correct about the format.

    import-csv col3.csv|%{if($_.column3 -match 'incremental.*DataTransferred=""(\d+)'){$matches[1]}}


    ¯\_(ツ)_/¯

    • Marked as answer by jh hj Sunday, April 13, 2014 8:27 PM
    Sunday, April 13, 2014 2:40 PM
  • this seems to have worked beautifully.  I changed 'incremental.'  with 'c:.'  and it seems to have worked too. will that be ok do you think?  ultimately I want the data amount from c:  not the boot drive.  thanks again.

    And to answer your question about the csv file, it IS a csv for sure.  it originated from a Windows Event log, parsed into XML and then dumped into CSV.  it's not pretty but it gets me my daily report to show successful backups.  There's tons of people asking how to do this on the net, but no one has a good method.  with your help on the last part, I think I got it.  if you're curious in the whole script I've done let me know as I'm sure it's quite ugly.... anyway, Thanks Again.   

    jmh

    Sunday, April 13, 2014 8:27 PM
  • Like I said.  The data is not predictable.  I cannot say what will happen without knowing how the data is being generated and doing an analysis of the files.

    ¯\_(ツ)_/¯

    Sunday, April 13, 2014 10:07 PM