locked
import-csv remove carriage return in column RRS feed

  • Question

  • Hi! I'm fairly new to powershell. I've been having a bit of trouble with import-csv and I hope you guys can help me. I have a list of csv files(mostly winevent log reports). Some of them have a column named "Message" and some don't. What I would like to do is if the csv file contains the column "Message", remove the \r\n in the values.

    I tried to do it this way but it doesn't work...what am I doing wrong?

    (this is inside a foreach loop, looping through csv files in my csv file list)

    import-csv $SourceFile | select * -ExcludeProperty "__mv_*","_mkv_*"| Foreach-Object {
                                        foreach ($property in $_.PSObject.Properties){
                                            if($property.Name -match "Message"){
                                               $_.Message = $_.Message -replace "`r`n",' '
                                            }
                                        }
                                    }| export-csv -path $TempFileUTF8 -Encoding UTF8 NoTypeInformation

    this noob needs your expertise!


    Wednesday, June 22, 2016 12:52 AM

Answers

  • I will repeat. The line break is not a return/newline.  It is just  newline.  By using a pair nothing will be replaced.

    If you just replace then nothing will be saved;

    Simple method:

    import-csv $SourceFile |
    	ForEach-Object{
    		$_.Message = $_.Message -replace "\n", ' '
    		$_  # output the record
    	} | 
    	Export-Csv $TempFileUTF8 -Encoding UTF8 -NoTypeInformation
    This will update the record and create a new file.


    \_(ツ)_/

    • Proposed as answer by Hello_2018Moderator Thursday, June 23, 2016 9:14 AM
    • Marked as answer by jh_tokyo Friday, June 24, 2016 12:01 AM
    Wednesday, June 22, 2016 11:20 AM
    Moderator

All replies

  • A true CSV file does not require this.  The line breaks are allowed in a CSV. There is never a need to remove them.

    \_(ツ)_/

    Wednesday, June 22, 2016 1:23 AM
    Moderator
  • Actually, I want to replace the \r\n because some files are fairly large and it takes forever to open them...

    Wednesday, June 22, 2016 2:17 AM
  • Don't display them on the console.  It doesn't make sense to do this.  It will take longer to fix the files.Also yuse Foramt-Table at the console as it will truncate the field and speed up the output.


    \_(ツ)_/

    Wednesday, June 22, 2016 2:22 AM
    Moderator
  • You can actually just eliminate "Message" from the output as all of the data is there anyway.  "Message" is just redundant.


    \_(ツ)_/

    Wednesday, June 22, 2016 2:23 AM
    Moderator
  • A true CSV file does not require this.  The line breaks are allowed in a CSV. There is never a need to remove them.


    \_(ツ)_/

    This is ridiculous. Just more rubbishell concept!... from rubbishell coder #3.

    If a field has a carriage-return/line-feed characters in it, then, a single line of data is SPLITTED in 2 lines of malformed data.

    Read the definition and rules of a CSV. All characters are allowed in quotes.  Databases frequently have quotes and other control characters in a text field.  We use this to move data to and from Excel and databases.  It never causes an issue.

    Read the post.  The issue is slowness. This is a side effect of console throttling in PowerShell which is well documented.


    \_(ツ)_/

    Wednesday, June 22, 2016 2:57 AM
    Moderator
  • Maybe I wasn't very clear with my question.. 

    I export the file to a remote destination for my user to be able to view it. The problem is, because this particular 'message' column has \r\n in its values, it takes 10mins+ to open a single file... 

    Wednesday, June 22, 2016 3:37 AM
  • Maybe I wasn't very clear with my question.. 

    I export the file to a remote destination for my user to be able to view it. The problem is, because this particular 'message' column has \r\n in its values, it takes 10mins+ to open a single file... 

    How are they opening it?


    \_(ツ)_/

    Wednesday, June 22, 2016 4:05 AM
    Moderator
  • they open it with excel...
    Wednesday, June 22, 2016 4:31 AM
  • So the Excel is opened by PowerShell?  If you turn of recalc in Excel it will open faster.

    Excel always tries to format the screen a cell at a time.  This will be very slow as it has to reformat the screen for every cell.  Turn of autocalc and auto format and it will open faster.

    The best thing is to not use Excel at all just use HTML or a Windows form with a DataGridView control.


    \_(ツ)_/

    Wednesday, June 22, 2016 4:37 AM
    Moderator
  • I don't think that the Message has Return and NewLine.  I think  it has only NewLine.


    \_(ツ)_/

    Wednesday, June 22, 2016 4:39 AM
    Moderator
  • I tried doing this 

     import-csv $SourceFile | select * -ExcludeProperty "__mv_*","_mkv_*" |% {
                                      foreach ($property in $_.PSObject.Properties)
                                        {
                                             if($property.Name -eq "Message"){
                                                $property.Value = $property.Value -replace "\r\n",' '
                                             }
                                        }                                      
                                    }| export-csv -path $TempFileUTF8 -Encoding UTF8 -NoTypeInformation -Append

    but I get blank csv files (0kb)... Should I breakdown the pipeline? I'm not sure how to do it..

    Please help..

    Wednesday, June 22, 2016 5:38 AM
  • Your pipeline result must be empty. If you're setting variables in a pipeline (what you're doing with "="), the pipeline stream is broken, since it is redirected into the variable. You can avoid this by sending the actual element again. Try both, the first will deliver nothing, the last will:

    1,2,3 | foreach{$a = $_}
    
    1,2,3 | foreach{$a = $_; $_}
    For simplicity's sake, you should do this in a foreach($a in $b){...} loop and not in foreach object.

    Wednesday, June 22, 2016 7:49 AM
  • For your original problem: if the file is large, I would read it line by line with a StreamReader. Only work with the actual line as a string. You can get values by 

    $value = ($line -split ";")[$index]
    when $index is the position of your message property. This can be found by analyzing the header. Then you can do this:
    $NewValue = $Value -replace "\n\r"
    $NewLine = $Line -replace $Value, $NewValue
    Write this to a new output file with a StreamWriter or with Add-Content.
    Wednesday, June 22, 2016 7:54 AM
  • I will repeat. The line break is not a return/newline.  It is just  newline.  By using a pair nothing will be replaced.

    If you just replace then nothing will be saved;

    Simple method:

    import-csv $SourceFile |
    	ForEach-Object{
    		$_.Message = $_.Message -replace "\n", ' '
    		$_  # output the record
    	} | 
    	Export-Csv $TempFileUTF8 -Encoding UTF8 -NoTypeInformation
    This will update the record and create a new file.


    \_(ツ)_/

    • Proposed as answer by Hello_2018Moderator Thursday, June 23, 2016 9:14 AM
    • Marked as answer by jh_tokyo Friday, June 24, 2016 12:01 AM
    Wednesday, June 22, 2016 11:20 AM
    Moderator
  • But if is is a large file (since it takes a long time to open in excel, it will be large, I assume), your way will need a lot of performance - both memory and time.

    But it can be better to use the following, to replace all combinations of newline and carriage return:

    $NewValue = $Value -replace "[\n\r]"
    Wednesday, June 22, 2016 11:30 AM
  • But if is is a large file (since it takes a long time to open in excel, it will be large, I assume), your way will need a lot of performance - both memory and time.

    But it can be better to use the following, to replace all combinations of newline and carriage return:

    $NewValue = $Value -replace "[\n\r]"

    Even relatively small event log exports will take a long time in Excel if the Excel is reformatting the display for every cell.  It is very bad in Excel 2010 and earlier and better in 2013 and later.

    As I keep pointing out. The messages do not contain `r`n which is part of the issue and the user is not outputting the results.  Your suggestion will work as it removes by character for all matched characters.

    $_.Message = $_.Message -replace "\'\n', ' '  # this works.  I have done it

    $_.Message = $_.Message -replace '[\r\n']'

    This and outputting by adding a single line with $_ fixes this.


    \_(ツ)_/

    Wednesday, June 22, 2016 12:12 PM
    Moderator
  • I tried removing the \r\n in ReadLine but it didn't work...

    my code is goes soemthing like this:

    if (!$inputReaderSrc.EndOfStream)
    {
        import-csv $SourceFile | select * -ExcludeProperty "__mv_*","_mkv_*"| export-csv -path $TempFileUTF8 -Encoding UTF8 -NoTypeInformation
    
        # Convert CSV file to SJIS encoding
        $inputReader = New-Object System.IO.StreamReader ($TempFileUTF8, [Text.Encoding]::GetEncoding("UTF-8"))
        $outputReader = New-Object System.IO.StreamWriter ($TempFileSJIS, $false, [Text.Encoding]::GetEncoding("Shift-JIS"))
    
        while( !$inputReader.EndOfStream ){
        	$line = $inputReader.ReadLine().Replace('\r\n',' ')
            $outputReader.WriteLine($line)
        }
    
        #flush,close file here
    
        $SourceFile = $TempFileSJIS
    }

    I tried both suggestions but I still don't get my desired output..Am I just bad at following directions or am I missing something here... 


    • Edited by jh_tokyo Thursday, June 23, 2016 1:16 AM corrected typos
    Thursday, June 23, 2016 1:15 AM
  • I gave you an answer that won't fail..  You chose something that you can't possibly understand.  Why?

    \_(ツ)_/

    Thursday, June 23, 2016 2:13 AM
    Moderator
  • Actually, I tried your answer. After tinkering with it a little bit, it finally worked. thanks a lot for your help!! (^_^)y

                                                     
    import-csv $SourceFile | select * -ExcludeProperty "__mv_*","_mkv_*"|%{
        foreach ($property in $_.PSObject.Properties){
            if(($property.Name -match 'Message') -OR ($property.Name -match 'TextData')){
               $property.Value = $property.Value -replace "`r`n",' '
            }
              
        }
        $_
    }| export-csv -path $TempFileUTF8 -Encoding UTF8 -NoTypeInformation

    Thursday, June 23, 2016 4:02 AM
  • You might not need to replace the \r\n in the column, just wrap the column with quotation marks like this "this column contains \r\n values".
    Friday, October 18, 2019 12:39 PM