locked
Remove leading and trailing space from CSV but exclude other spaces RRS feed

  • Question

  • I have a CSV file where every cell contains leading spaces and trailing spaces, and some cells have spaces in the middle.  I want to remove all leading and trailing spaces without affecting the spaces in the middle.  How can I do this with PowerShell?

    For example, the CSV looks like this (I'm using quotes to show where there are spaces, but there are not quotes in the file):

    "   Hostname    "                                      "   IP Address     "

    "   server_A_words with spaces      "          "   1.1.1.1         "

    "   server_B_words with spaces      "          "   2.2.2.2         "

    I want to import that file, remove leading and trailing space, and export it back to the *same* CSV file but have it look like this (again, the quotes are only used to show the spaces, or lack thereof in this case):

    "Hostname"                                     "IP Address"

    "server_A_words with spaces"           "1.1.1.1"

    "server_B_words with spaces"           "2.2.2.2"

    So something like this

    Import-Csv .\file.csv | foreach {"work some magic"} | Export-Csv .\file.csv


    Any help is greatly appreciated!

    Thursday, March 29, 2018 6:13 PM

Answers

  • So, in case anyone else needs this, until I can figure out a way to do it using PowerShell or I can get the original report in a better format, I figured out how to remove leading and trailing spaces (without affecting internal spaces) from a CSV using this VBA macro:

    Sub CleanUp()
      Dim r As Range
      For Each r In ActiveSheet.UsedRange
        v = r.Value
        If v <> "" Then
          If Not r.HasFormula Then
            r.Value = Trim(v)
          End If
        End If
      Next r
    End Sub
    

    It's a manual process, but at least it lets me get my data in the format I need in order to do other more important stuff with it.


    <cite style="margin:0in;font-family:Calibri;font-size:9.0pt;color:#595959;">From <https://stackoverflow.com/questions/32181405/how-to-remove-leading-and-trailing-spaces-from-all-cells-of-a-excel-sheet-at-onc> </cite>

    • Marked as answer by stringsurfer Friday, March 30, 2018 4:36 PM
    Friday, March 30, 2018 4:36 PM

All replies

  • and were does this magical csv file get generated from?
    Thursday, March 29, 2018 6:21 PM
  • It's a .xlsx file that I receive as an automated report, which I convert to a CSV using this script:

    # Convert Excel to CSV
    Function ExcelCSV ($File)
    {
        $excelFile = "$PSScriptRoot\" + $File + ".xlsx"
        $Excel = New-Object -ComObject Excel.Application
        $Excel.Visible = $false
        $Excel.DisplayAlerts = $false
        $wb = $Excel.Workbooks.Open($excelFile)
        
        foreach ($ws in $wb.Worksheets) {
            $ws.SaveAs("$PSScriptRoot\" + $File + ".csv", 6)
        }
        $Excel.Quit()
    }
    
    $xlsxFilename = "file"
    ExcelCSV -File "$xlsxFilename"

    The .xlsx file has all the unwanted spaces in it, which carries over to the CSV.  If there's an easier way to do it I'm open to suggestion.

    Thursday, March 29, 2018 6:25 PM
  • crucify the person giving you the automated REPORT in xlsx format.

    that's my advice.

    Thursday, March 29, 2018 6:29 PM
  • I am not sure if this would help, but you can try.

    import the CSV into Excel,

    go to data > Text to columns

    Then you can check the trim command

    https://www.pryor.com/blog/remove-extra-spaces-from-excel-data/

    or the maybe the format cell command using custom settings

    Once you have trimmed the excess, then save the file back out as a MS-DOS CSV file. 

    Thursday, March 29, 2018 6:32 PM
  • If there was anything I could do about that, believe me I would.  But the fact remains, I have to work with what I have... which is a garbage file that I need to reformat so I can parse for important data.  If you know how to do what I'm trying to do, then please share :)
    • Proposed as answer by jrv Thursday, March 29, 2018 11:58 PM
    Thursday, March 29, 2018 6:52 PM
  • Thanks, frankxk!  Unfortunately, those options don't work, and I'm trying to avoid having to do anything manually in Excel at all costs.  I have a lot of additional automation that will take place based on the data in this file, but it relies on the data being properly formatted.  So I need to be able to drop the xlsx file in a directory, run my script, and have it do everything (i.e. convert xlsx to csv, reformat correctly, then pump that data into other scripts).
    Thursday, March 29, 2018 6:55 PM
  • are the spaces before and after static? like always 5 spaces before and 7 after? are we sure it's not a 'tab' value in there somewhere?
    Thursday, March 29, 2018 7:31 PM
  • So, there are two more columns I didn't mention before, which contain dates.  There's always a single leading space, but the trailing spaces appear to be random based on the character length of the data in that cell, except for the IP and date columns which look like they all have one leading and two trailing spaces.

    Here's a screenshot of what it looks like when I copy the text from Excel and paste it into Notepad++.  The only tabs are where it tabs to the next cell in Excel, but within each cell there are a random number of spaces at the end except for the last 3 columns (minus the header).


    Thursday, March 29, 2018 8:14 PM
  • Might want to take a look into OCR
    • Edited by The Grim Thursday, March 29, 2018 8:45 PM
    Thursday, March 29, 2018 8:42 PM
  • Remove leading and trailing spaces from a CSV:

    (Get-Content .\file.csv) | 
        foreach {
            #"work some magic"
            $_ -replace '"\s+|\s+"','"'
        } | Out-File .\file.csv


    \_(ツ)_/

    Friday, March 30, 2018 12:01 AM
  • @jrv, when I try that it doesn't remove any leading or trailing spaces, and it combines all 4 columns into column A, with a comma preceding each subsequent entry.

    So if the source data looks like this in for separate columns (using quotes to show the spaces):

    " Hostname             "      " IP Address          "       " Created         "       " Modified        "

    The output looks like this with everything in one column:

    " Hostname             "      ", IP Address          "       ", Created         "       ", Modified        "

    Friday, March 30, 2018 3:08 PM
  • You original file is NOT a Csv file and appears to have unprintable characters embedded.  You will have to analyze the file correctly to determine how to parse it.

    Your original line has no commas and the second line has commas.  Where did the commas come from?


    \_(ツ)_/


    • Edited by jrv Friday, March 30, 2018 3:17 PM
    Friday, March 30, 2018 3:14 PM
  • The following method will convert your first line to a CSV line:

    $line = '" Hostname             "      " IP Address          "       " Created         "       " Modified        "'
    $l2 = $line  -replace '"\s+"','","'
    $l2 -replace '"\s+|\s+"','"'


    \_(ツ)_/

    Friday, March 30, 2018 3:22 PM
  • The commas appeared after I ran your script...

    (Get-Content .\file.csv) | 
        foreach {
            #"work some magic"
            $_ -replace '"\s+|\s+"','"'
        } | Out-File .\file.csv

    Once I ran that on the source CSV (which had no commas) the resulting CSV had all four columns combined into one column with the commas.

    Friday, March 30, 2018 3:42 PM
  • So, in case anyone else needs this, until I can figure out a way to do it using PowerShell or I can get the original report in a better format, I figured out how to remove leading and trailing spaces (without affecting internal spaces) from a CSV using this VBA macro:

    Sub CleanUp()
      Dim r As Range
      For Each r In ActiveSheet.UsedRange
        v = r.Value
        If v <> "" Then
          If Not r.HasFormula Then
            r.Value = Trim(v)
          End If
        End If
      Next r
    End Sub
    

    It's a manual process, but at least it lets me get my data in the format I need in order to do other more important stuff with it.


    <cite style="margin:0in;font-family:Calibri;font-size:9.0pt;color:#595959;">From <https://stackoverflow.com/questions/32181405/how-to-remove-leading-and-trailing-spaces-from-all-cells-of-a-excel-sheet-at-onc> </cite>

    • Marked as answer by stringsurfer Friday, March 30, 2018 4:36 PM
    Friday, March 30, 2018 4:36 PM