locked
Replacing a string of data within double qoutes and remove the doubel qoute RRS feed

  • Question

  • Hi,

    I have a string of data as below.

    1.1011.12.1,1508625580000,FB_1139,1,1011,109474892350579,1,5,df9d7c14a4e2957ecfa9c30baee745cf,5000001,"11+400,15+10,6+1201+1,6+2201+1",2017-10-22

    I would like to replace the commas within the double quotes only "11+400,15+10,6+1201+1,6+2201+1" with a underscore and than remove the double quotes.

    The script that I have is as below which was able to remove the double quote and retain the comma within the double quotes, I had tried to modified the script below to replace the comma with a underscore with the regexp but the results are negative.

    I would like to ask the help from the experts here how could I modify the below expression to suit my needs.

     
    Import-Csv -Path TST_Dummy.csv `
      | ConvertTo-CSV -NoTypeInformation `
      | % {$_ -replace  `
      '\G(?<start>^|,)(("(?<output>[^,"]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
      ,'${start}${output}'} `
      | Out-File TST_Dummy_C.csv -Encoding utf8 -Force

    Desire outcome:

    1.1011.12.1,1508625580000,FB_1139,1,1011,109474892350579,1,5,df9d7c14a4e2957ecfa9c30baee745cf,5000001,11+400_15+10_6+1201+1_6+2201+1,2017-10-22

    Thank you in advance.

    Thursday, December 14, 2017 9:03 AM

Answers

  • The following should do both of what you are trying to do.

    Get-Content input.csv |
        ForEach-Object{
            $_ -match '"(.*)"'
            $new = $matches[0] -replace ',', '_'
            $_ -replace '".*"', $new
        } |
        Out-File output.csv

    based on this:

    $line = '1.1011.12.1,1508625580000,FB_1139,1,1011,109474892350579,1,5,df9d7c14a4e2957ecfa9c30baee745cf,5000001,"11+400,15+10,6+1201+1,6+2201+1",2017-10-22'
    $line -match '"(.*)"'
    $new = $matches[0] -replace ',', '_'
    $line -replace '".*"', $new
    $line

    Any field enclosed in quotes will have any commas replaced and the field, including quotes will be replaced.


    \_(ツ)_/


    Thursday, December 14, 2017 12:10 PM

All replies

  • The following should do both of what you are trying to do.

    Get-Content input.csv |
        ForEach-Object{
            $_ -match '"(.*)"'
            $new = $matches[0] -replace ',', '_'
            $_ -replace '".*"', $new
        } |
        Out-File output.csv

    based on this:

    $line = '1.1011.12.1,1508625580000,FB_1139,1,1011,109474892350579,1,5,df9d7c14a4e2957ecfa9c30baee745cf,5000001,"11+400,15+10,6+1201+1,6+2201+1",2017-10-22'
    $line -match '"(.*)"'
    $new = $matches[0] -replace ',', '_'
    $line -replace '".*"', $new
    $line

    Any field enclosed in quotes will have any commas replaced and the field, including quotes will be replaced.


    \_(ツ)_/


    Thursday, December 14, 2017 12:10 PM
  • Thank you for your reply. I had tried but it is not removing the double quote. I had actually used two different script to get my desire output.
    Monday, December 18, 2017 7:59 AM
  • Hi,

    I'm checking how the issue is going, was your issue resolved?

    And if the replies as above are helpful, we would appreciate you to mark them as answers, and if you resolve it using your own solution, please share your experience and solution here. It will be greatly helpful to others who have the same question.

    Appreciate for your feedback.

    Best Regards,
    Albert

    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Tuesday, December 19, 2017 2:57 AM
  • I broke it into two parts, it is not very tidy and due to urgency I leave it at it is for now and will improve or combine it to work as one.

    Part one - Replace the commas within double quotes with underscore.

    #Set-ExecutionPolicy RemoteSigned
    cd C:\Users\TestFiles
    $Cnts = @()
    $RawData = Import-Csv -Path .\TST_Dummy.csv -Header xxx,yyy,zzzz
    foreach ($Cnt in $RawData[1..$RawData.Count])
    {
    $Cnt.RewardStr = $Cnt.RewardStr.Replace(",", "_")
    $Cnts += $Cnt
    }
    $Cnts | Export-Csv -Delimiter (",") -Encoding UTF8 -NoTypeInformation -Path .\CLeansed_1.csv
    Part two - Remove double quotes

    Set-ExecutionPolicy RemoteSigned
    cd C:\Users\TestFiles
    $RawData = "TST_Dummy_C1.csv"
    $CleanseData = "TST_Dummy_C2.csv"
    
    (Get-Content $RawData) `
    | % {$_ -replace '"', ""} `
    | out-file -FilePath $CleanseData -Force -Encoding utf8
    

    Hope that helps.

    Tuesday, December 19, 2017 8:03 AM