none
Remove Commas Between Quotes RRS feed

  • Question

  • Hi guys,
    I have been recently playing with a nasty file where delimiters (commas) and quotes are totally messed up.

    I cannot change the data source.

    Here is an example of two lines:

    Line 1: column1,column2,"column 3", "column4,something",
    Line 2:"column1","column2,xlfsj",column3,column4,
    and so on....

    The "main" delimiter is comma. But comma is sometimes used between the quotes which totally messes up "Split By Delimiter" function.

    I need to find all the commas between " " and replace them with something else. Then I would replace all the ". Then I would split by delimier which should be the correct commas.

    Any tips on how to achieve that?

    DZ

    Thursday, May 19, 2016 7:58 AM

Answers

  • Hi Daniel. The default "Split By Delimiter" options seem to work just fine on the sample lines you provided above. Just ensure the Quote Style of the split is CSV, not None.

    Unless the issue is that you want all the commas to be treated as delimiters. If that's the case, try switching the Quote Style to None.

    Ehren


    Thursday, May 19, 2016 5:52 PM
    Owner

All replies

  • Why not remove all the commas? replace " with empty. Then use the comma as delimiter. Generally, all columns default to text. You can then reassign the types manually.
    Thursday, May 19, 2016 4:07 PM
  • Hi Daniel

    Try something like this below

    let
        fxReplace = (txt as text, sep as text) as text =>
          let
            List1 = Text.Split(txt, """"),
            Tbl = Table.FromColumns({List1, List.Transform(List.Positions(List1), each Number.IsOdd(_))}, {"Lst", "IsOdd"}),
            AddCol = Table.AddColumn( Tbl, "Change", each if [IsOdd] then Text.Replace([Lst], ",", sep) else [Lst]),
            Combine = Text.Combine( AddCol[Change], """")
          in
            Combine,
    
        Source = Table.FromColumns({Lines.FromBinary(File.Contents("PathToYourFile\YourFile.txt"),null,null,1250)}),
        ColWithfxReplace = Table.AddColumn(Source, "Custom", each fxReplace([Column1], "|"))
    in
        ColWithfxReplace

    where txt, it is a text from the line of your one column file and sep is a delimiter you want to use instead of comma ( between quotation marks of course). Delimiter may contain more than one character.

    Regards

    Thursday, May 19, 2016 4:25 PM
  • Hi Daniel. The default "Split By Delimiter" options seem to work just fine on the sample lines you provided above. Just ensure the Quote Style of the split is CSV, not None.

    Unless the issue is that you want all the commas to be treated as delimiters. If that's the case, try switching the Quote Style to None.

    Ehren


    Thursday, May 19, 2016 5:52 PM
    Owner
  • Thx, this is the most elegant option :)

    DZ

    Friday, May 20, 2016 7:16 AM