none
Remove Line Breaks not Preceded by Comma from a List RRS feed

  • Question

  • Hi guys,

    I am have multiple lists list which are consisting from lots of lines. These lines are to be converted to table and then split by "comma", which is the delimited. Example: "Column1","Column2", "Column3"...

    The problem is that some columns between the "commas" contain line breaks which I do not want to have. These line breaks are unnecessary and actually create redundant rows when .

    I am reading the data from web URLs via Lines.FromBinary(Web.Contents([URL]))). There are 6 URLs = 6 lists stacked on each other. When I expand the content of those stacked lists, the list are broken into lines by all the line breaks, even those redundant ones.

    What I need is a function which would remove the line breaks not preceded by " + comma before I expand those lists.

    I am thinking first to replace all instances where the line break is on this position:  "sometext",[linebreak]"sometext"... I would replace it with e.g. XXX. Then I would run second replace which would replace all line breaks. Then third replace which woudl replace XXX for line breaks again.

    But I am not sure if powerquery can actuall "replace in binary" becuase it seems to me once the content is read as list it is already broken by line breaks and I cannot really do the replacing tricks.

    (no I cannot change the data source)

    Any tips?


    Wednesday, May 18, 2016 3:32 PM

Answers

  • It's possible your data is using just linefeed instead of carriage-return+linefeed.

    Try this:

    = Text.Replace(Text.FromBinary(File.Contents("C:\MyFile.txt")), """,#(lf)", """,ZZZ")

    Ehren

    Wednesday, May 18, 2016 8:30 PM
    Owner

All replies

  • Do you need to preserve the linebreaks between commas? Or can you just remove them completely?

    Here's a formula that will clean up the source file and replace the offending linebreaks with ZZZ. (This is assuming the file is not too large to load into memory...if you need to stream it we'd have to come up with a different solution.)

    = Text.Replace(Text.FromBinary(File.Contents("C:\MyFile.txt")), """,#(cr)#(lf)", """,ZZZ")

    You can then call Lines.FromText on the above result.

    The above M converts the binary file content to a text value, then replaces quote+comma+carriage-return+linefeed with quote+comma+ZZZ.

    Ehren

    Wednesday, May 18, 2016 6:18 PM
    Owner
  • I tried the formula but it does not remove my unwanted line breaks. Are there any special invisible characters which can break lines?

    (The formula itself works with the regular strings, e.g. when replace "XXX" for "#(cr)#(lf)" then line break is inserted indeed)

    Wednesday, May 18, 2016 8:26 PM
  • It's possible your data is using just linefeed instead of carriage-return+linefeed.

    Try this:

    = Text.Replace(Text.FromBinary(File.Contents("C:\MyFile.txt")), """,#(lf)", """,ZZZ")

    Ehren

    Wednesday, May 18, 2016 8:30 PM
    Owner
  • BINGO!!!

    Thx a lot. You just saved some nasty indexing and extensive use of Table.AlternateRows function :)

    Wednesday, May 18, 2016 8:34 PM
  • Great!
    Wednesday, May 18, 2016 8:40 PM
    Owner