none
Multiple Text.AfterDelimiter up to Line Break RRS feed

  • Question

  • Hi I am looking to parse out the contents of an email body which has a structure as per the sample below. I would like to take each value after the ": " up to the Line Break. Is there a way to do this for all the items in the email body putting each value in a different column? Thank you

    Sample

    Name: Test A1

    Country: United States

    City: New York

    ID Number: 1234567


    Wednesday, June 20, 2018 1:13 AM

Answers

  • Revision of earlier solution (deleted) using the Power Query UI:

    let
        Source = ..., //your source where you get the column with a text to split
    SplitColumnByDelimiters = Table.SplitColumn(Source, "SomeTextWithLineFeeds", Splitter.SplitTextByAnyDelimiter({": ", "#(lf)"}), {"Col1", "Col2"}), GroupedRows = Table.Group(SplitColumnByDelimiters, {"Col1"}, {{"Table", each Table.AddIndexColumn(_, "PivotKey"), type table}}), CombinedTables = Table.Combine(GroupedRows[Table]), PivotedColumn = Table.Pivot(CombinedTables, List.Distinct(CombinedTables[Col1]), "Col1", "Col2"), RemovedColumn = Table.RemoveColumns(PivotedColumn,{"PivotKey"}) in RemovedColumn

    Sunday, June 24, 2018 3:52 AM

All replies

  • How does your table structure look like? Does all the text of the email sit in one field (incl. the line breaks you mentioned) or are they already broken up into separate rows in the query?

    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Wednesday, June 20, 2018 8:13 PM
    Moderator
  • Thanks Imke. I worked it out but I suspect there is a better way to do this. Some of the data is within a text block and some is broken out in separate lines. I used the Text.BetweenDelimiters to parse out the data. One issue I could not work out was that sometime the item name (which I use as the first delimiter) changes slightly from email to email with one having two spaces between "Account  Number" and the other just one space "Account Number".

    Is there a way to incorporate this into an if statement? Thanks again.

    let
        Mail = let
        Source = Exchange.Contents("***@***.com"),
        Mail1 = Source{[Name="Mail"]}[Data],
        #"Filtered Rows" = Table.SelectRows(Mail1, each ([Folder Path] = "\Inbox\***\***\")),
        #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Sender", "DisplayTo", "DisplayCc", "ToRecipients", "CcRecipients", "BccRecipients", "DateTimeReceived", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Attributes", "Id"}),
        #"Expanded Body" = Table.ExpandRecordColumn(#"Removed Columns", "Body", {"TextBody"}, {"Body.TextBody"}),
        #"Added Custom" = Table.AddColumn(#"Expanded Body", "Account Legal Name", each [Body.TextBody]),
        #"Extracted Text Between Delimiters1" = Table.TransformColumns(#"Added Custom", {{"Account Legal Name", each Text.BetweenDelimiters(_, "Account Legal Name: ", "#(lf)"), type text}}),
        #"Added Custom1" = Table.AddColumn(#"Extracted Text Between Delimiters1", "Account Number", each [Body.TextBody]),
        #"Extracted Text Between Delimiters2" = Table.TransformColumns(#"Added Custom1", {{"Account Number", each Text.BetweenDelimiters(_, "Account Number: ", "#(lf)"), type text}}),
        #"Added Custom2" = Table.AddColumn(#"Extracted Text Between Delimiters2", "Address", each [Body.TextBody]),
        #"Extracted Text Between Delimiters3" = Table.TransformColumns(#"Added Custom2", {{"Address", each Text.BetweenDelimiters(_, "Address: ", "#(lf)"), type text}}),
        #"Added Custom3" = Table.AddColumn(#"Extracted Text Between Delimiters3", "Type", each [Body.TextBody]),
        #"Extracted Text Between Delimiters4" = Table.TransformColumns(#"Added Custom3", {{"Type", each Text.BetweenDelimiters(_, "Type: ", "#(lf)"), type text}}),
        #"Added Custom4" = Table.AddColumn(#"Extracted Text Between Delimiters4", "Domicile", each [Body.TextBody]),
        #"Extracted Text Between Delimiters5" = Table.TransformColumns(#"Added Custom4", {{"Domicile", each Text.BetweenDelimiters(_, "Domicile: ", "#(lf)"), type text}}),
        #"Added Custom5" = Table.AddColumn(#"Extracted Text Between Delimiters5", "AUM", each [Body.TextBody]),
        #"Extracted Text Between Delimiters6" = Table.TransformColumns(#"Added Custom5", {{"AUM", each Text.BetweenDelimiters(_, "AUM: ", "#(lf)"), type text}}),
        #"Added Custom6" = Table.AddColumn(#"Extracted Text Between Delimiters6", "LEI", each [Body.TextBody]),
        #"Extracted Text Between Delimiters7" = Table.TransformColumns(#"Added Custom6", {{"LEI", each Text.BetweenDelimiters(_, "LEI: ", "#(lf)"), type text}}),
        #"Added Custom7" = Table.AddColumn(#"Extracted Text Between Delimiters7", "Sub Accounts", each [Body.TextBody]),
        #"Extracted Text Between Delimiters8" = Table.TransformColumns(#"Added Custom7", {{"Sub Accounts", each Text.BetweenDelimiters(_, "Sub Accounts: ", "#(lf)"), type text}}),
        #"Added Custom8" = Table.AddColumn(#"Extracted Text Between Delimiters8", "Account Number", each [Body.TextBody]),
        #"Extracted Text Between Delimiters9" = Table.TransformColumns(#"Added Custom8", {{"Account Number", each Text.BetweenDelimiters(_, "Account Number: ", "#(lf)"), type text}}),
        #"Added Custom9" = Table.AddColumn(#"Extracted Text Between Delimiters9", "Products", each [Body.TextBody]),
        #"Extracted Text Between Delimiters10" = Table.TransformColumns(#"Added Custom9", {{"Products", each Text.BetweenDelimiters(_, "that would like to begin trading ", " upon funding"), type text}}),
        #"Added Custom10" = Table.AddColumn(#"Extracted Text Between Delimiters10", "Amendemnt Date", each [Body.TextBody]),
        #"Extracted Text Between Delimiters11" = Table.TransformColumns(#"Added Custom10", {{"Amendemnt Date", each Text.BetweenDelimiters(_, "listed on the amendments dated ", "."), type text}})
    in
       #"Extracted Text Between Delimiters11",
        #"Split Column by Delimiter" = Table.SplitColumn(Mail, "AUM", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"CCY", "Amount.2", "Amount.3"}),
        #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"CCY", type text}, {"Amount.2", type number}, {"Amount.3", type text}})
    in
        #"Changed Type"

    Thursday, June 21, 2018 8:41 PM
  • Hi Doug,

    how about replacing "  " by " " as one of the first steps before starting with the other transformations?


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Thursday, June 21, 2018 8:47 PM
    Moderator
  • Hi Doug.

    If I got it right:

    let
    
        Source = ..., //you source where you get the column with a text to split
    
        ToTable = Table.AddColumn(Source, "Fields", each 
            let 
                Splitted = Splitter.SplitTextByAnyDelimiter({": ", "#(lf)"})([SomeTextWithLineFeeds]) 
            in 
                #table(
                    List.Alternate(Splitted,1,1,1), 
                    {List.Alternate(Splitted,1,1)})
            ),
        AllColumns = List.Union(List.Transform(ToTable[Fields], Table.ColumnNames)),
        Expanded = Table.ExpandTableColumn(ToTable, "Fields", AllColumns, AllColumns)
    in
        Expanded

    This takes a value from the table column named SomeTextWithLineFeeds, which contains the text from your sample, then creates a table from this text, splitted by ": " and #(LF), and expand it to the new columns


    Maxim Zelensky Excel Inside

    Friday, June 22, 2018 1:58 PM
  • Revision of earlier solution (deleted) using the Power Query UI:

    let
        Source = ..., //your source where you get the column with a text to split
    SplitColumnByDelimiters = Table.SplitColumn(Source, "SomeTextWithLineFeeds", Splitter.SplitTextByAnyDelimiter({": ", "#(lf)"}), {"Col1", "Col2"}), GroupedRows = Table.Group(SplitColumnByDelimiters, {"Col1"}, {{"Table", each Table.AddIndexColumn(_, "PivotKey"), type table}}), CombinedTables = Table.Combine(GroupedRows[Table]), PivotedColumn = Table.Pivot(CombinedTables, List.Distinct(CombinedTables[Col1]), "Col1", "Col2"), RemovedColumn = Table.RemoveColumns(PivotedColumn,{"PivotKey"}) in RemovedColumn

    Sunday, June 24, 2018 3:52 AM
  • Hi NY_Doug_2018,

    did any of the provided solutions solve your problem? Then please mark it as answer to help others with the same problem.


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Sunday, July 8, 2018 8:59 AM
    Moderator