none
Rows into Columns, and Combine Rows with Tricky Structure RRS feed

  • Question

  • Hi,

    I have text that was output from a database at some point that I would like to get into tabular format using powerquery. I have two problems that I can't seem to get around:

    (1) the text is in linear format with the field name, a tab, and then the field value---how do I get that into columns?

    (2) Here's the tricky part, the "post_text" tag is strangely mixed up in it's own field values. Fortunately the field value is contained by special characters "Œ".  What makes this especially challenging, as you will see in the sample below, the text strangely breaks rows idiosyncratically.  In other words, the values for the "post_text" column for the three records below would be:

    "There is an issue with the back side fuse post_text box that is misbehaving"
    "Port side copling no long tight around back side H joint, must change post_text out looped bolt"
    "Back side power relay also failing based on meter readings and power flow indicator"

    Any ideas?

    Œticket_doc	 738473
    created_date	 8/20/2011
    post_id	 235051 
    post_date	 8/20/2011
    problem_type	 Electrical
    ticket_owner_user	 John James 
    ticket_ owner_ contact	 j@jay.com
    Œ	
    There is an issue with the back side fuse
    post_text box that is misbehaving
    Œticket_doc	 738422
    created_date	 8/20/2011
    post_id	 235051 
    post_date	 8/20/2011
    problem_type	 Mechanical
    ticket_owner_user	 John James 
    ticket_ owner_ contact	 j@jay.com
    Œ	
    Port side copling no long tight around back side H joint, must change
    post_text out looped bolt
    Œticket_doc	 738413
    created_date	 2/20/2013
    post_id	 235051 
    post_date	 2/20/2012
    problem_type	 Electrical
    ticket_owner_user	 John James 
    ticket_ owner_ contact	 j@jay.com
    Œ	
    Back side power relay also failing
    based on meter readings and 
    power flow indicator
    

    Saturday, March 23, 2019 12:58 AM

Answers

  • Not optimal but works:

    // test
    let
        Source = Text.FromBinary(File.Contents("E:\Users\User\Desktop\test.txt")),
        #"Split Text" = Text.Split(Source, "Œ"),
        Custom1 = Table.FromRows(List.Split(List.Skip(#"Split Text", 1), 2)),
        #"Renamed Columns" = Table.RenameColumns(Custom1,{{"Column2", "post_text"}}),
        SplitToRecords = Table.TransformColumns(#"Renamed Columns",{{"Column1", each Record.FromTable(Csv.Document(_, [Delimiter="#(tab)", Columns = {"Name","Value"}]))}}),
        #"Expanded {0}" = Table.ExpandRecordColumn(SplitToRecords, "Column1", {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}, {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expanded {0}","#(cr)#(lf)"," ",Replacer.ReplaceText,{"post_text"}),
        #"Cleaned Text" = Table.TransformColumns(#"Replaced Value",{{"ticket_doc", Text.Clean, type text}, {"created_date", Text.Clean, type text}, {"post_id", Text.Clean, type text}, {"post_date", Text.Clean, type text}, {"problem_type", Text.Clean, type text}, {"ticket_owner_user", Text.Clean, type text}, {"ticket_ owner_ contact", Text.Clean, type text}, {"post_text", Text.Clean, type text}}),
        #"Trimmed Text" = Table.TransformColumns(#"Cleaned Text",{{"ticket_doc", Text.Trim, type text}, {"created_date", Text.Trim, type text}, {"post_id", Text.Trim, type text}, {"post_date", Text.Trim, type text}, {"problem_type", Text.Trim, type text}, {"ticket_owner_user", Text.Trim, type text}, {"ticket_ owner_ contact", Text.Trim, type text}, {"post_text", Text.Trim, type text}}),
        #"Replaced Value1" = Table.ReplaceValue(#"Trimmed Text","post_text ","",Replacer.ReplaceText,{"post_text"})
    in
        #"Replaced Value1"
    Just change path for your text file


    Maxim Zelensky Excel Inside

    Saturday, March 23, 2019 12:17 PM
  • Hi Mr.Idaho,

    you must have done something wrong with Maxims code, as it works. I've replaced his reference to a local file with the string you've pasted and everything works fine. So please re-check your code.

    Copy and paste this code into the advanced editor (replace everything there) and follow the steps:

    let
        
        Source = "Œticket_doc#(tab) 738473#(cr)#(lf)created_date#(tab) 8/20/2011#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 8/20/2011#(cr)#(lf)problem_type#(tab) Electrical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)There is an issue with the back side fuse#(cr)#(lf)post_text box that is misbehaving#(cr)#(lf)Œticket_doc#(tab) 738422#(cr)#(lf)created_date#(tab) 8/20/2011#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 8/20/2011#(cr)#(lf)problem_type#(tab) Mechanical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)Port side copling no long tight around back side H joint, must change#(cr)#(lf)post_text out looped bolt#(cr)#(lf)Œticket_doc#(tab) 738413#(cr)#(lf)created_date#(tab) 2/20/2013#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 2/20/2012#(cr)#(lf)problem_type#(tab) Electrical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)Back side power relay also failing#(cr)#(lf)based on meter readings and #(cr)#(lf)power flow indicator",
    
    
        #"Split Text" = Text.Split(Source, "Œ"),
        Custom1 = Table.FromRows(List.Split(List.Skip(#"Split Text", 1), 2)),
        #"Renamed Columns" = Table.RenameColumns(Custom1,{{"Column2", "post_text"}}),
        SplitToRecords = Table.TransformColumns(#"Renamed Columns",{{"Column1", each Record.FromTable(Csv.Document(_, [Delimiter="#(tab)", Columns = {"Name","Value"}]))}}),
        #"Expanded {0}" = Table.ExpandRecordColumn(SplitToRecords, "Column1", {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}, {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expanded {0}","#(cr)#(lf)"," ",Replacer.ReplaceText,{"post_text"}),
        #"Cleaned Text" = Table.TransformColumns(#"Replaced Value",{{"ticket_doc", Text.Clean, type text}, {"created_date", Text.Clean, type text}, {"post_id", Text.Clean, type text}, {"post_date", Text.Clean, type text}, {"problem_type", Text.Clean, type text}, {"ticket_owner_user", Text.Clean, type text}, {"ticket_ owner_ contact", Text.Clean, type text}, {"post_text", Text.Clean, type text}}),
        #"Trimmed Text" = Table.TransformColumns(#"Cleaned Text",{{"ticket_doc", Text.Trim, type text}, {"created_date", Text.Trim, type text}, {"post_id", Text.Trim, type text}, {"post_date", Text.Trim, type text}, {"problem_type", Text.Trim, type text}, {"ticket_owner_user", Text.Trim, type text}, {"ticket_ owner_ contact", Text.Trim, type text}, {"post_text", Text.Trim, type text}}),
        #"Replaced Value1" = Table.ReplaceValue(#"Trimmed Text","post_text ","",Replacer.ReplaceText,{"post_text"})
    in
        #"Replaced Value1"


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Saturday, April 13, 2019 6:39 AM
    Moderator

All replies

  • Not optimal but works:

    // test
    let
        Source = Text.FromBinary(File.Contents("E:\Users\User\Desktop\test.txt")),
        #"Split Text" = Text.Split(Source, "Œ"),
        Custom1 = Table.FromRows(List.Split(List.Skip(#"Split Text", 1), 2)),
        #"Renamed Columns" = Table.RenameColumns(Custom1,{{"Column2", "post_text"}}),
        SplitToRecords = Table.TransformColumns(#"Renamed Columns",{{"Column1", each Record.FromTable(Csv.Document(_, [Delimiter="#(tab)", Columns = {"Name","Value"}]))}}),
        #"Expanded {0}" = Table.ExpandRecordColumn(SplitToRecords, "Column1", {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}, {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expanded {0}","#(cr)#(lf)"," ",Replacer.ReplaceText,{"post_text"}),
        #"Cleaned Text" = Table.TransformColumns(#"Replaced Value",{{"ticket_doc", Text.Clean, type text}, {"created_date", Text.Clean, type text}, {"post_id", Text.Clean, type text}, {"post_date", Text.Clean, type text}, {"problem_type", Text.Clean, type text}, {"ticket_owner_user", Text.Clean, type text}, {"ticket_ owner_ contact", Text.Clean, type text}, {"post_text", Text.Clean, type text}}),
        #"Trimmed Text" = Table.TransformColumns(#"Cleaned Text",{{"ticket_doc", Text.Trim, type text}, {"created_date", Text.Trim, type text}, {"post_id", Text.Trim, type text}, {"post_date", Text.Trim, type text}, {"problem_type", Text.Trim, type text}, {"ticket_owner_user", Text.Trim, type text}, {"ticket_ owner_ contact", Text.Trim, type text}, {"post_text", Text.Trim, type text}}),
        #"Replaced Value1" = Table.ReplaceValue(#"Trimmed Text","post_text ","",Replacer.ReplaceText,{"post_text"})
    in
        #"Replaced Value1"
    Just change path for your text file


    Maxim Zelensky Excel Inside

    Saturday, March 23, 2019 12:17 PM
  • Thank you so much for your help!  

    I keep getting this error:

    Expression.Error: The column 'Column2' of the table wasn't found.Details:    Column2
    Wednesday, March 27, 2019 3:45 AM
  • Please elaborate. 

    I just copied your sample to a plain text file and thats all. Everything works. At what step this error raised?


    Maxim Zelensky Excel Inside

    Wednesday, March 27, 2019 7:42 AM
  • Showing up on the  fourth "Rename" step. I might be missing something fundamental here.  I'm just wholesale replacing the text in the Advanced Editor box, right?


    And thank you again for your generous help.

    Wednesday, March 27, 2019 4:25 PM
  • Hi Mr.Idaho,

    you must have done something wrong with Maxims code, as it works. I've replaced his reference to a local file with the string you've pasted and everything works fine. So please re-check your code.

    Copy and paste this code into the advanced editor (replace everything there) and follow the steps:

    let
        
        Source = "Œticket_doc#(tab) 738473#(cr)#(lf)created_date#(tab) 8/20/2011#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 8/20/2011#(cr)#(lf)problem_type#(tab) Electrical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)There is an issue with the back side fuse#(cr)#(lf)post_text box that is misbehaving#(cr)#(lf)Œticket_doc#(tab) 738422#(cr)#(lf)created_date#(tab) 8/20/2011#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 8/20/2011#(cr)#(lf)problem_type#(tab) Mechanical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)Port side copling no long tight around back side H joint, must change#(cr)#(lf)post_text out looped bolt#(cr)#(lf)Œticket_doc#(tab) 738413#(cr)#(lf)created_date#(tab) 2/20/2013#(cr)#(lf)post_id#(tab) 235051 #(cr)#(lf)post_date#(tab) 2/20/2012#(cr)#(lf)problem_type#(tab) Electrical#(cr)#(lf)ticket_owner_user#(tab) John James #(cr)#(lf)ticket_ owner_ contact#(tab) j@jay.com#(cr)#(lf)Œ#(tab)#(cr)#(lf)Back side power relay also failing#(cr)#(lf)based on meter readings and #(cr)#(lf)power flow indicator",
    
    
        #"Split Text" = Text.Split(Source, "Œ"),
        Custom1 = Table.FromRows(List.Split(List.Skip(#"Split Text", 1), 2)),
        #"Renamed Columns" = Table.RenameColumns(Custom1,{{"Column2", "post_text"}}),
        SplitToRecords = Table.TransformColumns(#"Renamed Columns",{{"Column1", each Record.FromTable(Csv.Document(_, [Delimiter="#(tab)", Columns = {"Name","Value"}]))}}),
        #"Expanded {0}" = Table.ExpandRecordColumn(SplitToRecords, "Column1", {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}, {"ticket_doc", "created_date", "post_id", "post_date", "problem_type", "ticket_owner_user", "ticket_ owner_ contact"}),
        #"Replaced Value" = Table.ReplaceValue(#"Expanded {0}","#(cr)#(lf)"," ",Replacer.ReplaceText,{"post_text"}),
        #"Cleaned Text" = Table.TransformColumns(#"Replaced Value",{{"ticket_doc", Text.Clean, type text}, {"created_date", Text.Clean, type text}, {"post_id", Text.Clean, type text}, {"post_date", Text.Clean, type text}, {"problem_type", Text.Clean, type text}, {"ticket_owner_user", Text.Clean, type text}, {"ticket_ owner_ contact", Text.Clean, type text}, {"post_text", Text.Clean, type text}}),
        #"Trimmed Text" = Table.TransformColumns(#"Cleaned Text",{{"ticket_doc", Text.Trim, type text}, {"created_date", Text.Trim, type text}, {"post_id", Text.Trim, type text}, {"post_date", Text.Trim, type text}, {"problem_type", Text.Trim, type text}, {"ticket_owner_user", Text.Trim, type text}, {"ticket_ owner_ contact", Text.Trim, type text}, {"post_text", Text.Trim, type text}}),
        #"Replaced Value1" = Table.ReplaceValue(#"Trimmed Text","post_text ","",Replacer.ReplaceText,{"post_text"})
    in
        #"Replaced Value1"


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Saturday, April 13, 2019 6:39 AM
    Moderator