none
Data parsing help, please RRS feed

  • Question

  • I have imported data from a webscraper that contains data I am looking to split out to further analyze in Excel, but there is a large amount of repeating text within one cell.  I am trying to parse it out, but my scripting knowledge is rudimentary at best.  The data comes in like so, where each INFO is data I want to split into a separate columns...

    <span title=""INFO1&#10;INFO2 INFO3&#10;INFO4."" class=""example"" style="""text-decoration:underline;">INFO5% "text-decoration:underline;">INFO6%; display: block;"" example=""2"" example_id=""INFO7"" type=""INFO8""><span title=""INFO1&#10;INFO2 INFO3&#10;INFO4."" class=""example"" style=""nbsp;INFO5% nbsp;INFO6%; display: block;"" example=""2"" example_id=""INFO7"" type=""INFO8"">

    This pattern repeats around 200 times within the same cell, and repeats a different amount of times.  On the other hand, the structure of each <span title= /span> line is always the same, (and the data I am trying to extract is not underlined) so although the text  for each INFO could be 2 characters or 30 characters, the delimiters are always exactly the same.  The end result then would be around 1,600 columns where the column pattern of INFO1 / INFO2 / INFO3 / INFO4 / INFO5 / INFO6 / INFO7 / INFO8 would just repeat as many times as necessary.  I also have around 100 rows of these to parse, so not only does it need to repeat by adding columns, but it needs to apply the same process to all the rows below it. 

    I can take one line of this repeating code and split it by delimiters using the tools on the ribbon just fine in Power Query, but there has to be a way to tell power query to repeat this process on this one cell every time <span title= appears and just keep splitting the data out into columns all the with across the spreadsheet to the right.  I have watched hours of Lynda videos and searched the M code examples, but I haven't seen anything about manual coding to repeat steps a number of times.  Any help on this would be greatly appreciated.




    • Edited by zharkins Tuesday, June 4, 2019 6:44 PM clarifying request
    Tuesday, June 4, 2019 6:34 PM

Answers

All replies