none
Parsing values in quotes with embedded delimeters RRS feed

  • Question

  • I've been trying several different approaches to this one, but can't quite figure it out. I have a list that has name/value pairs like the following:

    "CreationTime":"2017-12-09T06:41:15"

    "DeletionTime":"2017-13-09T06:41:15"

    I want to run a routine to parse this into 2 columns CreationTime and DeletionTime with their associated values. 

    Suggestions?

    Thursday, January 18, 2018 10:18 PM

Answers

  • Hi Henry. Try this. It seems to work for the sample content you pasted above. The trick is to just use Json.Document against the third column (since it appears to be JSON), then expand the nested fields.

    let
        Source = Lines.FromBinary(File.Contents("C:\YOUR_FILE_HERE.txt")),
        #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Converted to Table", "Column1", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, false), {"Column1.1", "Column1.2"}),
        #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}}),
        #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type", "Column1.2", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, false), {"Column1.2.1", "Column1.2.2"}),
        #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column1.2.1", type text}, {"Column1.2.2", type text}}),
        #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type1", [PromoteAllScalars=true]),
        #"Changed Type2" = Table.TransformColumnTypes(#"Promoted Headers",{{"CreationDate", type datetime}, {"UserIds", type text}, {"Operations,AuditData", type text}}),
        #"Added Custom" = Table.AddColumn(#"Changed Type2", "Custom", each Json.Document([#"Operations,AuditData"])),
        #"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"CreationTime", "Operation", "ObjectId"}, {"CreationTime", "Operation", "ObjectId"})
    in
        #"Expanded Custom"

    Ehren

    Wednesday, January 31, 2018 5:36 PM
    Owner

All replies

  • Hi Henry,

    Split by delimiter, pick ":" as the delimiter, and make sure the Quote Character is set to " (which it should be by default). Then use Pivot Column to turn the rows into columns (making sure that the Advanced option is set to Don't Aggregate).

    Ehren

    Friday, January 19, 2018 12:31 AM
    Owner
  • This was very helpful.  Thanks!   So, now I have an added twist.  I have a column that contains a list of values lke that.  I'd like to break out the list into multiple columns.  The logic above would need to run on each element in the list.  So, instead of having individual columns that have that name /value pair, I have a single column called Audit Data and after transforming, the values in that column are a list.  Each row in the list is something like this

    "CreationTime":"2017-12-09T06:41:15"

    "DeletionTime":"2017-13-09T06:41:15"

    So, I'm trying to embed the delimiter and pivot code into a "for each" loop across each item in the list and create a bunch of columns instead of the list. 

    Make sense?

    Friday, January 26, 2018 5:35 PM
  • Hi Henry,

    When you say "list of values", do you mean a single Text value containing more than one name/value pair, or a nested M list?

    Ehren

    Friday, January 26, 2018 5:54 PM
    Owner
  • Nested M List
    Friday, January 26, 2018 6:01 PM
  • Ok. Have you tried expanding the list (into either rows or columns)? You should be able to do that via the little icon to the right of the column name.

    Ehren

    Friday, January 26, 2018 6:43 PM
    Owner
  • Thanks, Ehren.  I never noticed those.  They are helpful, but I still can't seem to get to where I want to be.  I'm thinking maybe I should start from scratch.  My goal is to parse a CSV that looks like the following (header and 2 rows pasted).

    The third column, "AuditData" is the one I'm trying to parse out into multiple columns "CreationTime", "Operation", "ObjectID".  There are actually quite a few more embedded columns, but I stripped them out for the example.  For what it's worth, this comes from the Audit Log search in O365.  I'm trying to get some actual information from that export. 

    CreationDate,UserIds,Operations,AuditData
    2017-12-09T06:41:15.0000000Z,user1@company.com,{"CreationTime":"2017-12-09T06:41:15","Operation":"FileAccessed","ObjectId":"https:\/\/xyz.sharepoint.com\/Corp\/facts\/Reports\/Test File Name.xlsm"}
    2017-12-09T05:33:23.0000000Z,user2@company.com,{"CreationTime":"2017-12-10T05:33:23","Operation":"FileAccessed","ObjectId":"https:\/\/xyz.sharepoint.com\/Corp\/facts\/Reports\/Test File Name2.xlsm"}

    Wednesday, January 31, 2018 3:20 PM
  • Hi Henry. Try this. It seems to work for the sample content you pasted above. The trick is to just use Json.Document against the third column (since it appears to be JSON), then expand the nested fields.

    let
        Source = Lines.FromBinary(File.Contents("C:\YOUR_FILE_HERE.txt")),
        #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Converted to Table", "Column1", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, false), {"Column1.1", "Column1.2"}),
        #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}}),
        #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type", "Column1.2", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, false), {"Column1.2.1", "Column1.2.2"}),
        #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column1.2.1", type text}, {"Column1.2.2", type text}}),
        #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type1", [PromoteAllScalars=true]),
        #"Changed Type2" = Table.TransformColumnTypes(#"Promoted Headers",{{"CreationDate", type datetime}, {"UserIds", type text}, {"Operations,AuditData", type text}}),
        #"Added Custom" = Table.AddColumn(#"Changed Type2", "Custom", each Json.Document([#"Operations,AuditData"])),
        #"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"CreationTime", "Operation", "ObjectId"}, {"CreationTime", "Operation", "ObjectId"})
    in
        #"Expanded Custom"

    Ehren

    Wednesday, January 31, 2018 5:36 PM
    Owner