none
How to replace text string in a column by identifying a substring using as search/replacement value another list RRS feed

  • Question

  • Hi

    I wonder if someone could help me with the answer to this question. I have a database that I want to clean based on the records of each cell if each cell contains certain text (I'll call this database report), and I have another table where I have those values of the certain text I'm looking for (I'll call this database directory) associated with a clean description.

    So, something like this:

    Report table sample

    Description

    75y65importedhorsetoy456

    86856nationalballerinadoll48568

    568509importedballerinadoll09342

    8346toycomputersoft49568

    Directory

    String to look for         Clean type

    horsetoy                     Horse toy

    ballerinadoll                Ballerina doll

    computer                    Computer

    I have so many options in the directory table that if I want to do a conditional column, it won't run.

    Is there any way I can code something so that my report table has this final result:

    Report table sample

    Description                                             Clean type

    75y65importedhorsetoy456                     Horse toy

    86856nationalballerinadoll48568              Ballerina doll

    568509importedballerinadoll09342           Ballerina doll

    Thanks!!

    Barbara

    Thursday, September 5, 2019 4:44 PM

All replies

  • Hi Barbara,

    Assume an Excel table named 'Report_table', and another table named 'Directory'. Import the tables into Power Query and try the following code:

    let
       Source = Excel.CurrentWorkbook(){[Name="Report_Table"]}[Content],
       stringToLookForList = List.Buffer(Directory[String to look for]),
       addedCustom = Table.AddColumn(
          Source,
          "String Match",
          each List.Select(
             stringToLookForList,
             (current)=> Text.Contains(
                [Description],
                current,
                Comparer.OrdinalIgnoreCase
             )
          ){0}?
       ),
       mergedQueries = Table.NestedJoin(addedCustom, {"String Match"}, Directory, {"String to look for"}, "Directory", JoinKind.LeftOuter),
       expandedDirectory = Table.ExpandTableColumn(mergedQueries, "Directory", {"Clean type"}),
       removedColumn = Table.RemoveColumns(expandedDirectory,{"String Match"})
    in
       removedColumn

    Thursday, September 5, 2019 6:15 PM
  • Hi Colin!

    Thank you for your answer!

    The code did make the lookup :), however I encountered 2 issues that I had not foreseen.

    1. I think I need it to discriminate the directory by description length as some substrings are similiar (for example), for the descriptions below:

    Description                                             Clean type (desired)                   Clean type (current result)

    75y65importedhorsetoy456                     Horse toy                                             Horse toy

    86856importedhorsetoyandrider48568   Horse toy and rider                              Horse toy

    2. Apparently the left outer join is duplicating some values, but I don't know why. 

    Is there a way to prevent this 2 issues?

    Thank you so much!

    Thursday, September 5, 2019 8:19 PM
  • Hi Barb,

    The nature of the technique requires that text with the same starting text must be placed in descending length order in the dictionary. So the description for 'Horse toy and rider' should appear before the description for 'Horse Toy' and so on. 

    Hope this explanation helps.

    Thursday, September 5, 2019 10:20 PM
  • Hi Barbss10,

    did Colin's suggestion work for you?

    Btw: to achieve the descending order, you can sort by text length (make sure to buffer your sort, otherwise it might not stick: https://www.excelando.co.il/powerquery-remove-duplicates-bug-workaround/59/ )


    Imke Feldmann - MVP Data Platform (PowerBI) - http://www.TheBIccountant.com 

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful - Thanks!

    Sunday, September 29, 2019 7:24 AM
    Moderator