none
Power Query Find/Replace Values using wildcard string RRS feed

  • Question

  • I'm using Power Query with MS Office 2013.  For a given column, I'd like to replace a string containing certain characters like *ABC* with a string such as "ABC" such that the entire contents of each cell is replaced with "ABC". The built-in Replace Values function doesn't seem to work with wildcard strings as an input option.

    Example:

    Initial String value = "ABCXYZ test123"

    Desired Output = ABC

    Initial String value = "123ABCXYZ"

    Desired Output = ABC

    etc.

    Thanks.

    Wednesday, March 4, 2015 6:59 PM

Answers

  • Sure.

    Create a named range "InputString" which contains one cell and has the value "ABC".

    Now update the M expression according to this following one:

    let inputString = Excel.CurrentWorkbook(){[Name="InputString"]}[Content][Column1]{0}, Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"Added Custom" = Table.AddColumn(Source, "Modified Values", each [Values]), replaceTextIfContains = (inputText, old, new) => if (inputText is text and Text.Contains(inputText, old)) then new else inputText, #"Replaced Value" = Table.ReplaceValue(#"Added Custom",inputString ,inputString, replaceTextIfContains,{"Modified Values"}) in #"Replaced Value



    Thursday, March 5, 2015 1:56 PM

All replies

  • Wednesday, March 4, 2015 8:59 PM
    Moderator
  • Following Imke Feldmann's proposal, below is an example of an M expression that imports Table1 where "Values" column contains the values you wish to transform.

    So we start with Table1 and import it to Power Query Editor.

    Then we modify the M expression to the following one:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Added Custom" = Table.AddColumn(Source, "Modified Values", each [Values]),
        replaceTextIfContainsABC = (inputText, old, new) =>
            
    	if (inputText is text and Text.Contains(inputText, old)) then
                new
            else
                inputText,
    
        #"Replaced Value" = Table.ReplaceValue(#"Added Custom","ABC","ABC",  replaceTextIfContainsABC,{"Modified Values"})
    in
        #"Replaced Value"

    The output is a table with another column with the desired text transformation (I preferred to show you the two columns side-by-side. You can modify the expression to perform the modification on the original column).

    Perhaps there are easier ways to implement it, but it works :)

    • Proposed as answer by Gil RavivMVP Thursday, March 5, 2015 9:35 AM
    Thursday, March 5, 2015 9:35 AM
  • Hi Gil,

    any idea if/how one could feed a table/range with lookup-values instead of hardcode "ABC"?


    Imke

    Thursday, March 5, 2015 1:47 PM
    Moderator
  • Sure.

    Create a named range "InputString" which contains one cell and has the value "ABC".

    Now update the M expression according to this following one:

    let inputString = Excel.CurrentWorkbook(){[Name="InputString"]}[Content][Column1]{0}, Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], #"Added Custom" = Table.AddColumn(Source, "Modified Values", each [Values]), replaceTextIfContains = (inputText, old, new) => if (inputText is text and Text.Contains(inputText, old)) then new else inputText, #"Replaced Value" = Table.ReplaceValue(#"Added Custom",inputString ,inputString, replaceTextIfContains,{"Modified Values"}) in #"Replaced Value



    Thursday, March 5, 2015 1:56 PM
  • Thanks!

    Imke

    Friday, March 6, 2015 7:06 AM
    Moderator
  • Hello,

    I have found this code useful but have a question.

    I would like to create a list of values that need replacement and to have users maintain that list in a table. I am trying to adapt your code to read the individual values of the parameter table and to make the necessary find and replace; however, I find that with the code you posted above only reads the first value from the parameter table.

    Here is what I am working with:

    let
        Source = Excel.Workbook(File.Contents("FILELOCATION"), null, true),
        #"LIQUID COFFEE RANK(2)_Sheet" = Source{[Item="LIQUID COFFEE RANK(2)",Kind="Sheet"]}[Data],
        #"Promoted Headers" = Table.PromoteHeaders(#"LIQUID COFFEE RANK(2)_Sheet"),
        #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"PROD", type text}, {"UNIVERSAL PROD CODE", Int64.Type}, {"BRAND DESCRIPTION", type text}, {"CONTAINER", type text}, {"PER", type text}, {"MKT", type text}, {"$ Vol", type number}, {"$ % Chg vs YAG", type number}, {"Unit Vol", Int64.Type}, {"Unit % Chg vs YAG", type number}, {"%ACV", type number}, {"%ACV Pt Chg vs YAG", type number}, {"UPC", type text}}),
    
        inputString = Query1[Original],
    
        Source2 = Excel.Workbook(File.Contents("FILELOCATION"), null, true){[Name="Table1"]}[Data],
        #"Added Custom" = Table.AddColumn(Source2, "Modified Values", each [MKT]),
        replaceTextIfContains = (inputText, old, new) =>
            
    	if (inputText is text and Text.Contains(inputText, old)) then
                new
            else
                inputText,
    
        #"Replaced Value" = Table.ReplaceValue(#"Added Custom",each inputString , inputString, replaceTextIfContains,{"Modified Values"})
    in
        #"Replaced Value"

    Is there a way to read in all of the replacement values and then compare each record against the multiple values?

    Thank you for any help you can provide!

    Friday, January 22, 2016 6:21 AM
  • Okay, 45 minutes later I think I have a solution. Essentially this solves two problems, it acts as a wildcard find and replace using the code above, and it uses multiple inputs from a table that has two columns "Original" and "New" from code that I saw on Chris Webb's blog here http://blog.crossjoin.co.uk/2014/06/25/using-list-generate-to-make-multiple-replacements-of-words-in-text-in-power-query/

    let
        Source = Excel.Workbook(), null, true),
        #"LIQUID COFFEE RANK(2)_Sheet" = Source{[Item="LIQUID COFFEE RANK(2)",Kind="Sheet"]}[Data],
        #"Promoted Headers" = Table.PromoteHeaders(#"LIQUID COFFEE RANK(2)_Sheet"),
        #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"PROD", type text}, {"UNIVERSAL PROD CODE", Int64.Type}, {"BRAND DESCRIPTION", type text}, {"CONTAINER", type text}, {"PER", type text}, {"MKT", type text}, {"$ Vol", type number}, {"$ % Chg vs YAG", type number}, {"Unit Vol", Int64.Type}, {"Unit % Chg vs YAG", type number}, {"%ACV", type number}, {"%ACV Pt Chg vs YAG", type number}, {"UPC", type text}}),
    
        WordsToReplace = Query1[Original],
    
        WordsToReplaceWith = Query1[New],
    
        Source2 = Excel.Workbook(), null, true){[Name="Table1"]}[Data],
        #"Added Custom" = Table.AddColumn(Source2, "Modified Values", each [MKT]),
    
        ReplacementFunction = (InputText, Position) => 
        let
            cond_test = if (InputText is text and Text.Contains(InputText, WordsToReplace{Position})) then
                InputText
            else
                WordsToReplace{Position},
    
         ReplaceText = Text.Replace(
                InputText, 
                cond_test,
                WordsToReplaceWith{Position})
        in
    
         if Position=List.Count(WordsToReplace)-1 
          then 
    
          ReplaceText 
          else 
    
          @ReplacementFunction(ReplaceText, Position+1),
    
     Output = Table.AddColumn(Source2, "Modified Values", each ReplacementFunction([MKT], 0))
    in
        Output

    What is great is that all you need to do is update the table and the values change automatically! This will allow my users to create standard data cleanup mechanisms by cleaning up product descriptions, market names without having to code or use the Power Query editor.

    Friday, January 22, 2016 7:13 AM
  • Does anyone know how you could use a wild card within the Text.Contains?

    For example

    [myfield]

    4400

    NVidia 4400

    4200

    if(Text.Contains([myfield],"4*00)) then "4000 Series"

    Thursday, February 25, 2016 6:58 PM
  • For this specific example your could construct it like this:

    if Text.Start([myfield],1)= "4" and Text.End([myfield],2)="00" and Text.Length([myfield])=4 then "4000 Series" else ...


    Imke Feldmann TheBIccountant.com

    Thursday, February 25, 2016 7:03 PM
    Moderator
  • Hi Gil,

    Just a quick question.

    In the function you defined "replaceTextIfContainsABC ", there are 3 parameters.

    However, in the code :

    #"Replaced Value" = Table.ReplaceValue(#"Added Custom","ABC","ABC",  replaceTextIfContainsABC,{"Modified Values"})

    there are no parameters passed into the code.

    Could you please tell me how that works?

    Cheers,

    Tom Sun

    Monday, December 19, 2016 11:26 PM
  • In general, when a function is passed as an argument to a surrounding function, the surrounding function supplies the arguments. In this case, replaceTextIfContainsABC is a Replacer function; the surrounding function, Table.ReplaceValue "knows" that 3 arguments must be passed to a Replacer function: value, old, new.
    So the internal logic of Table.ReplaceValue takes care of passing those arguments.
    Tuesday, December 20, 2016 7:33 AM
  • Hello, 

    I have a column of dates, some include a time. I want to remove all the time values. 

    I was trying to use #"Replace Value" but that requires a line for every instance of time. After view above I don't see anything that might help, or don't understand M enough to make it help. I thought one could use wild card for numbers like ##:## for 09:45 . See small sample below of some that have a time value.

    I see this is old, but I hope someone can guide me. Thanks, Robert

    Date
    29 Jul. 2017 09:30
    25 Jul. 2017 19:30
    25 Jul. 2017 14:15
    20 Jul. 2017 16:45
    22 Jul. 2017 08:30
    22 Jul. 2017 08:30
    20 Jul. 2017 16:30
    20 Jul. 2017 12:45
    
    


    Monday, October 16, 2017 7:28 PM
  • let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        RemovedTimeComponent = Table.TransformColumns(Source, {"Date", each Text.BeforeDelimiter(_, " ", {0, RelativePosition.FromEnd})}),
        ChangedType = Table.TransformColumnTypes(RemovedTimeComponent,{{"Date", type date}})
    in
        ChangedType

    Monday, October 16, 2017 7:53 PM
  • Hello World!

    Thanks for the great expertise above. 

    I'm just looking for a solution to the following problem:
    A table column may contain entries with HTML content with multiple HTML tags.

    Do you have a tip/suution how to remove all HTML tags so that only the real text-content is displayed in a new column? 

    For example:
    <P>TEST DESCRIPTION </P><P><STRONG>TEST DESCRIPTION </STRONG></P><P><EM>TEST DESCRIPTION </EM></P><P><U>TEST DESCRIPTION </U></P>

    should result in:
    TEST DESCRIPTION TEST DESCRIPTION TEST DESCRIPTION TEST DESCRIPTION

    Thursday, February 13, 2020 9:40 AM
  • Hi freiundwild

    Until someone suggests something more straightforward/efficient. Assuming your HMTL string is in [Column1]:

    AddedTextOnly = Table.AddColumn(PreviousStepName, "TextOnly", each
        Text.Combine(
            List.Transform(
                List.Select(
                    Text.Split([Column1],">"),
                    each not Text.StartsWith(_,"<")
                ),
            each Text.BeforeDelimiter(_,"<")
            )
        ), type text
    )

    If you're in the PowerBI environment you might have - not sure - an option with Hmtl.Table - see Chris Webb's Removing HTML Tags From Text In Power Query/Power BI


    • Edited by Lz._ Thursday, February 13, 2020 11:35 AM
    Thursday, February 13, 2020 11:35 AM
  • That works great for me! Thanks so much. If someone else need to replace HTML strings here you are.

    let
        Source = Content_With_HTML,
        #"Replace NULL" = Table.ReplaceValue(Source,null,"",Replacer.ReplaceValue,{"Description"}),
        #"Replace &AUml;" = Table.ReplaceValue(#"Replace NULL","&Auml;","Ä",Replacer.ReplaceText,{"Description"}),
        #"Replace &UUml;" = Table.ReplaceValue(#"Replace &AUml;","&Uuml;","Ü",Replacer.ReplaceText,{"Description"}),
        #"Replace &OUml;" = Table.ReplaceValue(#"Replace &UUml;","&Ouml;","Ö",Replacer.ReplaceText,{"Description"}),
        #"Replace &auml;" = Table.ReplaceValue(#"Replace &OUml;","&auml;","ä",Replacer.ReplaceText,{"Description"}),
        #"Replace &uuml;" = Table.ReplaceValue(#"Replace &auml;","&uuml;","ü",Replacer.ReplaceText,{"Description"}),
        #"Replace &ouml;" = Table.ReplaceValue(#"Replace &uuml;","&ouml;","ö",Replacer.ReplaceText,{"Description"}),
        #"Replace &nbsp;" = Table.ReplaceValue(#"Replace &ouml;","&nbsp;"," ",Replacer.ReplaceText,{"Description"}),
        #"Replace &amp;" = Table.ReplaceValue(#"Replace &nbsp;","&amp;","&",Replacer.ReplaceText,{"Description"}),
        #"Replace &quot;" = Table.ReplaceValue(#"Replace &amp;","&quot;","""",Replacer.ReplaceText,{"Description"}),
        #"Replace &apos;" = Table.ReplaceValue(#"Replace &quot;","&apos;","'",Replacer.ReplaceText,{"Description"}),
        #"Replace </p>" = Table.ReplaceValue(#"Replace &apos;","</p>","#(cr)",Replacer.ReplaceText,{"Description"}),
        #"Replace <br>" = Table.ReplaceValue(#"Replace </p>","<br>","#(cr)",Replacer.ReplaceText,{"Description"}),
        #"Replace <li>" = Table.ReplaceValue(#"Replace <br>","<li>","-#(00A0)",Replacer.ReplaceText,{"Description"}),
        #"Replace </li>" = Table.ReplaceValue(#"Replace <li>","</li>","#(cr)",Replacer.ReplaceText,{"Description"}),
        #"Replace HTML Tags" = Table.AddColumn(#"Replace </li>", "DescriptionTextOnly", each Text.Combine(
            List.Transform(
                List.Select(
                    Text.Split([Description],">"),
                    each not Text.StartsWith(_,"<")
                ),
            each Text.BeforeDelimiter(_,"<")
            )
        )),
        #"Replace &lt;" = Table.ReplaceValue(#"Replace HTML Tags","&lt;","<",Replacer.ReplaceText,{"DescriptionTextOnly"}),
        #"Replace &gt;" = Table.ReplaceValue(#"Replace &lt;","&gt;",">",Replacer.ReplaceText,{"DescriptionTextOnly"}),
        #"Replace Linefeed" = Table.ReplaceValue(#"Replace &gt;","#(lf)","",Replacer.ReplaceText,{"DescriptionTextOnly"})
    in
        #"Replace Linefeed"

    Thursday, February 13, 2020 5:47 PM
  • You're welcome & Thanks for upvoting - I guess it's you...

    Thanks for posting your code. I didn't scrutanize it but you don't seem to only remove HTML tags there, correct?

    Thursday, February 13, 2020 6:51 PM
  • ;)

    I'm just replacing a bit html-code to get cleared content readable, i.e. breaks, quotes, list elements and for the german keyboard layout ä, ü, ö letters. That was the easy part ;)

    Thursday, February 13, 2020 8:10 PM
  • and for the german keyboard layout ä, ü, ö letters. That was the easy part ;)

    I see ;-) Thanks
    Thursday, February 13, 2020 8:55 PM
  • Hello Germany :)

    Suggestion to replace :) at least your 1st set of Table.ReplaceValue (not checked further down) + will make your code more readable... Only the first 4 German chars are covered in the below code. You just have to add the others to the OldChars and NewChars lists:

        OldChars = List.Buffer({"&Auml;","&Uuml;","&Ouml;","&auml;"}),
        NewChars = List.Buffer({"Ä","Ü","Ö","ä"}),
        ReplacedChars = Table.ReplaceValue(PreviousStepName, each [Description], each
            List.Accumulate(OldChars, [Description],
                (stateString,curOldChar)=> Text.Replace(stateString,curOldChar,NewChars{List.PositionOf(OldChars,curOldChar)})
            ),
            Replacer.ReplaceText,{"Description"}
        )

    + Question: is the following really required - question more for you:
    Table.ReplaceValue(Source,null,"",Replacer.ReplaceValue,{"Description"})

     

    Friday, February 14, 2020 2:10 PM