none
Regular Expressions in Power Query RRS feed

  • Question

  • Hi, I already know that regular expressions are not available yet to be used in Power Query so I would like to get some ideas about how can I run a similar process. I am making an query that gets text from a file that was previously scanned and therefore, it's normal to get some wrong characters. As a example, I can use text.contains to check if the text has the word "Declaramos"(pt), but I also would like to return true if the text contained "DECLARAMOS"or "Declaranos". I will have to do the same with numbers to get invoice numbers in the text. Any idea about how doing it?

    Thank you very much!

    Wednesday, July 29, 2015 2:08 PM

Answers

  • @ Colin,

    I'd like to support this feature. Do you think this is the right one to vote for?: https://support.powerbi.com/forums/282523-bi-in-sql-vnext/suggestions/7260696-add-regular-expression-support-to-as-native-databa

    @ Renan,

    here are some mockup-ideas for the meantime :-)

    let
    // In table "Reference" sits your string to compare against "Declaramos"
        Ref = Excel.CurrentWorkbook(){[Name="Reference"]}[Content]{0}[Reference],
        RefCap = Text.Upper(Ref),
    // In table "Values" are sitting your values that you check to match
        Source = Excel.CurrentWorkbook(){[Name="Values"]}[Content],
        ChgType = Table.TransformColumnTypes(Source,{{"Values", type text}}),
        AddCapValue = Table.AddColumn(ChgType, "CapValues", each Text.Upper([Values])),
        CapitalMatch = Table.AddColumn(AddCapValue, "Capitals", each if [CapValues] = RefCap then "FullMatch" else ""),
        TextLength = Table.AddColumn(CapitalMatch, "TextLength", each if (Text.Length([Values])-Text.Length(RefCap))=0 then 1 else 0),
        ContainsCharacters = Table.AddColumn(TextLength, "ContainsCharacters", each List.Count(Text.PositionOfAny([CapValues],Text.ToList(RefCap),Occurrence.All))/Text.Length([CapValues])),
        StartsWith = Table.AddColumn(ContainsCharacters, "StartsWith", each if Text.Start([CapValues],2) = Text.Start(RefCap,2) then 1 else 0),
        EndsWith = Table.AddColumn(StartsWith, "EndsWith", each if Text.End([CapValues],2) = Text.End(RefCap,2) then 1 else 0),
        Match = Table.AddColumn(EndsWith, "MatchPerc", each if [Capitals] = "FullMatch" then 1 else ([TextLength]+[ContainsCharacters]+[StartsWith]+[EndsWith])/4),
        Return = Table.AddColumn(Match, "MatchCategory", each if [MatchPerc] > 0.6 then Ref else "")
    in
        Return

    With regards to your invoice numbers: If they contain some specific text-string you should include a match on this criteria as well.

    Link to file


    Imke


    Thursday, July 30, 2015 6:42 PM
    Moderator
  • Can't be done out of the box. Chris Webb created a "like" custom function. http://blog.crossjoin.co.uk/2014/05/27/implementing-a-basic-likewildcard-search-function-in-power-query/

    However, it's not designed to substitute any single character, or a specific character, if that's what you're after.

    Regex functions and fuzzy lookup join functions are at the top of my list for new functionality in Power Query. However, these functions are not at the top of the popularity request list, so I don't expect them to be implemented any time soon.

    Wednesday, July 29, 2015 5:02 PM

All replies

  • Can't be done out of the box. Chris Webb created a "like" custom function. http://blog.crossjoin.co.uk/2014/05/27/implementing-a-basic-likewildcard-search-function-in-power-query/

    However, it's not designed to substitute any single character, or a specific character, if that's what you're after.

    Regex functions and fuzzy lookup join functions are at the top of my list for new functionality in Power Query. However, these functions are not at the top of the popularity request list, so I don't expect them to be implemented any time soon.

    Wednesday, July 29, 2015 5:02 PM
  • @ Colin,

    I'd like to support this feature. Do you think this is the right one to vote for?: https://support.powerbi.com/forums/282523-bi-in-sql-vnext/suggestions/7260696-add-regular-expression-support-to-as-native-databa

    @ Renan,

    here are some mockup-ideas for the meantime :-)

    let
    // In table "Reference" sits your string to compare against "Declaramos"
        Ref = Excel.CurrentWorkbook(){[Name="Reference"]}[Content]{0}[Reference],
        RefCap = Text.Upper(Ref),
    // In table "Values" are sitting your values that you check to match
        Source = Excel.CurrentWorkbook(){[Name="Values"]}[Content],
        ChgType = Table.TransformColumnTypes(Source,{{"Values", type text}}),
        AddCapValue = Table.AddColumn(ChgType, "CapValues", each Text.Upper([Values])),
        CapitalMatch = Table.AddColumn(AddCapValue, "Capitals", each if [CapValues] = RefCap then "FullMatch" else ""),
        TextLength = Table.AddColumn(CapitalMatch, "TextLength", each if (Text.Length([Values])-Text.Length(RefCap))=0 then 1 else 0),
        ContainsCharacters = Table.AddColumn(TextLength, "ContainsCharacters", each List.Count(Text.PositionOfAny([CapValues],Text.ToList(RefCap),Occurrence.All))/Text.Length([CapValues])),
        StartsWith = Table.AddColumn(ContainsCharacters, "StartsWith", each if Text.Start([CapValues],2) = Text.Start(RefCap,2) then 1 else 0),
        EndsWith = Table.AddColumn(StartsWith, "EndsWith", each if Text.End([CapValues],2) = Text.End(RefCap,2) then 1 else 0),
        Match = Table.AddColumn(EndsWith, "MatchPerc", each if [Capitals] = "FullMatch" then 1 else ([TextLength]+[ContainsCharacters]+[StartsWith]+[EndsWith])/4),
        Return = Table.AddColumn(Match, "MatchCategory", each if [MatchPerc] > 0.6 then Ref else "")
    in
        Return

    With regards to your invoice numbers: If they contain some specific text-string you should include a match on this criteria as well.

    Link to file


    Imke


    Thursday, July 30, 2015 6:42 PM
    Moderator
  • Hi Imke, that's correct. Remember to give it three votes. :)
    Thursday, July 30, 2015 11:50 PM