none
Can I use regular expression in Power Query ? RRS feed

  • Question

  • Can I use regular expression in Power Query ?

    I want to use regular expression to get value from text.
    like, "<title>([a-z0-9 ]+)</title>", for get title value from HTML text.
    like, "<a href=\"(.+)\"", for get link url from HTML text.

    Regards,
    Yoshihiro Kawabata 

    Monday, October 14, 2013 7:58 PM

Answers

  • Hi Yoshihiro, we don't offer that functionality today but it's a popular request. Usually a regex step can be replaced with steps that our tool does offer though it's not nearly as quick as a regex would be. If you need specific help getting your data parsed, please let us know. Thanks for the feedback!

    Monday, October 14, 2013 8:34 PM
    Moderator

All replies

  • Hi Yoshihiro, we don't offer that functionality today but it's a popular request. Usually a regex step can be replaced with steps that our tool does offer though it's not nearly as quick as a regex would be. If you need specific help getting your data parsed, please let us know. Thanks for the feedback!

    Monday, October 14, 2013 8:34 PM
    Moderator
  • Thank you , Ben.

    Now, I use a lot of steps filter/replace/filter/spllite/etc.. instead of regex.
    like making lists of specific url pattern from HTML text.

    Regards,
    Yoshihiro Kawabata

    Monday, October 14, 2013 8:42 PM
  • 3 years later, with the role of Power Query expanding, I need an update on this.  As this is integral in Power BI, we would like to use this feature.  There are MANY open source offerings that will get us this capability.  Where can we escalate this feature in the product backlog?
    Wednesday, August 3, 2016 6:55 PM
  • 3 years later, with the role of Power Query expanding, I need an update on this.  As this is integral in Power BI, we would like to use this feature.  There are MANY open source offerings that will get us this capability.  Where can we escalate this feature in the product backlog?

    Well, you can add your two cents to one or both of these forums:

    https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/8892295-regular-expressions-support-in-power-query

    https://ideas.powerbi.com/forums/282523-bi-in-sql-vnext/suggestions/7260696-add-regular-expression-support-to-as-native-databa

    Don't expect much though. Some requests have statuses like "UNDER REVIEW," "PLANNED," or "STARTED."

    My optimistic interpretation of these statuses:
    1) "UNDER REVIEW" - We've had a round-table discussion about this feature. We've filed it away for now - but if there's enough demand, we'll put it at the bottom of the non-priority list.
    2) "PLANNED" - We've put this request in the non-priority list. We're planning to implement the feature within the next five years - hopefully.
    3) "STARTED" - We've started work on this feature - the functional spec is complete, but we still have to write the detailed design spec. Development will begin within the next year or so - with any luck.
    4) No status at all (Regex falls into this category!) - What's your expected lifespan? If you're within a decade of retiring, dont even bother asking for this feature. :)


    Wednesday, August 3, 2016 9:41 PM
  • 4 years later (7 years total), and nothing whatsoever has happened with this. Is there a Microsoft partner that offers a paid app to do this? At this point I have to conclude that there is a concrete reason Microsoft does not want to integrate this (super-duper-basic on all other platforms) functionality. Don't get me wrong, I LOVE Microsoft. I have since Windows 3.1 (yes, I'm old), but things like this drive me batty. If it's that much bother then open source PowerQuery and I'll add the regex code myself!
    Monday, February 10, 2020 9:44 PM
  • Expect nothing, and you won't be disappointed. Well, you still might, but at least you'll be mentally prepared for the inertia.
    Monday, February 10, 2020 9:45 PM
  • There's a workaround I'm using to use Regular Expressions, but its not very efficient (so this may be worse than not using a regex at all):

    let
        Match = (input, pattern, flags) => Json.Document(Web.Page("
            <script>
            var dq = String.fromCharCode(34);
            var input = '" & Text.Replace(Text.Replace(input, "\", "\\"), "'", "\'") & "';
            var pattern = '" & Text.Replace(Text.Replace(pattern, "\", "\\"), "'", "\'") & "';
            var flags = '" & Text.Replace(Text.Replace(flags, "\", "\\"), "'", "\'") & "';
            var re = new RegExp(pattern, flags);
            var match = re.exec(input);
            try {
                document.write(stringify(match))
            }
            catch (e) {
                document.write(e.message);
            }
            function stringify(value) {
                if (value === null) return 'null';
                if (typeof value === 'undefined') return 'undefined';
                if (typeof value === 'number') return value.toString();
                if (typeof value === 'boolean') return value.toString();
                if (typeof value === 'string') return dq + value.replace(new RegExp(dq, 'g'), '\\' + dq) + dq;
                var s = '';
                if (value instanceof Array) {
                    for (var i = 0; i < value.length; i++) {
                        if (s) s += ', ';
                        s += stringify(value[i]);
                    }
                    return '[' + s + ']';
                }
                else {
                    for (var p in value) {
                        if (value.hasOwnProperty(p)) {
                            if (s) s += ', ';
                            s += stringify(p) + ': ' + stringify(value[p]);
                        }
                    }
                    return '{' + s + '}';
                }
            }
            </script>
        ")[Data]{0}[Children]{0}[Children]{1}[Text]{0}),
    
        m = Match("input", "pattern", "")
    in
        m


    Monday, March 2, 2020 1:40 AM
  • Still nothing that I can see.  I have order numbers that look like xORD-2020-99999 and I'd REALLY like to peel these out of entity ID's that have this as a prefix.  I agree, that this is SUCH a basic functionality.  There are SO many RegEx pieces already built into .Net Core or Framework.  Does MS not even eat their own dogfood?  What gives?
    Saturday, September 19, 2020 3:57 PM