locked
Question about powershell and regex with variables. RRS feed

  • Question

  • We are indexing emails and we receive the following subject:

    Bericht over zaaknr., 1409, 21-11-2018 11:33 test email

    With the folowing scripts we try to retrieve the nr 1409 from the title.

    $object.IndexValue["dossier_id"] = $object.Object.Subject
    $regex = [regex]"<(?!\s*$)[0-9]\d{1,}/>"
    $result = $regex.Match($dossier_id)
    if($result.Success){
     $object.IndexValue["dossier_id"] = $result.Captures[0].Groups[1].Value
    }

    When we export the values we see that the indexvalue dossier_id has the value:

    Bericht over zaaknr., 1409, 21-11-2018 11:33 test email

    it seems the Regex isnt working, can someone help me out?

    Wednesday, November 21, 2018 1:01 PM

All replies

  • Like this in PowerShell:

    $dossier_id = 'Bericht over zaaknr., 1409, 21-11-2018 11:33 test email'
    if($dossier_id -match '\.,\s+(\d{4,4}),'){
        $object.IndexValue["dossier_id"] = $matches[1]
    }


    \_(ツ)_/

    Wednesday, November 21, 2018 1:13 PM
  • The current regex you're using is too complicated (and probably not even correct) for what you're trying to achieve. I suggest another approach.

    ($object.IndexValue["dossier_id"] -split ',\s?')[1]



    Wednesday, November 21, 2018 1:43 PM
  • Try plugging the regex into http://regex101.com.  For one thing, it gives an error about not escaping the forward slash at the end.  '\d{1,}' matches all the numbers.

    • Edited by JS2010 Wednesday, November 21, 2018 3:54 PM
    Wednesday, November 21, 2018 3:45 PM
  • The Regex I posted does exactly what was asked.  I think the OP just fell asleep.


    \_(ツ)_/

    Wednesday, November 21, 2018 3:57 PM
  • More a matter of style in this case, but if you're using a quantifier to match an exact number of "something" (in this case, digits) you should use "\d{4}" instead of a range "\d{4,4}". If you're running the regex against a LOT of data it's even faster to not ask the regex engine to do any counting at all and just use "\d\d\d\d".

    I spent a lot of time running complex regexes against massive amounts data in Perl, and the time spent in simplifying regexes (and avoiding backtracking!) is time well spent for anything except one-off scripts.

    Not a critique, just a FYI. :-)


    --- Rich Matheisen MCSE&I, Exchange Ex-MVP (16 years)

    Wednesday, November 21, 2018 7:36 PM
  • Hi,

    Was your issue resolved?

    If you resolved it using our solution, please "mark it as answer" to help other community members find the helpful reply quickly.

    If you resolve it using your own solution, please share your experience and solution here. It will be very beneficial for other community members who have similar questions.

    If no, please reply and tell us the current situation in order to provide further help.

    Best Regards,

    LEE


    Just do it.

    Tuesday, November 27, 2018 2:56 AM
  • Hi,

    Was your issue resolved?

    If you resolved it using our solution, please "mark it as answer" to help other community members find the helpful reply quickly.

    If you resolve it using your own solution, please share your experience and solution here. It will be very beneficial for other community members who have similar questions.

    If no, please reply and tell us the current situation in order to provide further help.

    Best Regards,

    Lee


    Just do it.

    Tuesday, December 4, 2018 6:49 AM