locked
Split function -split a word with dot into two strings RRS feed

  • Question

  • Hello Team,

    I have facing some issue with strings and split strings on dot.

    I need to import a column from excel sheet which looks like this 601,601,607, 601.02,602.01.601.20, 701,702, 701.01" and so on.

    There are two possible values either the input number has a dot or it does not have a dot.

    My algorithm is as below,

    1. Read excel cell, compare with regex

    2 If regular expression match split the string on dot else color the cell background to orange,

    goto step 4

    3. check if the first part of the string is part of the hash table which has approved list if match\contains the value color corresponding cell to green , if no color column blue

    4. if more cells goto step 1 else goto step 5

    5. exit function, close excel.


    My code snippet is as follows

    if($worksheet.cells.Item($i, $RoomNumberColumn).value2 -match "^[0-9]{3}([a-c]{1})?([.]{0,1})?([0]{1})?([1-20{1})?$")
                          
                      {
                       
                      $option = [System.StringSplitOptions]::RemoveEmptyEntries
    
                      [String]$RoomNumber= $worksheet.cells.Item($i, $RoomNumberColumn).value2
                      [string]$RoomNumberParts=$RoomNumber.split(".",2,$option)
                      Write-Host "Value of Number--> $RoomNumberParts and" $RoomNumberParts[0]
                      if($hashRoomNumber.ContainsKey($RoomNumberParts[0]))
    
                      {$worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 35;}
                      Else {$worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 37;}
                      }
     
            
            Else
            {
            $worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 45;
            }
     


    Question: Why is my string not breaking up into the two parts eg. 607 and 02  and instead when I try to access $RoomNumberParts[0] I am getting a 6 and not value 607. Kindly assist me to split the string into two parts only  601 and 02

    thanks,

    RJ



    Monday, May 6, 2019 2:38 AM

All replies

  • There seems to be a typo in your regex and it's invalid.  But I think you want to use "\." instead of "." in your regex.  "." means any character.  "\." means a literal period.

    • Edited by JS2010 Monday, May 6, 2019 3:08 AM
    Monday, May 6, 2019 3:08 AM
  • Hi,

    I checked through a regular expression checker tool and it said [.] and [\.] are equivalent and its not giving me any different results.

    Monday, May 6, 2019 8:35 AM
  • "601,601,607, 601.02,602.01.601.20, 701,702, 701.01" -split ',' -replace '\s'

    If this is what you column cell looks like.  First split on the commas.

    If you just want to detect a "." then dot this.

    if("601,601,607, 601.02,602.01.601.20, 701,702, 701.01" -match '\.'){ 'dot found' }

    Start by writing less code.  You are way overcomplicating this and it is causing much of your confusion.


    \_(ツ)_/

    Monday, May 6, 2019 10:54 AM
  • Hi,

    I checked through a regular expression checker tool and it said [.] and [\.] are equivalent and its not giving me any different results.

    When specifically looking for a "." you must use the "\." or it will always be true. "." matches any non-special character. "\." matches only the dot character.

    "Split()" is not RegEx and will match the dot correctly.  "-match" and "-replace" are full RegEx and will require correct patterns.


    \_(ツ)_/

    Monday, May 6, 2019 10:58 AM
  • You're right about "[.]".  That matches a period literally.

    The regex expression posted isn't valid.  I assume "[1-20" should have a closing "]".

    "hi" -match "^[0-9]{3}([a-c]{1})?([.]{0,1})?([0]{1})?([1-20{1})?$"

    parsing '^[0-9]{3}([a-c]{1})?([.]{0,1})?([0]{1})?([1-20{1})?$' - Unterminated [] set.
    At line:1 char:1
    + "hi" -match "^[0-9]{3}([a-c]{1})?([.]{0,1})?([0]{1})?([1-20{1})?$"
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (:) [], ArgumentException
    + FullyQualifiedErrorId : System.ArgumentException


    Another problem is you're casting what should be an array into a single string. 

    Remove "[string]".


    [string]$RoomNumberParts=$RoomNumber.split(".",2,$option)

    Monday, May 6, 2019 12:28 PM
  • That's sorta true. You can use "." in the same way you use "\." if you use it in a class match such as "[.]" It's a bit more typing, but in a regex with lots of escapes (i.e. one that looks like a picket fence) it makes the regex somewhat easier to decipher.

    Here's a trivial example:

    $x = "1.2.3"
    $y = "12345"
    $x -match "^\d[.]\d[.]\d$" #TRUE
    $x -match "^\d\.\d\.\d$"   #TRUE
    $y -match "^\d[.]\d[.]\d$" # FALSE
    $y -match "^\d\.\d\.\d$"   # FALSE


    --- Rich Matheisen MCSE&I, Exchange Ex-MVP (16 years)

    Monday, May 6, 2019 2:45 PM
  • Yes.  Using it in a character class match works and can be useful.  Matching an IP can make this technique less readable.

    '^\d{1,3}[.]\d{1,3}[.]\d{1,3}$'
    '^\d{1,3}\.\d{1,3}\.\d{1,3}$'

    It is still personal choice but the second one reads better for me.


    \_(ツ)_/


    Monday, May 6, 2019 3:00 PM
  • Hi,

    Thanks for the input, I didt some searching and found another way to split my modified code now looks like this,it has worked on some test data, but unsure if I am making some mistake which would fail on large dataset,

     $worksheet.cells.Item($i, $RoomNumberColumn).value2 -match "^[0-9]{3}([a-c]{1})?([\.]{0,1})?([0]{1})?([1-20]{1,2})?$")
                          
                      {
                       
                      [String]$RoomNumber= $worksheet.cells.Item($i, $RoomNumberColumn).value2
                      [string]$RoomNumberParts1=$RoomNumber.split(".")|Select -Index 0
                      [string]$RoomNumberParts2=$RoomNumber.split(".")|Select -Index 1
    
                      #Write-Host "Value of Number--> $RoomNumberParts1 and $RoomNumberParts2"
                      if($hashRoomNumber.ContainsKey($RoomNumberParts1))
    
                      {$worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 35;}
                      Else {$worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 37;}
                      }
     
            
            Else
            {
            $worksheet.cells.Item($i, $RoomNumberColumn).Interior.ColorIndex = 45;
            }

    Wednesday, May 15, 2019 4:01 AM
  • That is the longest attempt at splitting a string I have ever seen.

    $part1,$part2 = $worksheet.cells.Item($i,$RoomNumberColumn).value2 -split '\.'


    \_(ツ)_/

    Wednesday, May 15, 2019 4:14 AM
  • Yes agree. But that's a side-effect, when we learn programming by referring to snippets of code from forums other people's works.

    Also read somewhere that for large volume of data direct referencing the value like 

    $worksheet.cells.Item($i,$RoomNumberColumn).value2 slows down the code but assigning to a variable [String]$RoomNumber= $worksheet.cells.Item($i, $RoomNumberColumn).value2 speeds up the code for large values to be compared.

    I didn't do a measure to actually find out if there is a significant difference between the two methods.

    Thanks for your tips very helpful.



    • Edited by c_programer Wednesday, May 15, 2019 4:40 AM
    Wednesday, May 15, 2019 4:36 AM
  • No.  That does not slow down the code.  

    You cannot learn programming from a forum.  You can only accumulate numerous bad habits and gather misinformation.   Take from someone with more than 30 years of technical programming experience.

    Also, in PowerShell it is not necessary to declare a type when assigning a variable. PowerShell does this for you.

    "select -index 0" is unnecessary and slow.  Use subscripts.

    ($worksheet.cells.Item($i,$RoomNumberColumn).value2 -split '\.')[0]

    Also using the double assignment I used reduces the number of round trips to the server to one.  Your method takes two.

    You cannot easily measure the relative performance of these things because it is automation.  The system will always insert a random amount of delay that is unpredictable. 

    So the lesson is don't believe what the hackers tell you about how this works.  Most are wrong. 


    \_(ツ)_/

    Wednesday, May 15, 2019 4:58 AM
  • Hi,

    Was your issue resolved?

    If you resolved it using our solution, please "mark it as answer" to help other community members find the helpful reply quickly.

    If you resolve it using your own solution, please share your experience and solution here. It will be very beneficial for other community members who have similar questions.

    If no, please reply and tell us the current situation in order to provide further help.

    Best Regards,

    Lee


    Just do it.

    Friday, May 31, 2019 6:43 AM