none
PatIndex Function

    Question

  • Hello, I am trying to utilize the PatIndex Function, and need a little assitance.

    I have a string like the following:

    74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5

    What we need to do is to search for any quantity of numbers at the begining of the string, followed by ";#" and remove those characters altogether, then search for any pattern of ";#" and any quantity of numbers and ";#" and replace with a ";" and a space.

    The completed string should look like:

    XYZ Awards Portal Members; Awards Contribute No Delete DRM1; Awards Contribute No Delete DBS5

    I have tried several different variations of the function below.  My current function is as follows, and it's not providing the results I am expecting:

    create FUNCTION [dbo].[udfSplit]
    (
    @string VARCHAR(8000)
    )
    RETURNS VARCHAR(8000)
    AS
    BEGIN
    DECLARE @IncorrectCharLoc SMALLINT
    SET @IncorrectCharLoc = PATINDEX('%[0-9][;#]%' , @string)
    WHILE @IncorrectCharLoc > 0
    BEGIN
    SET @string = STUFF(@string, @IncorrectCharLoc, 1, '')
    SET @IncorrectCharLoc = PATINDEX('%[0-9][;#]%', @string)
    END
    SET @string = @string
    RETURN @string
    END


    • Edited by jmcpsd Thursday, October 25, 2012 8:20 PM
    Thursday, October 25, 2012 7:57 PM

Answers

  • Sorry. Untested. Also, assumed that the largest numeric value is 7 digits long. But Try:

    ALTER FUNCTION [dbo].[udfSplit]
    (
    @string VARCHAR(8000)
    )
    RETURNS VARCHAR(8000)
    AS
    BEGIN
    DECLARE @IncorrectCharLoc INT
    DECLARE @LenOfCharToRemove INT
    
    SET @IncorrectCharLoc =	CASE
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)	
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][;#]%' , @string)
    								ELSE 0				
    								END
    
    WHILE @IncorrectCharLoc > 0
    BEGIN
    SET @LenOfCharToRemove = CHARINDEX('#', @string,@IncorrectCharLoc)
    
    SET @IncorrectCharLoc = CASE WHEN @IncorrectCharLoc <> 1 THEN @IncorrectCharLoc - 1 ELSE @IncorrectCharLoc END
    
    SET @string = STUFF(@string, @IncorrectCharLoc , @LenOfCharToRemove-@IncorrectCharLoc, ';')
    
    
    
    
    
    SET @IncorrectCharLoc =	CASE
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)	
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][;#]%' , @string)	
    								ELSE 0			
    								END
    END
    
    
    
    
    
    SET @string = REPLACE(@string,';;',';')
    RETURN @string
    END
    
    
    
    GO 
    
    CREATE TABLE PatIndx
    (
     Col1 VARCHAR(4000)
    ) 
    
    
    
    INSERT PatIndx
    SELECT '74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    
    
    
    
    SELECT REPLACE((dbo.udfSplit(Col1)),';#', ' ;')AfterTheSplitFunction, '74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5' BeforeTheSplitFunction
    FROM PatIndx
    
    
    DROP TABLE PatIndx



    Please mark as answer if this answers your question. Please mark as helpful if you found this post was helpful.




    Thursday, October 25, 2012 9:35 PM
  • declare @s varchar(2000)='74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    ;with mycte as
    (SELECT 
    d.id.value('(/H/r)[1]', 'NVARCHAR(50)') c1,
    d.id.value('(/H/r)[2]', 'NVARCHAR(50)') c2,
    d.id.value('(/H/r)[3]', 'NVARCHAR(50)') c3,
    d.id.value('(/H/r)[4]', 'NVARCHAR(50)') c4,
    d.id.value('(/H/r)[5]', 'NVARCHAR(50)') c5,
    d.id.value('(/H/r)[6]', 'NVARCHAR(50)') c6,
    d.id.value('(/H/r)[7]', 'NVARCHAR(50)') c7,
    d.id.value('(/H/r)[8]', 'NVARCHAR(50)') c8
    FROM 
    (SELECT CAST ('<H><r>' + REPLACE(@s, ';#', '</r><r>') + '</r></H>' as XML ) as id
     ) d
    )
    SELECT
     ISNULL(CASE WHEN ISNUMERIC(c1)=1  THEN '' ELSE c1  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c2)=1   THEN ';' ELSE c2  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c3)=1   THEN ';' ELSE c3  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c4)=1   THEN ';' ELSE c4  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c5)=1   THEN ';' ELSE c5  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c6)=1   THEN ';' ELSE c6  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c7)=1   THEN ';' ELSE c7  END,'') 
    FROM mycte

    Thursday, October 25, 2012 9:50 PM

All replies

  • declare @s varchar(2000)='74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    SELECT  replace( [dbo].[RemoveNumeric](@s),';#',' ')  
    CREATE FUNCTION [dbo].[RemoveNumeric]
    (
    	@string varchar(2000)
    )
    RETURNS varchar(2000)
    AS
    BEGIN
    RETURN ( select cast(cast((select substring(@string,number,1)
    from (select number from master..spt_values 
    where type='p' and number between 1 and len(@string) ) t
    where number <= len(@string)
    and substring(@string,number,1) like '[^0-9]' for xml path('')) as xml)as varchar(max))
     )
    END

    Thursday, October 25, 2012 8:26 PM
  • I'm not looking to remove all numerics from the string. Please see above in my inital question that we need to keep DRM1 (one) and DBS5 (five).

    Just want to replace on the start of the string any quantity of numbers which are directly followed by a ;# with nothing (null value or just remove the pattern)

    And anywhere after the start of the string any quantity of numbers which are directly preceded by ;# and followed by ;# to replace with just a semicolon ;

    Thursday, October 25, 2012 9:16 PM
  • Sorry. Untested. Also, assumed that the largest numeric value is 7 digits long. But Try:

    ALTER FUNCTION [dbo].[udfSplit]
    (
    @string VARCHAR(8000)
    )
    RETURNS VARCHAR(8000)
    AS
    BEGIN
    DECLARE @IncorrectCharLoc INT
    DECLARE @LenOfCharToRemove INT
    
    SET @IncorrectCharLoc =	CASE
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)	
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][;#]%' , @string)
    								ELSE 0				
    								END
    
    WHILE @IncorrectCharLoc > 0
    BEGIN
    SET @LenOfCharToRemove = CHARINDEX('#', @string,@IncorrectCharLoc)
    
    SET @IncorrectCharLoc = CASE WHEN @IncorrectCharLoc <> 1 THEN @IncorrectCharLoc - 1 ELSE @IncorrectCharLoc END
    
    SET @string = STUFF(@string, @IncorrectCharLoc , @LenOfCharToRemove-@IncorrectCharLoc, ';')
    
    
    
    
    
    SET @IncorrectCharLoc =	CASE
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)	
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][0-9][;#]%' , @string)
    								WHEN PATINDEX('%[0-9][;#]%' , @string) <> 0 THEN PATINDEX('%[0-9][;#]%' , @string)	
    								ELSE 0			
    								END
    END
    
    
    
    
    
    SET @string = REPLACE(@string,';;',';')
    RETURN @string
    END
    
    
    
    GO 
    
    CREATE TABLE PatIndx
    (
     Col1 VARCHAR(4000)
    ) 
    
    
    
    INSERT PatIndx
    SELECT '74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    
    
    
    
    SELECT REPLACE((dbo.udfSplit(Col1)),';#', ' ;')AfterTheSplitFunction, '74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5' BeforeTheSplitFunction
    FROM PatIndx
    
    
    DROP TABLE PatIndx



    Please mark as answer if this answers your question. Please mark as helpful if you found this post was helpful.




    Thursday, October 25, 2012 9:35 PM
  • declare @s varchar(2000)='74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    ;with mycte as
    (SELECT 
    d.id.value('(/H/r)[1]', 'NVARCHAR(50)') c1,
    d.id.value('(/H/r)[2]', 'NVARCHAR(50)') c2,
    d.id.value('(/H/r)[3]', 'NVARCHAR(50)') c3,
    d.id.value('(/H/r)[4]', 'NVARCHAR(50)') c4,
    d.id.value('(/H/r)[5]', 'NVARCHAR(50)') c5,
    d.id.value('(/H/r)[6]', 'NVARCHAR(50)') c6,
    d.id.value('(/H/r)[7]', 'NVARCHAR(50)') c7,
    d.id.value('(/H/r)[8]', 'NVARCHAR(50)') c8
    FROM 
    (SELECT CAST ('<H><r>' + REPLACE(@s, ';#', '</r><r>') + '</r></H>' as XML ) as id
     ) d
    )
    SELECT
     ISNULL(CASE WHEN ISNUMERIC(c1)=1  THEN '' ELSE c1  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c2)=1   THEN ';' ELSE c2  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c3)=1   THEN ';' ELSE c3  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c4)=1   THEN ';' ELSE c4  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c5)=1   THEN ';' ELSE c5  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c6)=1   THEN ';' ELSE c6  END,'') 
    +ISNULL(CASE WHEN ISNUMERIC(c7)=1   THEN ';' ELSE c7  END,'') 
    FROM mycte

    Thursday, October 25, 2012 9:50 PM
  • The function I'd use to split the string

     create function dbo.SplitString 
    -- string function from http://stackoverflow.com/a/2703/573261
        (
            @str nvarchar(4000), 
            @separator char(2)
        )
        returns table
        AS
        return (
            with tokens(p, a, b) AS (
                select 
                    1, 
                    1, 
                    charindex(@separator, @str)
                union all
                select
                    p + 1, 
                    b + 2, 
                    charindex(@separator, @str, b + 2)
                from tokens
                where b > 0
            )
            select
                p-1 zeroBasedOccurance,
                substring(
                    @str, 
                    a, 
                    case when b > 0 then b-a ELSE 4000 end) 
                AS s
            from tokens
          )
    GO

    A sample table and some data

    create table tbl (
      string varchar(max));
    insert tbl select '74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5'
    union all select ';#XYZ Awards Portal Members;#4740;#Awards Contri'
    union all select ''
    union all select null
    union all select ';#XYZ Awards Portal Members;#4740;#Awards ;#;#;#Contri'
    union all select ';#XYZ Awards Portal Members;#4740;#123;#Awards ;#;ssss;#2222;#Contri'

    The query

    select string, stuff((
      select ';#' + s
      from dbo.SplitString(tbl.string, ';#') x
      where isnumeric(s)=0 or s like '%[^0-9]%'
      order by zerobasedoccurance
      for xml path(''), type).value('.','nvarchar(max)'),1,2,'') converted
    from tbl;

    Results:

    |                                                                                                           STRING |                                                                                     CONVERTED |
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    | 74;#XYZ Awards Portal Members;#4740;#Awards Contribute No Delete DRM1;#9987710;#Awards Contribute No Delete DBS5 | XYZ Awards Portal Members;#Awards Contribute No Delete DRM1;#Awards Contribute No Delete DBS5 |
    |                                                                 ;#XYZ Awards Portal Members;#4740;#Awards Contri |                                                    ;#XYZ Awards Portal Members;#Awards Contri |
    |                                                                                                                  |                                                                                               |
    |                                                                                                           (null) |                                                                                        (null) |
    |                                                           ;#XYZ Awards Portal Members;#4740;#Awards ;#;#;#Contri |                                              ;#XYZ Awards Portal Members;#Awards ;#;#;#Contri |
    |                                             ;#XYZ Awards Portal Members;#4740;#123;#Awards ;#;ssss;#2222;#Contri |                                           ;#XYZ Awards Portal Members;#Awards ;#;ssss;#Contri |



    Friday, October 26, 2012 7:09 AM
  • You're seeing that there are several ways to do this.

    If you're married to using a function (which is fine if you use it once for one string, but will be slower if you have to apply the same logic to many thousands of rows), here's some code that'll work (minus the start and end of the function setup). 

    Set @String = @String + ';#9999'
    Declare @NewString Varchar(999) = ''
    
    DECLARE @IncorrectCharLoc SMALLINT, @EndPos SmallInt
    SET @IncorrectCharLoc = PATINDEX('%[0-9];#%' , @string)
    WHILE @IncorrectCharLoc between 1 and len(@String) - 6
    	BEGIN
    		Set @Endpos = CharIndex(';#', @String, @IncorrectCharLoc + 2)
    		If @NewString <> '' Set @NewString = @NewString + '; '
    		Set @NewString = @NewString + SubString(@String, @IncorrectCharLoc + 3, @EndPos - @IncorrectCharLoc - 3) 
    		SET @string = STUFF(@string, 1, @EndPos + 1, 'X')
    		SET @IncorrectCharLoc = PATINDEX('%[0-9][;#]%', @string)
    	END
    Do you later on split into rows, or is that a non issue for you?  Do you have to process lots of rows?


    Friday, October 26, 2012 1:59 PM