locked
Remove all non numberic characters from a column RRS feed

  • Question

  • Hello, I need a way to remove ALL non numeric characters from a column ( space, comma, - , & , etc)

    there are about 2 million entries in a column, are there any known fast way to do this?

    I can't set up an over night job to do this, because this part of the project was an unexpected necessity because our
    external vendor started to give us garbage data


    JT
    Thursday, May 19, 2011 6:33 PM

Answers

  • Here is one solution:

    create table #t (my_col varchar(50))
    insert into #t values ('123')
    insert into #t values ('12a3')
    insert into #t values ('12a3bc')
    insert into #t values ('q3 5')
    insert into #t values ('4%6')
    
    SET NOCOUNT ON
    Declare @char char(1)
    SELECT TOP 1 @char=SUBSTRING(my_col,PATINDEX('%[^0-9]%', my_col),1)
    FROM #t
    WHERE PATINDEX('%[^0-9]%', my_col) > 0
    While @@rowcount > 0
    Begin
     UPDATE #t
     SET my_col = REPLACE(my_col, @char, '')
     WHERE my_col LIKE '%\'+@char+'%' ESCAPE '\'
     SELECT @char+': '+CAST(@@rowcount AS varchar(12))
     
     SELECT TOP 1 @char=SUBSTRING(my_col,PATINDEX('%[^0-9]%', my_col),1)
     FROM #t
     WHERE PATINDEX('%[^0-9]%', my_col) > 0
    End
    SET NOCOUNT OFF
    
    select * from #t
    
    drop table #t
    

     


    Gert-Jan
    • Marked as answer by mr JT Friday, May 20, 2011 1:48 PM
    Thursday, May 19, 2011 8:25 PM

All replies

  • Take a look at this blog post showing how to do it with pure T-SQL (which may be not the best solution for the job):

    Extracting numbers with SQL Server


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Thursday, May 19, 2011 6:35 PM
  • This definitely would be helpful in other applications, but not in my current case unfortunately...
    I need to remove "." as well and not just extract the number.

     

    I My server connects to SQL 2000 server, and I was wondering if it was possible to do anything like  RegExReplace ?

     


    JT
    Thursday, May 19, 2011 6:58 PM
  • Here is one solution:

    create table #t (my_col varchar(50))
    insert into #t values ('123')
    insert into #t values ('12a3')
    insert into #t values ('12a3bc')
    insert into #t values ('q3 5')
    insert into #t values ('4%6')
    
    SET NOCOUNT ON
    Declare @char char(1)
    SELECT TOP 1 @char=SUBSTRING(my_col,PATINDEX('%[^0-9]%', my_col),1)
    FROM #t
    WHERE PATINDEX('%[^0-9]%', my_col) > 0
    While @@rowcount > 0
    Begin
     UPDATE #t
     SET my_col = REPLACE(my_col, @char, '')
     WHERE my_col LIKE '%\'+@char+'%' ESCAPE '\'
     SELECT @char+': '+CAST(@@rowcount AS varchar(12))
     
     SELECT TOP 1 @char=SUBSTRING(my_col,PATINDEX('%[^0-9]%', my_col),1)
     FROM #t
     WHERE PATINDEX('%[^0-9]%', my_col) > 0
    End
    SET NOCOUNT OFF
    
    select * from #t
    
    drop table #t
    

     


    Gert-Jan
    • Marked as answer by mr JT Friday, May 20, 2011 1:48 PM
    Thursday, May 19, 2011 8:25 PM
  • It's easy to fix the original blog code to not include .

    There is no native RegEx replace in SQL Server at present, unfortunately. In SQL 2005+ you may try using CLR function.

     


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Thursday, May 19, 2011 9:13 PM