none
sql query

    Question

  • Hi ,,

    I have a table with some columns , but am concerned about this column.

    If you see the values for this column below, the 1st and 2nd, 3rd and 4th ,5th and 6th are actually same..just a number is appended in front of them.

    example lets take first two values: 20208296 and 000000208296. These two are same , some how 2 is appended in front and its making those two disitnct values.

    How to find how many of those are like that in table. There are million rows in the table.

    This column also has other kind of values , such as the last three values.I am not concerned about those kind.

    ColumnA

    20208296

    000000208296

    50822404

    000000822404

    60984072

    000000984072

    144233434

    234354355

    345346676

    Monday, February 03, 2014 6:56 PM

Answers

  • CREATE TABLE [dbo].[Data](
    	ColumnA [varchar](20) NOT NULL 
    ) ON [PRIMARY]
    
    GO
    -- Jan
    INSERT INTO [dbo].[Data]  VALUES ('20208296')
    ,('000000208296')
    ,('50822404')
    ,('000000822404')
    ,('60984072')
    ,('000000984072')
    ,('144233434')
    ,('234354355')
    ,('345346676')
    
    ;with mycte as
    (
    select ColumnA, Cast(Right(ColumnA,6) as int) right6, 
    ROW_NUMBER() Over(Partition by Cast(right(ColumnA,6) as int)  Order by  Cast(right(ColumnA,6) as int)   ) rn from data
    )
    Select * from mycte  
    Where right6 IN 
    (Select right6 from mycte 
    Where rn=2)
    
    
    drop table data

    Monday, February 03, 2014 8:57 PM
    Moderator

All replies

  • SELECT t.ColumnA,t1.ColumnA
    FROM table t
    INNER JOIN table t1
    ON t1.ColumnA LIKE '%' + CAST(CAST(t.ColumnA AS int) AS varchar(10))
    WHERE t.ColumnA LIKE '000000%'
    AND t1.ColumnA NOT LIKE '000000%'


    Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

    Monday, February 03, 2014 7:04 PM
  • there is no guarantee that the values will have fixed 6 leading zeros.

    It may have >= or < 6 leading zeros.

    Monday, February 03, 2014 7:20 PM
  • Better suggestion is Redo the ETL than trying to fix up the erraneous data..

    Second option : do you have a table that atleast maps to the values in this table? if so we can do a join and find it over.. if you don't have that you have to go back to option 1.

    See the below example.

    DECLARE @Input1 VARCHAR(20) = '111222', @Input2 VARCHAR(20) = '00010111222', Declare @input3 VARCHAR(20) = '00051111222'
    

    You cannot really differentiate between @input1, @input2, @input3 using like operator until or unless you know a specific pattern. I hope you understood.

    Good luck :) ... visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.

    Monday, February 03, 2014 8:02 PM
  • Hi Leo,

    From the small sample provided:

    All the ones without 0's appended to the front are 8 digits in length and the ones with 0's in front are 6 digits in length.

    Why do some have 0's and some not have 0s?

    How do you know that the 20208296 is the same as the 000000208296? What logic was used to figure this out?

    Do these IDs get loaded into this table from different sources? Do these errors occur as a result of simple input error?

    Please provide a bit more information so we can assist you with finding a resolution. Thanks!

    Monday, February 03, 2014 8:05 PM
  • CREATE TABLE [dbo].[Data](
    	ColumnA [varchar](20) NOT NULL 
    ) ON [PRIMARY]
    
    GO
    -- Jan
    INSERT INTO [dbo].[Data]  VALUES ('20208296')
    ,('000000208296')
    ,('50822404')
    ,('000000822404')
    ,('60984072')
    ,('000000984072')
    ,('144233434')
    ,('234354355')
    ,('345346676')
    
    ;with mycte as
    (
    select ColumnA, Cast(Right(ColumnA,6) as int) right6, 
    ROW_NUMBER() Over(Partition by Cast(right(ColumnA,6) as int)  Order by  Cast(right(ColumnA,6) as int)   ) rn from data
    )
    Select * from mycte  
    Where right6 IN 
    (Select right6 from mycte 
    Where rn=2)
    
    
    drop table data

    Monday, February 03, 2014 8:57 PM
    Moderator
  • hi

    Try this

     create table #temp(columnA varchar(25) )
    go
    Insert into #temp
    Values('20208296')
    Insert into #temp
    Values('000000208296')

    Insert into #temp
    Values('50822404')

    Insert into #temp
    Values('000000822404')

    Insert into #temp
    Values('60984072')

    Insert into #temp
    Values('000000984072')

    Insert into #temp
    Values('144233434')

    Insert into #temp
    Values('234354355')
    Insert into #temp
    Values('345346676')
    go
    Select cast(rtrim(ltrim(columnA)) as int) as test ,* from #temp
    Select distinct cast(rtrim(ltrim(columnA)) as int) as test  from #temp
    go
    Drop table #temp
    go

    Mark as Answer if you find it useful

    Shridhar J Joshi


    Thanks alot

    Tuesday, February 04, 2014 12:38 PM
  • See if this helps, I am assuming that you have a unique ID in your table.

    /*create table #temp(Id int identity(1,1),col varchar(25))
    go
    Insert into #temp
    Values('20208296'),
    ('000000208296'),
    ('50822404'),
    ('000000822404'),
    ('60984072'),
    ('000000984072'),
    ('144233434'),
    ('234354355'),
    ('345346676');
    */
    select t1.id,t1.col,t2.id from #temp t1 join #temp t2 
    on convert(varchar(100),t1.col)
    like '%'+convert(varchar(100),convert(bigint,t2.col))+'%' --drop table #temp

    You can remove the ending percentage '%'+convert(varchar(100),convert(bigint,t2.col))+'%'

    which will avoid matching the row 10 & 11( ID 7 & 8)


    Satheesh
    My Blog



    Tuesday, February 04, 2014 1:34 PM