none
identify repetition of the same word twice successively in sql

    Question

  • Hi All,

    A cell value contains the text as below. In the below text we can see the word sakat repeated twice successively.

    my query, i want to retrive the words that are repeated twice successively.

    I want to write a sql query to retrive such repetitive words stored in different cells.

    Can someone please help?

    Text:

    "My name is sakat sakat kumar, from AP."

    "Life is beautiful beautiful, enjoy it."

    Thanks,

    Sakat


    • Edited by Sakath Monday, March 20, 2017 9:10 AM
    Monday, March 20, 2017 9:04 AM

All replies

  • You would have to split your strings on many separators (whitespace, punctuation, begin of string, end of string) and then identify the strings that contain adjacent identical tokens.

    While definitely possible in T-SQL, it is not the best language or toolset to do that. Would you consider doing this in your application layer?

    Monday, March 20, 2017 9:26 AM
  • Is that always such string? Could it be 

    Life is beautiful , enjoy it beautiful.? Is that considered  as duplicated as well

             

     CREATE FUNCTION [dbo].[SplitString]
         (
             @str VARCHAR(MAX)
         )
        RETURNS @ret TABLE (token VARCHAR(MAX))
         AS
         BEGIN

        DECLARE @x XML 
        SET @x = '<t>' + REPLACE(@str, ' ', '</t><t>') + '</t>'

        INSERT INTO @ret
            SELECT x.i.value('.', 'VARCHAR(MAX)') AS token
            FROM @x.nodes('//t') x(i)
        RETURN
       END

    CREATE TABLE #t (st VARCHAR(50))

    INSERT INTO #t VALUES ('Life is beautiful beautiful enjoy it.')
    INSERT INTO #t VALUES ('Life is beautiful enjoy it.')
    INSERT INTO #t VALUES ('My name is sakat sakat kumar from AP.')

    WITH cte
    AS
    (
    SELECT *,ROW_NUMBER() OVER (PARTITION BY st,token ORDER BY (SELECT 0)) rn
     FROM  #t CROSS APPLY(SELECT * FROM [SplitString] (st)) AS DER
    ) SELECT *
     FROM cte WHERE rn>1



    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Monday, March 20, 2017 9:39 AM
    Moderator
  • CREATE TABLE #data (text VARCHAR(1000)) INSERT INTO #data VALUES ('My name is sakat sakat kumar from AP') INSERT INTO #data VALUES ('Life is beautiful beautiful enjoy it') declare @temp table(text varchar(1000)) declare @result table(text varchar(1000), word varchar(100)) insert into @temp select * from #data while exists(select * from @temp) begin declare @text VARCHAR(100) declare @temptext VARCHAR(100) declare @word VARCHAR(100)

    select top 1 @text = text from @temp set @temptext = @text while charindex(' ', @temptext, 0) <> 0 begin set @word = substring(@temptext,0,charindex(' ',@temptext,0)) set @temptext = substring(@temptext,charindex(' ',@temptext,0) + 1,len(@temptext)) insert into @result values(@text, @word) end insert into @result values(@text, @temptext) delete top (1) from @temp end select text, word, count(*) from @result group by text, word having count(*) > 1





    • Edited by SQLNeophyte Monday, March 20, 2017 12:00 PM
    Monday, March 20, 2017 11:58 AM
  • Hi,

    Thank you for your reply. I only have SQL access to retrive the data through query.

    I have a huge database of texts. From which i need to identify the above mentioned examples.

    Thanks,

    Sakat

    Tuesday, March 21, 2017 1:48 AM
  • Hi SQLNeophyte Team,

    Thanks for your response.

    I used the above query for the text

    text to be inserted: 

    XYZ Corp announced sales results for the fourth quarter and full year ended January 3, 2016. For the fourth quarter of 2015, company-owned comparable store sales decreased 8.5%, Franchise comparable store sales decreased 6.2%, and System-wide comparable store sales decreased 2.5%. Full year, company-owned comparable store sales increased 0.8%, Franchise comparable store sales decreased 0.3%, and System-wide comparable store sales decreased 0.2%. For the year 2015, the company expects to report total revenue to meet or exceed $78 million compared to prior guidance of approximately $8 million. Non-GAAP adjusted EBITDA to meet or exceed $1.5 million compared to prior guidance of approximately $1.5 million. For the year 2017, the company expects total revenue to be in the range of $5 million to $7 million compared to preliminary guidance of $7 million to $8 million. Non-GAAP adjusted EBITDA to be in the range of $3 million to $15 million compared to preliminary guidance of $1 million to $6 million.

    The result that is shown is as below

    text                                                                                                                                 word (No column name)
    XYZ Corp announced sales results for the fourth quarter and full year ended January 3, 2016. For the   For  2
    XYZ Corp announced sales results for the fourth quarter and full year ended January 3, 2016. For the  the 2

    For the above text, the query should give zero results, but the results are shown as above. Please help.

    The Requirement is if the text is as below, the query should give the below highlighted underlined words

    XYZ Corp announced sales sales results for the fourth quarter and full full year ended January 3, 2016. For the fourth quarter of 2015, company-owned comparable  store store sales decreased 8.5%, Franchise comparable store sales decreased 6.2%, and System-wide comparable store sales decreased 2.5%. Full year, company-owned comparable store sales increased 0.8%, Franchise comparable store sales decreased 0.3%, and System-wide comparable store sales decreased 0.2%. For the year 2015, the company expects to report total revenue to meet or exceed $78 million compared to prior guidance of approximately $8 million. Non-GAAP adjusted EBITDA to meet or exceed $1.5 million compared to prior guidance of approximately $1.5 million. For the year 2017, the company expects total revenue to be in the range of $5 million to $7 million compared to preliminary guidance of $7 million to $8 million. Non-GAAP adjusted EBITDA to be in the range of $3 million to $15 million compared to preliminary guidance of $1 million to $6 million.

    Thanks,

    Sakat

    Wednesday, March 22, 2017 2:00 AM