locked
How to build 7 day weeks based on start and end date? RRS feed

  • Question

  • Hi All,

    I have a interesting challenge. I have data lets say RowID, UserID, Date in a table. If UserID appears twice within a 7 day period, i need to delete the one with the lower RowID. The data is for one month only and each month min and max dates are different. i have to use those to define "one month", i cannot use standard months. So lets say its Jan 2 - Jan 30. if at the end of the month its not a complete 7 day period, use whatever days are left to create the last week.

    the key is to build those 7 day time slots in a programmatic way which i am short-handed with. I am thinking CTEs or maybe a table variable but the logic is escaping me. Any gurus can shed some pointers?

    Thanks!

    Friday, April 9, 2010 12:11 AM

Answers

  • Here is one solution. The first query is just to demonstrate how to create 7 day groups, and the second does the actual delete.

    DECLARE @month_start DATETIME;
    
    SET @month_start = '20100102';
    
    CREATE TABLE Foo (
     keycol INT NOT NULL PRIMARY KEY,
     datecol DATETIME);
     
    INSERT INTO Foo VALUES(1, '20100102');
    INSERT INTO Foo VALUES(2, '20100103');
    INSERT INTO Foo VALUES(3, '20100116');
    INSERT INTO Foo VALUES(4, '20100117');
    INSERT INTO Foo VALUES(5, '20100129');
    INSERT INTO Foo VALUES(6, '20100130');
    
    SELECT grp, MAX(datecol) AS max_date_in_group
    FROM (
    SELECT DATEADD(DAY, (DATEDIFF(DAY, @month_start, datecol) / 7) * 7, @month_start) AS grp,
           datecol
    FROM Foo) AS F
    GROUP BY grp;
    
    /*
    
    grp                     max_date_in_group
    ----------------------- -----------------------
    2010-01-02 00:00:00.000 2010-01-03 00:00:00.000
    2010-01-16 00:00:00.000 2010-01-17 00:00:00.000
    2010-01-23 00:00:00.000 2010-01-29 00:00:00.000
    2010-01-30 00:00:00.000 2010-01-30 00:00:00.000
    
    */
    
    WITH Groups AS (
    SELECT ROW_NUMBER() 
           OVER(PARTITION BY ((DATEDIFF(DAY, @month_start, datecol) / 7) * 7) ORDER BY datecol DESC) AS rk
    FROM Foo)
    DELETE FROM Groups
    WHERE rk > 1;
    
    SELECT keycol, datecol FROM Foo;
    
    /*
    
    keycol      datecol
    ----------- -----------------------
    2           2010-01-03 00:00:00.000
    4           2010-01-17 00:00:00.000
    5           2010-01-29 00:00:00.000
    6           2010-01-30 00:00:00.000
    
    */

    Plamen Ratchev
    • Proposed as answer by Deepak Biswal Friday, April 9, 2010 11:26 AM
    • Marked as answer by KJian_ Wednesday, April 14, 2010 9:50 AM
    Friday, April 9, 2010 2:29 AM

All replies

  • declare @StartDate datetime = '20100102'

    ;with cte_Days as (select @StartDate as DateFld

                               union all

                               select dateadd(day,1,DateFld) as DateFld from cte_Days

                              where DateFld <= dateadd(day,7, @StartDay))

    select * from cte_Days

    ---------------------------

    from the top of my head.


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Edited by Naomi N Friday, April 9, 2010 12:30 AM
    Friday, April 9, 2010 12:18 AM
  • You can also build all dates (for the month) and set end week with some marker inside this loop.
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Friday, April 9, 2010 12:19 AM
  • -- Test query
    select dateadd(day,-30,datediff(DAY,0,GETDATE()))
    
    declare @StartDate datetime = '20100102'
    
    declare @EndDate datetime = '20100131'
    
    ;with cte_Days as (select @StartDate as DateFld, 1 as GroupID
    
                               union all
    
                               select dateadd(day,1,DateFld) as DateFld, 
                               GroupID + case when 
                               datediff(day,@StartDate, dateadd(day,1,DateFld))  % 7 > 0 then 0 else 1 end as GroupID from cte_Days
    
                              where DateFld <= @EndDate)
    
    select * from cte_Days
    


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog

    Friday, April 9, 2010 12:22 AM
  • Here is one solution. The first query is just to demonstrate how to create 7 day groups, and the second does the actual delete.

    DECLARE @month_start DATETIME;
    
    SET @month_start = '20100102';
    
    CREATE TABLE Foo (
     keycol INT NOT NULL PRIMARY KEY,
     datecol DATETIME);
     
    INSERT INTO Foo VALUES(1, '20100102');
    INSERT INTO Foo VALUES(2, '20100103');
    INSERT INTO Foo VALUES(3, '20100116');
    INSERT INTO Foo VALUES(4, '20100117');
    INSERT INTO Foo VALUES(5, '20100129');
    INSERT INTO Foo VALUES(6, '20100130');
    
    SELECT grp, MAX(datecol) AS max_date_in_group
    FROM (
    SELECT DATEADD(DAY, (DATEDIFF(DAY, @month_start, datecol) / 7) * 7, @month_start) AS grp,
           datecol
    FROM Foo) AS F
    GROUP BY grp;
    
    /*
    
    grp                     max_date_in_group
    ----------------------- -----------------------
    2010-01-02 00:00:00.000 2010-01-03 00:00:00.000
    2010-01-16 00:00:00.000 2010-01-17 00:00:00.000
    2010-01-23 00:00:00.000 2010-01-29 00:00:00.000
    2010-01-30 00:00:00.000 2010-01-30 00:00:00.000
    
    */
    
    WITH Groups AS (
    SELECT ROW_NUMBER() 
           OVER(PARTITION BY ((DATEDIFF(DAY, @month_start, datecol) / 7) * 7) ORDER BY datecol DESC) AS rk
    FROM Foo)
    DELETE FROM Groups
    WHERE rk > 1;
    
    SELECT keycol, datecol FROM Foo;
    
    /*
    
    keycol      datecol
    ----------- -----------------------
    2           2010-01-03 00:00:00.000
    4           2010-01-17 00:00:00.000
    5           2010-01-29 00:00:00.000
    6           2010-01-30 00:00:00.000
    
    */

    Plamen Ratchev
    • Proposed as answer by Deepak Biswal Friday, April 9, 2010 11:26 AM
    • Marked as answer by KJian_ Wednesday, April 14, 2010 9:50 AM
    Friday, April 9, 2010 2:29 AM
  • I think I misunderstood your problem. I thought you wanted to create all dates within your period.
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Friday, April 9, 2010 2:47 AM
  • You could do this one of two ways.  You can ignore the duplicates when you query the real data, or you could delete the actual dups:

    declare @s_date datetime, @e_date datetime
    
    set @s_date = '1/2/2010'
    set @e_date = '1/30/2010'
    
    create table #temp (id int, date datetime)
    
    insert into #temp values (0, '1/1/2010') -- before start
    insert into #temp values (1, '1/4/2010') -- dup week 0
    insert into #temp values (2, '1/6/2010') -- dup week 0
    insert into #temp values (3, '1/7/2010') -- max week 0
    insert into #temp values (4, '1/12/2010') -- dup week 1
    insert into #temp values (5, '1/15/2010') -- max week 1
    insert into #temp values (6, '1/20/2010') -- max week 2
    insert into #temp values (7, '1/31/2010') -- after end
    insert into #temp values (8, '2/2/2010') -- after end
    
    -- only eliminate them when you query
    select datediff(dd, @s_date, date) / 7 as week_num, max(date) as max_date 
    from   #temp 
    where  date >= @s_date and date <= @e_date
    group by datediff(dd, @s_date, date) / 7
    
    -- delete them from the actual table
    delete #temp
    from (
      select datediff(dd, @s_date, date) / 7 as week_num, max(date) as max_date 
      from   #temp 
      where  date >= @s_date and date <= @e_date 
      group by datediff(dd, @s_date, date) / 7 
      having count(*) > 1
    ) as dups
    where datediff(dd, @s_date, #temp.date) / 7 = dups.week_num and #temp.date < dups.max_date and #temp.date >= @s_date and #temp.date <= @e_date
    
    -- show the records remaining
    select *
    from   #temp
    
    drop table #temp
    

    Results from just eliminating them when you query the real data:

     

    week_num    max_date
    0    2010-01-07 00:00:00.000
    1    2010-01-15 00:00:00.000
    2    2010-01-20 00:00:00.000

     

    Records in #temp after "deleting" the older duplicate records:

    id	date
    0	2010-01-01 00:00:00.000
    3	2010-01-07 00:00:00.000
    5	2010-01-15 00:00:00.000
    6	2010-01-20 00:00:00.000
    7	2010-01-31 00:00:00.000
    8	2010-02-02 00:00:00.000

    Friday, April 9, 2010 8:02 AM