none
Group Islands of Continuous Data Ranges

    Question

  • Hi guys,

    I have a case of the famous "Islands of Date Ranges" problem and it seems like I can't get my head around it. Let's create the test data first:

    CREATE TABLE [dbo].[IslandsTestDataException]( [Client_ID] [int] NULL, [Store_ID] [int] NULL, [Product_ID] [int] NULL, [Discount_Percent] [decimal](5, 3) NULL, [EffectiveFrom] [date] NULL, [EffectiveTo] [date] NULL, [Exclude] [bit] NULL ) ON [PRIMARY] GO

    INSERT [dbo].[IslandsTestDataException] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [EffectiveFrom], [EffectiveTo], [Exclude]) VALUES (2, 57, 119, NULL, CAST(0xB9330B00 AS Date), CAST(0xB2360B00 AS Date), 1)
    INSERT [dbo].[IslandsTestDataException] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [EffectiveFrom], [EffectiveTo], [Exclude]) VALUES (2, 444, 119, CAST(50.000 AS Decimal(5, 3)), CAST(0xB9330B00 AS Date), CAST(0xB3B20B00 AS Date), 0)

    /****** Object:  Table [dbo].[IslandsTestData]    Script Date: 07/03/2013 11:52:00 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE TABLE [dbo].[IslandsTestData](
    	[Client_ID] [int] NULL,
    	[Store_ID] [int] NULL,
    	[Product_ID] [int] NULL,
    	[Discount_Percent] [decimal](5, 3) NULL,
    	[StoreEffectiveFrom] [date] NULL,
    	[StoreEffectiveTo] [date] NULL,
    	[ProductEffectiveFrom] [date] NULL,
    	[ProductEffectiveTo] [date] NULL
    ) ON [PRIMARY]
    GO
    INSERT [dbo].[IslandsTestData] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [StoreEffectiveFrom], [StoreEffectiveTo], [ProductEffectiveFrom], [ProductEffectiveTo]) VALUES (2, 57, 118, CAST(15.000 AS Decimal(5, 3)), CAST(0xB9330B00 AS Date), CAST(0x93360B00 AS Date), CAST(0xB9330B00 AS Date), CAST(0x26350B00 AS Date))
    INSERT [dbo].[IslandsTestData] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [StoreEffectiveFrom], [StoreEffectiveTo], [ProductEffectiveFrom], [ProductEffectiveTo]) VALUES (2, 57, 119, CAST(27.000 AS Decimal(5, 3)), CAST(0xB9330B00 AS Date), CAST(0x74360B00 AS Date), CAST(0xB9330B00 AS Date), CAST(0x26350B00 AS Date))
    INSERT [dbo].[IslandsTestData] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [StoreEffectiveFrom], [StoreEffectiveTo], [ProductEffectiveFrom], [ProductEffectiveTo]) VALUES (2, 57, 119, CAST(26.000 AS Decimal(5, 3)), CAST(0x75360B00 AS Date), CAST(0x48370B00 AS Date), CAST(0x94360B00 AS Date), CAST(0x67370B00 AS Date))
    INSERT [dbo].[IslandsTestData] ([Client_ID], [Store_ID], [Product_ID], [Discount_Percent], [StoreEffectiveFrom], [StoreEffectiveTo], [ProductEffectiveFrom], [ProductEffectiveTo]) VALUES (2, 108, 119, CAST(26.000 AS Decimal(5, 3)), CAST(0x75360B00 AS Date), CAST(0x48370B00 AS Date), CAST(0xB9330B00 AS Date), CAST(0xB3B20B00 AS Date))

    If you query the second table, [IslandsTestData], you'll see that it has 2 sets of EffectiveFrom, EffectiveTo (SCD-2) columns. What I need to get out of that table is the set of (islands) date ranges covered by those 2 sets of columns together with the amount of Discount_Percent for each set of (Client,Store,Product). For example, what I need for ClientID = 2, Store = 57, product = 119 is:


    The result of the above described operation should be merged with the Exception table. The main piece of logic here is that the Exception table can overwrite what has been produced in the first step:

      - If the Exclude flag is equal to 1 for any rows with a matching key (Client_ID, Store_ID, Product_ID) in the result of the first step for the overlapping period in 2 tables , there should be a record in final result for that period with Discount_Percent = NULL and Exclude = 1;

      - The Discount_Percent in the Exception table overwrites the Discount_Percent from the result of first step.

    So, the final result I want to get for ClientID = 2, Store = 57, product = 119 is:


    Your help is highly appreciated.

    Cheers.


    http://thebipalace.wordpress.com





    • Edited by SaeedB Wednesday, July 03, 2013 3:52 AM
    Wednesday, July 03, 2013 2:40 AM

Answers

  • I wasn't sure from your description what you were doing with the ProductEffective dates, so I ignored them.  Hopefully, the following will at least get you started.

    The following code uses a calendar table (named dbo.Calendar).  If you don't already have one, they are very useful.  See

    http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html

    for what a calendar table is, how to create one, and some of their many uses.

    What the following does is it creates a temp table that will hold one row for every client, store, product, and effectivedate in your two tables.  (Essentially, it's taking your two tables which are in Islands and Gaps form and flattening them - that makes them easier to work with).  When we're done, we will put it back into Islands and Gaps form for the final result.

    So, step 1 is create the temp table.  In step 2, fill the temp table with the data from regular table.  In step 3, we update the discount percent and the Exclude value for any rows in the exception table hat match rows from the regular table.  In step 4 we insert data for any rows in the exception data that have no matches in the regular data (I wasn't sure whether you needed this, but since you had data in the exception table for product 444, but not in your regular table, I included this step).  Now we have built the temp table with the correct result for every client, store, product, and date, so we just output it in Island form and then drop the temp table.

    -- Step 1 Create temp table to hold result by data
    CREATE TABLE #DiscountByDate( [Client_ID] [int] NOT NULL, [Store_ID] [int] NOT NULL, [Product_ID] [int] NOT NULL, dt date,
     [Discount_Percent] [decimal](5, 3) NULL, Exclude int NOT NULL, Primary Key (Client_ID, Store_ID, Product_ID, dt));
    
    -- Step 2 Insert Regular data
    Insert #DiscountByDate(Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude)
    Select i.Client_ID, i.Store_ID, i.Product_ID, c.dt, i.Discount_Percent, 0
    From dbo.IslandsTestData i
    Inner Join dbo.Calendar c On c.dt Between i.StoreEffectiveFrom And i.StoreEffectiveTo
    
    -- Step 3 Override Regular data with any exception rows
    Update d
    Set Discount_Percent = Case When e.Exclude = 0 Then e.Discount_Percent Else NULL End,
       Exclude = e.Exclude 
    From #DiscountByDate d
    Inner Join dbo.IslandsTestDataException e On e.Client_ID = d.Client_ID And e.Store_ID = d.Store_ID
      And e.Product_ID = d.Product_ID And d.dt Between e.EffectiveFrom And e.EffectiveTo;
    
    -- Step 4 Insert any exception rows where no matching row in regular data
    Insert #DiscountByDate(Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude)
    Select e.Client_ID, e.Store_ID, e.Product_ID, c.dt, 
      Case When e.Exclude = 0 Then e.Discount_Percent Else NULL End, e.Exclude 
    From dbo.IslandsTestDataException e
    Inner Join dbo.Calendar c On c.dt Between e.EffectiveFrom And e.EffectiveTo
    Where e.Exclude = 0 And Not Exists(Select * From #DiscountByDate d
      Where e.Client_ID = d.Client_ID And e.Store_ID = d.Store_ID And e.Product_ID = d.Product_ID And c.dt = d.dt);
    
    -- Step 5 Now do islands
    ;With cte As
    (Select Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude,
      Row_Number() Over(Order By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude, dt) -
        Row_Number() Over(Partition By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude Order By dt) As Island
    From #DiscountByDate)
    Select Client_ID, Store_ID, Product_ID, Discount_Percent, 
       MIN(dt) As EffectiveFrom, MAX(dt) As EffectiveTo, Exclude
    From cte
    Group By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude, Island
    Order By Client_ID, Store_ID, Product_ID, Min(dt);
    go
    Drop Table #DiscountByDate;
    Tom

    Wednesday, July 03, 2013 4:19 AM

All replies

  • I wasn't sure from your description what you were doing with the ProductEffective dates, so I ignored them.  Hopefully, the following will at least get you started.

    The following code uses a calendar table (named dbo.Calendar).  If you don't already have one, they are very useful.  See

    http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html

    for what a calendar table is, how to create one, and some of their many uses.

    What the following does is it creates a temp table that will hold one row for every client, store, product, and effectivedate in your two tables.  (Essentially, it's taking your two tables which are in Islands and Gaps form and flattening them - that makes them easier to work with).  When we're done, we will put it back into Islands and Gaps form for the final result.

    So, step 1 is create the temp table.  In step 2, fill the temp table with the data from regular table.  In step 3, we update the discount percent and the Exclude value for any rows in the exception table hat match rows from the regular table.  In step 4 we insert data for any rows in the exception data that have no matches in the regular data (I wasn't sure whether you needed this, but since you had data in the exception table for product 444, but not in your regular table, I included this step).  Now we have built the temp table with the correct result for every client, store, product, and date, so we just output it in Island form and then drop the temp table.

    -- Step 1 Create temp table to hold result by data
    CREATE TABLE #DiscountByDate( [Client_ID] [int] NOT NULL, [Store_ID] [int] NOT NULL, [Product_ID] [int] NOT NULL, dt date,
     [Discount_Percent] [decimal](5, 3) NULL, Exclude int NOT NULL, Primary Key (Client_ID, Store_ID, Product_ID, dt));
    
    -- Step 2 Insert Regular data
    Insert #DiscountByDate(Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude)
    Select i.Client_ID, i.Store_ID, i.Product_ID, c.dt, i.Discount_Percent, 0
    From dbo.IslandsTestData i
    Inner Join dbo.Calendar c On c.dt Between i.StoreEffectiveFrom And i.StoreEffectiveTo
    
    -- Step 3 Override Regular data with any exception rows
    Update d
    Set Discount_Percent = Case When e.Exclude = 0 Then e.Discount_Percent Else NULL End,
       Exclude = e.Exclude 
    From #DiscountByDate d
    Inner Join dbo.IslandsTestDataException e On e.Client_ID = d.Client_ID And e.Store_ID = d.Store_ID
      And e.Product_ID = d.Product_ID And d.dt Between e.EffectiveFrom And e.EffectiveTo;
    
    -- Step 4 Insert any exception rows where no matching row in regular data
    Insert #DiscountByDate(Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude)
    Select e.Client_ID, e.Store_ID, e.Product_ID, c.dt, 
      Case When e.Exclude = 0 Then e.Discount_Percent Else NULL End, e.Exclude 
    From dbo.IslandsTestDataException e
    Inner Join dbo.Calendar c On c.dt Between e.EffectiveFrom And e.EffectiveTo
    Where e.Exclude = 0 And Not Exists(Select * From #DiscountByDate d
      Where e.Client_ID = d.Client_ID And e.Store_ID = d.Store_ID And e.Product_ID = d.Product_ID And c.dt = d.dt);
    
    -- Step 5 Now do islands
    ;With cte As
    (Select Client_ID, Store_ID, Product_ID, dt, Discount_Percent, Exclude,
      Row_Number() Over(Order By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude, dt) -
        Row_Number() Over(Partition By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude Order By dt) As Island
    From #DiscountByDate)
    Select Client_ID, Store_ID, Product_ID, Discount_Percent, 
       MIN(dt) As EffectiveFrom, MAX(dt) As EffectiveTo, Exclude
    From cte
    Group By Client_ID, Store_ID, Product_ID, Discount_Percent, Exclude, Island
    Order By Client_ID, Store_ID, Product_ID, Min(dt);
    go
    Drop Table #DiscountByDate;
    Tom

    Wednesday, July 03, 2013 4:19 AM
  • Hi Tom, Thanks for the quick response.

    I need the Product EffectiveFrom and ProductEffectiveTo columns to be incorporated in the solution. If you read the 3 lines under the code for creating the tables and populating them, I described that the first result I need is to be able to tell the date ranges at which each set of (Client_ID, Store_ID, MasterProduct_ID) is "Effective", by looking into Product and Store Effective date ranges. 

    For example, Client_ID=2, Store_ID=57,MasterProduct_ID=118 is valid from 2011-01-01 to 2011-12-31, because it's ProductEffectiveFrom&To isndicated that. Without that, the result is incomplete.

    Cheers.


    http://thebipalace.wordpress.com

    Wednesday, July 03, 2013 5:06 AM
  • Actually I figured out how to take care of the first step by getting some idea about your solution Tom. I need to play around with it and test it for all the possible situations, and make sure it works in all of them. I'll let you know if I need more help.

    Thanks a lot.


    http://thebipalace.wordpress.com

    Wednesday, July 03, 2013 8:31 AM