none
Query to return the FIRST and LAST records

    Question

  • Given the following as a sample table and sample data:

    CREATE TABLE [dbo].[FirstLastTest](
    	[RecordedDate] [datetime] NOT NULL,
    	[DeviceSerialNumber] [bigint] NOT NULL,
    	[MessageData] [varchar](50) NOT NULL,
     CONSTRAINT [PK_FirstLastTest] PRIMARY KEY CLUSTERED 
    (
    	[RecordedDate] ASC,
    	[DeviceSerialNumber] ASC
    ) ON [PRIMARY]
    ) ON [PRIMARY]
    
    GO
    
    DECLARE @GMTNow DATETIME
    SET @GMTNow = GETUTCDATE()
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 1')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 1')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 1')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 2')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 2')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 2')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 3')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 3')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 3')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 4')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 4')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 4')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 5')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 5')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 5')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 6')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 6')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 6')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000001, 'MESSAGE 7')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000002, 'MESSAGE 7')
    SET @GMTNow = DATEADD(n, 1, @GMTNow)
    INSERT INTO [FirstLastTest] VALUES (@GMTNow, 1000003, 'MESSAGE 7')
    
    SELECT * FROM FirstLastTest

    I would like to know if there is any way to retrieve, for all DeviceSerialNumbers, the FIRST and LAST rows in the table.

    For example, given the above data, I would like the output to have exactly (6) rows, two for each DeviceSerialNumber, one with the "MESSAGE 1" row for each, and one with the "MESSAGE 7" row for each.

    Is there a way to do this?  I am using SQL Server 2008.

    I tried this, but it returns all rows (didn't think it would work, but it was worth trying anyway):

    SELECT RecordedDate, DeviceSerialNumber, MessageData FROM FirstLastTest
    GROUP BY DeviceSerialNumber, RecordedDate, MessageData
    Having RecordedDate= MIN(RecordedDate) or RecordedDate=MAX(RecordedDate)

    Thanks

    Jim Kern

    Monday, July 01, 2013 1:56 PM

Answers

  • One way could be using a ranking function.

    with R as (
    select
        RecordedDate, DeviceSerialNumber, MessageData,
        row_number() over(partition by DeviceSerialNumber order by RecordedDate) as rn1,
        row_number() over(partition by DeviceSerialNumber order by RecordedDate DESC) as rn2
    from
        FirstLastTest
    )
    select RecordedDate, DeviceSerialNumber, MessageData
    from R
    where 1 in (rn1, rn2)
    order by 
    DeviceSerialNumber, RecordedDate;

    The other method is using a derived table where we group by device and pull minimum and maximum recorded date, then using this derived table to join to the original table by device nad by min_recordeddate to the recordeddate for the first and max_recordeddate for the last.


    AMB

    Some guidelines for posting questions...



    Monday, July 01, 2013 2:04 PM

All replies

  • One way could be using a ranking function.

    with R as (
    select
        RecordedDate, DeviceSerialNumber, MessageData,
        row_number() over(partition by DeviceSerialNumber order by RecordedDate) as rn1,
        row_number() over(partition by DeviceSerialNumber order by RecordedDate DESC) as rn2
    from
        FirstLastTest
    )
    select RecordedDate, DeviceSerialNumber, MessageData
    from R
    where 1 in (rn1, rn2)
    order by 
    DeviceSerialNumber, RecordedDate;

    The other method is using a derived table where we group by device and pull minimum and maximum recorded date, then using this derived table to join to the original table by device nad by min_recordeddate to the recordeddate for the first and max_recordeddate for the last.


    AMB

    Some guidelines for posting questions...



    Monday, July 01, 2013 2:04 PM
  • This works:

    ;with cteFirst As (
    	Select	RecordedDate
    			,ROW_NUMBER() OVER(Partition By DeviceSerialNumber Order By RecordedDate) rn
    	From	FirstLastTest
    )
    ,cteLast As (
    	Select	RecordedDate
    			,ROW_NUMBER() OVER(Partition By DeviceSerialNumber Order By RecordedDate Desc) rn
    	From	FirstLastTest
    )
    Select	flt.RecordedDate
    		,DeviceSerialNumber
    		,MessageData
    From	FirstLastTest flt
    join	cteFirst cf
    	on	flt.RecordedDate = cf.RecordedDate
    join	cteLast cl
    	on	flt.RecordedDate = cl.RecordedDate
    Where	cf.rn = 1
    	Or	cl.rn = 1
    Order By RecordedDate, DeviceSerialNumber

    Monday, July 01, 2013 2:07 PM
  • ;With mycte as
    (SELECT *, row_number() Over(partition by DeviceSerialNumber Order by MessageData  ) rn1
    , row_number() Over(partition by DeviceSerialNumber Order by MessageData DESC ) rn2 FROM FirstLastTest)
    Select [RecordedDate] ,	[DeviceSerialNumber] , [MessageData] from mycte 
    WHERE rn1=1 or rn2=1
    Order by [DeviceSerialNumber], [MessageData]

    Thanks for your DDL and sample data.
    Monday, July 01, 2013 2:07 PM
  • That works beautifully.

    Now I'll have to optimize it for my "real" data (50+ million rows, 1+ million distinct devices), but this gives me a GREAT start.  Thank you so much...and thanks to the others who also contributed here!

    It is well appreciated!  One question asked, one fabulous solution received!

    Jim Kern

    Monday, July 01, 2013 2:34 PM
  • You can add a couple of covering indexes to help with the performance:

    - (DeviceSerialNumber, RecordedDate) include (MessageData)
    -
    (DeviceSerialNumber, RecordedDate DESC) include (MessageData)

    I also recommend this great book for futher reading on Window Functions.

    Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions
    http://shop.oreilly.com/product/0790145323088.do


    AMB

    Some guidelines for posting questions...

    Monday, July 01, 2013 5:02 PM