Combining rows from two tables using unique identifier

Answered Combining rows from two tables using unique identifier

  • Saturday, December 29, 2012 5:24 PM
     
     
    Hi all I have two temporary tables that have the same column names and they are in the same order. Here are the table names and their matching structure. #ThisWeek #PreviousWeek LocationID | BenchmarkType | WeekNum | YearNum | VisitorCount | I want to eliminate rows from the two tables where the LocationID does not exist in both tables, and then either append the rows in #PreviousWeek to the rows in #ThisWeek, or even better, insert the rows from both tables (where the LocationID exists in both) in to a new temporary table. The way I am currently thinking is to use the WHERE EXISTS clause when using SELECT INTO a new temporary table, however this would mean using a sub query. Is there a more efficient way? Many thanks. James

All Replies

  • Saturday, December 29, 2012 5:27 PM
    Moderator
     
     

    Post your query and DDL. Thanks.


    Kalman Toth SQL 2008 GRAND SLAM
    New Book: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012

  • Saturday, December 29, 2012 7:05 PM
     
      Has Code

    Hi,

    See below example

    DELETE dbo.Raw_Data_Eucerin
     FROM dbo.Raw_Data_Eucerin
     LEFT JOIN Pharmacy ON Raw_Data_Eucerin.Pharmacy_Code=Pharmacy.Pharmacy_Code AND Raw_Data_Eucerin.Product_code=Pharmacy.Product_code
     WHERE Pharmacy.Pharmacy_Code IS NULL
    

    Do accordingly ..


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

  • Saturday, December 29, 2012 8:40 PM
     
     Proposed Answer

    WHERE EXISTS sounds the normal thing to me. There is nothing wrong with subqueries per se.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
  • Saturday, December 29, 2012 9:48 PM
     
     Answered Has Code

    Sub-queries are not bad on their own if they serve our purpose with reasonable resource consumption. MERGE syntax can not be used in your example.

    4 ways to do it using JOINS and Sub-queries -

    drop table #ThisWeek
    drop table #PreviousWeek
    drop table #BothWeeks
    create table #ThisWeek     (LocationID int, BenchmarkType varchar(1), WeekNum varchar(1), YearNum varchar(1), VisitorCount varchar(1))
    create table #PreviousWeek (LocationID int, BenchmarkType varchar(1), WeekNum varchar(1), YearNum varchar(1), VisitorCount varchar(1))
    create table #BothWeeks    (LocationID int, BenchmarkType varchar(1), WeekNum varchar(1), YearNum varchar(1), VisitorCount varchar(1))
    
    insert into #ThisWeek (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount) values
    (1,'t','t','t','t'),(2,'t','t','t','t'),(5,'t','t','t','t')
    insert into #PreviousWeek (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount) values
    (1,'p','p','p','p'),(2,'p','p','p','p'),(3,'p','p','p','p'),(4,'p','p','p','p')
    
    select * from #PreviousWeek
    select * from #ThisWeek
    select * from #BothWeeks
    
    
    -- METHOD 1: using DELETE on OUTER JOIN
    -- dump data from both weeks into the new table
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #ThisWeek
    
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #PreviousWeek
    
    -- delete those rows that are not in both source tables
    delete from #BothWeeks
    from #BothWeeks as b
    left outer join #PreviousWeek as p
    	on b.LocationID = p.LocationID
    left outer join #ThisWeek as t
    	on b.LocationID = t.LocationID
    where p.LocationID is null or t.LocationID is null
    
    
    -- METHOD 2: using INNER JOIN once each from t and p
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select t.LocationID, t.BenchmarkType, t.WeekNum, t.YearNum, t.VisitorCount from #ThisWeek as t
    inner join #PreviousWeek as p
    	on t.LocationID = p.LocationID
    
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select p.LocationID, p.BenchmarkType, p.WeekNum, p.YearNum, p.VisitorCount from #ThisWeek as t
    inner join #PreviousWeek as p
    	on t.LocationID = p.LocationID
    
    
    -- METHOD 3: using WHERE EXISTS
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #ThisWeek as t
    where exists (select 1 from #PreviousWeek as p where p.LocationID = t.LocationID)
    
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #PreviousWeek as p
    where exists (select 1 from #ThisWeek as t where p.LocationID = t.LocationID)
    
    
    -- METHOD 4: using IN
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #ThisWeek as t
    where LocationID in (select LocationID from #PreviousWeek as p)
    
    insert into #BothWeeks (LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount)
    select LocationID, BenchmarkType, WeekNum, YearNum, VisitorCount from #PreviousWeek as p
    where LocationID in (select LocationID from #ThisWeek as t)


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)


    • Edited by Aalam Rangi Saturday, December 29, 2012 9:52 PM
    • Marked As Answer by SpringboardJG Sunday, December 30, 2012 2:11 PM
    •  
  • Sunday, December 30, 2012 2:15 PM
     
     

    Thanks to all for your responses... useful to know that subqueries do have their place!

    Aalam - thank you for your response, I have decided to use METHOD 1 by deleting rows from the temporary tables before I dump the data in to another table.  This seems to be the most efficient solution as there are not many rows that actually need deleting each time the routine runs.