none
Best Method to update FK from IDENTITY(1,1) PK table after insert

    Question

  • I have a PersonMapping Table that has PersonSourceId INT NOT NULL, SourceId INT NOT NULL, PersonId INT NULL FK, where PersonSourceId is the person original number on their server, SourceId is the server and PersonId is my DB ID. This tables holds all persons across all linked servers.  Then i have a Person table that has PersonId INT IDENTITY(1,1) PK, FirstName, LastName...etc. What i am doing is inserting a row into the Person table when i find a person match across linked servers and then i need to update PersonId in PersonMapping table using the PersonSourceId and SourceId to find the appropriate person. Right now my idea is:

    1. WITH CTE AS(Query To Find Person Matches that has PesonSourceId and SourceId)

    2. INSERT INTO Persons() SELECT FROM CTE  

    3. UPDATE PesonMapping somehow. I tried exploring OUTPUT and MERGE but could not get them to work.


    Monday, July 29, 2013 11:04 PM

Answers

  • would it be bad programming, and hurt my DB in the long run, to add a temporary column in my person table so the PersonSystemId goes with my insert and then i can update my PersonMapping table and then drop the temporary column in the Person table?

    I would not recommend that.  Just create a temp table or table variable, that has the same columns as your PersonMapping table.  Then use MERGE with an output clause to insert the rows into the Person table and the values into the temp table or table variable.  In the following code I hard code the values into the cte, you, of course, would write whatever cte that built the row to be inserted (it needs to have all the columns you are inserting into Person, and the PersonSourceId and the SourceID.  So, for example

    Create Table FooPerson(PersonId INT IDENTITY(1,1) primary key, FirstName varchar(20), LastName varchar(20));
    
    Declare @TempMapping Table(PersonSourceId int Not Null, SourceId int Not Null, PersonId int Null);
    
    With cte As
    (Select 273 As PersonSourceId, 2 As SourceId, 'Bill' As FirstName, 'Smith' As LastName
    Union All
    Select 648 As PersonSourceId, 1 As SourceId, 'Sam' As FirstName, 'Jones' As LastName)
    Merge Into FooPerson
    Using cte c
    On 1 = 0
    When Not Matched Then Insert(FirstName, LastName) Values(c.FirstName, c.LastName)
    Output c.PersonSourceId, c.SourceId, inserted.PersonId Into @TempMapping(PersonSourceId, SourceId, PersonId);
    
    Select * From FooPerson;
    Select * From @TempMapping;
    
    go
    Drop Table FooPerson;

    That puts the information on the mapping between SourceId, PersonSourceId, and PersionId into the table variable.  Your would then be able to use that information to update the PersonMapping table.

    Tom

    • Marked as answer by Jay_Michael Wednesday, July 31, 2013 11:09 AM
    Tuesday, July 30, 2013 2:53 AM

All replies

  • OUTPUT is the best way to capture IDENTITY insert values:

    http://www.sqlusa.com/bestpractices2005/outputidentitycapture/


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012


    Monday, July 29, 2013 11:14 PM
    Moderator
  • so do i insert the OUTPUT into a temp table and then update my Person table? The problem is i am not inserting the PersonSourceId or the SourceId into my Person table because those are stored in my PersonMapping table. All i am inserting is the persons name with some other misc. data.
    Monday, July 29, 2013 11:21 PM
  • >so do i insert the OUTPUT into a temp table and then update my Person table?

    You can insert it into a @tablevariable, #temptable, ##globaltemptable or  work table. You need to choose which one is appropriate.


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012


    Tuesday, July 30, 2013 12:27 AM
    Moderator
  • would it be bad programming, and hurt my DB in the long run, to add a temporary column in my person table so the PersonSystemId goes with my insert and then i can update my PersonMapping table and then drop the temporary column in the Person table?
    Tuesday, July 30, 2013 12:34 AM
  • would it be bad programming, and hurt my DB in the long run, to add a temporary column in my person table so the PersonSystemId goes with my insert and then i can update my PersonMapping table and then drop the temporary column in the Person table?

    I would not recommend that.  Just create a temp table or table variable, that has the same columns as your PersonMapping table.  Then use MERGE with an output clause to insert the rows into the Person table and the values into the temp table or table variable.  In the following code I hard code the values into the cte, you, of course, would write whatever cte that built the row to be inserted (it needs to have all the columns you are inserting into Person, and the PersonSourceId and the SourceID.  So, for example

    Create Table FooPerson(PersonId INT IDENTITY(1,1) primary key, FirstName varchar(20), LastName varchar(20));
    
    Declare @TempMapping Table(PersonSourceId int Not Null, SourceId int Not Null, PersonId int Null);
    
    With cte As
    (Select 273 As PersonSourceId, 2 As SourceId, 'Bill' As FirstName, 'Smith' As LastName
    Union All
    Select 648 As PersonSourceId, 1 As SourceId, 'Sam' As FirstName, 'Jones' As LastName)
    Merge Into FooPerson
    Using cte c
    On 1 = 0
    When Not Matched Then Insert(FirstName, LastName) Values(c.FirstName, c.LastName)
    Output c.PersonSourceId, c.SourceId, inserted.PersonId Into @TempMapping(PersonSourceId, SourceId, PersonId);
    
    Select * From FooPerson;
    Select * From @TempMapping;
    
    go
    Drop Table FooPerson;

    That puts the information on the mapping between SourceId, PersonSourceId, and PersionId into the table variable.  Your would then be able to use that information to update the PersonMapping table.

    Tom

    • Marked as answer by Jay_Michael Wednesday, July 31, 2013 11:09 AM
    Tuesday, July 30, 2013 2:53 AM
  • Yes - it would be bad programming for a number of reasons.  First, your proposal only REALLY needs to add a new column for "new" rows, but altering the table adds the column for all rows.  That is inefficient on a number of levels - the space must be allocated, pages must be adjusted, rows moved around for splits, etc.  Altering the table requires special privileges not normally granted to production applications/users. In the event that an error occurs during this process, you will find it more difficult to handle such errors in your code (for presentation to the user in an informative manner) and you will find it more difficult to diagnose due to the schema-altering (and ultimately destructive) design.  Finally, what column name do you propose to use for this temporary column?  Think long and hard about that trivial matter - a bad decision may collide with some schema change that occurs long after your code is active. 
    Tuesday, July 30, 2013 1:39 PM
  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

    This is minimal polite behavior on SQL forums. Does your boss make you work from narratives? 

    >> I have a PersonMapping Table .. <<

    That makes no sense in RDBMS. A table models either a specific set of entities of the same kind or it models a relationship among entities. This means that a table is a collective or plural noun. Are you trying to use SQL for some OO programming? It does not work. 

    Since the IDENTITY table property is not a column nor it is relational, SQL programmers never use it. What you are really modeling is server users, not a generic person. That is a silly OO class thing, and not RDBMS. And those users are not identified by a count of physical insertion attempts to a disk drive like you are doing. Since we do not do math on any of these identifiers why would anyone use integers for them? 

    Please think about “Person_Source_Id” as a data element name. Since it is singular, ISO-11179 says that you have only one of them. There is no verb in the name, so it is not a relationship. 

    In short, you have no idea how to do a basic data model. The idiom in SQL for a n:m relationship is three tables, like this skeleton. 

    CREATE TABLE Users
    (user_id CHAR(10) NOT NULL PRIMARY KEY,
     first_name VARCHAR(15) NOT NULL,
     last_name VARCHAR(15) NOT NULL,
     ..);

    CREATE TABLE Servers
    (server_id CHAR(10) NOT NULL PRIMARY KEY,
     ..);

    CREATE TABLE Server_Assignments
    (user_id CHAR(10) NOT NULL 
     REFERENCES Users (user_id)
     ON UPDATE CASCADE
     ON DELETE CASCADE,
     server_id CHAR(10) NOT NULL 
     REFERENCES Servers (server_id)
     ON UPDATE CASCADE
     ON DELETE CASCADE,
     PRIMARY KEY(user_id, server_id),
     ..);

    Throw out everything, find a data modeler and start over. Asking for the best way to capture IDENTITY is like asking the best kind of rocks to pound screws into fine furniture. 



     

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Tuesday, July 30, 2013 4:37 PM
  • this is a dev environment, when i posted that i realized it would lead to fragmentation and is bad programming. I just wanted to visit every option i could think of before i altered my code.
    Wednesday, July 31, 2013 10:31 AM
  • wow i did not realize i could get scrutinized for putting a bare bones example. Those are not my real table or my column names, i was just pointing out the issue without diving into the specifics of my database structure . Your example is exactly my data model i just did not include the third table because it is not involved in my insert or my update (I still use INT IDENTITYT(1,1) as my PK, i do not really see the issue if i can maintain it).

    Tom/Kalman, thank you for undestanding my issue and capturing the IDENTITY value for me. I implemented your solution and it is working.


    • Edited by Jay_Michael Wednesday, July 31, 2013 11:41 AM
    Wednesday, July 31, 2013 11:09 AM
  • Don't let Joe get to you.  He often does this sort of reply to many people.  You can go to

    http://social.msdn.microsoft.com/Profile/--celko--/activity

    and click on some of his recent replies.  You will see that you are not alone.  And decide for yourself how much you want to pay attention to this sort of reply from Joe.

    Tom

    Wednesday, July 31, 2013 1:44 PM
  • >> Your example is exactly my data model I just did not include the third table because it is not involved in my insert or my update (I still use INTEGER IDENTITY(1,1) as my PK, I do not really see the issue if I can maintain it). <<

    Yes, the third table is important; it holds the relationship that you are trying to model. What you are doing is mimicking a pointer chain with a proprietary physical locator. RDBMS uses references, not pointers (first week of your RDBMS class in school? Remember?). It is local to one and only one machine that is using one and only one SQL product. Worse, it has no validation or verification possible. 

    Consider your automobile. It has a VIN (http://en.wikipedia.org/wiki/VIN ) that you use in your insurance forms, licenses, warranties, etc. This industry standard value can be validated (i.e. has the correct format) with a regular expression. It can be verified with CarFax, the DVM, the manufacturer, or read it off the engine block.  

    Consider using your INTEGER IDENTITY(1,1) as the PK for your automobile. How do you convince your insurance company that your car is properly identified by (physical insertion count on my machine = 42)? The Department of Motor Vehicles will be less kind :) You have no validation (any integer will do) and no verification. You have no data integrity. 

    In fact the lack of verification is worse. When you insert a set of rows, the sequence of IDENTITY values is unpredictable. Change and index, and you get a different ordering. Share the table with another session, and you get a different ordering depending on the locking. Or my favorite; delete a row, then re-insert the same row as a single transaction. Did the VIN stay the same? Yes. Did your IDENTITY value stay the same? No. Did the automobile change? Nope; therefore its attributes sodu not change. In relational algebra and logic, this would be a no-op.  

    Just like pointer chains in the old 1970's network databases! Lazy, non-RDBMS programmers use these fake pointer chains (IDENTITY, GUIDs, uniqueidentifier, etc.) because they are familiar. We know pointers!  They are quick and dirty – no need to be a professional programmer who researches ANSI, ISO, or other industry standards.  Just write IDENTITY (1,1), build pointer chains with kludges you got here and worry about data integrity after you have left the project!~ 



    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Wednesday, July 31, 2013 4:15 PM