Deadlock issue in SQL Server 2008 R2 (.Net 2.0 Application)

Unanswered Deadlock issue in SQL Server 2008 R2 (.Net 2.0 Application)

  • Thursday, September 16, 2010 11:20 AM
     
      Has Code

    The Sql Server 2008 R2 instance in question is a heavy load OLTP production server. The deadlock issue came up a few days back and is still unresolved. We received the Xml deadlock report that listed the stored procedures involved in the deadlock and some other details. I'll try to list down the facts from this xml first:

    Two stored procedures are involved in the deadlock, say SP1 and SP2. According to the report SP1 was running in Isolation level "Serializable" and SP2 was running in "ReadCommitted" .

    We have investigated the following:

    • Are we setting IsolationLevel of SP1 to "Serializable" inside SP or in Code? - No.

    • Is any other SP whose IsolationLevel is "Serializable" calling SP1? - No.

    • Are the table used by SP1 called by any other SP that has Isolation Level as "Serializable"? - Yes. There are SPs that have Isolation Level set to "Serializable" and access the same tables as SP1, but we don't know whether they were running at the time of deadlock or not as the deadlock
      report only showed SP1 and SP2.

    Lines of thought:
    We have considered the following possible causes:

    • Deadlock is occurring because SP1 is running as "Serializable". - Why is this SP running in Serializable when I haven't set it? Is the Isolation level escalating (like locks do)? If we figure this out and make it run as ReadCommitted, will the issue be resolved?

    • Any other SP is running, locking the table used by SP1 and causes a deadlock between SP1 and SP2. - Wouldn't this SP be listed in the deadlock report? Can the deadlock report miss such a dependency? If yes then we might only be getting partial information. This still doesn't resolve how SP1 is running in Serializable, though.

    Suggestions:

    • If this information is not sufficient in resolving the problem, how can I obtain more information from SQL Server for my purpose and what information should I try to collect?

    • Any other Line of Thought that you'd pursue in solving this issue?


    Update: 
    This is the trace log information for the deadlock. I've changed the names of SPs etc. but have checked and verified that the changes don't miss out any relevant information. Check the notes succeeding the code for more info on tables etc.
    ?<EVENT_INSTANCE>
     <EventType>DEADLOCK_GRAPH</EventType>
     <PostTime>2010-09-07T11:27:47.870</PostTime>
     <SPID>16</SPID>
     <TextData>
      <deadlock-list>
       <deadlock victim="process5827708">
        <process-list>
         <process id="process5827708" taskpriority="0" logused="0" waitresource="KEY: 7:72057594228441088 (8d008a861f4f)"
              waittime="5190" ownerId="1661518243" transactionname="SELECT" lasttranstarted="2010-09-07T11:27:42.657"
              XDES="0x80bf3b50" lockMode="RangeS-S" schedulerid="4" kpid="2228" status="suspended" spid="76" sbid="0"
              ecid="0" priority="0" trancount="0" lastbatchstarted="2010-09-07T11:27:42.657"
              lastbatchcompleted="2010-09-07T11:27:42.657" clientapp=".Net SqlClient Data Provider"
              hostname="xxx" hostpid="5988" loginname="xxx" isolationlevel="serializable (4)"
              xactid="1661518243" currentdb="7" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
          <executionStack>
           <frame procname="SP1" line="12" stmtstart="450" stmtend="6536"
               sqlhandle="0x0300070090cbdc7742720c00e99d00000100000000000000">
            Select ... from Table1, Table2, Table4, Table5
           </frame>
          </executionStack>
          <inputbuf>
           Proc [Database Id = 7 Object Id = 2010958736]
          </inputbuf>
         </process>
         <process id="process5844bc8" taskpriority="0" logused="1873648" waitresource="KEY: 7:72057594228441088 (0e00ce038ed0)"
              waittime="4514" ownerId="1661509575" transactionname="user_transaction" lasttranstarted="2010-09-07T11:27:40.423"
              XDES="0x37979ae90" lockMode="X" schedulerid="7" kpid="3260" status="suspended" spid="104" sbid="0" ecid="0"
              priority="0" trancount="2" lastbatchstarted="2010-09-07T11:27:43.350" lastbatchcompleted="2010-09-07T11:27:43.350"
              clientapp=".Net SqlClient Data Provider" hostname="xxx" hostpid="5988" loginname="xxx"
              isolationlevel="read committed (2)" xactid="1661509575" currentdb="7" lockTimeout="4294967295"
              clientoption1="673185824" clientoption2="128056">
          <executionStack>
           <frame procname="SP2" line="68" stmtstart="5272" stmtend="5598"
               sqlhandle="0x030007003432350f109a0c00e99d00000100000000000000">
            UPDATE Table1 ...
           </frame>
          </executionStack>
          <inputbuf>
           Proc [Database Id = 7 Object Id = 255144500]
          </inputbuf>
         </process>
        </process-list>
        <resource-list>
         <keylock hobtid="72057594228441088" dbid="7" objectname="Table1" indexname="Index1"
              id="lock448e2c580" mode="X" associatedObjectId="72057594228441088">
          <owner-list>
           <owner id="process5844bc8" mode="X" />
          </owner-list>
          <waiter-list>
           <waiter id="process5827708" mode="RangeS-S" requestType="wait" />
          </waiter-list>
         </keylock>
         <keylock hobtid="72057594228441088" dbid="7" objectname="Table1" indexname="Index1"
              id="lock2ba335880" mode="RangeS-S" associatedObjectId="72057594228441088">
          <owner-list>
           <owner id="process5827708" mode="RangeS-S" />
          </owner-list>
          <waiter-list>
           <waiter id="process5844bc8" mode="X" requestType="wait" />
          </waiter-list>
         </keylock>
        </resource-list>
       </deadlock>
      </deadlock-list>
     </TextData>
     <TransactionID />
     <LoginName>xx</LoginName>
     <StartTime>2010-09-07T11:27:47.867</StartTime>
     <ServerName>xxx</ServerName>
     <LoginSid>xxx</LoginSid>
     <EventSequence>116538375</EventSequence>
     <IsSystem>1</IsSystem>
     <SessionLoginName />
    </EVENT_INSTANCE>



    SP1 is performing a select that takes data from 5 different tables (Table1 to Table5) (uses inner query etc.) SP2 performs an update on Table1. An interesting thing is one of the columns that SP2 updates is a foreign key field in Table1 and primary key of Table2 while both Table1 and Table2 are part of the select statement of SP1, not sure this is relevant but didn't want to miss out anything.

    NOTE: indexname="Index1" (in deadlock graph above) -- Index1 is on the same column that is foreign key in Table1 and primary key of Table2.

All Replies

  • Thursday, September 16, 2010 1:43 PM
     
     

    Siddhart ,

    Serializable is the most strict Isolation level .You cant do anything with the table even if there is a select under this isolation level on that table.You cant even do an insert even if the rows you are inserting does not come in the range of select under Serializable ...

    So the chances of deadlock not happening are very high if we can manager to bring this transaction Isolation level to Read committed ...

    I reallu found it strane that both the SPs are User SPs and neither the application nor the user connection is marking that transaction as serializable .You have mentioned that there are SPs that run under Isolation level serializable ..How its set on them ....

    Please set trace flag 1222 to get the complete deadlock graph in the errorlog ...You can also get the deadlock graph using X-Events or WMI ....

    If you wish , you can send me a mail and I can send you the scrips and we can pick it up from there onwards .my email is abhay_c@hotmail.com .I might be a bit late in replying you as I am a bit occupied .But will try my best to reply you soon ....

    Else you can enable the trace flag and either copy the grraph here OR upload it somewhere fo everyone to have a look OR mail me ..

    Regards

     


    Abhay Chaudhary OCP 9i, MCTS/MCITP (SQL Server 2005, 2008, 2005 BI) ms-abhay.blogspot.com/
  • Thursday, September 16, 2010 2:05 PM
     
     

    As per BOL

    Transaction isolation levels can be set using Transact-SQL or through a database API.

    Transact-SQL
    Transact-SQL scripts use the SET TRANSACTION ISOLATION LEVEL statement.

    ADO
    ADO applications set the IsolationLevel property of the Connection object to adXactReadUncommitted, adXactReadCommitted, adXactRepeatableRead, or adXactReadSerializable.

    ADO.NET
    ADO.NET applications using the System.Data.SqlClient managed namespace can call the SqlConnection.BeginTransaction method and set the IsolationLevel option to Unspecified, Chaos, ReadUncommitted, ReadCommitted, RepeatableRead, Serializable, and Snapshot.

    OLE DB
    When starting a transaction, applications using OLE DB call ITransactionLocal::StartTransaction with isoLevel set to ISOLATIONLEVEL_READUNCOMMITTED, ISOLATIONLEVEL_READCOMMITTED, ISOLATIONLEVEL_REPEATABLEREAD, ISOLATIONLEVEL_SNAPSHOT, or ISOLATIONLEVEL_SERIALIZABLE.

    When specifying the transaction isolation level in autocommit mode, OLE DB applications can set the DBPROPSET_SESSION property DBPROP_SESS_AUTOCOMMITISOLEVELS to DBPROPVAL_TI_CHAOS, DBPROPVAL_TI_READUNCOMMITTED, DBPROPVAL_TI_BROWSE, DBPROPVAL_TI_CURSORSTABILITY, DBPROPVAL_TI_READCOMMITTED, DBPROPVAL_TI_REPEATABLEREAD, DBPROPVAL_TI_SERIALIZABLE, DBPROPVAL_TI_ISOLATED, or DBPROPVAL_TI_SNAPSHOT.

    ODBC
    ODBC applications call SQLSetConnectAttr with Attribute set to SQL_ATTR_TXN_ISOLATION and ValuePtr set to SQL_TXN_READ_UNCOMMITTED, SQL_TXN_READ_COMMITTED, SQL_TXN_REPEATABLE_READ, or SQL_TXN_SERIALIZABLE.

    For snapshot transactions, applications call SQLSetConnectAttr with Attribute set to SQL_COPT_SS_TXN_ISOLATION and ValuePtr set to SQL_TXN_SS_SNAPSHOT. A snapshot transaction can be retrieved using either SQL_COPT_SS_TXN_ISOLATION or SQL_ATTR_TXN_ISOLATION.

    Please see if any of these fits in your situation ...


    Abhay Chaudhary OCP 9i, MCTS/MCITP (SQL Server 2005, 2008, 2005 BI) ms-abhay.blogspot.com/
  • Thursday, September 16, 2010 2:34 PM
    Moderator
     
     
    Post the complete deadlock graph into this thread so we can look at all of the information contained in it.  Without that all we can do is make blind guesses about what may or may not be the problem, and how to actually go about fixing it.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.twitter.com/SQLSarg
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
  • Thursday, September 16, 2010 5:42 PM
     
     
    Have attached the deadlock graph. Let me know if you require any more information. I have modified the table names etc. due to obvious reasons but if needed I'll try to provide detailed schema information.
  • Friday, September 17, 2010 6:22 AM
     
     

    <owner id="process5844bc8" mode="X" />
    <waiter id="process5827708" mode="RangeS-S" requestType="wait" />

    <owner id="process5827708" mode="RangeS-S" />
    <waiter id="process5844bc8" mode="X" requestType="wait" />


    process5827708
    taskpriority="0" logused="0" waitresource="KEY: 7:72057594228441088 (8d008a861f4f)
    transactionname="SELECT"
    lasttranstarted="2010-09-07T11:27:42.657"
    lockMode="RangeS-S
    spid="76"
    ecid="0"
    priority="0"
    trancount="0"
    lastbatchstarted="2010-09-07T11:27:42.657"
    lastbatchcompleted="2010-09-07T11:27:42.657"
    isolationlevel="serializable (4)
    clientapp=".Net SqlClient Data Provider"
    Select ... from Table1, Table2, Table4,
    [Database Id = 7 Object Id = 2010958736]


    process5844bc8
    taskpriority="0" logused="1873648" waitresource="KEY: 7:72057594228441088 (0e00ce038ed0)
    lasttranstarted="2010-09-07T11:27:40.423
    lockMode="X"
    spid="104"
    ecid="0"
    trancount="2"
    lastbatchstarted="2010-09-07T11:27:43.350"
    lastbatchcompleted="2010-09-07T11:27:43.350
    clientapp=".Net SqlClient Data Provider"
    isolationlevel="read committed (2)
    currentdb="7"
    UPDATE Table1 ...
    Database Id = 7 Object Id = 255144500


    conclusion :

    -> Has this graph been produced after enabling traceflag 1222.
    -> looks like this graph is incomplete as I can see only victim information .But a fair deduction can be made since waiters and owners are avaliable.
     process5844bc8 took X lock on objectname="Table1" indexname="Index1" associatedObjectId="72057594228441088"> key is (0e00ce038ed0)
     process5827708 was waiting for mode="RangeS-S" lock not sure which key (thats a question and looks like this graph is incomplete)

     similarly process5827708 took RangeS-S lock on objectname="Table1" indexname="Index1 associatedObjectId="72057594228441088" key is 8d008a861f4f
     process5844bc8 was waiting on mode="X" but not sure which key (thats a question and looks like this graph is incomplete)

     

    -> while first batch is running 1 transaction the second SPID is running 2 .Not sure if both the transactions are going to be committed together or have some dependence on each other.
    -> Its the application which is setting the isolation levels
    -> The select is running under serializable.In read committed ,if the select this over even if the transaction has not committed the shared lock will go away and someone else can take a lock.
    -> out of victim graph I am deducing that locks are on different index keys .And if its actually true then converting the select to read committed Isolation level can solve this deadlock issue.
    -> I am hoping that the Index mentioned is least fragmented and select query is already optimised.

    OTHERS CAN GUIDE IN CASE I HAVE MISSED SOMETHING ...


    Abhay Chaudhary OCP 9i, MCTS/MCITP (SQL Server 2005, 2008, 2005 BI) ms-abhay.blogspot.com/
  • Friday, September 17, 2010 11:01 AM
     
     

    Thanks for the updates Abhay. Here's some more information on the issue:

    conclusion :

    -> Has this graph been produced after enabling traceflag 1222. -- Yes.
    -> looks like this graph is incomplete as I can see only victim information .But a fair deduction can be made since waiters and owners are avaliable.
     process5844bc8 took X lock on objectname="Table1" indexname="Index1" associatedObjectId="72057594228441088"> key is (0e00ce038ed0)
     process5827708 was waiting for mode="RangeS-S" lock not sure which key (thats a question and looks like this graph is incomplete)

     similarly process5827708 took RangeS-S lock on objectname="Table1" indexname="Index1 associatedObjectId="72057594228441088" key is 8d008a861f4f
     process5844bc8 was waiting on mode="X" but not sure which key (thats a question and looks like this graph is incomplete) -- I'll need to check from the DB team whether they've chopped off any of the contents from the trace flag (though, I don't think so). Unfortunately, right now this is the information I've to work on.

    -> while first batch is running 1 transaction the second SPID is running 2 .Not sure if both the transactions are going to be committed together or have some dependence on each other. -- I excluded the t-sql statements which had the dependency. The dependency is -- SP1 (running in Serializable fires a Select on Table1 (among other tables) and in the process obtains a RangeS-S lock on the index of its field, say OrganizationID. SP2 is attempting to update the OrganizationID in the same table and hence there's a deadlock. The resource causing the lock is Index1 in Table1 which is a bit confusing. According to the trace, it seems that SP1 has a chunk of this index and wants more that SP2 has an X lock on and SP2 in turn wants the chunk that is locked by SP1.


    -> Its the application which is setting the isolation levels -- Are you 100%??? This is the BIGGEST of my worries right now. I've checked everywhere in code and we're not setting SP1 to Serializable. From code this is coming as ReadCommitted. Is there any way SQL Server may influence the lock? I also found that we are pooling and reusing connections. Is it possible that an SP (say SP3) runs and sets the Isolation level to Serializable (inside itself); the connection this SP used is used by SP1 and hence it's running in Serializable??? I tried a mock up but the Isolation Level set inside an SP stayed inside that SP only and didn't affect other SPs running on that connection, but I want to be sure this is not what's happening. -- Again. This is the MAJOR WORRY that I have. If we can solve this then other things will fall into place.


    -> The select is running under serializable.In read committed ,if the select this over even if the transaction has not committed the shared lock will go away and someone else can take a lock. -- Yeah. Then if SP1 runs in ReadCommitted, it's problem solved. I can forcefully declare a Set Isolation Level statement but I'm worried how is SP1 running in Serializable in the first place.


    -> out of victim graph I am deducing that locks are on different index keys .And if its actually true then converting the select to read committed Isolation level can solve this deadlock issue. -- Affirmative.


    -> I am hoping that the Index mentioned is least fragmented and select query is already optimised. -- Select query has a lot of joins, if you want I'll put it on the board too but I doubt we'll go anywhere important with that because it'll still be running in Serializable which causes locks on so many tables and won't release until SP1 finishes.


    Let me know if you need more/specific information.

  • Friday, September 17, 2010 11:31 AM
     
     

    I have never heard of Isolation level escalation .If thats not true , then its coming from application ..can you check one of the comments I gave from BOL ...ITS THE SECOND COMMENT  and cross check once again if your application is not setting it to serializable.....you also mentioned initially that there are few queries where Serilazible is set .So I am thinking (sorry for repeatedly saying this ) there is somewhere in code or might be you are setting one transaction to Serializable but by mistake its being used by another transacton as well (someting like that.I am not able to express it correctly as I do not know about how its set at application leel ....something like if in query analyzer we dont give GO everything is considered as one batch)..


    Abhay Chaudhary OCP 9i, MCTS/MCITP (SQL Server 2005, 2008, 2005 BI) ms-abhay.blogspot.com/
  • Friday, September 17, 2010 11:49 AM
     
     

    I have read that in DB2 When a requested isolation level is not supported by an application server, the isolation level is escalated to the next highest supported isolation level ...I am still seeing if that is true in SQL Server or if there are other reasons to do that ...

    I also found this for 2000 : http://support.microsoft.com/kb/215520 but I think its not applicable ..


    Abhay Chaudhary OCP 9i, MCTS/MCITP (SQL Server 2005, 2008, 2005 BI) ms-abhay.blogspot.com/
  • Friday, September 17, 2010 11:52 AM
     
     

    @Abhay

    I'm checking that very thing right now. The code internally uses a Helper class that sets the IsolationLevel to ReadCommitted and this SP is using that Helper class. So far, I've taken it on faith that the Helper class is doing its job, because the application has a huge number of SPs and we've not faced any deadlocks anywhere.

    A couple of possible leads that I'm following are:

    1. A different SP (not SP1 or SP2 we talked about) is running in Serializable mode. The connection used by this SP goes to the connection pool. The Helper class picks this connection (with Serializable) and calls SP1 (which has no Isolation Level set) with Serializable mode. This is my main lead right now.
    2. SP3 (the third SP which is running as Serializable) works on the same resource that SP1 and SP2 fight for. I've no idea how to correlate to the main problem but still don't want to discount this.
    3. There are other SPs that are using the Helper class but in DB they're running as Serializable, which is weird . This is similar to what is happening to SP1. One common thing among these SPs and SP1 is that all of them are using SqlDataReader. I've a sneaky feeling that somehow under the hood SQL Server might be tweaking something to accommodate the data reader and changing the Isolation level. This might be the missing piece in the jigsaw and after exhausting 1. I'll try to investigate this.
    I'll try to share if I have any new leads.
  • Sunday, October 03, 2010 4:32 PM
     
     

    Sidharth,

     

    Another possibility would be that developers are using in certain cases ADO.NET TransactionScope class for working with the transactions? TransactionScope default isolation level is serializable. I'm as you puzzled these days with a very similar massive locking problem of our app, until in traces I began to notice a set trans isolation level serializable..developers think they only use the read commited default, but there seem to be a need to research a bit more. I found informative the link at http://www.devx.com/codemag/Article/31666/0/page/4

    Thanks for the tip (btw) to double check the xml report of the deadlock graphs, I've fogot the isolation level is a field of the rep which is way useful.

     

  • Thursday, May 03, 2012 1:58 AM
     
     

    Hello Sidharth,

    Do you solve the problem?

    I have the similar issue.

    Thanks,

    Abby


    Get reply from social.microsoft

  • Thursday, May 03, 2012 3:58 AM
    Moderator
     
     

    Hello Sidharth,

    Do you solve the problem?

    I have the similar issue.

    Thanks,

    Abby


    Get reply from social.microsoft

    If you are having a deadlock issue, it is best to start a new thread and post your deadlock graph into that thread so that it can be reviewed. There are so many different scenarios that lead to deadlocks that it is unlikely that you are hitting this exact same problem.  Specifically the deadlock in this case is being caused by a Serializable transaction using Range shared locks during a select which is a specific type of deadlock with specific types of concurrency issues that don't generally occur unless you are changing the isolation level to serializable.

    Jonathan Kehayias | Principal Consultant, SQLSkills.com
    SQL Server MVP | Microsoft Certified Master: SQL Server 2008
    Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
    Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
    Please click the Mark as Answer button if a post solves your problem!