none
2003 SBS Locks Up: Timeout transaction response NtFrs service 7011 Issue

    Question

  • Hi all, I am hoping someone can shed some light on this strange issue I have been having.

    I have restored a backup to almost identical hardware (test server) minus 2gb of ram, minus 1 network card and different hard drives. After restoring I keep getting the server lockup, the error is:

    [I]Event Type: Error
    Event Source: Service Control Manager
    Event Category: None
    Event ID: 7011
    Date: 11/01/2011
    Time: 5:25:50 AM
    User: N/A
    Computer: SBS
    Description:
    Timeout (30000 milliseconds) waiting for a transaction response from
    the NtFrs service.[/I]

    I have researched this issue and most seem to only get the server lockup for 60 seconds, however I have to power cycle and my issue seems very different.

    Production Server:
    HP ML115 G1
    4gb ram
    Adaptec 1220SA RAID card
    2 Network Cards
    Samsung 500gig drive (were 2 in RAID1 but secondary died and replacement caused no end of issues but that is a different story, all I need to say is software RAID).

    Test Server:
    HP ML115 G1
    2gb ram
    Adaptec 1220SA RAID Card
    1 Network Card
    2x Western Digital 750gig drives in RAID1 partitioned exactly as the 500gb with the remained left unallocated. 

    I restored from backup and left the raid overnight verifying. This morning the raid1 had finished verifying and the system locked up sometime after after, this could be purely a coincidence but I remember seeing a few lockups before when restoring to the same test server.

    I have nothing connected to the network card and tried stopping the File Replication Service but still it locks up. I am yet to try changing the "Autodisconnect value to 0xfffff" but I’m sceptical this will solve it as the server never recovers from the lockup compared with other people’s experiences.

    I was wondering if anyone had any possible causes/ experiences of this problem given the scenario?

    My thoughts at this time are:

    Different hard disks causing an issue? Raid card bios version possibly different causing issue? missing network card/ 2gb memory? Nod32 AV causing an issue?

    I will do a clean boot in the mean time to see the results.

    Maybe there are some tools suitable to help get to the bottom of this.

     Gary

    Wednesday, January 12, 2011 2:21 PM

All replies

  • I've been fighting this on two different servers over the last couple of weeks including one a couple of days ago that shut down work on the server altogether. Have a visual basic script from Microsoft that address a problem that the NtFRS service was having with the SYSVOL share.  Running the script essentially recreated the active directly values for the SYSVOL share which stopped the server freezes.  On one of the servers, I am still getting the 7011 timeout message below, but at least the freezing problem is gone.

    Check the File Replication Service log in the event viewer to see if you are getting errors there as well (Event IDs 13552 and 13555).  If so, then I can post the text of the .vbs script here for you to copy and save to a file. The Microsoft tech I spoke with said that I could use it on another SBS 2003 server as it is an in house script that they use to do what used to take them over an hour to do using the ADSI Editor tool (I still took a backup of the server system state as I tend to be paranoid that way). The MS tech simply made a "Sysvol backup" folder on my server desktop and copied the "policies" and "scripts" folders from the \\ds-server\SYSVOL\"InternalDomainName" folder. Then ran the script.

    The server freezing stopped immediately upon the script ending (on both servers).  I'd also noticed that the freezing had made the backup times take hours longer as well. On one server, the backup job that was taking till around 5 am in the morning to complete (started at 8:00 PM) now completes around 1:30 am..  That tells me how much the overall server performance was being affected. The other server backup dropped from almost 11 hours down to 3.5 hours, but this is the server that had the more severe freezing problem earlier this week.

    Let me know if you have the event errors above in addition to the Event 7011 messages.

    "the80sroklej" wrote in message news:1d6379af-3c57-43ba-9762-4fc103a274a1@communitybridge.codeplex.com...

    Hi all, I am hoping someone can shed some light on this strange issue I have been having.

    I have restored a backup to almost identical hardware (test server) minus 2gb of ram, minus 1 network card and different hard drives. After restoring I keep getting the server lockup, the error is:

    [I]Event Type: Error
    Event Source: Service Control Manager
    Event Category: None
    Event ID: 7011
    Date: 11/01/2011
    Time: 5:25:50 AM
    User: N/A
    Computer: SBS
    Description:
    Timeout (30000 milliseconds) waiting for a transaction response from
    the NtFrs service.[/I]

    I have researched this issue and most seem to only get the server lockup for 60 seconds, however I have to power cycle and my issue seems very different.

    Production Server:
    HP ML115 G1
    4gb ram
    Adaptec 1220SA RAID card
    2 Network Cards
    Samsung 500gig drive (were 2 in RAID1 but secondary died and replacement caused no end of issues but that is a different story, all I need to say is software RAID).

    Test Server:
    HP ML115 G1
    2gb ram
    Adaptec 1220SA RAID Card
    1 Network Card
    2x Western Digital 750gig drives in RAID1 partitioned exactly as the 500gb with the remained left unallocated.

    I restored from backup and left the raid overnight verifying. This morning the raid1 had finished verifying and the system locked up sometime after after, this could be purely a coincidence but I remember seeing a few lockups before when restoring to the same test server.

    I have nothing connected to the network card and tried stopping the File Replication Service but still it locks up. I am yet to try changing the "/Autodisconnect value to 0xfffff"/ but I’m sceptical this will solve it as the server never recovers from the lockup compared with other people’s experiences.

    I was wondering if anyone had any possible causes/ experiences of this problem given the scenario?

    My thoughts at this time are:

    Different hard disks causing an issue? Raid card bios version possibly different causing issue? missing network card/ 2gb memory? Nod32 AV causing an issue?

    I will do a clean boot in the mean time to see the results.

    Maybe there are some tools suitable to help get to the bottom of this.

    Gary

    Friday, January 14, 2011 5:14 PM
  • Hi Tom, thanks for the assistance.

    An update so far: I ran a clean boot and all seemed ok, I left the server for about 1.5 hours without lockups. I then disabled a few startup services and rebooted just to test and this too seemed ok, I left it running for 1 hour without any lockups. I re-enabled all the startup services back to how it was originally and it seemed fine, I left it running for 2.5 hours without lockups. I started the server the following morning and it locked up after just 20 minutes. Completely stumped.

    Since then I formatted the boot partition (C), Left the data partition intact (E) and reinstalled the OS with the first disk and performed a full restore with NTbackup. It took 10 hours to restore but as its a test server downtime isn't an issue (until I do it on the production at the weekend that is). I had the server running about 4 hours yesterday after the restore and it didnt lockup at all.

    Unfourtunately I had already performed the restore before reading your post so I cannot be certain there were errors in the File Replication logs. If possible, I would be very greatful if you could you post the vbs script, This could well be the key to the intermitant issue I have been having. 

    I will run the server again today and see if I have any lockups.

    Tuesday, January 18, 2011 1:58 PM
  • Hi ThomasE2E

     I know this is an old post I am responding to but could you post the script please?

    I am having a problem whereby an SBS 2003 server freezes up and we have to do a hard reset to get it going again.

    We have tried everything we can think of but nothing yet.

    I did find an error in the event log (shown above) and it seems like my situation.

    Please can you post the text of the vb script.

    Thanks

    Tuesday, February 14, 2012 8:36 AM
  • Don't know if you ever got this fixed, but here is the text of the visual basic script.

    "recreateSysvolADobjsPri.vbs" is the name of the script.
    __________________________________________________________________________________________________________
    Const HKEY_LOCAL_MACHINE = &H80000002
    On Error Resume Next

    WScript.Echo "Stopping the File Replication Service"
    Set objWMINameSpace = GetObject("winmgmts:{impersonationLevel=impersonate}//LOCALHOST/root/CIMV2")
    Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
    If LCase(srvcInstance.state) = "running" Then Set srvcOutParam = srvcInstance.ExecMethod_("stopservice")
    If srvcOutParam.ReturnValue <> 0 Then Wscript.Echo "The NTFRS service failed to stop with a return value of: " & srvcOutParam.ReturnValue

    Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
    For k = 1 to 150
     Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
     If LCase(srvcInstance.status) = "ok" Then Exit For
     wscript.sleep (k * 100)
    Next

    Set WSHShell = CreateObject("WScript.Shell")

    Set objRegistry = GetObject("winmgmts:{impersonationLevel=impersonate}//LOCALHOST/root/default:StdRegProv")
    Set objMethod = objRegistry.Methods_("ENUMKEY")
    Set objInParam = objMethod.inParameters.SpawnInstance_()
    objInParam.hDefKey = HKEY_LOCAL_MACHINE
    objInParam.sSubKeyName = "SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Replica Sets"
    Set objOutParam = objRegistry.ExecMethod_("ENUMKEY", objInParam)
    Set objSubKeys = objOutParam.Properties_("sNames")

    For i = 0 To UBound(objSubKeys)
      intTomb = 1
      strSetName = ""
      intTomb = WSHShell.RegRead("HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Replica Sets\" & objSubKeys.Value(i) & "\Replica Set Tombstoned")
      strSetName = WSHShell.RegRead("HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Replica Sets\" & objSubKeys.Value(i) & "\Replica Set Name")
      If intTomb = 0 AND strSetName = "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)" Then
        DelADobj(objSubKeys.Value(i))
         Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
        if LCase(srvcInstance.state) = "stopped" Then Set srvcOutParam = srvcInstance.ExecMethod_("startservice")
        If srvcOutParam.ReturnValue <> 0 Then Wscript.Echo "The NTFRS service failed to start with a return value of: " & srvcOutParam.ReturnValue
         Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
        For k = 1 to 150
          Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
          If LCase(srvcInstance.state) = "running" Then Exit For
          wscript.sleep (k * 100)
        Next
         For j = 1 to 60
          If WSHShell.RegRead("HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Replica Sets\" & objSubKeys.Value(i) & "\Replica Set Tombstoned") = 1 Then Exit For
          wscript.sleep (j * 500)
        Next
        Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
         If LCase(srvcInstance.state) = "running" Then Set srvcOutParam = srvcInstance.ExecMethod_("stopservice")
        If srvcOutParam.ReturnValue <> 0 Then Wscript.Echo "The NTFRS service failed to stop with a return value of: " & srvcOutParam.ReturnValue
         Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
        For k = 1 to 150
          Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
          If LCase(srvcInstance.status) = "ok" Then Exit For
          wscript.sleep (k * 100)
        Next
      End If
    Next

    Set ADInfo = CreateObject("AdSystemInfo")
    strDNSDomain = ADInfo.DomainDNSName
    strSysvol = WshShell.RegRead("HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\Sysvol")
    strSysvol = Left(strSysvol, Len(strSysvol)-7)
    strSysvolRoot = strSysvol & "\domain"
    strSysvolStage = strSysvol & "\staging\domain"
    intPrimary = 1

    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Command", "Create", "REG_SZ"
    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Name", strDNSDomain, "REG_SZ"
    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Type", "Domain", "REG_SZ"
    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Primary", intPrimary, "REG_DWORD"
    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Root", strSysvolRoot, "REG_SZ"
    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\" & strDNSDomain & "\Replica Set Stage", strSysvolStage, "REG_SZ"

    WshShell.RegWrite "HKLM\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\SysVol\SysVol Information is Committed", 1, "REG_DWORD"

    wscript.echo "Starting FRS service"
    Set srvcInstance = objWMINameSpace.get("Win32_Service='NTFRS'")
    Set srvcOutParam = srvcInstance.ExecMethod_("startservice")
    If srvcOutParam.ReturnValue <> 0 Then Wscript.Echo "The NTFRS service failed to start with a return value of: " & srvcOutParam.ReturnValue
    wscript.echo "Script has completed"

    Sub DelADobj(strGUID)
      If Err.Number <> 0 Then Err.Clear
      Set objADRepSet = GetObject("LDAP://LOCALHOST/<GUID=" & Left(strGUID,23) & "-" & Right(strGUID, 12) & ">")
      If Err.Number <> 0 Then
        WScript.Echo "Could not bind to GUID=" & Left(strGUID,23) & "-" & Right(strGUID, 12)
        WScript.Echo "Failed with error: " & Err.Number
        WScript.Echo "Not deleting any AD Objects"
        Err.Clear
        Exit Sub
      End If
      Set objADRepSet = GetObject("LDAP://LOCALHOST/" & objADRepSet.distinguishedName)
      If Err.Number <> 0 Then
        WScript.Echo "Could not bind to nTFRSMember object with DN= " & objADRepSet.distinguishedName
        WScript.Echo "Failed with error: " & Err.Number
        WScript.Echo "Not deleting any AD Objects"
        Err.Clear
        Exit Sub
      End If
      If objADRepSet.fRSMemberReferenceBL <> "" Then
        Set objADSubscriber = GetObject("LDAP://LOCALHOST/" & objADRepSet.fRSMemberReferenceBL)
        If Err.Number = 0 Then
          objADSubscriber.DeleteObject(0)
        Else
          WScript.Echo "Could not bind to nTFRSSubscriber object with DN= " & objADSubscriber.distinguishedName
          WScript.Echo "Failed with error: " & Err.Number
          WScript.Echo "Not deleting nTFRSSubscriber object from AD"
          Err.Clear
        End If
      Else
        WScript.Echo "No member reference to nTFRSSubscriber object"
      End If
      If Err.Number <> 0 Then
        WScript.Echo "Failed to delete nTFRSSubscriber object with error: " & Err.Number
        Err.Clear
      End If
      objADRepSet.DeleteObject(0)
      IF Err.Number <> 0 Then
        WScript.Echo "Failed to delete nTFRSMember object with error: " & Err.Number
        Err.Clear
      End If

    End Sub
    __________________________________________________________________________________________________________

    "ATYOUNG" wrote in message news:385c41b3-027c-4d23-9e98-b3d7694b96ae@communitybridge.codeplex.com...

    Hi ThomasE2E

    I know this is an old post I am responding to but could you post the script please?
     I am having a problem whereby an SBS 2003 server freezes up and we have to do a hard reset to get it going again.

    We have tried everything we can think of but nothing yet.

    I did find an error in the event log (shown above) and it seems like my situation.

    Please can you post the text of the vb script.

    Thanks

    Thursday, April 26, 2012 4:11 PM