none
Server crash when access by DPM client

    Question

  • The server froze (didn't respond to Cntl-Alt-Del) when I tried accessing the Restore facility from a XP SP3 client, and also a Win7 client.

    I clicked on the recovery point, it said Opening. . . and the server froze. This has happened twice.

     

    Server: DPM 2010 on W2008R2 x64, client XP Pro SP3 & Win 7 Prof, backups are reported to have been done OK.

    Nothing in the event logs on the server, apart from

    Timeout (30000 milliseconds) waiting for a transaction response from the ShellHWDetection service.

    Timeout (30000 milliseconds) waiting for a transaction response from the LanmanServer service.

    both of which I suspect may be side-effects of the freeze.

    • Moved by Praveen D [MSFT]Owner Tuesday, January 04, 2011 12:27 PM Moving to DPM laptop Protection (From:DPM Setup and Agent management)
    Monday, November 29, 2010 12:15 AM

Answers

All replies

  • Okay, as I understand it, the problem you are having is this:

    DPM 2010 installed on Win2008 R2

    Client protection group with (at least) two members, 1) XP Pro 2) Win7 Pro

    On the clients, when opening the DPM tool and connect to the server fine and get a list of recovery points.  When you click on the recovery point to expand it, it says "Opening" at which point the server freezes.

    Assuming that is correct, then can you answer the following questions:

    1. How long did you let the DPM server sit before forcing it to reboot?

    2. Did you try to ping it?  Could you connect to any shares?  Did the keyboard respond to CapsLock or NumLock (assuming it has a light indicator when they are on)?

    3. Are you able to do a recovery of any of these recovery points from the DPM console?


    Thanks, Chris Bu - MSFT This posting is provided "AS IS" with no warranties, and confers no rights
    Wednesday, December 15, 2010 4:44 PM
  • 1. About 5-10 minutes

    2. No. Didn't try. Didn't notice.

    3. Didn't try, but a recovery of a file went fine.

    I'll try what you've suggested, but should I try something else to help debug?

    I should note that the server freezes hard: no response to Cntrl-Alt_del, and I have to hit the power button to get it working again.

    Many thanks!

    Wednesday, December 15, 2010 7:15 PM
  • Okay, I think the things we should test the following then:

    This is a little painful as we have to put it in the "hung" state to do some testing.

    First, you should test different users on different client machines to determine if this is specific to a user or to the computers you tested on.  If that testing shows that the server still hangs, then while it is hung, try to ping and possibly connect to a share on it.

    Finally, open the DPM installation directory\temp folder on the DPM server and sort the logs by date.  Look for the logs that were updated at the date/time right around the time of the hang.  Open each log in Excel and search for the name of the client machine.  When you find it, filter on the TaskID for that entry and then see if it reflects any errors in that TaskID thread.

    The other log files we can look at will be found on the client machine at: %USERPROFILE%\AppData\Roaming\Microsoft\Microsoft System Center Data Protection Manager 2010\

    You can post it here if it doesn't have any helpful information.


    Thanks, Chris Bu - MSFT This posting is provided "AS IS" with no warranties, and confers no rights
    Friday, December 17, 2010 12:09 PM
  • Hard freeze. Pinged OK, no shares mounting.
    Here's the logs. Win XP SP3 client.
    From MSDPMCurr.ErrLog
     
    137C 0390 12/19 23:34:21.778 07 agentmanagerservices.cpp(532) NORMAL CAgentManagerServices::SetClientDetails(). bstrClientName=OWEN-WATSON.rsnz.local, bstrClientOSVersion=5.1.2600, clientOSType=1, bstrClientIP=, bstrClientAgentVersion=3.0.7696.0
    137C 0390 12/19 23:34:21.778 07 AgentCommunication.cs(1796) NORMAL clientMachineName=OWEN-WATSON.rsnz.local, clientOSVersion=5.1.2600, clientOSType=1, clientIP=, clientAgentVersion=3.0.7696.0
    137C 0390 12/19 23:34:21.780 07 Deployment.cs(3579) NORMAL Checking if protection is setup for [OWEN-WATSON.rsnz.local] and found bProtectionFound as [True]
    137C 0390 12/19 23:34:21.780 07 Deployment.cs(2732) NORMAL Updating the agent record of a given type[RA] for server[OWEN-WATSON.rsnz.local] from table InstalledAgent.
    137C 0390 12/19 23:34:21.783 07 Deployment.cs(3579) NORMAL Checking if protection is setup for [OWEN-WATSON.rsnz.local] and found bProtectionFound as [True]
    137C 0390 12/19 23:34:21.783 07 Deployment.cs(3518) NORMAL updating client OS details for machine[OWEN-WATSON.rsnz.local] in table InstalledAgent
    137C 0390 12/19 23:35:00.116 07 agentmanagerservices.cpp(532) NORMAL CAgentManagerServices::SetClientDetails(). bstrClientName=OWEN-WATSON.rsnz.local, bstrClientOSVersion=5.1.2600, clientOSType=1, bstrClientIP=, bstrClientAgentVersion=3.0.7696.0
    137C 0390 12/19 23:35:00.116 07 AgentCommunication.cs(1796) NORMAL clientMachineName=OWEN-WATSON.rsnz.local, clientOSVersion=5.1.2600, clientOSType=1, clientIP=, clientAgentVersion=3.0.7696.0
    137C 0390 12/19 23:35:00.117 07 Deployment.cs(3579) NORMAL Checking if protection is setup for [OWEN-WATSON.rsnz.local] and found bProtectionFound as [True]
    137C 0390 12/19 23:35:00.117 07 Deployment.cs(2732) NORMAL Updating the agent record of a given type[RA] for server[OWEN-WATSON.rsnz.local] from table InstalledAgent.
    137C 0390 12/19 23:35:00.119 07 Deployment.cs(3579) NORMAL Checking if protection is setup for [OWEN-WATSON.rsnz.local] and found bProtectionFound as [True]
    137C 0390 12/19 23:35:00.119 07 Deployment.cs(3518) NORMAL updating client OS details for machine[OWEN-WATSON.rsnz.local] in table InstalledAgent
    137C 0390 12/19 23:35:00.982 17 intentservices.cpp(522) NORMAL CIntentServices::MountAndShareShadowCopy(). shadowCopyId={68CA125B-BDD0-4853-8766-4A01E69D3731}, datasourceId={DFBCB0E5-3211-4238-A7C0-606514A73067}
    137C 0390 12/19 23:35:01.065 33 exposeshadowcopy.cpp(44) WARNING Failed: Hr: = [0x80042308] : {: lVal : pVssBackupComponentEx2->UnexposeSnapshot(snapshotId)
     
     
     
    In DPMRACurr.errlog
    0410 12C4 12/19 23:35:07.960 03 runtime.cpp(1652) [000000000032C580] NORMAL CDLSRuntime::ProcessIdleTimeout
    0410 12C4 12/19 23:35:07.960 03 runtime.cpp(646) [000000000032C580] NORMAL CDLSRuntime::Uninitialize, bForce: 0
    0410 12C4 12/19 23:35:07.960 05 genericagent.cpp(264) [0000000000329230] NORMAL Agent Can Shutdown if there is only default wokitem active[1]
    0410 12C4 12/19 23:35:07.960 29 dpmra.cpp(360) [0000000000329230] NORMAL CDPMRA::Shutting down dpmra, force-shutdown :yes
    0410 12C4 12/19 23:35:07.961 03 workitem.cpp(391) NORMAL Timing out WI [00000000025322C0], WI GUID = {B71B4544-7067-4A30-B5FB-BA320B10D82A}, ..last DM activity happened 79381373msec back, WI Idle Timeout = 390000msec
    0410 12C4 12/19 23:35:07.961 22 genericthreadpool.cpp(664) [00000000003BF7E0] NORMAL CGenericThreadPool: Waiting for threads to exit
    0410 13EC 12/19 23:35:09.961 20 cc_base.cpp(1226) [00000000003A1E20] NORMAL DM: Aborting http state machine for reason 0
    0410 13EC 12/19 23:35:09.962 20 cc_base.cpp(1226) [000000000039FB30] NORMAL DM: Aborting http state machine for reason 0
    0410 0D9C 12/19 23:35:09.962 20 cc_base.cpp(1226) [00000000003A4190] NORMAL DM: Aborting http state machine for reason 0
    0410 0D78 12/19 23:35:09.962 20 cc_base.cpp(1226) [00000000003A6500] NORMAL DM: Aborting http state machine for reason 0
    0410 10C8 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003A9830] NORMAL DM: Aborting http state machine for reason 0
    0410 10C8 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003B46E0] NORMAL DM: Aborting http state machine for reason 0
    0410 10C8 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003ABB20] NORMAL DM: Aborting http state machine for reason 0
    0410 0D78 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003B23F0] NORMAL DM: Aborting http state machine for reason 0
    0410 0D9C 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003B0100] NORMAL DM: Aborting http state machine for reason 0
    0410 13EC 12/19 23:35:09.963 20 cc_base.cpp(1226) [00000000003ADE10] NORMAL DM: Aborting http state machine for reason 0
    0410 12C4 12/19 23:35:09.967 22 genericthreadpool.cpp(664) [000000000032C580] NORMAL CGenericThreadPool: Waiting for threads to exit
    0410 129C 12/19 23:35:11.968 03 timer.cpp(513) [00000000003B74F8] ACTIVITY Shutting down timer thread.
    0410 12C4 12/19 23:35:11.968 03 service.cpp(81) ACTIVITY CService::StopThisService
    0410 12C4 12/19 23:35:11.968 03 service.cpp(281) [00000000000EFB00] ACTIVITY CService::StopService()
    0410 0F20 12/19 23:35:11.969 03 service.cpp(298) [00000000000EFB00] ACTIVITY CService::AnnounceServiceStatus
     
    In DPMAccessManagerCurr.errlog:
    0C3C 16F0 12/19 23:34:26.961 65 ACM.cs(318) NORMAL Number of scopes returned : 3 for pass : 1
    0C3C 16F0 12/19 23:34:26.990 65 ACM.cs(318) NORMAL Number of scopes returned : 0 for pass : 2
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:d04a9d8b-1727-4c62-a07b-ab2a14e85493 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:45566ca9-9328-4fc1-95d5-061cce393218 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:ab581255-26cc-417d-be43-373f6e2aebc9 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:b01ecb19-3596-4cba-ae78-19dd8ae3ed5a operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:c0a79c35-58bd-4345-9f91-1e1fc066f507 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:df962460-33e7-4d8b-a150-95779123d7e9 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:1c5946a2-d3b3-499e-a682-310edb901a53 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:b729a237-6acf-4d2a-b044-5f61e9c93655 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:63e8bd4d-0dab-4888-8d1d-36fa4e487db5 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:52671f22-4d6a-4519-89de-2ac5949b08c9 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:0a46690d-5667-44e6-b329-10c00fe74258 operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:ebfa0e9b-4458-47b6-8fb0-e518ef9d142b operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:e3538dc0-5086-44ec-859e-b6e6770307aa operation:Add
    0C3C 16F0 12/19 23:34:27.547 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:09e235bb-ae7b-4062-9bdd-2adfa8cfe267 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:482ddc5c-854f-4eef-82fd-91759d8e7e6e operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:d4b585bb-00e3-4c2a-87e4-74c1f253b276 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:2024f54f-bbb6-47c9-bd69-9459a9418196 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:b1e67081-c824-4a98-83c3-8baf205326b4 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:52d892e6-de80-4a17-a8d0-7903bd39b50a operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:cd1b92c1-80de-48f6-9a7f-1556bc57dcb3 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:6cbe6868-3f77-4ad4-9724-e515f5d001b4 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:c2522f00-cab9-4cb9-a67f-852f65709a88 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:c338e94f-0172-40a9-90e3-e60ed9d0cae4 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:f7bcde51-8ddf-480b-a63f-b47aee182132 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:76a3f4bf-9837-4f0d-9a73-5a0b6fdbab0b operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:fb3fd954-b2e4-4dc0-b2b9-b36c9d6a51ec operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:ea3b3d23-9d63-4615-b225-18a4a2141a89 operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:6d08a22e-50ce-471d-b6c7-4bb7839618eb operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:6d8f7499-843b-4437-9244-ff2324813cbe operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:dba510dc-62fa-4875-b475-57c62505f2db operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:d551919c-152b-4290-80e1-da132e2f130b operation:Add
    0C3C 16F0 12/19 23:34:27.548 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:f3018209-165b-43b9-b53a-122208c8afda operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:2af4d774-f231-42a3-8822-8a5a57f8fb3f operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:4485f970-b32c-4d35-a8d7-bd6d77d42ed8 operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:98164a7b-535e-4616-aefd-2b01ecd8154f operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:9f45aad2-1e6e-43b8-96b0-79242ec3cd54 operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:ace98191-c378-4e86-bb4c-348620769633 operation:Add
    0C3C 16F0 12/19 23:34:27.623 62 Datasource.cs(5947) NORMAL Updating the recoverysourcecollection datasetId:ef15c949-196d-49a6-05D4 0650 12/19 23:43:50.419 64 AMServiceHost.cs(44) NORMAL Initializing the AMService
     
     
    From local client:
    0C48 05CC 12/19 23:34:21.715 33 clientserviceproxy.cpp(201) NORMAL CClientServiceProxy::TriggerBackupAdhoc - Client FQDN: OWEN-WATSON.rsnz.local, TriggerBackup:False
    • Proposed as answer by vrmurthy[MSFT] Thursday, February 10, 2011 7:06 AM
    • Unproposed as answer by OwenWatson Thursday, February 10, 2011 9:02 AM
    Monday, December 20, 2010 12:35 AM
  • Sorry for unintentionally proposing the above as answer. It was wrong press of a button.

    From the DPM log pasted, i observed that VSS related when exposing recovery points for the client OWEN-WATSON.rsnz.local:

    137C 0390 12/19 23:35:00.982 17 intentservices.cpp(522) NORMAL CIntentServices::MountAndShareShadowCopy(). shadowCopyId={68CA125B-BDD0-4853-8766-4A01E69D3731}, datasourceId={DFBCB0E5-3211-4238-A7C0-606514A73067}
    137C 0390 12/19 23:35:01.065 33 exposeshadowcopy.cpp(44) WARNING Failed: Hr: = [0x80042308] : {: lVal : pVssBackupComponentEx2->UnexposeSnapshot(snapshotId
     This seems to be issue addressed by a hotfix by VSS team: http://kb.backupassist.com/articles.php?aid=2967
    Go through the KB article to find if you see similar events in your DPM server. If applicable, apply the hotfix & reboot server and see if it helps.

    Thanks VRMURHTY [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, February 10, 2011 9:57 AM
  • The BackupAssist article seems to be aimed at x32 systems AFAIK. I did as much of them that seemed to work and running the vssadmin list writers command at the end showed all writers running OK. The KB referenced in the article refers to cluster configs (mine isn't).

    Also installed W2008R2 SP1, and the latest DPM patch.

    Still the same server crashes. . .

    Friday, March 11, 2011 4:13 AM
  • Hi Owen,

    If you are still experiencing this issue could you please open up a support case with PSS so we can do some more troubleshooting and get a crashdump?

    Thanks,

    Marc

     

    Friday, April 01, 2011 7:00 PM
  • Where do I go to open a PSS support case? I'm in NZ.
    Monday, April 04, 2011 11:27 PM
  • Hi Owen,

    You can open a support case by completing the web based wizard linked below and it will route you to a resource in your region.

    http://support.microsoft.com/common/international.aspx?RDPATH=dm;en-us;select&target=assistance

    Thanks,

    Marc

    Tuesday, April 05, 2011 12:32 PM
  • Ta! I spoke to the MS rep on the professional support team and they were wanting a good amount of $ to go any further, so I passed. So count this as unclosed.
    Wednesday, April 06, 2011 1:10 AM
  • housekeeping - closing old post.  Open a new post if you still have a need.
    Wednesday, November 30, 2011 1:30 PM