none
DPM 2012 ground to a halt RRS feed

  • Question

  • We have a client who has been using DPM2012 fine for a month of so. Recently (this may be a co-incidence) we added a tape library to add long-term storage. This takes a daily copy of the existing disc snapshots.

    The single protection group is backing up:

    1) The DPM server including its DBs
    2) A HyperV server with all guests
    3) Specific content of one of the guests (SQL DB and some files)

    This had been working fine but suddenly a lot of the content has stopped working - sync or receovery point creation just hangs - no data appears to be moved at all, the jobs just stop. The DPM server backup is OK, the backup of the HyperV host system drive is fine but all of the guests now hang. The backup of the system drive on the selected guest is OK, but the SQL DBs now hang.

    Also, if I go to any element and select "Modify Disk Allocation" the window hangs completely, never displaying anything. I can see no VSS errors on the HyperV server, for example, it's as if something just makes the system hang completely on these specific elements. However, it was working fine for quite some time.

    Restarting the HyperV server and all guests did allow one consistency check to work across everything and then it simply started to hang again.

    With no errors logged anyway, it's very difficult to diagnose.

    Wednesday, November 21, 2012 4:51 PM

Answers

All replies

  • Hi,

    first step is to remove the tape library to check if the problem comes from it.

    Wednesday, November 21, 2012 8:33 PM
  • Sorry, I realise that my explanation wasn't clear. The drive was always connected but unused due to the unavailability of proper labels. So, what actually changed was to modify the protection group to add long-term protection to it. However, the failure occurs on the short-term backup to disc; but only for the selected elements.

    This morning I can see jobs with have been running for 13,14 and 17 hours with 0MB recorded and just sitting "In progress".

    Thursday, November 22, 2012 9:55 AM
  • We're still seeing some serious problems with this system after reverting the Protection Group to its previous settings. The latest symptom is the following:

    The Agents are responding "OK" in the Management tab with no problem
    However, if I try to modify the protection group and try to expand one of the servers with the agent on, they do not respond. The selection simply hangs and I cannot access the servers to select any components.

    This seems to match exactly the symptoms with the backups themselves. The agents say that they're responding but they won't communicate. Sometimes if I restart the remote machines they will then respond for a while, but this then stops completely shortly afterwards.

    I have applied Rollup 3 and updated the agents with no problem - the fault persists. We're using DPM 2010 at other sites in precisely this scenario and not seeing any problems. Our client is not impressed as this is the first move to DPM from BackupExec for them.

    Tuesday, December 4, 2012 4:12 PM
  • I've now reposted this as a different thread with specific details and test results.
    • Marked as answer by Nigel Ayres Monday, January 14, 2013 10:11 AM
    • Edited by Nigel Ayres Monday, January 14, 2013 10:12 AM
    Monday, January 14, 2013 10:11 AM