Long VSS generation time with many SQL Databases RRS feed

  • Question

  • Server: DPM 2010 on Windows 2008 R2
    Client: Server 2008 SP2 x64 running SQL 2008 SP3 x64

    Troubleshooting an issue protecting a SQL 2008 SP3 instance with many DB's (172). Total space is about 284GB for data files and 44GB for log files across two volumes (one for data - R5 - E:, one for logs - R1 - F:). The disks are busy, but not overly busy (avg 70 IOPs on the log volume, 100 IOPs on the data volume). I don't seem to be able to backup every DB within a 24-hour period, mainly because it looks like VSS snapshot generation takes ~40 minutes per DB. Looking at the logs, I see something like:

    19:55:27.980 NORMAL CVssSnapshotRequestor::AddComponentForSnapshot [0000000001CE56E0]
    19:55:27.980 NORMAL CVssSnapshotRequestor::AddVolumeForSnapshot [0000000001CE56E0]
    19:55:27.980 NORMAL CVssSnapshotRequestor: AddVolumeForSnapshot - Marked volume E:\ to be snapshot
    19:55:27.981 NORMAL CVssSnapshotRequestor: AddVolumeForSnapshot - Marked volume F:\ to be snapshot
    19:55:27.981 NORMAL CVssSnapshotRequestor::StartPrepareForBackup [0000000001CE56E0]
    19:55:29.020 NORMAL QueryStatus returned 0x4230a, Releasing VssAsync [0000000000C610C0]
    19:55:29.020 NORMAL CVssBaseRequestor::StartGatherWriterStatus [0000000001CE56E0]
    19:55:30.018 NORMAL QueryStatus returned 0x4230a, Releasing VssAsync [0000000000C6B400]
    19:55:30.018 NORMAL CVssBaseRequestor::CheckWriterStatus [0000000001CE56E0]
    19:55:30.020 NORMAL CreateBitmap: {7AAB46E9-09AF-4604-A61C-33EF53415560}
    19:55:30.022 NORMAL CreateBitmap: {7AAB46E9-09AF-4604-A61C-33EF53415560}
    19:55:30.023 NORMAL CVssSnapshotRequestor::StartDoSnapshot [0000000001CE56E0]
    20:36:21.666 NORMAL QueryStatus returned 0x4230a, Releasing VssAsync [0000000000C6B400]
    20:36:21.666 NORMAL CVssBaseRequestor::StartGatherWriterStatus [0000000001CE56E0]
    20:36:22.668 NORMAL QueryStatus returned 0x4230a, Releasing VssAsync [0000000000C6B400]
    20:36:22.668 NORMAL CVssSnapshotRequestor::MapVolumesToSnapshotPaths [0000000001CE56E0]

    During this time, any other DB jobs that try to run in parallel get a 0x80042316 error that a snapshot is already in progress. Furthermore, it seems it performs this same snapshot operation for each DB - 172 DB's times 40mins each = 114.67 hours just to create the snapshots so they can be transferred to the DPM server.

    The current backup policy is to perform daily recovery points with a synchronization just before the recovery point creation and 14-day retention.

    How can I go about finding out why the snapshot generation takes 40 minutes and what might I be able to do to speed up that process?

    Jeff Graves, ORCS Web, Inc.

    Thursday, February 23, 2012 8:59 PM

All replies

  • I've opened a PSS case for this. Will report back with findings.

    Jeff Graves, ORCS Web, Inc.

    Sunday, February 26, 2012 4:34 PM
  • Case is still on-going, but PSS believes this is due to volume fragmentation. Unfortunately, the both the in-box defrage utilities and contig are unable to defragment the volume online and given the size of the volume, an offline defrag is not an option. We're looking at spinning up another box, mirroring the DB's, and then rebuilding this box.

    Jeff Graves, ORCS Web, Inc.

    Thursday, April 12, 2012 1:34 PM