none
DPM 2007 reports inconsistency, volsnap event on server reports the volume is offline for shadow copy protection. RRS feed

  • Question

  • I don't know how this happened, or of any event that would have precipitated this.  I received errors in the event log related to this volume for the replica of a SQL database. 

    The application log shows a VSS event 12293: "Volume Shadow Copy Service error: Error calling a routine on a Shadow Copy Provider {b5946137-7b9f-4925-af80-51abd60b20d5}. Routine details GetSnapshotProperties({f5923d2a-3be4-4d9f-8507-378f00b6b14b}) [hr = 0x80042308].

    Operation:"

    The next related app log entry is VSS 12305: "Volume Shadow Copy Service error: Volume/disk not connected or not found. Error context: DeviceIoControl(\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy3328 - 0000000000000178,0x00530190,0000000000000000,0,0000000000678000,4096,[0]). "

    The system log has a volsnap event 82 and says the replica volume "is offline for shadow copy protection.  A shadow copy meta data corruption was detected.  Please try clearing the protection fault or restart the computer.  If all else fails, revert out of shadow copy protection mode to reclaim the use of the volume while losing the shadow copies."

    I also found a volmgr event 57: "The system failed to flush data to the transaction log. Corruption may occur."

    I opened the volume in the disk manager and running a chkdsk on it, but it failed saying the disk type is RAW.  I mounted the volume with a drive letter and used the GUI to check the disk and it said it found no errors.  It may not have been the missing volume, because it's hard to match up the volume shadow copy ID with the device GUID.

    I have restarted the server (Windows Server 2008 x64) twice to no avail.  This is the only replica having this problem, and we have many replica and recovery point volumes on this same drive array, running on a Dell PE 2950 hooked up to a Dell MD1000.  Any ideas?

    EDIT: One more note - I tried removing the affected database from protection (without deleting recovery points) and adding it back to the protection group, but that didn't fix it either.

    Wednesday, June 23, 2010 6:37 PM

Answers

  • Hello Robert,

    This event tells all: volsnap event 82 - The replica volume "is offline for shadow copy protection.  A shadow copy meta data corruption was detected.

    This means that VSS can no longer track changes to the replica volume, so it took it offline to prevent any further changes. Unfortunatly, the recovery point volume that holds the shadow copies is corrupted, so you will unfortunatly need to remove the data source from protection, but this time remove the replica so both volumes get deleted.  Yes, you wil loose all RP's, then when you add it back into protection new volumes will get created and you can continue to get new RP's.

     

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, June 23, 2010 11:30 PM
    Moderator

All replies

  • Hello Robert,

    This event tells all: volsnap event 82 - The replica volume "is offline for shadow copy protection.  A shadow copy meta data corruption was detected.

    This means that VSS can no longer track changes to the replica volume, so it took it offline to prevent any further changes. Unfortunatly, the recovery point volume that holds the shadow copies is corrupted, so you will unfortunatly need to remove the data source from protection, but this time remove the replica so both volumes get deleted.  Yes, you wil loose all RP's, then when you add it back into protection new volumes will get created and you can continue to get new RP's.

     

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, June 23, 2010 11:30 PM
    Moderator
  • Oh - and just for grins, the following Windows 2008 hotfix contains the latest volsnap.sys driver just to be sure you're on the latest volsnap.sys driver code. 

     

    977001 Stop error code when you enable shadow copies on a volume that is larger than 64 terabytes on a computer that is running Windows Server 2008 or Windows Vista: "0x0000007E"
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;977001


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, June 23, 2010 11:36 PM
    Moderator
  • Thanks for your reply.

    How would this happen?  The server was last shut down cleanly.

    Thursday, June 24, 2010 12:14 AM
  • Hello Robert,

     

    Unfortunatly there is no way of telling what caused the corruption.  In this case DPM is the victim, VSS infrastructure and storage stack including hardware is the only thing that can cause the meta data corruption.  In the 3 years I've supported DPM, I think perhaps I've heard three customers report this.  If it's reoccuring I would be concerned and would start by updating all storage related firmware and drivers. 


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, June 24, 2010 4:00 PM
    Moderator
  • Hello Mike,

    We are experiencing the same problem with DPM 2010 with different volumes.(iscsi attached)
    All of them are protecting SQL databases.
    The problems occured after a scheduled maintenance on the storage network.
    The DPM 2010 had a clean shutdown before the maintenance.
    It directly worked again after the maintenance. The problems started after a few days with 2 volumes.
    A day later 4 other volumes (from another protection-group and protected server) had the same issue.
    We have 7 different DPM servers (all DPM 2010) protection a lot of important data and they do not have a problem....yet?

    Summary of the events:

    Windows events\system on the DPM server:

    -Event 82 - volsnap - ....volume is offline for shadow copy protection
    -Event 57 - volmgr - ...the system failed to flush data to the transaction log. Corruption may occur.
    -Event 27 - Application Popup - Delayed write failed

    Disk are all online in DPM and Windows disk management and all Healty!

    DPM 2010 Alerts

    -Jobs are failing with the following messages: Replica inconsistent and with recommended actions:

    1) Rerun the consistency check and see if it resolves this issue.
    2) Check to see if your disks are connected properly to the DPM server. If there are any disk cabling issues, rectify them and then run consistency check.
    3) If you are certain that you do not have any disk cabling issues, clear the VSS fault. Next, run chkdsk on \\?\Volume{83126570-e2bf-11df-9257-001a4ba6c1de}\ and finally run consistency check.
     Synchronize with consistency check.
     Clear the VSS fault...
     Run a synchronization job with consistency check...

    My questions:

    - Options 3: The chkdsk options does not work because chkdsk is saying the disk is RAW and RAW disks are not supported by chkdsk. Any other options

    - Clear the VSS fault.. : how do i know this task ran succesfully. Is there a status we can check?

     

    Monday, February 14, 2011 4:47 PM
  • Hi Marti,

    The devil is in the details of why windows put the replica volume in VSS protection mode.  Perhaps the Volsnap event has more details.  I've seen a page file being too small cause this, so check your page file size. 

    Hardware Requirements http://technet.microsoft.com/en-us/library/ff399280.aspx

    0.2 percent the size of all recovery point volumes combined, in addition to the recommended size (generally, 1.5 times the amount of RAM on the computer).

    Under normal circumstances, if this is just a resource issue, then a reboot usually clears it.  Also, you should be running chkdsk on the DIFFAREA volume, that is the volume that has the problem that caused the replica to be taken offline and now reports RAW.


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, February 14, 2011 6:49 PM
    Moderator
  • Mike,

    First, thanks for the quick reply!
    About the pagefile requirements. We meet these specs (DPM server: 8GB mem 20GB Pagefile).

    Also reboots of the DPM server and underlying storage systems (HP MSA 2012i iscsi target) are already tried.
    Did not solve it.

    Can you point me into a direction explaining "..you should be running chkdsk on the Diffarea volume.."?
    For your information an exampe:

    The replication path for ServerA - Database001

    D:\ on ServerA.domain.lan C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\ServerA.domain.lan\MSDEWriter\Database001\023b4871-4d3b-45b0-bb30-0e2594c606b5\Full\D-Vol\

    This volume is located on Dynamic Disk2 (4,8TB - 14% free space - GPT - status online) on the DPM server. This disk is ISCSI attached using a dedicated multi-path gigabit storage network.

    On what level should i run chkdsk?

     

    Tuesday, February 15, 2011 2:04 PM
  • Hi

    Run mountvol.exe and look for the volume associated with this mountpoint path:

    C:\Program Files\Microsoft DPM\DPM\Volumes\Diffarea\ServerA.domain.lan\MSDEWriter\Database001

    Then run chkdsk against the volume {GUID}:

    chkdsk /f \\?\Volume{0ddf2e7e-d6f1-11df-805c-0013722810cd}


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, February 15, 2011 4:35 PM
    Moderator
  • Thanks.

    chkdsk /f <volume guid>  on the failing volumes will return the message: cannot open volume for direct access.

    On "healthy" volumes this works.

    Any other options?

    Tuesday, February 15, 2011 4:50 PM
  • Hi,

    Try these commands and see what gets returned.

    fsutil dirty set  \\?\Volume{0ddf2e7e-d6f1-11df-805c-0013722810cd} (where the guid if for the diffarea volume)

    If set correctly - reboot the server and chkdsk should run.

    Try running diskpart.exe - then: list volume
    From the list, locate the volume associated with the diffarea - then select that volume followed bu a detail.
       Select Volume #
       Detail volume
    See if there is anything weird about that volume or if we get an error.

     

     

     


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, February 15, 2011 5:06 PM
    Moderator
  • Mike,

    This will not work because the disk is attached using a software iscsi client.
    During the boot process this disk will not be available to chkdsk.

    If you have more options than i`m glad to hear them. But i think also other people that run into this problem.

    But for our problem, we remove the corrupted RP volumes and re-assigned the SQL databases to the Protection group.
    Assuming this was a 1-time-experience.

    They are all working fine now.


    • Edited by MartijnH Monday, May 21, 2012 10:41 AM
    Thursday, February 17, 2011 11:22 AM
  • HI Mike,   Same problem occured to me using "DPM 2012 SP1"

    "Volume \\?\Volume{95636ff3-2016-11e3-9415-001ddab72887} is offline for shadow copy protection.  A shadow copy meta data corruption was detected.  Please try clearing the protection fault or restart the computer.  If all else fails, revert out of shadow copy protection mode to reclaim the use of the volume while losing the shadow copies.

    When i try to access the

    I have removed one of the protected vm's and added back into protection group. This will work, BUT

    I can still see the volume with data on it, just not accessible.

    Just wanted you to know this still happens even when you have the lastest and greatest.


    ANNCEX

    Friday, October 25, 2013 2:41 PM