none
DPM 2012: impossible to recover SharePoint items if you have a clustered SQL instance for the farm? RRS feed

  • Question

  • Hi,

    We are upon the first tests on our DPM 2012 platform protecting a SP 2010 farm. the protection works perfectly. however, when trying to recover a Sharepoint item (a word document) we are facing some problems. here is a sample recovery session:

    • start the recovery wizard on the document
    • specify the sql cluster instance as the stage sql server
    • specify a temporary folder on the sql cluster, which is already included in the cluster dependencies
    • specify a temporary folder on the WFE.

    The recovery process ends with failure, displaying this warning on the DPM console: "DPM was unable to attach the database to the SQL Instance SERVER\INSTANCE. Cannot use file \\DPMSERVER\WSSILRShare_{guid}\local-replica-path-to-mdf for clustered server. Only formatted files on which the cluster resource of the server has a dependency can be used. Either the disk resource containing the file is not present in the cluster group or the cluster resource of the sql server does not have a dependency on it (ID 32000 Details :Unknown)"

    analyzing this message, and guessing what DPM does under the hood for restoring SharePoint content (restore db files, mount database, export-spweb, transfer files to WFE, import-spweb) , this is as far as I go:

    • DPM restores the .mdf / .ldf locally (and not on the temporary folder specified). In fact, DPM just exposes the backup replica locally (hence the \\dpmserver\local-replica-path...)
    • DPM (or the agent on the sql node) tries to mount the files on a staging database on the cluster, and here is where problems start. It looks (analyzing the error message text) that SQL tries to attach the database without copying the .mdf/.ldf files into the target server, but directly using the DPM local share (\\dpmserver\....). And the cluster complains saying that this location is not valid for a clustered sql

    Some thoughts:

    • DPM 2010 copied the files to the sql server instead of exposing the replica (this would explain why MS states that no long transfers happen in DPM 2012 for item level recovery, and claim much shorter recovery times) If this is true, then it would be impossible to attach a database on a clustered server, as the files must reside on dependencied folders or drives.
    • Here MS says that the target sql instance can be a clustered instance (a Microsoft Cluster Server).
    • Here MS says that the target sql instance can NOT be a clustered instance.

    So I'm lost in the dpm forest. I've tried to use the DPM server instance as temporary sql staging area, with same luck. To make things worse, we're on a DMZ environment and there is no sql server in the farm directly accessible both by the WFE and the DPM server (the first link hints that any sql server instance that can be accessed by DPM and by the front-end Web server of the protected SharePoint farm can be used). This is not listed as a prerequisite for DPM installation, so the dpm server has no direct sql access to the sql cluster, nor the WFE has direct sql access to the dpm server database.

    well, er... any hint/suggestion/worksaround? could at least a DPM specialist validate that the database mount is done by exposing the replica instead of transferring the files to the sql server?

    Thank you very much in advance



    • Edited by Roberto MD Thursday, May 10, 2012 12:46 PM
    Thursday, May 10, 2012 12:43 PM

Answers

  • Hi,

    There is a known limitation when using optimized item level recovery (ILR) for sharepoint that uses clustered SQL Servers. 

    The workaround is to peform a non-optimized recovery.  Please add the following registry key and try a new recovery.

    To disable Optimized ILR completely, use the following registry key on the DPM 2012 server:

    Key HKLM\Software\Microsoft\Microsoft Data Protection Manager\Configuration\SharePoint
    Value: DisableOptimizedILR
    Data:0x1
    Type: DWORD


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    • Marked as answer by Roberto MD Wednesday, May 16, 2012 5:49 AM
    Tuesday, May 15, 2012 11:30 PM
    Moderator

All replies

  • Hi,

    There is a known limitation when using optimized item level recovery (ILR) for sharepoint that uses clustered SQL Servers. 

    The workaround is to peform a non-optimized recovery.  Please add the following registry key and try a new recovery.

    To disable Optimized ILR completely, use the following registry key on the DPM 2012 server:

    Key HKLM\Software\Microsoft\Microsoft Data Protection Manager\Configuration\SharePoint
    Value: DisableOptimizedILR
    Data:0x1
    Type: DWORD


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    • Marked as answer by Roberto MD Wednesday, May 16, 2012 5:49 AM
    Tuesday, May 15, 2012 11:30 PM
    Moderator
  • Thanks Mike,

    Many thanks for that information. Just for my info, I had searched a lot before posting on this forum, is this information written and available anywhere?

    Regards,

    Roberto

    Wednesday, May 16, 2012 5:51 AM
  • Hi,

    Checked with the Registry entry and it works!!!! thanks again Mike.

    However, I still find a bit strange that the main advantage for sharepoint users for upgrading from DPM 2010 to 2012 (because it is really a good advantage) is not available if you have a clustered instance. I mean, if you use DPM to protect the farm, quite probably your farm health is important, and quite probably, the sql instance is clustered...

    Regards

    Roberto

    Wednesday, May 16, 2012 9:41 AM
  • Hi Robero,

    Thanks for confirming the solution, I will have the technet link updated to include the workaround.  I'm also looking to see if we can get this fixed in DPM 2012 Sp1.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, May 16, 2012 1:53 PM
    Moderator
  • Hi Robero,

    In DPM 2012, if optimized recovery fails for any reason, then DPM should automatically re-run the failed recovery job in non-optimized mode.  This means that there should have been a successful recovery prior to you adding the DisableOptimizedILR entry and doing another manual recovery.  Can you check the job history to confirm that two recovery attempts we done prior to you doing the manual recovery ?

    Also, another workaround would be to use a non-clustered staging sql server, that should succeed using optimized recovery.

    Can you test those scenarios ?


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, May 17, 2012 10:01 PM
    Moderator
  • Hi Mike,

    Thanks again for this valuable information. Regarding your first remark, I don't remember the jobs to retry automatically (I tried a given recovery for three times without success, and with some time passed among them. however I cannot fully confirm this as the job history has been already pruned.

    Being in a production platform we have limited manoeuverability but I'll add to the list of tests as soon as we can.

    Regarding your second workaround, our platform is within a paranoid DMZ so staging and production are not directly connected so, unfortunately, the tests cannot be done. another possibility we have thought of is to use the DPM instance and open some ports on a WFE so there's direct connection to this instance.

    But, at least by now, we will keep on using Non-Optimized ILR.

    Regards

    Friday, May 18, 2012 5:52 AM