none
Why would my DPM replicas be inconsistent on SQL databases that I have just protected? RRS feed

  • Question

  • Hello,

    I have a new SQL cluster running Server 2019 and SQL 2016.  I also have a second file server cluster running Server 2019 that I've set up a number of shares on.  Finally, I have a new DPM server running Server 2019 and DPM 2019.  DPM has been linked to an Azure Vault.  All of the above are VMs.

    I have created a Protection Group that backs up to a local disk and the Azure Vault, and I tagged the local DPMDB to be backed up in this group, and that seems to be "OK".

    Now I've installed the DPM client on both SQL servers in the cluster and both servers in the file server cluster.  I added the SQL cluster name and the file server cluster name to DPM.  I am backing up SQL databases via the SQL cluster name, and I am backing up the file server cluster via the cluster name also, backing up the G: and U: drives.  I'm getting errors that "Replica is inconsistent" and "Online recovery point creation failed".  Based on past experience with our old DPM server, I'm guessing if I clear up the first error, the second one might also disappear.

    Usually, I just run the sync job with consistency check and that fixes it.  If it doesn't, I delete the item from the protection group and re-add it.  Neither of those is working, and as I said, these are brand new servers.  The error is: "Failure occurred while adding one or more of the volumes involved in backup operation to snapshot set. Please check the event log on SQL Server (MSSQLSERVER).MYSQLSERVER.domain.net to troubleshoot the issue. (ID 30290 Details: Internal error code: 0x809909E2).  Unless I was looking in the wrong log, I went to the SQL server and there was nothing in the Event Log for the time period.

    I am a jack-of-all-trades System Administrator, and I don't have a lot of experience with either SQL Server or DPM.  Google hasn't been incredibly helpful as far as things to try or logs to look in for info.  I'm hoping somebody can give me a little guidance.

    Thanks,

    Zachary Hamilton


    Sunday, October 25, 2020 8:33 PM

Answers

All replies

  • Hi Zachary,

    Could you provide us with some more information about the SQL clusters you are backing up? 

    What kind of clusters are they?

    Are the clusters virtualized? If yes, give more information of the infrastructure.

    What kind of storage are they on? Normal cluster disks or Cluster Shared Volumes (CSVs)?

    -------------------------------------------------------------------------------------------------------

    Make sure the SQL clusters are running on supported configuration, more info here:

    What's supported and what isn't for DPM?

    What can DPM back up?

    I would also suggest checking the MSDPMCurr.errlog located under %ProgramFiles%\Microsoft System Center\DPM\DPM\Temp\.

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:



    • Edited by Leon Laude Sunday, October 25, 2020 9:37 PM added links
    Sunday, October 25, 2020 8:58 PM
  • Hi Leon,

    I'm not sure what you mean by what kind of clusters.  In each case, the two virtual machines were built separately, and then put in to a cluster with Failover Cluster Manager.  To be honest, I'm not 100% sure what I'm looking at here as I haven't used clusters at any previous employers.  Here's basically what I have:

    SQL Cluster:

    • SQL201 (Node 1, currently owner)
    • SQL202 (Node 2)
    • SQL200 (cluster name)
    • SQL300 (not exactly sure what this is, but this is what I RDP to)
    • Has one role: SQL Server (MSSQLSERVER), Type: Other

    File Server Cluster:

    • FS201 (Node 1, currently owner)
    • FS202 (Node 2)
    • FS200 (cluster name)
    • FS300 (not exactly sure what this is, but this is what I RDP to)
    • Has one role: FS300, Type: File Server

    All of these machines are virtual.  There are 3 hosts, all running Server 2019.  They are connected to a SAN by 10 GB twinax using iSCSI.

    According to Failover Cluster Manager, the SQL servers' disks say they are "Assigned To" Cluster Shared Volumes (except for the Disk Witness in Quorum).  On the file servers, it says "Assigned To" FS300 (except for the Disk Witness in Quorum).

    In MSDPMCurr.errlog there are a lot of WARNINGs but no ERRORs.

    As a couple of examples of how I configure the job, they look like this:

    SQL Job:

    • Affected area: SQL300\DATABASE1
    • Computer: sql server (mssqlserver).sql200.domain.net

    File Server:

    • Affected area: G:\
    • Computer: fs300.fs200.domain.net

    Thanks,

    Zach

    Sunday, October 25, 2020 10:12 PM

  • According to Failover Cluster Manager, the SQL servers' disks say they are "Assigned To" Cluster Shared Volumes (except for the Disk Witness in Quorum).  On the file servers, it says "Assigned To" FS300 (except for the Disk Witness in Quorum).

    The problem for the SQL clusters is that DPM 2019 RTM does not support SQL clusters hosted on Cluster Shared Volumes (CSV), support was given in UR2:

    https://docs.microsoft.com/en-us/system-center/dpm/dpm-protection-matrix?view=sc-dpm-2019

    Is the File Server cluster a ”normal” cluster of a Scale-Out File Server cluster?


    Blog: https://thesystemcenterblog.com LinkedIn:


    Monday, October 26, 2020 7:03 AM
  • Hi,

    DPM 2019 UR2 supports SQL Server Failover Cluster Instance (FCI) using Cluster Shared Volume (CSV). Please install UR2 if not installed already. 

    https://docs.microsoft.com/en-us/system-center/dpm/back-up-sql-server?view=sc-dpm-2019

    https://docs.microsoft.com/en-us/system-center/dpm/what-s-new-in-dpm?view=sc-dpm-2019#new-features-in-dpm-2019

    However DPM still does not support file server backup over CSV. If you are protecting a File cluster on CSV, it is unsupported.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. This posting is provided "AS IS" with no warranties, and confers no rights.


    Monday, October 26, 2020 7:58 AM
  • Assuming Zachary is running RTM (as no UR was mentioned) it's not supported, but if you have UR2 then yes of course it is supported, but you'll need to install that first.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, October 26, 2020 8:02 AM
  • Hi Leon,

    Once again, thank you.  I installed DPM 2019 UR2 and updated the protection agents.  All five databases are now showing "OK". One was a little stubborn.  While it was synchronizing, I noticed under "Details of Replica Path" that it showed a destination of:

    C:\Program Files\Microsoft System Center\DPM\DPM\Volumes\Replica\d7bde93d-9457-4fb4-bde4-724282600d26\d4625377-177a-4389-b3ca-ffe9c4cc0c22\Full\

    Is this a configurable parameter?  It concerns me because I have an E: drive configured under Management -> Disk Storage.  The E: drive is 15 TB.  The C: drive is only 100 GB.  Is that an issue?

    I *think* my file servers are "normal".  How can I tell?

    Also, the way I configured the file servers was that I installed the protection agent on FS201 and FS202.  I then attached FS200.  FS300 is not protected.  Should I attach FS300?  I'm a bit confused on how to install the client and protect clusters like this.

    Thanks again,

    Zachary Hamilton

    Wednesday, October 28, 2020 2:31 PM
  • The path is actually a volume mount point pointing to the replica on the DPM Storage pool disk.  DPM stores these volume mount points over here: 
    C:\Program Files\Microsoft System Center\DPM\DPM\Volumes\Replica\.....

    There will be no space consumed on C: drive.

    If you have no backup errors with your file servers then I’d say all is OK.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, October 28, 2020 2:35 PM
  • Leon,

    Sorry I wasn't specific.  The file server cluster continues to show "Replica is inconsistent" for both G:\ and U:\, which are the two drives I'm backing up.  If I "Run a synchronization job with consistency check..." it runs for several hours (there's a lot of data) and then fails.

    Thanks,

    Zachary Hamilton

    Wednesday, October 28, 2020 3:12 PM
  • For DPM to start protecting your file clusters, it is important to deploy a DPM agent to all the file cluster nodes, and when that is done, you will be able to start protecting your file cluster.

    So you have the "File Server" cluster role on your file cluster?

    Is it a "File Server for general use" or "Scale-Out File Server for application data" configured file cluster?
    https://docs.microsoft.com/en-us/windows-server/failover-clustering/deploy-two-node-clustered-file-server

    What kind of cluster disks are the G: and U: drives? Simple cluster disks, or Cluster Shared Volumes (CSVs)?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, October 28, 2020 3:54 PM
  • I deployed the agent to FS201 and FS202, and I then attached the cluster name FS200. I did not attach FS300, which if I understand the link you provided, is the Client Access Point.

    Yes, the "File server" role is there, and IIRC, I configured it as "File server for general use".

    I think G: and U: are CSVs, but I'm not really sure how to tell.  I created them according to the link you provided (in fact, that link looks familiar so I think it was the exact same link).  I can't find a whole lot of info on the Internet regarding "simple cluster disks" to provide a frame of reference.

    Thanks.

    Thursday, October 29, 2020 1:08 PM
  • You can tell if G: and U: are CSVs by checking the Disks within the Failover Cluster Manager tool.

    DPM does not yet support backing up the File Server over CSVs, so if your File Server role is located on CSVs then it is not supported.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, October 29, 2020 1:46 PM
  • They do not appear to be CSVs. The "Assigned to" column shows FS300. Should I "attach" FS300 in DPM?

    Thanks.

    Thursday, October 29, 2020 2:37 PM
  • Could you provide a screenshot? (Hide any sensitive information).

    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, October 29, 2020 2:39 PM
  • Here ya go!  Thanks.

    Thursday, October 29, 2020 5:20 PM
  • Looks OK, DPM can protect the volumes or the file shares on the cluster since only one node will own the resource at any given time.  So install the DPM agent on all nodes in the cluster, and in the protection group wizard, expand out the cluster and protect the volumes / shares.

    Can you show the exact error and the error description that you're receiving when attempting to back up the drives?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, October 29, 2020 5:34 PM
  • Here's a screenshot. Needless to say, clicking "Run a synchronization job with consistency check..." doesn't work.

    Thursday, October 29, 2020 5:51 PM
  • This has been a common issue with DPM 2019, please try the workarounds mentioned here:
    https://feedback.azure.com/forums/914488-system-center-data-protection-manager/suggestions/37713133-dpm-2019-error-40002-the-vhd-containing-the-rep

    This can also be a "generic" error so it may apply to many scenarios and can therefore require deeper troubleshooting, as it has been a long issue for many DPM customers, if the fixes provided in the link above doesn't help I would strongly suggest creating a support request directly to Microsoft: https://support.serviceshub.microsoft.com/supportforbusiness


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, October 29, 2020 5:57 PM
  • I will try the things mentioned in the link. If none work, would rolling back to DPM 2016 be an option? It would still be a step up for us since we were upgrading from DPM 2012. Is DPM 2016 a more stable product?

    Thanks,

    Zachary Hamilton

    Friday, October 30, 2020 2:23 PM
  • I'd say DPM 2016 is about the same as DPM 2019, but there are definitely more improvements in the latter. If you're still having issues, I'd suggest creating a support request.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, October 30, 2020 2:26 PM
  • I marked the answer above for SQL. Getting a new error now, so I'll probably be opening a support ticket unless this one is easy. Thanks for all the help.

    Friday, October 30, 2020 3:04 PM