none
DPM 2010 Not Backing Up SQL Database RRS feed

  • Question

  • Hello all,

    I have a problem with DPM not backing up certain SQL Databases. The server it is trying to back up is running Server 2008R2 and SQL Server 2005. It is due to be replaced early New Year. It is backing up multiple SQL databases. Some do seem to be working as I can see recovery points but most (including the major ones) haven't backed up in a while.

    The error is:

    DPM failed to communicate with the protection agent because the agent is not responding (ID43 Details: Internal error code 0X8099090E)

    The error points to the agent being the issue, i.e. firewall, agent needs re-install, etc but if it was the agent, why are other databases backing up?

    I wanted to try rebooting the server but was told this should be done as a last resort.

    I have tried running consistency checks but to be honest, it moved on to he next day and then the next and now I can't even see where to look to see the outcome. I have just tried running it again on the master database and will monitor.

    The databases that are failing are 200GB - 300GB. I did read that it could be because there isn't enough space. How can I check exactly what volume on the DPM server these are being backed up to? If I view Storage Management, i'm seeing hundreds of volumes and indeed, some are low on space. Could this be the problem? (screenshot attached)

    Any help would be much appreciated. I'm not familiar with DPM as have come from using ArcServe to this so am finding it difficult to decipher.

    Many Thanks in advance.

    Sunday, December 9, 2018 11:15 AM

Answers

  • If possible give the SQL Server a reboot just to see if it could resolve any issues before reinstalling DPM.

    If you're willing to reinstall DPM 2010, don't forget to have a look at the system requirements.

    I understand that backing up SQL databases is crucial and you might want to have backups now before the holidays.
    If the plan is to decommission the old server, without putting too much time and effort on the "old systems" you could always temporarily create SQL dumps as backups in the meantime.

    If a DPM reinstall is possible, you can give that a go of course but either way it could be that the SQL database server might need a reboot along this process.


    Blog: https://thesystemcenterblog.com LinkedIn:

    • Marked as answer by Gary OConnor Monday, December 17, 2018 11:06 AM
    Wednesday, December 12, 2018 7:25 AM

All replies

  • Hello Gary!

    What errors you are receiving in the Monitoring tab during the failure of the backup? There should be more detailed errors there.

    You might also want to investigate the following logs:

    • System & Application logs of the SQL server
    • DPMRACurr.errlog log on the SQL server (C:\Program Files\Microsoft Data Protection Manager\DPM\Temp)
    • MSDPMCurr.errlog on the DPM server (C:\Program Files\Microsoft Data Protection Manager\DPM\DPM\Temp)

    The paths may be slightly different, not sure I remember the paths of DPM 2010 correctly.

    To check the replica path of a protected item in DPM:

    1. Go to the Protection tab, then select the protection group where you have your SQL databases.
    2. Select your specific SQL database and at the bottom you should see some Details.
    3. You can then click on the click to view details, there you should be able to see the path.


    You mentioned that you are going to replace the server, which server are you replacing, the SQL server running on Windows Server 2008 R2?
    I would strongly recommend upgrading both your server operating system(s) and DPM.

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Sunday, December 9, 2018 2:07 PM
  • Hi Leon,

    Thanks for replying, much appreciated.

    The consistency check and recovery point creation for all databases are failing with these errors:

    DPM failed to communicate with the protection agent because the agent is not responding. (ID 43 Details: Internal error code: 0x8099090E)

    The replica of SQL Server 2005 database is not consistent with the protected data source. (ID 91)

    Thanks for showing me where the volumes reside. They are not out of space but would they dynamically expand anyway? They are no way near as big as I thought they would be.

    I'm going to examine the other Event Logs you have suggested and will post back.

    A new server to replace the ageing Server 2008 and SQL Server 2005 has been set-up by another engineer. I have already raised that DPM 2010 urgently requires updating. I'm hoping to learn more about this myself. I've found some good MS online classes and plan on watching these soon.

    Many Thanks again for your help.

    Sunday, December 9, 2018 6:55 PM
  • Let's start by looking at the logs for a better understanding, and share the logs here.

    If other SQL databases are being backed up, then the connectivity should be OK to the protected server (Windows Server 2008 R2 with SQL Server 2005).

    It could be something that is causing the DPM agent to crash somehow, but let's see what the logs tells us first.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Sunday, December 9, 2018 8:04 PM
  • Hi Leon,

    There is only 1 error but this is showing constantly in the Application Logs of the SQL Server:

    (Event ID 85) A DPM agent failed to communicate with the DPM service on (DPM Server) because of a communication error. Make sure that (DPM Server) is remotely accessible from the computer running the DPM agent. If a firewall is enabled on (DPM Server), make sure that it is not blocking requests from the computer running the DPM agent (Error code: 0x800706ba, full name: (DPM Server)

    There is also this warning regularly coming up:

    Certificate for local system with Thumbprint 11 71 0b 86 51 57 ca 03 c5 ff ab 78 4c 93 81 e5 8f 21 ef 7d is about to expire or already expired.

    Under the System log, this is constantly showing:
    (Event ID 1009) DCOM was unable to communicate with the computer (DPM Server) using any of the configured protocols.

    And occasionally, this is showing
    (Event ID 1053) The processing of Group Policy failed. Windows could not resolve the user name. This could be caused by one of more of the following: 
    a) Name Resolution failure on the current domain controller. 
    b) Active Directory Replication Latency (an account created on another domain controller has not replicated to the current domain controller).

    Here are some fatal errors seen in the DPM Logs from the SQL Server:

    0D0 0F78 12/08 20:25:09.547 03 runtime.cpp(1599) [0000000000148650] 139DB9DC-BA87-4880-8778-2415D5234967 FATAL Subtask failure, sending status response XML=[<?xml version="1.0"?>
    10D0 0F78 12/08 20:25:09.547 03 runtime.cpp(1599) [0000000000148650] 139DB9DC-BA87-4880-8778-2415D5234967 FATAL <Status xmlns="http://schemas.microsoft.com/2003/dls/StatusMessages.xsd" StatusCode="-2137454160" Reason="Error" CommandID="RAGetWorkItemInfo" CommandInstanceID="47e8588f-adb8-4e31-9cb8-693f7a34ed9c" GuidWorkItem="3416bba6-a517-463f-91a3-4beadf00e0c5" TETaskInstanceID="139db9dc-ba87-4880-8778-2415d5234967"><ErrorInfo xmlns="http://schemas.microsoft.com/2003/dls/GenericAgentStatus.xsd" ErrorCode="2001" DetailedCode="-2137454160" DetailedSource="2"><Parameter Name="AgentTargetServer" Value="SQL Server"/></ErrorInfo></Status>
    10D0 0F78 12/08 20:25:09.547 03 runtime.cpp(1599) [0000000000148650] 139DB9DC-BA87-4880-8778-2415D5234967 FATAL ]
    10D0 0F78 12/08 20:25:09.547 05 defaultsubtask.cpp(680) [0000000003A56FE0] 139DB9DC-BA87-4880-8778-2415D5234967 WARNING Failed: Hr: = [0x809909b0] : F: lVal : CommandReceived(pAgentOvl)
    And these are fatal errors from the DPM Server:

    07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING CheckTimeoutMessage: code[0x00000102], detailedCode[0x8099090e], errMgs[Internal error code: 0x8099090E]
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING [<?xml version="1.0" encoding="utf-16"?>
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING <Status xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" StatusCode="-2137454322" Reason="Timeout" CommandInstanceID="eb0200ec-6ad2-4025-a581-c3a1e40fb1ad" CommandID="GetProperties" GuidWorkItem="7b5da4a2-24ea-415f-9c1e-9a7e7b0be284" TETaskInstanceID="7b5da4a2-24ea-415f-9c1e-9a7e7b0be284" xmlns="http://schemas.microsoft.com/2003/dls/StatusMessages.xsd">
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <ErrorInfo ErrorCode="258" DetailedCode="-2137454322" DetailedSource="2" ExceptionDetails="" xmlns="http://schemas.microsoft.com/2003/dls/GenericAgentStatus.xsd">
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING     <Parameter Name="servername" Value="SQL Server" />
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   </ErrorInfo>
    0BC4 30C4 12/10 16:05:45.261 07 AMUtil_expanded.cs(3474) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING </Status>].
    0BC4 30C4 12/10 16:05:45.262 07 InstallListRefresh.cs(484) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 NORMAL FailTask[InstallListRefresh]
    0BC4 30C4 12/10 16:05:45.262 07 AMUtil_expanded.cs(79) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 NORMAL RaiseAgentUnreachableAlert serverName -> SQL Server
    0BC4 30C4 12/10 16:05:45.263 07 Deployment.cs(2147) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 NORMAL Checking whether reboot required for [SQL Server].
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING Task Diagnostic Information - <?xml version="1.0" encoding="utf-16"?>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING <TaskExecutionContext>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <AMServerName>SQL Server</AMServerName>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <AMOsType>None</AMOsType>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <AMClusterName>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   </AMClusterName>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <TEVerb>InstallListRefresh</TEVerb>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <TEErrorState>InstallListRefresh.AgentInstallStatusInquiring</TEErrorState>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   <TEErrorDetails>&lt;?xml version="1.0" encoding="utf-16"?&gt;
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING &lt;q1:ErrorInfo ErrorCode="316" DetailedCode="-2137454322" DetailedSource="2" ExceptionDetails="" xmlns:q1="http://schemas.microsoft.com/2003/dls/GenericAgentStatus.xsd"&gt;
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING   &lt;q1:Parameter Name="servername" Value="SQL Server" /&gt;
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING &lt;/q1:ErrorInfo&gt;</TEErrorDetails>
    0BC4 30C4 12/10 16:05:45.266 01 TaskInstance.cs(800) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 WARNING </TaskExecutionContext>
    0BC4 30C4 12/10 16:05:45.266 02 EventManager.cs(95) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 NORMAL Publishing event from TaskInstance.cs(825): TaskStop, [TaskID=7b5da4a2-24ea-415f-9c1e-9a7e7b0be284]
    0BC4 30C4 12/10 16:05:45.269 01 TaskExecutor.cs(845) 7B5DA4A2-24EA-415F-9C1E-9A7E7B0BE284 FATAL Task stopped (state=Failed, error=AMAgentNotResponding; -2137454322; WindowsHResult), search "Task Diagnostic Information" for details.

    Many thanks for your help with this.
    Gary


    Monday, December 10, 2018 7:46 PM
  • So you are backing up multiple SQL databases on this Windows Server 2008 R2 server, and "some" of the databases are failing is that right?

    The error you are receiving on the protected server indicates that the agent cannot make an RPC connection to the DPM Server.

    Can you verify the protected server can communicate with the DPM Server on the following port: TCP 135?

    You can test this using the telnet command telnet <dpm server name> 135 (requires a windows feature installed) or download and use the PortQuery tool

    Are there anything else installed on the Windows Server 2008 R2 server than the SQL Server For example: Roles/Features/software that may interfere.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, December 10, 2018 8:17 PM
  • Hi Leon,

    Yes, multiple SQL databases reside on the same SQL server which are being backed up. Most are failing every day and have been since the 19th November but there are a couple which have worked on and off, some as recently as the 3rd and 6th of December. On the days they don't work, these couple have error messages about not being consistent where as all the others fail because DPM failed to communicate wit the protection agent.

    I ran the Telnet test and indeed, it couldn't communicate. Really strange how it works sometimes? If I telnet to another DPM server, it works so it would seem the issue is with the DPM server.

    The Firewall isn't turned on for Domain networks for both the SQL and DPM server. 

    There isn't really much else running on the server. I did notice Microsoft System Center 2012 SP1 DPM Agent Coordinator has an install date of 07/12/2018? Other than that, just a few utilities which have been there years.

    On checking (and I've learnt to do this first now), this system has been up for 535 days. In light of the fact that it's running SQL Server 2005 which isn't supported, an old version of DPM and has been on for such a long time, I can see this being a minefield to fix. No updates have been installed since June 2017 and on doing a check, nothing is available. Really, it could do with be replaced asap.

    If you have any other suggestions, I would be very interested. Your tips have really helped me gain a better understanding of DPM already for which I am very grateful for!







    Tuesday, December 11, 2018 10:57 AM
  • I'm wondering, could it be something to do with the service account or which users are permitted on each server? 

    It's using the local system account to start the service.

    I did try using a Domain Admin account as well but that made no difference.

    I have also now go the DPM server fully up to date and done a restart and tried disabling anti-virus.


    Tuesday, December 11, 2018 2:21 PM
  • Sorry for the late reply!

    By default the DPM agent runs with the Local System account so that shouldn't be the issue.

    While it is indeed an old server with an older SQL Server and the server itself have not been updated, it is like you said a minefield, to pinpoint the issue could become very time consuming. 


    Do you see any errors when running the command below on the Windows Server 2008 R2 (database server)?

    vssadmin list writers


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, December 11, 2018 2:29 PM
  • No worries at all. I appreciate your help.

    When I run that command, no errors are reported on all outputs.

    I did notice that if I try the Telnet command multiple times, every now and then it does seem to connect which would probably explain why the backup is occasionally working for smaller databases.

    I might try uninstalling DPM from this server and then re-configuring on the other DPM server where it does seem to connect ok - providing I have enough space.

    Tuesday, December 11, 2018 3:50 PM
  • If possible give the SQL Server a reboot just to see if it could resolve any issues before reinstalling DPM.

    If you're willing to reinstall DPM 2010, don't forget to have a look at the system requirements.

    I understand that backing up SQL databases is crucial and you might want to have backups now before the holidays.
    If the plan is to decommission the old server, without putting too much time and effort on the "old systems" you could always temporarily create SQL dumps as backups in the meantime.

    If a DPM reinstall is possible, you can give that a go of course but either way it could be that the SQL database server might need a reboot along this process.


    Blog: https://thesystemcenterblog.com LinkedIn:

    • Marked as answer by Gary OConnor Monday, December 17, 2018 11:06 AM
    Wednesday, December 12, 2018 7:25 AM
  • Hi Leon,

    We rebooted the server and this resolved the problem and everything has now backed up.

    Many thanks for all of your help with this, it is much appreciated.

    Gary

    Monday, December 17, 2018 11:07 AM
  • Glad to hear the reboot helped, it is always a quick and simple solution to many problems! 

    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, December 17, 2018 11:54 AM