none
Microsoft Azure Recovery Services Management Agent service terminated unexpectedly RRS feed

  • Question

  • Using the Azure Recovery Services Management Agent to connect SCDPM to Azure Recovery Services Vault for replicating backup data, but the recovery services agent keeps crashing every 5 minutes. I"ve tried uninstalling/reinstalling, verified I have latest version of the agent downloaded from the Azure mgmt portal, not sure what the problem is, thoughts? It generates three error events in the windows log each time it crashes (details below).

    Log Name:      System
    Source:        Service Control Manager
    Date:          6/28/2018 9:33:08 AM
    Event ID:      7031
    Task Category: None
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:     
    Description:
    The Microsoft Azure Recovery Services Management Agent service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.

    Log Name:      Application
    Source:        Application Error
    Date:          6/28/2018 9:33:08 AM
    Event ID:      1000
    Task Category: (100)
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:     
    Description:
    Faulting application name: OBRecoveryServicesManagementAgent.exe, version: 2.0.9118.0, time stamp: 0x5ab8ef90
    Faulting module name: TraceProvider.dll, version: 2.0.9118.0, time stamp: 0x5ab8efd6
    Exception code: 0xc0000005
    Fault offset: 0x00000000000063a1
    Faulting process id: 0x16e4
    Faulting application start time: 0x01d40ee48ab9f210
    Faulting application path: C:\Program Files\Microsoft Azure Recovery Services Agent\bin\OBRecoveryServicesManagementAgent.exe
    Faulting module path: C:\Program Files\Microsoft Azure Recovery Services Agent\bin\TraceProvider.dll
    Report Id: cac8d9f4-7ad7-11e8-80c5-842b2b6def83
    Faulting package full name:
    Faulting package-relative application ID:

    Log Name:      Application
    Source:        .NET Runtime
    Date:          6/28/2018 9:33:08 AM
    Event ID:      1026
    Task Category: None
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:     
    Description:
    Application: OBRecoveryServicesManagementAgent.exe
    Framework Version: v4.0.30319
    Description: The process was terminated due to an unhandled exception.
    Exception Info: System.AccessViolationException
       at <Module>.CTraceProvider.TraceToErrorFile(CTraceProvider*, DLS_TRACE_EVENT*)
       at Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceProvider.Trace(Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceFlag, System.String, Int32, System.Guid*, Boolean, System.String, System.Object[])
       at Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceProvider._TraceMessage(Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceFlag, System.String, Int32, System.String, System.Object[])
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.DpmServerFactory.GetServer(System.String, System.ComponentModel.AsyncOperation, Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.DpmServerScope)
       at Microsoft.Internal.EnterpriseStorage.Dls.UI.Cmdlet.Common.ConnectDpmServer.ProcessRecord()
       at System.Management.Automation.CommandProcessor.ProcessRecord()
       at System.Management.Automation.CommandProcessorBase.DoExecute()
       at System.Management.Automation.Internal.PipelineProcessor.SynchronousExecuteEnumerate(System.Object, System.Collections.Hashtable, Boolean)
       at System.Management.Automation.PipelineOps.InvokePipeline(System.Object, Boolean, System.Management.Automation.CommandParameterInternal[][], System.Management.Automation.Language.CommandBaseAst[], System.Management.Automation.CommandRedirection[][], System.Management.Automation.Language.FunctionContext)
       at System.Management.Automation.Interpreter.ActionCallInstruction`6[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Run(System.Management.Automation.Interpreter.InterpretedFrame)
       at System.Management.Automation.Interpreter.EnterTryCatchFinallyInstruction.Run(System.Management.Automation.Interpreter.InterpretedFrame)
       at System.Management.Automation.Interpreter.EnterTryCatchFinallyInstruction.Run(System.Management.Automation.Interpreter.InterpretedFrame)
       at System.Management.Automation.Interpreter.EnterTryCatchFinallyInstruction.Run(System.Management.Automation.Interpreter.InterpretedFrame)
       at System.Management.Automation.Interpreter.Interpreter.Run(System.Management.Automation.Interpreter.InterpretedFrame)
       at System.Management.Automation.Interpreter.LightLambda.RunVoid1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](System.__Canon)
       at System.Management.Automation.DlrScriptCommandProcessor.RunClause(System.Action`1<System.Management.Automation.Language.FunctionContext>, System.Object, System.Object)
       at System.Management.Automation.DlrScriptCommandProcessor.Complete()
       at System.Management.Automation.CommandProcessorBase.DoComplete()
       at System.Management.Automation.Internal.PipelineProcessor.DoCompleteCore(System.Management.Automation.CommandProcessorBase)
       at System.Management.Automation.Internal.PipelineProcessor.SynchronousExecuteEnumerate(System.Object, System.Collections.Hashtable, Boolean)
       at System.Management.Automation.Runspaces.LocalPipeline.InvokeHelper()
       at System.Management.Automation.Runspaces.LocalPipeline.InvokeThreadProc()
       at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
       at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
       at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
       at System.Threading.ThreadHelper.ThreadStart()


    Shaun

    Thursday, June 28, 2018 2:20 PM

All replies

  • I have the same issue as you do.  Heard anything, or figured it out?

    Tuesday, July 3, 2018 6:17 PM
  • Hi

    I'm also getting this issue, do you manage to find a resolution to the issue.

    I have found if you stop the Microsoft Azure Recovery Service Agent and then start the Microsoft Azure Recovery Service Management Agent, then restart the Microsoft Azure Recovery Service Agent, the Microsoft Azure Recovery Service Management Agent will stay running.

    It does not fix my issue for long and i am still having issues getting my recovery points to sync to Azure, as I get an warning message in the CBEngineCurr.Errlog.

    WARNING    cbpVersion 2.0.9118.0 , latestVersion 2.0.9109.0

    Thanks

    

    Tuesday, July 24, 2018 6:56 AM
  • Same problem here.

    3 x Errors every 5 minutes

    1. Application error
    Faulting application name: OBRecoveryServicesManagementAgent.exe, version: 2.0.9127.0, time stamp: 0x5b6221b9
    Faulting module name: TraceProvider.dll, version: 2.0.9127.0, time stamp: 0x5b6221eb
    Exception code: 0xc0000005
    Fault offset: 0x00000000000063a1
    Faulting process id: 0x17d8

    2. .NET Runtime error

    Application: OBRecoveryServicesManagementAgent.exe
    Framework Version: v4.0.30319
    Description: The process was terminated due to an unhandled exception.
    Exception Info: System.AccessViolationException

    3. Service Control Manager

    The Microsoft Azure Recovery Services Management Agent service terminated unexpectedly. 

    Any updates? Any Ideas?
    Thanks in advance.

    Monday, August 13, 2018 9:09 AM
  • Hi,

    I confirm that this sequence does make a difference. I managed to go 10 minutes so far and it did not crash.

    Clearly it's not a solution (and it will break at the first restart), but it is a marked improvement!

    thanks

    Thursday, September 6, 2018 3:46 PM
  • I have the same issue with Azure agent on at least 7 DPM servers, all have obmanagementservicesagent crashing every 5 minutes 247.

    Please reply with a resolution or if I should open a Premier support ticket.

    Thanks,

    David

    Friday, September 28, 2018 9:03 PM
  • Workaround (by Michael Britton)

    1. stop Microsoft azure recovery service agent
    2. start Microsoft azure recovery service Management agent
    3. start Microsoft azure recovery service agent

    This will fix the problem until next restart. Apparently all depends on the startup sequence. Management agent is in delayed start and recovery agent starts first, when management starts it crashes. After 300 second OS restarts it and is crashes again, and so on. Reverse the sequence and all goes well.

    BTW: a new version of the agent is out. It may resolve the issue but I have not tested it yet.


    • Edited by GianpaoloV Monday, October 1, 2018 8:30 AM
    Monday, October 1, 2018 8:30 AM
  • David,

    Did you (or anyone) open a ticket and find a resolution for this?  I'm plagued with the same issue.  I tried Michael Britton's workaround but I didn't have any luck with it.

    Phil

    Wednesday, October 3, 2018 2:42 PM
  • I did not, even if I have a support plan I don't want to invest time on this.

    But I did verify that the latest (2.0.9127.0) version of the agent still has the problem, so the only solution that works appears to be the one above. It worked for me several times, and is easy to automate (use task scheduler and a trigger on startup)

    gp

    Wednesday, October 3, 2018 3:54 PM
  • I have a case open with Azure Support and they're taking a look at it now.  They're trying to recreate some issues and hopefully this will lead somewhere.  If I get a resolution (or not), I'll report back.

    Phil

    Wednesday, October 3, 2018 4:23 PM
  • Well, this is really unsatisfying, but it's an answer.  The tech successfully replicated the following issues in his own environment --

    • repeated .NET Runtime System.AccessViolationException (Event ID 1026)
    • repeated Application Error for OBRecoveryServicesManagementAgent.exe version 2.0.9127.0 (Event ID 1000)
    • "Version" and "Azure Backup Agent Version" fields blank in the Azure Recovery Services vault blade where the DPM server is listed under "Backup Management Servers"

    I was told that although these things were not desired that they were low priority bugs since the backups were still succeeding.  I was told to ignore the errors in the Application log and that they would be likely be fixed in the next update.  I'm not sure if that means a .NET Framework update, a DPM update, or a MARSAgent update.  I have verified that all my online recovery points are accessible through DPM so I guess that's as much of an answer as I'll get.


    • Edited by pkpkpk Wednesday, October 3, 2018 6:14 PM
    Wednesday, October 3, 2018 6:14 PM
  • well, consider that the management agent is not running. it starts, fails and is restarted after 300 seconds, and so on forever.

    so you could just disable it and save resources, and log space...

    Wednesday, October 3, 2018 6:40 PM
  • You're right that it probably wouldn't hurt anything but I prefer to hit Microsoft with a firehose of Windows Error Reporting cases that may end up spurring some action.
    Wednesday, October 3, 2018 6:46 PM
  • I am still having this problem with the agent, the solution to reverse the sequence works fine until the next boot, so I made a batch file with the code:

    net stop obenginenet 
    start RecoveryServicesManagementAgentnet 
    start obengine

    In the recovery page of the service RecoveryServicesManagementAgent property, I selected Run a Program as an option for First or Second failure action, then selected this batch file, now every time the service fails, it runs the batch file and fix itself, it works for me.

    Hope this helps until they fix it.

    Sunday, November 25, 2018 6:40 AM
  • This was happening for me. Brand new install of DPM 2016, and my online job creation was failing for a new protection group.

    I checked the event logs and saw exactly these events. I stopped and started the services in the order suggested, and I was able to create/set up my online backups.

    hopefully we won't have to restart the server all that often. Let's hope this is fixed soon!

    Thursday, January 17, 2019 2:31 PM
  • Please disable this service OBRecoveryServicesManagementAgent.exe.

    This will crash on DPM 2012 R2 and 2016 but not on 1801 and 2019 versions.

    This service is the one which will upload the data to the recovery services vault (and create a nice and cool report about what you are backing up and etc), which is supported on 1801, 2019, Azure Backup Server V2 and V3 but no on DPM 2012 R2 and 2016.

    So you are safe to disable the service.

    Thanks.



    • Proposed as answer by Tome Lopes Saturday, February 23, 2019 4:41 PM
    • Edited by Tome Lopes Saturday, February 23, 2019 4:41 PM
    Saturday, February 23, 2019 4:38 PM
  • That stops the crashes, but I cannot create an online recovery point, which was one of the reasons to have DPM protecting our servers.

    I'm now getting "Online recovery point creation jobs for Microsoft Hyper-V %server% on %host% have failed 1 times (ID 3188). Microsoft Azure Backup Agent was unable to initialize the operation (ID 100046)"

    To me, this indicates that the service that has been disabled, is actually required.

    Tuesday, March 5, 2019 11:15 AM
  • Maybe I didn't explained correctly! There are two services OBengine and OBRecoveryServicesManagementAgent.

    If you disable the OBengine the MARS Agent of course will NOT work.

    If you disable OBRecoveryServicesManagementAgent the MARS Agent will NOT stop working and the online backups will be DONE with success.

    From Services.msc the OBengine name is Microsoft Azure Recovery Services Agent for OBRecoveryServicesManagementAgent the service name is Microsoft Azure Recovery Services Management Agent.

    Said this, make sure you disable the correct service which, in this case, will be Microsoft Azure Recovery Services Management Agent (OBRecoveryServicesManagementAgent).

    Hope I clear out your issue / question ...

    • Proposed as answer by Tome Lopes Wednesday, March 6, 2019 9:47 PM
    Wednesday, March 6, 2019 9:46 PM
  • same issue with me, also looking 
    Wednesday, March 6, 2019 9:50 PM
  • i have aslo
    Wednesday, March 6, 2019 9:53 PM
  • Now its running, but Data Transferred is always 0 MB, I don't know if its transferring or not, or if it will even ends or its just there without any progress.
    Thursday, March 7, 2019 7:02 AM
  • Maybe I didn't explained correctly! There are two services OBengine and OBRecoveryServicesManagementAgent.

    If you disable the OBengine the MARS Agent of course will NOT work.

    If you disable OBRecoveryServicesManagementAgent the MARS Agent will NOT stop working and the online backups will be DONE with success.

    From Services.msc the OBengine name is Microsoft Azure Recovery Services Agent for OBRecoveryServicesManagementAgent the service name is Microsoft Azure Recovery Services Management Agent.

    Said this, make sure you disable the correct service which, in this case, will be Microsoft Azure Recovery Services Management Agent (OBRecoveryServicesManagementAgent).

    Hope I clear out your issue / question ...

    Thanks Tome, I was a bit confused as the properties of the services didn't show the OBengine name.

    I've done as you have suggested, and am running the online jobs now, let's see if this resolves the problem.

    Monday, March 11, 2019 11:49 AM
  • It appears that there are no more crashes, but it doesn't look like any data is being transferred to my Azure vault. I've set the online retention to be different from my disk based retention (14 days online, 5 days on disk) but I only have 5 recovery points for each item that is being protected.

    Online recovery is a must, otherwise why have the ability to add an Azure vault if you can't use it?

    Friday, March 15, 2019 11:18 AM
  • Thanks for the workaround
    Wednesday, August 7, 2019 12:15 PM