locked
Agent Install/removal/repair/start failed RRS feed

  • Question

  • On 1 system i want to repair an agent. On this system the agent is a mess, it's been removed manually by deleting files and registry and reinstalling it manually but never got this to a working agent again.

    The system is still known in de console. When i start an repair it sais to look in the MOMAgentMgmt.log on the managementserver, there i see this:

    11:49:30 : CServiceModule::Init : m_bService is set to true. HResult: 0
    11:49:30 : CServiceModule::Start : Service flag is set
    11:49:30 : CServiceModule::SetServiceStatus : State: 2, Error: 0
    11:49:30 : CServiceModule::SetServiceStatus : State: 2, Error: 0
    11:49:30 : IsCurrentMOMServerInstalled : MOM Setup key could not be queried.
    11:49:30 : CServiceModule::_RegisterFile : Before LoadLibrary
    11:49:30 : CServiceModule::_RegisterFile : After LoadLibrary
    11:49:30 : CServiceModule::_RegisterFile : Before GetProcAddress
    11:49:30 : CServiceModule::_RegisterFile : After GetProcAddress
    11:49:30 : CServiceModule::_RegisterFile : After FreeLibrary
    11:49:30 : CServiceModule::RegisterFile : Success return
    11:49:30 : CServiceModule::RegisterFile : WaitForSingleObject return  HResult: 0
    11:49:31 : CServiceModule::SetServiceStatus : State: 4, Error: 0
    11:49:32 : OmAgentFunctions::UninstallOmAgent : Entering
    11:49:32 : OmAgentFunctions::IsAgentManagementApiAvailable : Entering
    11:49:32 : OmAgentFunctions::GetAgentManagementApi : Entering
    11:49:32 : OmAgentFunctions::GetAgentManagementApi : Failed to retrieve Management Service Configuration. So, assuming agent management library is not installed HResult: 800700c1
    11:49:32 : OmAgentFunctions::GetAgentManagementApi : Leaving
    11:49:32 : OmAgentFunctions::IsAgentManagementApiAvailable : Failed to retrieve Management Service Configuration. Assuming Agent Management API is not available. HResult: 800700c1
    11:49:32 : OmAgentFunctions::IsAgentManagementApiAvailable : Leaving
    11:49:32 : OmAgentFunctions::UninstallOmAgent : IsAgentManagementApiAvailable() returned FALSE, giving up.
    11:49:32 : OmAgentFunctions::UninstallOmAgent : Leaving
    11:49:42 : CServiceModule::SetServiceStatus : State: 3, Error: 0
    11:49:42 : CServiceModule::SetServiceStatus : State: 3, Error: 0
    11:49:42 : IsCurrentMOMServerInstalled : MOM Setup key could not be queried.
    11:49:42 : CServiceModule::_RegisterFile : Before LoadLibrary
    11:49:42 : CServiceModule::_RegisterFile : After LoadLibrary
    11:49:42 : CServiceModule::_RegisterFile : Before GetProcAddress
    11:49:42 : CServiceModule::_RegisterFile : After GetProcAddress
    11:49:42 : CServiceModule::_RegisterFile : After FreeLibrary
    11:49:42 : CServiceModule::RegisterFile : Success return
    11:49:42 : CServiceModule::RegisterFile : WaitForSingleObject return  HResult: 0
    11:49:42 : CServiceModule::SetServiceStatus : State: 1, Error: 0

    If i try to repair it manually on the node i get this error:

    Error 25362. Failed to start -2147023843 service. Setup will proceed, please examine your eventlog.

    In the eventlog i only see that it failed to start the service.

    When i try to remove the agent manually i get the error:

    Error 1723. There is a problem with this windows installaer package. A DLL required for this install to complete could not be run.

    I also started the healthservice with tracing on and this is what i get from the tracefiles:

    [3]4528.5756::02/08/2017-15:49:42.539 [ConfigurationManager] [] [Error] :CConfigStorage::_GetPoolList{configstorage_cpp1639}CInstanceTable::GetInstance for the local health service failed with code WINERROR=80FF0036. An empty pool list will be returned.
    [1]4528.5960::02/08/2017-15:49:42.704 [ConnectorManager] [] [Error] :CQueueManagerInternal::NotifyOutgoingSessionAvailable{queuemanager_cpp2154}NotifyOutgoingSessionAvailable failed with code 8(ERROR_NOT_ENOUGH_MEMORY).
    [2]4528.5680::02/08/2017-15:49:47.281 [ConnectorChannel] [] [Error] :CChannelConnectorImpl::OnRequestCreateSession{channelconnectorimpl_cpp1261}Unable to request creation of a new session for ehv-host.domain.com, failed with code 0x80070015(ERROR_NOT_READY)
    [2]4528.5680::02/08/2017-15:49:47.281 [MOMConnector] [] [Error] :CConnectorDestinationConnection::RequestCreateSession{momconnectordestination_cpp132}Request to create a session for 00000000-0000-0000-0000-000000000000 has failed with 0x80070015(ERROR_NOT_READY)
    [0]4528.5680::02/08/2017-15:49:48.281 [ConnectorChannel] [] [Error] :CChannelConnectorImpl::OnRequestCreateSession{channelconnectorimpl_cpp1261}Unable to request creation of a new session for ehv-host.domain.com, failed with code 0x80070015(ERROR_NOT_READY)

    Any idea how to solve this nicely?

    Gr,

    Ronald


    Ronald


    Monday, February 13, 2017 1:36 PM

Answers

  • Well, finally found the cause and solution of this issue.

    This was caused by msvcr120.dll, a 32-bit version of this dll was inc:\windows\system32. It seems that on a x64 system the 32bits dlls need to be in syswow64.


    It was solved by removing that dll and reinstalling vcredist both 32 and 64 bit versions. After that the installation/deinstallation and agent worked fine again.How these dll's became mixed up we don't know, we suspect some bad packaging.
    We used http://www.dependencywalker.com/ to find this out.

    Thanks to my higly skilled co-worker!

    Gr,

    Ronald


    Ronald

    Thursday, March 2, 2017 9:08 AM

All replies

  • Hi Ronald,

    I have once used this particular guide:

    Forcibly removing a SCOM agent that cannot be uninstalled by normal means

    to get out of the exat same situation. You wrote:

    "it's been removed manually by deleting files and registry and reinstalling it manually but never got this to a working agent again."

    did you complete all the steps from the article?

    If not, please try to do so and post back. It's also pretty important to restart the affected computer as soon as you have removed all the pieces and prior to install the agent gain.

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    • Proposed as answer by Yan Li_ Tuesday, February 14, 2017 5:32 AM
    Monday, February 13, 2017 4:35 PM
  • Hi Stoyan,

    I didn't know this tool so i performed all steps in the document. The reg-keys are all gone, the agent folder is gone, restarted the machine, i deleted the node in operations and try to reinstall it from the console.

    It's still not installing, the installer failed with:

    Foutcode: 80070645
    Foutbeschrijving: This action is only valid for products that are currently installed.

    Is this still an indication that something is left of the old installation?

    Gr,
    Ronald


    Ronald

    Tuesday, February 14, 2017 9:12 AM
  • Hi Ronald,

    please take a look here on what you have to remove in order to be able to successfully install the agent:

    SCOM Agent Installation Error – 80070645 – This action is only valid for products that are currently installed

    Post back with the outcome.

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    Tuesday, February 14, 2017 9:52 AM
  • Hi Stoyan,

    That is the same key as the one that has been deleted by the MSIZap utility, it is no longer there. I even looked through all the uninstall entries but none of the remaining ones looks related to the agent.

    Seemes to be a pretty hopeless case?

    gr,

    Ronald


    Ronald

    Tuesday, February 14, 2017 2:43 PM
  • Hi Ronald,

    don't give up that easy :) I know it is easy to say so, but let's try another approach:

    - Please try pushing the agent from the console after you have cleaned everything. Do you get the same error? Just maje sure you have removed the system from SCOM before doing this.

    What you can try in addition is to search in the registry for some remaining recors. You can use MOM, SCOM, Operations and also other similar keywords.

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    Tuesday, February 14, 2017 3:06 PM
  • Hi,

    Ok, i removed all related keys with the terms you've mentioned from the registry, restarted the machine.

    Removed the node from scom and after that i could install the agent from the console succesfully, but it didn't turn in to a healthy state. in fact the node stays in pending.

    And on the system the healthservice won't start, i'm back to my first situation.

    if i start a repair on the system itself (the option in the console is grayed out) i get:

    Error 25362. Failed to start -2147023843 service. Setup will proceed, please examine your eventlog.

    If i try to manually uninstall the agent via add/remove programs i get the dll error again.

    Gr

    Ronald


    Ronald

    Wednesday, February 15, 2017 12:07 PM
  • Hi Ronald,

    please take a look at this:

    Monitoring Agent installation : Error 25362.Failed to start -2147023843 service

    unfortunately I have no more ideas, which do not involve complex troubleshooting. :(

    In my opinion if you get the service to start, then you will be able to properly monitor the server or at least do a regular uninstall and then install the agent back again.

    So the question is how to troubleshot this. There are many approaches here, but If I were you I would use ProcMon and capture the events during the start attempt of the service, then analyse the file. Now, this can be too much for you because of many reasons, but it is the only way I can think of.

    The other possibility is to enable OpsMgr tracing, create a ETL file and analyze it.

    Please write back if you are still interested in further troubleshooting this (if this with the Visual Studio Tools does not help, see the thread posted) and I will send you a short guide on how to enable the logging, collect the results and also some more info on ProcMon log creation.

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    Wednesday, February 15, 2017 1:38 PM
  • I'm still interested in getting this agent working again.

    I've searched for these visual studio tools, only found 'visual studio 2010 office' is that the right package? I installed it anyway, restarted server without success, still not working.

    I also started agenttracing VER, started the service, stopped tracing, ran formattracing. It only creates 5 emtpy logfiles.

    Also run procmon with a filter on healthservice.exe i'm not experienced in interpreting these logs, i see al lot of name not found results on some dll's, is there a way i can attach the export?

    Then i retried to install the agent from console. The logfile from the gateway now show some interesting output, maybe it is usefull for you?

    I removed all lines that don't seem interesting:

    03:50:14 : IsCurrentMOMServerInstalled : MOM Setup key could not be queried.
    03:50:15 : OmAgentFunctions::IsOmAgentPresent : OmAgent found.
    03:50:15 : OmAgentFunctions::IsHsInstalled : HealthService installed.
    03:50:15 : OmAgentFunctions::InstallOmAgent : HealthService found, checking if it is used by other components.
    03:50:15 : OmAgentFunctions::IsOmAgentPresent : OmAgent found.
    03:50:15 : OmAgentFunctions::IsServerMgPresent : RegOpenKeyExW() failed. HResult: 80070002
    03:50:15 : OmAgentFunctions::IsMomServerPresent : IsProductInstalled() failed. HResult: 80070490
    03:50:15 : OmAgentFunctions::IsOmServerPresent : IsProductInstalled() failed. HResult: 80070647
    03:50:15 : OmAgentFunctions::IsOmGatewayPresent : IsProductInstalled() failed. HResult: 80070103
    03:50:15 : OmAgentFunctions::IsSceServerPresent : IsProductInstalled() failed. HResult: 80070647
    03:50:15 : OmAgentFunctions::IsSmServerPresent : IsProductInstalled() failed. HResult: 80070647
    03:50:15 : OmAgentFunctions::IsServer : Server not found.
    03:50:15 : OmAgentFunctions::IsAgentOnly : Only agent is installed.
    03:50:15 : OmAgentFunctions::InstallOmAgent : HealthService not used by other components.
    03:50:15 : OmAgentFunctions::InstallOmAgent : Agent is not current or this is not a repair operation, checking if agent can be upgraded.
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : Existing version:
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : 8.0.10918.0
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : New version:
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : 7.2.11719.0
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : Upgrade not supported.
    03:50:15 : OmAgentFunctions::InstallOmAgent : Agent cannot be upgraded, only adding MG.
    03:50:15 : OmAgentFunctions::GetAgentManagementApi : Failed to retrieve Management Service Configuration. So, assuming agent management library is not installed HResult: 800700c1
    03:50:15 : OmAgentFunctions::IsAgentManagementApiAvailable : Failed to retrieve Management Service Configuration. Assuming Agent Management API is not available. HResult: 800700c1
    03:50:15 : OmAgentFunctions::InstallOmAgent : API for adding MG not available.

    If you want me to rerun other types of tracing then let me know.

    Gr,
    Ronald

     


    Ronald

    Wednesday, February 15, 2017 2:58 PM
  • Hi Ronald,

    I think that they installed this:

    Visual Studio 2010 Tools for Office Runtime

    In regards gto the tracing: Did you run the StopTracing first? Here the exact order:

    - Run "StopTracing.cmd" (stop current tarcing)
    - Afterwards run "StartTracing.cmd VER" (VER is VERBOSE and is case sensitive)
    - Wait for the issue to reproduce.
    - Run "StopTracing.cmd" (stop the trace)
    - Run "FormatTracing.cmd (converts the etl traces in readable format)
    - Collect the logs from  %windir%\logs\OpsMgrTrace\.log

    Try it once again please.

    In regards to ProcMon - filtering for the HealthService.exe is a good start. Amongst the NOT fOUND records you will have also records in regards to the registry. Can you group them a bit and try to check whether they really exist? The reason I am asking is because I saw the logs you posted and in particular this record:

    03:50:15 : OmAgentFunctions::IsServerMgPresent : RegOpenKeyExW() failed. HResult: 80070002

    The code means that "The system cannot find the file specified." or "not found". You can check the most common records from the log file against the registry of a working client in order to try and idetify what is exactly missing.

    Hope this makes some sense to you.

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    Thursday, February 16, 2017 9:14 AM
  • Hi,

    I tried the tracing again but it doesn't contain data. When tracing is running and i start the service the etl files stay on 0 bytes. On the stop command they all grow to 64bytes and after the format i only get 0kb log files.

    I compared the missing regkey errors from procmon to a working agent but see no issues. I then ran procmon again but now filtered on msiexec. I ran it on an agent removal procedure and also on an repair. I think the agent is not completely installed and therefor will not start the service.

    On the repair command i get a lot of errors with 'Buffer too small' and 'Bufer overflow'

    These results i get on regkeys about perflib also on queries in regkeys about services...could this be worth focusing on?

    Gr,
    Ronald


    Ronald

    Thursday, February 16, 2017 10:24 AM
  • Hm,

    hard to tell, but most probably they are not that relevant.

    I just saw this:

    03:50:15 : OmAgentFunctions::IsUpgradeSupported : 8.0.10918.0
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : New version:
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : 7.2.11719.0
    03:50:15 : OmAgentFunctions::IsUpgradeSupported : Upgrade not supported.

    You seem to have a SCOM 2016 on that system. And you are trying to install an older version? This could be one reason why for example registry and libraries are not updated. Can you check this out?

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!)

    Thursday, February 16, 2017 10:37 AM
  • I noticed that also, but those lines are from the gatewayserver, so this is from my install attemt via the console, i cannot choose an older version there.

    I now also checked the gateway to be sure that its the correct 2016 version and it does. In the agentmanagement folder it has the current agent, i only do not know how it works with the ur1 update on the agent, i don't see that update in that folder...but the difference between version 7.2 and 8 don't seem as the difference between non ur and ur1 i guess.

    Gr,
    Ronald


    Ronald

    Thursday, February 16, 2017 10:52 AM
  • Well, finally found the cause and solution of this issue.

    This was caused by msvcr120.dll, a 32-bit version of this dll was inc:\windows\system32. It seems that on a x64 system the 32bits dlls need to be in syswow64.


    It was solved by removing that dll and reinstalling vcredist both 32 and 64 bit versions. After that the installation/deinstallation and agent worked fine again.How these dll's became mixed up we don't know, we suspect some bad packaging.
    We used http://www.dependencywalker.com/ to find this out.

    Thanks to my higly skilled co-worker!

    Gr,

    Ronald


    Ronald

    Thursday, March 2, 2017 9:08 AM
  • Hi Ronald

    Thanks a bunch for your guide! It worked wonders for me and saved me quite some time.

    Best regards

    Simon

    Wednesday, August 19, 2020 4:13 PM