locked
Sudden strange errors on Skype for business frontend servers, "Invalid incoming HTTPS certificate" "A server did not respond to HTTP request" RRS feed

  • Question


  • Hi,
    On Friday our Skype for Business that’s been working OK for a few months started acting up.
    People can login to the client, some have problems signing in tho. The status of users are not showing the right status, some have been away for 70days but was online 10min ago.

    SCOM started reported a few errors all of the sudden:
    ________________
    “[Skype] Invalid incoming HTTPS certificate.”
    Invalid incoming HTTPS certificate.
    Subject Name: SFB-FEPOOL.domain.com Issuer: GlobalSign Domain Validation CA - SHA256 - G2
    Cause: This can happen if the HTTPS certificate has expired, or is untrusted. The certificate serial number is attached for reference.
    Resolution:
    Please check the remote server and ensure that the certificate is valid. Also ensure that the full certificate chain of the Issuer is present in the local machine.
    Please see the 'Product Knowledge' and the 'Alert Context' tab on Alert Properties view for more information.
    _________________
    [Skype] A server did not respond to HTTP request
    A server did not respond to HTTP request
    Server FE-SERVER. domain.com did not respond to HTTP request PublishCertRequest targeted at https:// FE-SERVER. domain.com:444/LiveServer/UserPinService.
    Cause: Server might be down or the network path between servers might not be properly configured.
    Resolution: Please ensure that the server can be connected on the target port using telnet and then re-try.
    Please see the 'Product Knowledge' and the 'Alert Context' tab on Alert Properties view for more information.
    ________________

    SFB running on a Windows 2012R2 server:
    2 SQLservers Mirror
    1 ME Server
    2 Persistent Chat
    3 Frontends
    2 Edge
    Skype for Business Server 2015, Administrative Tools        6.0.9319.0
    Skype for Business Server 2015, Application Host            6.0.9319.0
    Skype for Business Server 2015, Audio Test Service          6.0.9319.0
    Skype for Business Server 2015, Call Park Service           6.0.9319.0
    Skype for Business Server 2015, Conferencing Announcemen... 6.0.9319.0
    Skype for Business Server 2015, Conferencing Attendant      6.0.9319.235
    Skype for Business Server 2015, Conferencing Server         6.0.9319.102
    Skype for Business Server 2015, Core Components             6.0.9319.235
    Skype for Business Server 2015, Core Management Server      6.0.9319.0
    Skype for Business Server 2015, Front End Server            6.0.9319.235
    Skype for Business Server 2015, Performance Counter Cumu... 6.0.9319.235
    Skype for Business Server 2015, Reach Fonts                 6.0.9319.0
    Skype for Business Server 2015, Response Group Service      6.0.9319.235
    Skype for Business Server 2015, Skype for Business Web A... 6.0.9319.102
    Skype for Business Server 2015, Web Components Server       6.0.9319.235
    Skype for Business Server 2015, Web Conferencing Server     6.0.9319.0
    Skype for Business Server 2015, XMPP Translating Gateway    6.0.9319.0

    Also seen this in Eventlog on all FE-servers:

    ID: 30988
    Sending HTTP request failed. Server functionality will be affected if messages are failing consistently.
    Sending the message to https://FE-SERVER.domain.com:444/liveserver/asmcu/ failed. IP Address is xxx.xxx.xxx.xxx. Error code is 0x2EFE. Content-Type is application/cccp+xml. Http Error Code is 0x0.
    Cause: Network connectivity issues or an incorrectly configured certificate on the destination server. Check the eventlog description for more information.
    Resolution: Check the destination server to see that it is listening on the same URI and it has certificate configured for MTLS. Other reasons might be network connectivity issues between the two servers.

    ID: 32284
    An HTTP message sent to another front end has failed.
    The request failed with the error: 12030.
    Cause: Possible network configuration issue
    Resolution: If a front end in this pool is in maintenance mode, then this alert may be ignored. Otherwise check that the network is properly configured between front ends in the pool. Also check whether any patches have recently been installed.

    ID: 47067
    A server did not respond to HTTP request
    Server FE-SERVER.domain.com did not respond to HTTP request GetPublishedCertRequest targeted at https://FE-SERVER.domain.com:444/LiveServer/UserPinService.
    Cause: Server might be down or the network path between servers might not be properly configured.
    Resolution: Please ensure that the server can be connected on the target port using telnet and then re-try.

    ID: 32042
    Invalid incoming HTTPS certificate.
    Subject Name: SFB-FEPOOL.domain.com Issuer: GlobalSign Domain Validation CA - SHA256 - G2
    Cause: This can happen if the HTTPS certificate has expired, or is untrusted. The certificate serial number is attached for reference.
    Resolution: Please check the remote server and ensure that the certificate is valid. Also ensure that the full certificate chain of the Issuer is present in the local machine.

    ID: 61029
    The process RtcHost(8180) received an invalid client certificate.
    Certificate SN: CN=SFB-FEPOOL.domain.com, OU=Domain Control Validated
    Certificate issuer name: CN=GlobalSign Domain Validation CA - SHA256 - G2, O=GlobalSign nv-sa, C=BE.
    Certificate error: 2148204809.

    ID: 47068
    GetAndPublish web service failed. This may impact user sign-in.
    Certificate could not be published in the database associated with User Services Cluster [SFB-FEPOOL.domain.com]. Request Details - Entity: [name.name@domain.com], Device Id: [{C0FFD9FB-1053-5373-83D4-015F7933611B}], Authenticated User: [sip:name.name@domain.com].
    Additional Context: [Publish_Failure: OtherFailure;HttpNoResponse:"FE-SERVER.domain.com"]
    Cause: This could be due to network connectivity issues with the remote server, or because the database is down.
    Resolution: Please ensure that the user services is reachable and the database is up and running.

    Note that no certificates have expired!! When we check certificates and chains they look OK!
    So I am very confused over this problem. All good also on https://testconnectivity.microsoft.com/
    I have checked TechNet and Google but most fixes are for Lync 2013 and older Windows versions.
    Anyone have any idea?

    Best Regards
    Robban

    Monday, July 11, 2016 8:50 AM

Answers

  • Hi Robban, 

    That was good and detailed information about the topology. Just wanted to confirm other things as well.

    What about the CMS replication status? Do you see any other error related to Lync share? When was the last time that we had a reboot for the servers? Did you schedule any reboot recently and then monitored the situation. 


    Linus || Please mark posts as answers/helpful if it answers your question.

    Monday, July 11, 2016 4:40 PM
  • Deleted
    Monday, July 11, 2016 5:29 PM

All replies

  • Hi Robban, 

    That was good and detailed information about the topology. Just wanted to confirm other things as well.

    What about the CMS replication status? Do you see any other error related to Lync share? When was the last time that we had a reboot for the servers? Did you schedule any reboot recently and then monitored the situation. 


    Linus || Please mark posts as answers/helpful if it answers your question.

    Monday, July 11, 2016 4:40 PM
  • Deleted
    Monday, July 11, 2016 5:29 PM
  • Hi! thanks guys!

    I took both your ideas and worked last night...
    I patched all systems to June 2016 and rebooted, and also all Exchange-servers.
    CMS replication looked good before and after. However, a lot less in the log. But there were errors.

    So tried the https://support.microsoft.com/en-us/kb/2795828 again, did it on Saturday, even tho my services were running, KB says cannot start service, and found a self-sign root-CA, on one sever, back in the wrong folder.
    I fixed it again and all look good, then it came back again in that folder by itself, removed it checked GPO… so all look good again. But then it came back again?!?! Removed it and since then it has not come back. Im not sure why, I hope it stays away now tho!

    If it was this self-sign root the error logs I got pointed me way of as it pointed to the GlobalSign Certificate:
    ID: 61029
    The process RtcHost(8180) received an invalid client certificate.
    Certificate SN: CN=SFB-FEPOOL.domain.com, OU=Domain Control Validated
    Certificate issuer name: CN=GlobalSign Domain Validation CA - SHA256 - G2, O=GlobalSign nv-sa, C=BE.
    Certificate error: 2148204809.

    Thanks so far! I let you know if any changes happen!

    Best regards
    Robert

    Tuesday, July 12, 2016 8:33 AM
  • Good to hear Robban.


    Linus || Please mark posts as answers/helpful if it answers your question.

    Tuesday, July 12, 2016 9:59 AM