locked
Problem with Federation with Office 365 Skype for business clients RRS feed

  • Question

  • Hi,

    we had setup successfully a skype for business server and edge infrastructure and it worked fine until about 3 weeks ago when office 365 federation did not work anymore. From the on premise site we can see the presence of the office 365 skype user and can send messages which the office 365 user also receives. Federation with other skype for business on premise users works fine.

    But the other way round it does not work. Office 365 users can not send us messages nor do they see our presence.

    On the on premise site I could not find any errors but the office 365 skype for business client put out the following messages:

    ms-diagnostics: 1047;reason="Failed to complete TLS negotiation with a federated peer server";ip-address="213.95.138.74";peer-type="FederatedPartner";winsock-code="10054";winsock-info="The peer forced closure of the connection";source="sipfed1E.online.lync.com";source-server="AM41E00EDG05.infra.lync.com"

    ms-diagnostics: 1010;reason="Certificate trust with another server could not be established";ErrorType="The peer did not respond to TLS or MTLS negotiation in a timely manner";PeerServer="sip.hvs-consulting.de";source="sipfed1E.online.lync.com";source-server="AM41E00EDG05.infra.lync.com"

    Office 365 complains that it cannot establish a certificate trust. We have comodo certificates which are installed correctly and the chain can be resolved (qualys ssl test). The certificates are still valid. 

    Did microsoft change anything with certificates for office 365 federation? As I said it had worked before?

    Any idea how to find out what certificate seems to be the problem?

    Thank you

    Mark


    Regards Mark


    • Edited by Wompi Tuesday, August 21, 2018 1:34 PM
    Tuesday, August 21, 2018 1:33 PM

Answers

  • Hi,

    OMG we finally fixed it. Indeed it had to do something with our firewall in this case our f5.

    It seems that the ssl persistence profile on the virtual server from our edge (external if - port 5061) caused the problem. When I disabled the profile everything started working. The profile had been there since the installation of skype for business but maybe with the latest update f5 changed something why federation didn't work anymore.

    Thanks again all for the help.


    Regards Mark

    • Marked as answer by Wompi Wednesday, September 5, 2018 11:30 AM
    Wednesday, September 5, 2018 11:30 AM

All replies

  • Can you check if the Baltimore cyber trust root certificate is installed in your edge server ? try installing it if NOT. [make sure it is placed in Trusted root certificate container]

    Link to download = https://dl.cacerts.digicert.com/BaltimoreCyberTrustRoot.crt

    Basically this certificate is being used in Microsoft Office 365 Skype for Business Online server and those servers also looks to establish the secure connection with your Edge server by performing the MTLS handshake using this certificate.

    Best Regards, Neeranjan

    Tuesday, August 21, 2018 3:45 PM
  • Hi

    please check the Edge internal certificate, Front End certificates and Oauth to confirm no expiry.

    And check automatic update to the firewall in these days, there is a similar case caused by the update to the firewall.

    https://social.technet.microsoft.com/Forums/en-US/4721c4bb-3085-4118-848b-56bf92b1a8ce/unable-to-send-ims-from-onpremises-lync-2013-to-office-365-skype-for-business?forum=lyncprofile


    Best Regards,
    Leon Lu


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.


    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    Wednesday, August 22, 2018 3:01 AM
  • Can you check if the Baltimore cyber trust root certificate is installed in your edge server ? try installing it if NOT. [make sure it is placed in Trusted root certificate container]

    Link to download = https://dl.cacerts.digicert.com/BaltimoreCyberTrustRoot.crt

    Basically this certificate is being used in Microsoft Office 365 Skype for Business Online server and those servers also looks to establish the secure connection with your Edge server by performing the MTLS handshake using this certificate.

    Best Regards, Neeranjan

    Hi Neeranjan,

    thank you for your answer. We already had the Root cert in the trusted root certificate container. I also installed it another time but this did not help.


    Regards Mark

    Wednesday, August 22, 2018 6:49 AM
  • Hi

    please check the Edge internal certificate, Front End certificates and Oauth to confirm no expiry.

    And check automatic update to the firewall in these days, there is a similar case caused by the update to the firewall.

    https://social.technet.microsoft.com/Forums/en-US/4721c4bb-3085-4118-848b-56bf92b1a8ce/unable-to-send-ims-from-onpremises-lync-2013-to-office-365-skype-for-business?forum=lyncprofile


    Best Regards,
    Leon Lu


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.


    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    Hi Leon,

    thank you. All the certificates have not expired and are still valid. We will have a look into the firewall but the strange thing is that federation with other on premise partner works without problems. Only with MS (office 365 and skype) it does not work. So it seems to me that the problem is more a certificate issue or a new enforcement from the side of microsoft like tls 1.2 etc.


    Regards Mark

    Wednesday, August 22, 2018 8:32 AM
  • Please check your steps when your make a federation, please refer to this link.

    https://docs.microsoft.com/en-us/skypeforbusiness/skype-for-business-hybrid-solutions/deploy-hybrid-connectivity/configure-federation-with-skype-for-business-online


    Best Regards,
    Leon Lu


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.


    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    Tuesday, August 28, 2018 11:16 AM
  • Hi,

    thank you for your answer. I have deployed the provider anew but still not working.

    It turned out that Federation is not only working one way with o365 but also with out federated partners.

    So outbound from our side everythings working but inbound from our partners/o365 they see no presence, cannot im or make a call.

    Also there is still the error in the certs: Certificate trust with another server could not be established";ErrorType="The peer did not respond to TLS or MTLS negotiation in a timely manner

    Is there any toolkit which checks if the federation is working like Test-CsFederatedPartner or similar.

    From our side Test-CsFederatedPartner is working fine as do skype but I cannot test it from external.

    Thank you


    Regards Mark

    Tuesday, August 28, 2018 3:15 PM
  • Hi Wompi,

    you do not have your sipfederation tls DNS entry published.

    please create an SRV record:

    _sipfederationtls._tcp.hvs-consulting.de

    pointing to sip.hvs-consulting.de

    telnet on that entry with port 5061 works

    kind regards

    Hi,

    thank you I will have a look.


    Regards Mark

    Tuesday, August 28, 2018 3:16 PM
  • Hi Wompi,

    you do not have your sipfederation tls DNS entry published.

    please create an SRV record:

    _sipfederationtls._tcp.hvs-consulting.de

    pointing to sip.hvs-consulting.de

    telnet on that entry with port 5061 works

    kind regards

    Hi,

    thank you I will have a look.


    Regards Mark

    I checked the entry and it seems to be published and points to sip.hvs-consulting.de 5061

    I checked 5061 also with Port Query and it is Listening although I cannot access the port via putty.


    Regards Mark

    Tuesday, August 28, 2018 3:24 PM
  • what about any inbound SSL inspection firewall rules?

    have there been any changes recently?

    Tuesday, August 28, 2018 3:24 PM
  • I already looked into this but we have no firewall inspection rules. SSL won't be broken up. Everything with SSL pass-through and the ports are open. We also don't have like a web application filter or similar. :(


    Regards Mark

    Tuesday, August 28, 2018 3:36 PM
  • I asked a colluege from another company to do a federation test on their side and the following comes up:

    Test-CsFederatedPartner : A 504 (Server time-out) response was received from

    the network and the operation failed. See the exception details for more information.


    Regards Mark

    Tuesday, August 28, 2018 4:29 PM
  • what is strange abour your domain: I can's get nslookup to resolve your DNS entries, however mxtoolbox can resolve.

    do you have DNSSEC mandatory? and can you lower TTL to something like 2 hours or 4 hours?

    by the way: I get the same error:

    Diagnosis     : ErrorCode=1047,Source=sip-eu.domain.de,Reason=Failed to
                    complete TLS negotiation with a federated peer server,winsock-c
                    ode=10054,peer-type=FederatedPartner,ip-address=213.95.138.74,w
                    insock-info=The peer forced closure of the connection
                    Microsoft.Rtc.Signaling.DiagnosticHeader

    looking into snooper I see the following:

    The connection was closed before TLS negotiation completed. Did the remote peer accept our certificate?

    we use a let's encrypt cert and federation on this test domain works with others perfectly.

    can your edge retrieve CRLs / OCSP?

    • Edited by zaikun Thursday, August 30, 2018 8:09 AM
    Wednesday, August 29, 2018 12:25 PM
  • Hi zaikun

    thank you very much for your help.

    I asked but we do not have DNSSEC enabled. I will try to change the TTL on the records.

    However I also noticed that nslookup on our records is kind of strange. I will have a look into this.

    I also opened your url in Chrome on the server (as IE cannot access the sip urls at all) and it worked fine with a correct certificate. I also tried to browse the crl distribution points and could download the crl files.



    Regards Mark


    • Edited by Wompi Thursday, August 30, 2018 7:03 AM
    Thursday, August 30, 2018 7:03 AM
  • Hi Mark,

    OK then we need to dive deeper.

    From TCP dump I see that it's your WAN IP that sends the reset on the handshake.

    Firewall-wise everything is fine? port 5061 and 443 are NATed to the edge server? (port 443 is correct of course - otherwise your website wouldn't be accessible :) )

    inside the topology builder the federation route is set to the correct edge pool / edge server?

    regards

    this happens after the first cipher change - your server sends something that I guess my edge can't decrypt / recognise and after some time your server resets the connection


    • Edited by zaikun Thursday, August 30, 2018 7:49 AM
    Thursday, August 30, 2018 7:13 AM
  • Hi Zaikun

    thank you very much for the support.

    The firewall routings and ports seems to be ok.

    We have standard setup of skype so one frontend and one edge.

    In the topology builder the federation route was not set. I set it but unfortunately it did not change anything :(

    The route from edge to front end is set.

    One thing I came up onto is that it shows me a handshake failure in message analyzer:

    This could explain the sudden reset of the connection but I don't really know why the payload is incomplete.

    Regards Mark


    • Edited by Wompi Thursday, August 30, 2018 10:11 AM
    Thursday, August 30, 2018 10:10 AM
  • Hi Mark,

    what about external client login? Does this work? Because here you use "almost" the same connection: the access edge component.

    so can you log in via skype from an external machine?

    regards

    Thursday, August 30, 2018 11:05 AM
  • Hi Zaikun,

    yes, external client login works without problems as we always login via internet / external ip.

    Our network team will have a look into the handshake issue. Maybe they find something.


    Regards Mark


    • Edited by Wompi Thursday, August 30, 2018 11:14 AM
    Thursday, August 30, 2018 11:14 AM
  • OK if you want we can verify that routes on the edge server are defined correctly etc

    by the way: do you have one single ip on the edge for all services or 3 ip addresses on the local external NIC as best practise?

    does DNS resolve work on the edge server?

    the internal NIC has DNS configured but no gateway and the external NIC has Gateway but no DNS configured?

    • Edited by zaikun Thursday, August 30, 2018 11:24 AM
    Thursday, August 30, 2018 11:23 AM
  • Hello Wompi,

    You might want to check the following, often seen intermediate missing might cause such an issue

    1. use this tool on edge server to check if its listing all the Certificates  (With option show root chain)

    https://www.myskypelab.com/2016/12/skype4b-lync-certificate-checker-tool.html

    2. Verify the certificate chain is available in respective store

    3. Perform a store validation by validating server is not having

    a. Duplicate b. certificate is mapped to correct store 3.expired

    https://gallery.technet.microsoft.com/LyncSkype4B-Certificate-81944851

    Note: These are recommendation and you would perform these steps at your own risk

    Thursday, August 30, 2018 4:40 PM
  • OK if you want we can verify that routes on the edge server are defined correctly etc

    by the way: do you have one single ip on the edge for all services or 3 ip addresses on the local external NIC as best practise?

    does DNS resolve work on the edge server?

    the internal NIC has DNS configured but no gateway and the external NIC has Gateway but no DNS configured?

    Hi,

    we have 3 IPs on the external nic.

    Internal nic is in the internal network. external nic is in dmz.

    DNS resolving works fine. Internal nic has internal dns configured and no gateway.

    External nic has external gw defined and external gw.

    Basically the routing seems to work fine (e.g. with the external clients). The federation used to work fine before (until one month ago) and we did not change anything at the topology.


    Regards Mark

    Monday, September 3, 2018 6:39 AM
  • Hello Wompi,

    You might want to check the following, often seen intermediate missing might cause such an issue

    1. use this tool on edge server to check if its listing all the Certificates  (With option show root chain)

    https://www.myskypelab.com/2016/12/skype4b-lync-certificate-checker-tool.html

    2. Verify the certificate chain is available in respective store

    3. Perform a store validation by validating server is not having

    a. Duplicate b. certificate is mapped to correct store 3.expired

    https://gallery.technet.microsoft.com/LyncSkype4B-Certificate-81944851

    Note: These are recommendation and you would perform these steps at your own risk

    Hi,

    I checked the points and everything was correct but but we found out that our official um certificate has two different paths to validate the chain. One path with only 3 certificates (cert, intermediate, root) and one with 4 certs (cert, 2x intermediate, root). This can be seen with qualys which is I guess not normal.

    We are in contact with comodo and awaiting their response.

    Otherwise we have no more ideas what the problem could be. :(


    Regards Mark

    Monday, September 3, 2018 6:45 AM
  • Hi Mark,

    could you make a tcp trace on the external NIC and verify that certificates are correct and no interception etc?

    to be on the safe side you could try to NAT 443 and 5061 without any firewall in between? just to make sure that it's nothing network related?

    any virus scanner on the edge that you could disable temporarily?

    also: could you verify the following:

    open mmc on your edge server, import certificate snapin from service account, choose Skype for Business Access Edge

    navigate to accepted certificates

    is the one from sipfed.online.lync.com in there and the chain is complete?? (check by clicking on the cert and go to certification path

    if not, you have a "closed federation" and somebody kicked out the o365 domain from "sip federated domains"


    • Edited by zaikun Monday, September 3, 2018 11:38 AM
    Monday, September 3, 2018 9:08 AM
  • Hi Mark,

    could you make a tcp trace on the external NIC and verify that certificates are correct and no interception etc?

    to be on the safe side you could try to NAT 443 and 5061 without any firewall in between? just to make sure that it's nothing network related?

    any virus scanner on the edge that you could disable temporarily?

    also: could you verify the following:

    open mmc on your edge server, import certificate snapin from service account, choose Skype for Business Access Edge

    navigate to accepted certificates

    is the one from sipfed.online.lync.com in there and the chain is complete?? (check by clicking on the cert and go to certification path

    if not, you have a "closed federation" and somebody kicked out the o365 domain from "sip federated domains"


    Hi Zaikun,

    yes, thats a good idea with the capture. I see if I can get a capture from the internet side and also a capture from the edge external interface so I can compare the packages. Unfortunately we cannot easily bypass the firewalls.

    I cannot find the certificate on the edge server in the accepted certificates. But we also have "open federation" so everyone should be accepted. We also have setup the hosting provider for o365 via skype powershell.

    We are still awaiting the answer from comodo. At the moment I think it will be the best guess that something changed with the root certs or similar.

    I will keep you updated.

    Thank you very much anyway.


    Regards Mark

    Monday, September 3, 2018 12:35 PM
  • Hi,

    changing the certificate did not change anything. Still the same problem so it is not a certificate issue.

    I try to get the captures and compare them if there is something screwed up.


    Regards Mark

    Wednesday, September 5, 2018 6:29 AM
  • Hi,

    OMG we finally fixed it. Indeed it had to do something with our firewall in this case our f5.

    It seems that the ssl persistence profile on the virtual server from our edge (external if - port 5061) caused the problem. When I disabled the profile everything started working. The profile had been there since the installation of skype for business but maybe with the latest update f5 changed something why federation didn't work anymore.

    Thanks again all for the help.


    Regards Mark

    • Marked as answer by Wompi Wednesday, September 5, 2018 11:30 AM
    Wednesday, September 5, 2018 11:30 AM
  • it's always the firewall :D

    good to hear that you fixed it.

    Friday, September 7, 2018 7:33 AM
  • Error: ms-diagnostics: 1047;reason="Failed to complete TLS negotiation with a federated peer server

    -----------

    Glad you got this fixed for your environment!

    This issue can send you down a deep rabbit hole. The following reference pointed me in the right direction with this issue. - https://blog.valeconsulting.co.uk/2016/03/21/skype-for-business-the-effects-of-packet-inspection/

    First, go to this site on your edge server to see if TLS passes. This will tell you a lot.

    https://www.ssllabs.com/ssltest/viewMyClient.html

    Next, check to see if your router/firewall is performing Deep Packet Inspection or has SIP ALG enabled. If it's on, turn it OFF. This breaks the TLS protocol, leading to this error on Edge servers and external federation.

    https://www.nextiva.com/blog/disable-sip-alg.html

    If your router/firewall is not listed you can get the steps from the manufacturer of your model.

    Thanks!


    • Edited by JerseyCatt Wednesday, September 11, 2019 4:33 PM
    Wednesday, September 11, 2019 4:32 PM