locked
SfB Calls Disconnecting After about 10 seconds RRS feed

  • Question

  • Hi all,

    We have had a stable Skype for Business 2015 on premise platform in place for a few years now but have recently been experiencing a new issue that is starting to make us pull our hair out!

    To give a brief overview of the setup:
    3 Frontend Servers in a Pool
    2 Edge Servers in a Pool
    2 Med Servers in a Pool
    + SfB SQL Server

    All SfB servers sit on an internal DMZ VLAN meaning they have full communication with each other internally.
    Edge1 and Edge2 have three external IP addresses assigned to each - for AV, Access and Conf.

    Med1 and Med2 servers have an external IP address each as well.

    The three FE servers sit behind a pair of ARR servers for load balancing and request routing.

    All servers are running Windows Server 2016 Standard, version 1607, build 14393.3630.
    The 'Get-CsServerPatchVersion' command shows that all our components are running version 6.0.9319.0 or above.

    This setup is fairly mature and, to my knowledge, standard practise. We have been using the system for three, maybe four, years now, without any real problems.

    The new problem we are seeing, quite sporadically is that the caller starts ringing the callee successfully, when the callee goes to pick up the phone there is around 8 to 10 seconds of silence while it appears the call media flow is trying to establish a connection, then the call simply drops. This appears to happen on inbound and outbound calls, to and from internal and external SfB users, and also to totally external non-SfB phone numbers.
    One interesting behaviour is that when making a call from an SfB softphone to an external UK number (mobile or landline, for example), there are times when all the caller can hear is the stock 'SfB' ringtone; it never transitions to being the classic 'ring ring', and these calls that show this behaviour are the ones that never establish the media traffic when the callee picks up.
    On the occasions where the callers can hear the SfB ringtone transition to that classic 'ring ring' tone, the callee is able to pick up and establish the media successfully and both parties can hear each other.

    We have performed SIP tracing and can see the apparently successful SIP INVITE, RINGING and 200 OK messages, before what appears to be a final series of ACKs, then around 10 seconds of no SIP messaging (at which point I assume the SIP audio media itself is attempting to flow between caller and callee.)
    Finally the call disconnects with a series of SIP BYE messages and the following error appears in the logs:


    Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote";component="MediationServer";Exception="Proxy side ICE connectivity check failed."



    What is odd is that trying to consistently replicate the error or behaviour is very hit and miss. Sometimes, at certain times of the day, you may get a short run of 5 or 6 calls that fails to establish.
    We get variations of that error message; some say there was a failure when both endpoints were internal, or when both endpoints were remote.

    We have tried temporarily removing various combinations of the FE, MED and EDGE servers from the pools to see if any one particular server in the pools is causing the issue.
    We have checked the fabric state of the FE servers and all is happy.
    We have rebooted each server, gracefully, to see if that resolves the issue.
    We have checked all the configuration on our firewalls and all relevant ports appear to be open and configured correctly.
    We have checked the event logs on all SfB servers for any Skype for Business errors; there appear to be no errors related to the issue we are seeing, in fact, there are relatively few overall.

    We are starting to run out of ideas on this one and if anyone can offer some insights into what might be causing this, that'd be very helpful. Happy to provide more data and troubleshooting if required.

    Thanks!
    Rory.

    Wednesday, May 27, 2020 8:10 AM

All replies

  • Hi RoryJM!

    Thanks for your detailed clarification for this issue.

    Are your edge servers deployed in the DMZ zone?

    According to your description, can we understand that this issue persists between internal and external users?

    I found a case seems related to this issue for your reference. It recommends you try to run the command like the following to configure Mediation Server attribute:

    Set-CsMediationServer -Identity "MediationServer:standard.kressmark.com" -EdgeServer edge.kressmark.com

     

    For more details, you can refer to:

    https://social.technet.microsoft.com/Forums/lync/en-US/a54f2371-3a9e-4a2d-973d-6be13fd437fc/incoming-pstn-call-to-external-user-fails-to-connect?forum=ocsvoice

    Best Regards,
    Jimmy Yang

    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.
    Thursday, May 28, 2020 5:39 AM
  • Hi,

    thanks for your reply and link.
    Yes, our edge servers are also deployed in the DMZ zone.
    And, yes, from everything I can see, the issue occurs when either one (or both) of the users in a call is external to our network.
    Internal to internal calls are working fine, from everything we can see.

    I have run the PS command in the link you sent (Get-CsService -MediationServer), and I can see that both our mediation servers have edge server FQDN attributes set correctly.

    Thanks for your help,

    Rory.

    Thursday, May 28, 2020 8:24 AM
  • Hi RoryJM!

    Please try to make sure the Edge AV address is having a correct NAT address. 443/tcp and 3478/udp can’t be setup if the NAT address is configured incorrectly. The NAT must be a one-to-one NAT like:

    Internal < -> External
    10.0.0.1 < -> 89.89.89.2 (SIP)
    10.0.0.2 < -> 89.89.89.3 (Webconf)
    10.0.0.3 < -> 89.89..89.4 (AV)

    Best Regards,
    Jimmy Yang


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.






    Tuesday, June 2, 2020 11:19 AM
  • Hi Jimmy,

    Thanks for your reply.
    I have inspected our EDGE AV configuration and we actually have Edge1 and Edge2's external Access, conf and av IP addresses assigned directly to these servers' network adapters - there is no NAT occurring.

    These EDGE servers also have a secondary network adapter with the internal IP address for access, conf and av for internal client access.

    The internal and external IPs are assigned correctly to each access, conf and av URLs.

    Thanks for your assistance,
    Rory.

    Thursday, June 4, 2020 2:52 PM
  • Hi RoryJM!

    You can try to check if these DNS configurations have been configured on your servers.

    The following picture shows for your reference:

    Best Regards,
    Jimmy Yang


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.


    Wednesday, June 10, 2020 7:56 AM