locked
Edge Server sends packets for federation partners back internally to its internal interface RRS feed

  • Question

  • We have a very strange issue that has us completely beaten. We have a consolidated SfB topology with standard front end server (with mediation role) and an edge server. The edge server is working fine for remote users, but federation is only working in one direction. Internal users cannot get presence or initiate chats with external parties, but it works the other way around. We get the following messages when testing federation and inspecting logs when running the federation test.

    This appears to be a very common issue and usually due to the firewall protecting the Edge server external interface not permitting outbound TCP 443 or 5061. We have established that this isn't the case here as we can telnet on both ports from the Edge server to both sipfed.online.lync.com and another federated partner's Edge server.

    We decided to inspect what is actually happening on the Edge server's external interface when it is trying to test external federation. This threw up something surprising. We noticed that the edge interface is sending the federation connection on SIp 5061 to its own internal interface. See below the sniffed packet and below a comparison packet from an almost identical topology, testing federation back to this Edge server.

    Edge server will failing federation

    Edge server with working federation

    Can anybody throw any light onto this for us?

    Monday, October 17, 2016 8:49 AM

Answers

  • Hello Nippertje,

    An user sending an request would first try to resolve the partner domain and which is updated from external interface to internal and so on, post which the DNS resolution transaction would take place. Mostly you might see an sip info packet going to the Internal interface and further to frontend with the information of DNS resolution.

    we can take the same time sip logs from client, Frontend ,edge and netmon on edge to diagnose the issue.We can check the call leg from the client and post which we would be able to see the exact exchange of communication between edge and frontend.

    • Marked as answer by Nippertje Monday, October 24, 2016 10:01 AM
    Tuesday, October 18, 2016 1:06 AM
  • Hi Nippertje,

    Welcome to our forum.

    For cost, all internal server use internal CA, include the internal interface on the Edge server, the external interface use public CA, it is recommend by Microsoft. So it is not result from the certificate.

    By this issue, we suggest do the following steps for troubleshoot:

    1. The Edge server could not be joined into domain
    2. Check federation configuration: https://technet.microsoft.com/en-us/library/dn705313.aspx 
    3. Clear DNS cache 

    If there are any questions or issues, please be free to let me know and we will pleasure to help you. If the above suggestions are helpful to you, please mark it as answer so that someone who has similar issue could find this thread as soon as possible.

    Best Regard,
    Jim Xu
    TechNet Community Support


    Please remember to mark the replies as an answers if they help and unmark them if they provide no help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    • Proposed as answer by jim-xu Monday, October 24, 2016 9:55 AM
    • Marked as answer by Nippertje Monday, October 24, 2016 10:01 AM
    Tuesday, October 18, 2016 6:19 AM

All replies

  • Hi Nippertje,

    Do you have persistent routes added on the edge serer for your internal networks ? 

    You should have a single default gateway specified on the external interface only so all traffic to any destination except your internal networks route via the external interface.

    There should be no default gateway assigned on the internal interface, as persistent route should be in place for internal routing to the internal default gateway.

    For persistent route open command prompt as admin and you can add the below but i would recommend adding only your internal subnets instead of a whole ranges of private ips. Below are examples.

    route add -p 10.0.0.0 mask 255.0.0.0 <internal_NIC_IP> -p

    route add -p 172.16.0.0 mask 255.240.0.0 <internal_NIC_IP> -p

    route add -p 192.168.0.0 mask 255.255.0.0  <internal_NIC_IP> -p

    Also for internal dns resolution you should also use Host files and its not recommend to use an internal dns server for internal dns name resolution. External dns should point to an external dns server.

    Hope this helps, if you have this all in place already just let us know.

    Thanks,

    Martin


    Please remember, if you see a post that helped you please click &quot;Vote As Helpful&quot; and if it answered your question please click &quot;Mark As Answer&quot;. Thank you. This forum post is based upon my personal experience and does not reflect the opinion or view of my employer.

    Monday, October 17, 2016 9:16 AM
  • Hi Martin,

    Thanks for your reply. The Edge server does not sit in a DMZ and is dual homed. As such, the internal interface already has a persistent route to the internal network (10.0.0.0/24) and it's default gateway is on the external interface that is assigned with three public addresses. Here is the routing table for the Edge server

    The DNS server is assigned on the internal interface and is the internal DC. I know that best practice is to use external dns with static host records for the internal hosts, but this configuration is working for us on other deployments, including the one I showed the successfully federation test on, so I doubt this is the problem. Also, in the logs I can see the edge server successfully resolve the edge server of the federated domain in the test. See here: 

    This all checks out ok, yes?

    Monday, October 17, 2016 9:28 AM
  • Hi Nippertje,

    So the edge server is not in a DMZ but is dual homed on the LAN ?

    The edge server should be homed in a DMZ with an external facing DMZ network and an internal facing DMZ network, both networks have different subnets and the are not directly routable. The LAN should only communicate with the internal interface.  Having the Edge server connected on the LAN is not recommend for security reasons.

    Could you give a little more detail on the topology you have deployed please ? or a diagram just so we can understand how its deployed.

    DNS wise looks all good just wanted to add this information for you.

    thanks,

    Martin


    Please remember, if you see a post that helped you please click &quot;Vote As Helpful&quot; and if it answered your question please click &quot;Mark As Answer&quot;. Thank you. This forum post is based upon my personal experience and does not reflect the opinion or view of my employer.

    • Proposed as answer by jim-xu Monday, October 24, 2016 9:55 AM
    Monday, October 17, 2016 10:15 AM
  • Here it is. It is only a small deployment for a small company of less than 20 staff, so DMZ is not viewed as essential. We also have similar small environments that work with this topology, which is all the more puzzling. The only significant difference with this deployment is that it (unfortunately) has a .local extension for the internal domain name. We have used an internal CA for the FE and Edge internal interface certificates and use a public certificate on the edge external. Normally, we would use a public cert on all interfaces, but this was not possible. Nothing in our testing suggests a certificate issue though, so I strongly doubt this has any bearing on it.

    Does this information help?

    Monday, October 17, 2016 10:43 AM
  • Deleted
    Monday, October 17, 2016 2:03 PM
  • Thank you for your contribution. No, there is no NAT at all on the Edge external interface. The interface is numbered with the public addresses as shown.
    Monday, October 17, 2016 5:06 PM
  • Hello Nippertje,

    Dont know if its possible to test but Federation can be resolved by troubleshooting the external sign in when a User A logs in from User B Domain and Vice versa. once this is working Federation will start working

    Other than that you cal also verify that required ports and protocols are configured as per the requirement

    https://technet.microsoft.com/en-us/library/gg425891(v=ocs.14).aspx

    The error which you are getting is most of the time a port blocked issue , when we are unable to reach the partner edge server on port 443 mostly. Federation always needs port 5061 (whether you want to configure it for Access edge is optional)

    Please troubleshoot federation with external domain and skype for business online differently to make it easy.

    Monday, October 17, 2016 6:49 PM
  • Thanks. Yes, I know that it is usually a firewall issue but given that we can telnet in and out on 443 and 5061, it is not the case in this instance. Also, we have wireshark traces of the packets on the external edge interface directing to the internal interface on the same server instead of outbound to the federated partner's edge server. Why on earth would it do that?

    Monday, October 17, 2016 9:27 PM
  • Hello Nippertje,

    An user sending an request would first try to resolve the partner domain and which is updated from external interface to internal and so on, post which the DNS resolution transaction would take place. Mostly you might see an sip info packet going to the Internal interface and further to frontend with the information of DNS resolution.

    we can take the same time sip logs from client, Frontend ,edge and netmon on edge to diagnose the issue.We can check the call leg from the client and post which we would be able to see the exact exchange of communication between edge and frontend.

    • Marked as answer by Nippertje Monday, October 24, 2016 10:01 AM
    Tuesday, October 18, 2016 1:06 AM
  • Hi Nippertje,

    Welcome to our forum.

    For cost, all internal server use internal CA, include the internal interface on the Edge server, the external interface use public CA, it is recommend by Microsoft. So it is not result from the certificate.

    By this issue, we suggest do the following steps for troubleshoot:

    1. The Edge server could not be joined into domain
    2. Check federation configuration: https://technet.microsoft.com/en-us/library/dn705313.aspx 
    3. Clear DNS cache 

    If there are any questions or issues, please be free to let me know and we will pleasure to help you. If the above suggestions are helpful to you, please mark it as answer so that someone who has similar issue could find this thread as soon as possible.

    Best Regard,
    Jim Xu
    TechNet Community Support


    Please remember to mark the replies as an answers if they help and unmark them if they provide no help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    • Proposed as answer by jim-xu Monday, October 24, 2016 9:55 AM
    • Marked as answer by Nippertje Monday, October 24, 2016 10:01 AM
    Tuesday, October 18, 2016 6:19 AM
  • OK, so the solution was DNS. We changed the DNS to external and added some static DNS entries into the host file for internal hosts and this fixed it. We have a couple of small deployments where we use the internal DC for DNS on the Edge server and it works fine, but these are split-brain DNS where the same domain name exits internally as well as externally. 

    The one we had problems with was a .local domain.

    Thank for your all your help guys.


    Monday, October 24, 2016 10:04 AM
  • Glad you found the issue Nippertje and thank for letting us know :)

    Please remember, if you see a post that helped you please click &quot;Vote As Helpful&quot; and if it answered your question please click &quot;Mark As Answer&quot;. Thank you. This forum post is based upon my personal experience and does not reflect the opinion or view of my employer.

    Tuesday, October 25, 2016 8:25 AM
  • Thanks for the update
    Thursday, October 27, 2016 8:17 PM