locked
Lync fail-over routing using Audiocodes Mediant 1000 gateways RRS feed

  • Question

  • Hi,

    I am doing a fail-over routing configuration in Lync. I have 2 sites, "Site1" has GW1 and GW2 deployed, and Site2 has a single GW3 deployed. Each Site is responsible for their own users. All GWs are Audiocodes Mediant 1000, running the very latest 6.20.49.XX firmware. Lync is running CU4 at Site2, (CU3 at Site1) as CU4 had some Mediation server and fail-over routing improvements according the KnowledgeBase articles. I configured failover routing, so that only in case of GW3 at Site2 is unreachable/unavailable/busy should GW1 and GW2 be used for outbound calls of the people at Site2. Otherwise said, primary routes point to GW3 and secondary routes point to GW1/GW2 (GW1 and GW2 can be loadbalanced, as they are deployed at Site1 in the same datacenter).

    From time-to-time -however- users of Site2 start complaining, that calls went through GW1/GW2 even though GW3 is up and running at the time of the issue. 

    In the Lync monitoring logs I see the GW3 has been marked via the Mediation component as "Error 503 unavailable", and thats seems to be the primary reason why calls have been routed to the GW1/GW2 at Site1. Other than a firmware bug incorrectly answering SIP 503 I dont see what else may cause this failover logic even though the GW3 is reachable and didnt seem to reach its trunk capacity even at peak hours.

    Tuesday, January 31, 2012 10:00 AM

All replies

  • Sample log (from Site1)

     

    10503; source="CYLIM1511.region1.domain.com"; reason="Gateway responded with 503 Service Unavailable"; component="MediationServer"; SipResponseCode="503"; SipResponseText="Service Unavailable"; sip-reason="Q.850 ; cause=27"; GatewayFqdn="10.180.51.35"

    Sequence #1 12006 CYNIC1511.region1.domain.com OutboundRouting 12006; reason="Trying next hop"; source="CYNIC1511.region1.domain.com"; PhoneUsage="CY-Lync-Site1_Unrestricted"; PhoneRoute="CY-Lync-Site1_Unrestricted"; Gateway="10.180.32.5"; appName="OutboundRouting"
    Sequence #2 12006 CYNIC1511.region1.domain.com OutboundRouting 12006; reason="Trying next hop"; source="CYNIC1511.region1.domain.com"; PhoneUsage="CY-Lync-Site1_Unrestricted"; PhoneRoute="CY-Lync-Site1_Unrestricted"; Gateway="10.180.32.4"; msgwResponseCode="503"; msgwsource="CYNIC1511.region1.domain.com"; msgwreason="Gateway responded with 503 Service Unavailable"; msgwcomponent="MediationServer"; msgwSipResponseCode="503"; msgwSipResponseText="Service Unavailable"; msgwsip-reason="Q.850 ; cause=27"; msgwGatewayFqdn="10.180.32.5"; appName="OutboundRouting"
    Sequence #3 12006 CYNIC1511.region1.domain.com OutboundRouting 12006; reason="Trying next hop"; source="CYNIC1511.region1.domain.com"; PhoneUsage="CY-Unrestricted_failover"; PhoneRoute="CY-Unrestricted_Failover"; Gateway="10.180.51.35"; msgwResponseCode="503"; msgwsource="CYNIC1511.region1.domain.com"; msgwreason="Gateway responded with 503 Service Unavailable"; msgwcomponent="MediationServer"; msgwSipResponseCode="503"; msgwSipResponseText="Service Unavailable"; msgwsip-reason="Q.850 ; cause=27"; msgwGatewayFqdn="10.180.32.4"; appName="OutboundRouting"



    Tuesday, January 31, 2012 10:23 AM
  • 10503; source="CYLIM1511.region1.domain.com"; reason="Gateway responded with 503 Service Unavailable"; component="MediationServer"; SipResponseCode="503"; SipResponseText="Service Unavailable"; sip-reason="Q.850 ; cause=27"; GatewayFqdn="10.180.51.35"

    You will have to enable syslog tracing on audiocodes and set debug level to 5 in order to see what's happening and why gateway responds with 503 message.

    But according to http://www.vegaassist.com/download/docs/IN_18-Q850_cleardown_cause_codes_10.pdf the 27 cause is "destination out of order". Could it be you PRI is unavailable at some times?

     


    • Edited by Adminiuga Wednesday, February 1, 2012 4:58 AM
    Wednesday, February 1, 2012 4:57 AM
  • I see similar behavior at different customers as well, even in deployments where the PSTN gateway is connected to the PBX instead of hoing directly to a PSTN provider, so I doubt the Layer1/Layer2 connectivity of E1 trunk is wrong there.

    Unfortunately the issue affects only approx. 1% of calls and therefore it is very difficult to spot; gathering logs for a whole day would record 2-3000 calls, 99% of them without problems, and finding the 1% is a tremendous task :(

    Wednesday, February 1, 2012 1:37 PM
  • Hi,

    The following post has the similar issue, hope it can help you:

    http://social.technet.microsoft.com/Forums/pl-PL/ocsvoice/thread/e09dbe88-3477-4887-ac76-02e6b9c0dac5


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

    Thursday, February 2, 2012 7:14 AM
  • Unfortunately didnt seem to help.
    Saturday, March 3, 2012 12:27 PM
  • Unfortunately didnt seem to help.

    Have you installed the latest Hotfixes for Lync?

    regards Holger Technical Specialist UC

    Saturday, March 3, 2012 5:58 PM
  • There was no fix listed in any of the updates for any similar issue like mine :(

    I will deploy CU5 sometimes later, but dont expect this will fix my problem in any way.

    Sunday, March 4, 2012 6:36 AM
  • There was no fix listed in any of the updates for any similar issue like mine :(

    I will deploy CU5 sometimes later, but dont expect this will fix my problem in any way.


    Well, the 503 is comming from the gateway and there's 850 disconnect cause is supplied, so I don't think it is just a blip on the Lync side and a Lync update would fix it. If you would like more insight, you have to run logging on the gateway side. shouldn't be too problematic to locate the call in the trace, since from Lync monitoring log you should be able to get date and time of the failed calls.
    Monday, March 5, 2012 3:41 AM