none
SChannel - Help with Error # 20 (Event ID # 36888) RRS feed

  • Question

  • Was hoping somebody could help me understand what's causing some SChannel error 20 events I'm seeing in system event logs.

    Running Server 2008 R2 as IIS web servers, have a commercial wildcard SSL certificate in use on multiple sites and we use IIS Crypto's "best practice" settings.

    Majority of our customers, monitoring apps and SSL labs report no issues with HTTPS, however we have one customer with a data-center hosted application which sometimes connects flawlessly, yet other times causes our server to generate fatal alert 20 and reset the connection before it even reaches IIS.

    Can't see any pattern to these issues and very little of the discussion online about error 20 seems to fit here as it mostly relates to invalid server certificates, low-level development with SSL or other "consistent HTTPS failure" scenarios while ours is more intermittent.

    Reading up on error 20 suggests it should be indicate a "bad record mac", where I'm reading the mac to be a checksum of the SSL message suggesting the message may be incomplete, altered or incorrectly signed -- but not being an expert on either schannel or crypto I could be misunderstanding what this means.

    Attempted to find more detail regarding the internal error state value, with very little luck.

    Tried enabling SChannel logging for errors and warnings (3), but that's not provided any more detail before or after this event.

    Right now I'm not entirely sure what's causing the problem which makes it even harder to look at solutions, so if you have any questions or need more detail let me know, will try and keep an eye on this for the next few days.

    - T

    Log Name: System
    Source: Schannel
    Date: [removed]
    Event ID: 36888
    Task Category: None
    Level: Error
    Keywords: 
    User: SYSTEM
    Computer: [removed]
    Description:
    The following fatal alert was generated: 20. The internal error state is 960.


    Tuesday, February 3, 2015 2:46 PM

Answers

All replies

  • Hi twrty,

    The alert 20 is “TLS1_ALERT_BAD_RECORD_MAC”, possibly a bad SSL implementation, or payload has been tampered with e. g. FTP firewall rule on FTPS server, please check the firewall settings.

    The related KB:

    Message Authentication Codes in Schannel

    https://msdn.microsoft.com/en-us/library/windows/desktop/aa378743(v=vs.85).aspx

    Schannel Error Codes for TLS and SSL Alerts

    https://msdn.microsoft.com/en-us/library/windows/desktop/dd721886(v=vs.85).aspx

    I’m glad to be of help to you!


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com

    Friday, February 6, 2015 6:07 AM
    Moderator
  • Good to know we definitely have an issue with the MAC, I can look into whether anything between our firewall and servers would be tampering with encrypted data but it seems unlikely as that should make the issue more consistent.

    I'm guessing there's no way to break down why the MAC failed in more detail?

    Given I can only pull diagnostic details from our environment and not the clients is there anything useful I could do to diagnose this further?

    Have firewall logs, but they can't really tell us much about tampering without a matching log from the client.

    Working on acquiring traceroutes/pingroutes but that needs to be done while it's failing which has been tricky as we haven't been notified of any large failure windows.

    If we could packet capture the failed messages at the firewall, would there be any way to replay or analyse them inside an app (eg. wireshark) and get a more detailed breakdown of what failed given we have access to the certificate and private keys?

    Friday, February 6, 2015 9:38 AM
  • Hi twrty,

    This error can caused by many reasons, typically reason I experienced such as ,Incorrect certificate bind with HTTPS Port 443, enabled Cert Authentication wrong certificate was used ,certificate on TMG server is revoked and has not validity, SSL handshake failures between client and server also can cause these events, please check all this above conditions and disable the port 443 related security of your firewall then monitor again.

    The similar thread:

    Certificate Services - can't connect using SSL

    https://social.technet.microsoft.com/forums/windowsserver/en-US/091a3222-641b-43a3-ae19-6cc238828950/certificate-services-cant-connect-using-ssl

    Error schannel

    https://social.technet.microsoft.com/Forums/windowsserver/en-US/dc661a87-d78a-4398-96d8-e3659d26f282/error-schannel

    I’m glad to be of help to you!


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact tnmff@microsoft.com

    Tuesday, February 10, 2015 9:07 AM
    Moderator
  • Hi Alex,

    Can you please tell me why I see this system entry?  And what may be causing it? 

    Please suggest/advice at the earliest.

    Thanks!

    - <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
    - <System>
      <Provider Name="Schannel" Guid="{1F678132-5938-4686-9FDC-C8FF68F15C85}" /> 
      <EventID>36888</EventID> 
      <Version>0</Version> 
      <Level>2</Level> 
      <Task>0</Task> 
      <Opcode>0</Opcode> 
      <Keywords>0x8000000000000000</Keywords> 
      <TimeCreated SystemTime="2016-07-05T07:40:04.867986800Z" /> 
      <EventRecordID>87794</EventRecordID> 
      <Correlation /> 
      <Execution ProcessID="636" ThreadID="688" /> 
      <Channel>System</Channel> 
      <Computer>KESEAMDB01.kes-gse.local</Computer> 
      <Security UserID="S-1-5-18" /> 
      </System>
    - <EventData>
      <Data Name="AlertDesc">20</Data> 
      <Data Name="ErrorState">960</Data> 
      </EventData>
      </Event>

    Regards,

    Pradeep

    Wednesday, August 3, 2016 6:36 AM
  • Can anyone kindly suggest me because it's affecting Production users.

    Thanks in advance.

    Regards,

    Pradeep

    Thursday, August 4, 2016 6:25 AM
  • Hi everyone,

    we're experiencing the exact same issue that twrty has.

    We also have a Windows Server 2008 R2 Machine running IIS with a Website using SSL on it.

    We get no complaints from the customers browsing this site with their browsers.

    However, there is a SOAP Service - in the same IIS Web Site, using the same binding as the clients - that is being used by another server to communicate directly server-to-server.

    The other system sometimes (rarely, but enough to cause problemns) get errors when trying to reach our SOAP Service, their error reads 

    Error observed by underlying BIO: Connection reset by peer"#=12Detail: SSL_connect error in tcp_connect()

    At the exact same times, we always get the SChannel error 36888, alertNumber 20 (TLS1_ALERT_BAD_RECORD_MAC):

    Es wurde eine schwerwiegende Warnung generiert: 20. Der interne Fehlerstatus lautet: 960.

    We are pretty sure that there is no general problem with the SSL setup on our IIS, since it's been running well for roughly 3 years, and we get no complaints from the customers using this site. And even the other server can reach our Web Service just fine most of the times, just that few times it fails.

    We are just the programmers for the Software that includes the SOAP server and have been called to help. Unfortunately, both servers are hosted at another company, so we have to ask them about firewall settings and the possibility of something tampering with the packets.

    Does anybody have another idea? Is there maybe an IIS setting we could change to make the MAC check less stringent which might help in resolving the issue?

    Best,
    Sascha
    • Proposed as answer by kit123 Tuesday, August 9, 2016 6:43 PM
    • Unproposed as answer by kit123 Tuesday, August 9, 2016 6:43 PM
    Monday, August 8, 2016 10:20 AM
  • I'm investigating an issue with my own client software, talking to a Jetty server that is reporting the same set of errors client-side, and again, intermittently.

    I've narrowed it down to SSL inspection occurring in the local firewall, client-side.

    I get it with the Windows client using either Kaspersky or BitDefender, and can stop the errors occurring if I disable inspection of SSL traffic.

    Hopefully you will find the same scenario, and can reproduce the 'fix'.

    If so, then of course, like me, you will of course not have found the ultimate fix, as I'm sure your customers won't want to hack their Internet Security software setting just for you.

    ...and you will be just as stumped as to why does the failure occur only intermittently, even with identical REST/SOAP requests (in my case HTTPS posts) and is the security suite actually blocking the transaction intermittently, or, is it it always allowing it, but the client API is simply, sometimes, unable to handle some synchronicity in the SSL inspection. 

    Do post back if this helps with the next step of your diagnosis, and if so, good luck finding the client-side workaround ...I'm at the point of giving up and simply re-attempting each HTTPS post a few times when I get such an error!


    Tuesday, August 9, 2016 6:53 PM
  • Hello,

    thank you for your suggestions!

    Unfortunately our access to the firewall + antivirus software of the other web application is very limited - they didn't even tell us which products they use, just that they are sure their firewall isn't tampering with our packets. What I know is that they have a unix based environment and are unlikely to have Kaspersky or BitDefender running on their servers or routers.

    As I mentioned, other clients (probably using all different sorts of AV software) do not report any problems. From the computers I use I can certainly say that GData and Kaspersky did not cause any problems for me.

    But there's been an interesting new information: The problems started right after we installed Windows Updates on 2016/07/20. That seems a bit too much of a coincidence.

    I will check all updates installed on this occasion whether anything seems to relate to SSL/TLS connections. I will post here if I find anything.

    Best,
    Sascha

    Wednesday, August 10, 2016 11:52 AM
  • Okay, there's an interesting one: 

    https://support.microsoft.com/en-us/kb/3161639

    This one introduced two new cipher suites to schannel:

    TLS_DHE_RSA_WITH_AES_128_CBC_SHA
    TLS_DHE_RSA_WITH_AES_256_CBC_SHA

    I'm not sure whether one of the new cipher suites has been used in our case, but it might. I could also imagine this update messing with already implemented cipher suites in some way.

    I will consult with my client and suggest to try uninstalling this update. I will keep you posted whether this helps.

    Best,
    Sascha

    Wednesday, August 10, 2016 1:33 PM
  • Hi everyone,

    good news, you can stop analyzing your firewalls ;-)

    Uninstalling the Update solved it. We get no more sChannel Events 36888 with AlertNumber TLS1_ALERT_BAD_RECORD_MAC (20) after doing so.

    The Update you actually have to uninstall is KB3161608 (https://support.microsoft.com/de-de/kb/3161608). The above mentioned addition of two new cipher suites to sChannel (https://support.microsoft.com/en-us/kb/3161639) is part of that update.

    Uninstalling it can be done in "Programs and Features" > "Show Installed Updates" > right click on the Update > "Uninstall". Reboot is required.

    Best,
    Sascha

    Friday, August 12, 2016 8:20 AM
  • I don't recommend uninstalling security updates if another workaround exists especially if the update is a rollup.

    I created a GPO as described in https://support.microsoft.com/en-us/kb/3161639 and removed the 2 new cipher suites added by the update.

    1. Go to Computer Configuration > Administrative Templates > Network > SSL Configuration Settings.
    2. Under SSL Configuration Settings, select SSL Cipher Suite Order.
    3. Remove TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA from the list.

    This solved our problem until we can get the ciphers updated on our remote service.

    Tuesday, August 23, 2016 2:28 PM
  • Thank you! This is the exact issue we are caught up in, reordering the cipher suite didnt work.

    @MattBarlak: In your comment you mentioned removing the two new cipher suites resolved the issue and go on to add that "until we can get the ciphers updated on our remote service."

    Do you mean this problem is because the client is not having the same cipher suite during handshake and if that is right how would the two cipher suites be reinstated?

    If the client were to be unix/linux or aix would the appropriate update of the cipher suites at the client end resolve the issue.


    • Edited by Prabhats Saturday, July 21, 2018 3:39 AM
    Saturday, July 21, 2018 3:38 AM