none
EventID's 70, 68, 67, 6 - CertificateServicesClient-CertEnroll

    Question

  • Since upgrading from Windows Server 2003 Certificate Server to a 2012R2 AD CS server, we have seen a few anomalies. It started on Desktops, but now I have seen it on servers also. We get a series of errors in the application log:

    70 - Certificate enrollment for Local system failed because no valid policy can be obtained from policy servers with ID {DFFA2E01-D71F-4A3F-8414-2D5AB6EEA4B4}

    68 - Certificate enrollment for Local system failed in authentication to policy servers with ID {DFFA2E01-D71F-4A3F-8414-2D5AB6EEA4B4}  (A specified logon session does not exist. It may already have been terminated. 0x80070520 (WIN32: 1312))

    67 - Certificate enrollment for Local system failed to load policy from policy servers with ID  A specified logon session does not exist. It may already have been terminated. 0x80070520 (WIN32: 1312) ()

    6 - Automatic certificate enrollment for local system failed (0x80070520) A specified logon session does not exist. It may already have been terminated.

    It is followed by a Kerberos Security-Kerberos error 4 in the System Log for the host. At this point it is impossible to login to the server with a domain account. The resolution is to reboot the system, which I have been fortunate enough so far to have happened off hours for the servers.

    During research, there have been indications this is related to time sync issues, but I have found no evidence of this. I won't rule it out completely, but the servers are virtualized and synced to the host, if one was off, I would expect the host to be off along with all the other servers.

    I am also curious what this is referencing "policy servers with ID {DFFA2E01-D71F-4A3F-8414-2D5AB6EEA4B4}"? I have searched the registry and the adsi edit with no luck finding this guid. If it refers to the old server, where is the information stored for the new server?

    I hope to figure this out before a critical server experiences the issue during business hours.

    Sunday, July 06, 2014 10:03 PM

All replies

  • Hi Tim,

    Please check the CA server and find if the Delta CRL list has expired, If so, please try to renew the CRL list.

    A similar discussion for your reference:

    Error OCS 2007 - OCS 2007 R2 and client Communicator

    If you have any feedback on our support, please click here.

    Best Regards,

    Anna Wang

    TechNet Community Support

    Monday, July 07, 2014 9:41 AM
  • I checked and they are all current and I don't have any revoked certificates to actually publish.

    When I upgraded the CA from 2003 on a domain controller to standalone on 2012R2, I did not install the Certificate Enrollment Web Services. I wanted it to act like the 2003 server and did not feel the need to introduce the new service. As I continued to dig, I found the ID from the error using Get-CertificateEnrollmentPolicyServer. The URL seems odd to me, but I have been unable to find a real world example, just references to TechNet articles for the Enrollment Web Services that I am not using.

    Get-CertificateEnrollmentPolicyServer -context user -scope all

    Id                      : {DFFA2E01-D71F-4A3F-8414-2D5AB6EEA4B4}
    Url                     : ldap:
    AuthType                : Kerberos
    RequireStrongValidation : True
    AutoEnrollmentEnabled   : True
    IsDefault               : True
    Priority                : -1
    Context                 : User

    As far as I can tell, the URL is not being pushed explicitly by Group Policy, but I do have Automatic Certificate Management enabled. Unfortunately, there are no explicit URL's defined with that policy.

    Monday, July 07, 2014 6:39 PM
  • This ldap policy is "classical RPC/DCOM auto-enrollment" in Windows 2008 R2 (or higher) speak - as opposed to the new CEP/CES services could used HTTP policies. So this error could be read as: Autoenrollment over RPC/DCOM failed.

    I would focus on the Kerberos error. Since RPC/DCOM based certificate enrollment cannot fall back to another authentication protocol often Kerberos errors show up here first while they go undetected with other services.

    Since the migration seemed to have triggered this and you get Kerberos error 4 (I assume it is this error) I would check if there is any issues with the SPN of the new CA machine's computer account. Did you migrate to a new machine and is the old machine's name in its SPN? Or does a duplicate SPN exist for the new machine (that didn't exist for the old one).

    Use adsiedit or ldp to the check the CA server's SPNs - they should read HOST/[CA_DNSName] and HOST/[CA_NetBIOSName] and search the forest for another user or machine with these SPNs or the more specific SPNs like RPCSS/[CA_DNSName]

    What happens is the the server (the CA in this case) cannot decrypt the ticket as it had been encrypted for another server - and the most common error here is that the KDC was not able to locate the targeted (CA) service properly because of SPN issues.... Duplicate or missing SPNs, sometimes triggered by DNS issues are quite common Kerberos errors.

    Edit / Added:

    Another place to look for issues with names (that could lead to SPN issues) is the attribute dnsHostName of the object under

    Configuration Container/Services/Public Key Services/Enrollment Services

    Here is an old but good Technet article on Kerberos troubleshooting, check the section on SPNs in particular.

    Elke



    Monday, July 07, 2014 8:49 PM
  • The Kerberos error 4 you mentioned was the one I am seeing, but it seems to apply only to the HOST object of the client, no service was mentioned in the error. I didn't see anything out of the ordinary, but I am continuing to investigate.
    Wednesday, July 09, 2014 12:40 PM
  • I didn't find any duplicate use LDP and I didn't find any RPCSS services registered. Is there any spn's besides HOST that a CA should register?

    Wednesday, July 09, 2014 1:44 PM
  • The CA machine should only have HOST SPNs as HOST "includes" also RPCSS (same logic as for HTTP for example which is also included in HOST). But since RPCSS is more specific another machine with that SPN would overrule the CA.

    Wednesday, July 09, 2014 2:46 PM
  • I haven't found any duplicates. I have had three more servers exhibit this behavior since I started this thread. I did find a time sync issue with my DR site, it was off by 75 seconds. The newly built domain controllers were using Hyper-V synchronization to the hosts so based on the most recent documentation I could find, disabled that and now they are syncing properly to the PDC at the primary site. Physical Member servers were still showing the decommissioned DC's as there time source. Corrected that also.

    Most of the servers exhibiting the behavior have been at the primary site, which hasn't experienced time sync issues.

    Monday, July 14, 2014 3:56 PM
  • If it is not an issue with SPNs it might be an issue with the CA machine not being able to update its computer password ... so that the KDC would also encrypt tickets with the wrong shared secret.

    Here is a thread related to the same Kerberos error on an enterprise CA - in this case the issue was resolved by re-joining the CA to the domain (using a shortcut via netdom to avoid restoring the CA ... not sure if this will work for 2012).

    The root cause is still unclear - so even if this works errors may show up again when the computer password is due in 30 days. In this thread an issue with a virtual network adapter seems to have been responsible for failed updates of computer passwords - so if you changed the CA's virtual network adapter when you did the upgrade or migration that probably something to be checked.

    Monday, July 14, 2014 4:59 PM
  • The problem does appear to be related to the computer passwords. All the systems that I have checked had the password reset time right before we started experiencing issues.

    Yesterday I checked when several of my computers were due to reset their passwords. I discovered that my PDC was due at 10 pm. I logged in around 10:30 and it thought it showed it was reset at 10:21. No issues. Around 4:56 AM, several computers started reporting errors connecting to the PDC. All Kerberos ID 4 errors. I had to reboot it to recover and discovered that it now showed the password was reset at 4:50 AM. I may have misread the date last night, but that was specifically what I was looking for. Why would it have reset twice in a 7 hour period?

    Also, yesterday we had a desktop with the issue. When I checked it's last password reset, it was at 6:15 AM right after it auto booted. Three other computer also reset their passwords at the exact same time and none of them experienced the issue. Why some machines and not others?

    The CA was built from scratch and the backup from the old CA was migrated to it. This was not an in place upgrade. I was jumping from 2003 R2 to 2012 R2. The old CA was collocated on the domain controller. This was part of a larger project to upgrade all of the domain controllers to 2012 R2. We are still running at a domain functional level of 2003, but are running completely on 2012 R2 servers.

    Friday, July 18, 2014 7:25 PM
  • Maybe the client machines that managed to have their passwords reset correctly contacted a "good" Domain Controller, while the other ones talked a DC that itself has issues with its password?

    Here is an article about resetting the computer pw. I would probably try with a client first and reset the passwort by targeting netdom to different DCs - in order to check if there are differences between DCs.

    The article also has a procedure for fixing issues with DCs themselves - I never used that so try at your own risk (or probably open a case with MS before you start tinkering with the passwords of the DCs).

    As for the root cause, I can only guess: Did you have W2K3 DCs and Win 7 / Server 2008 R2 machines in parallel at some point of time in your migration - so that you might have run into this issue? Did you migrate machines by restoring virtual machines or virtualizing machines that may have "missed" a computer password update?

    Sunday, July 20, 2014 2:49 PM
  • I had started forcing machine password resets using both reset-computermachinepassword and netdom. Sometimes it worked and sometimes it required a reboot, but it was on my terms and not the middle of the day. Unfortunately, when they are in this state, they don't update without a reboot.

    I opened a case with MS about a week ago. This morning they came back with and analysis of tests they had me perform:

    The logs above indicates that the server cannot find a system key of type 0x12. The key type 0x12 means 18 in decimal which is KERB_ETYPE_AES256. Please understand that when we log onto a server/workstation, we are in fact getting a service ticket to the server/workstation we are trying to log onto. So the Kerberos error “KRB_AP_ERR_MODIFIED” in our case means that a system key of Etype AES is used to encrypt the service ticket but when the server tries to decrypt this, it is not able to find an AES key. Then we received a logon error 0xc000006d which means an invalid logon, please see below.

    Based on my experience, the symptom in our case may occur in one of the following scenarios. And yes, your sense is right that it may be related to the machine password change. Here please allow me to explain more detailed information for you.

    • Have disabled the AES Etype on the problem machine previously and then enabled it again
    • Have performed a domain upgrade from 2003 to 2008 or later recently (AES Etype is only supported in windows server 2008 or later)

    The machine hashed password is store both in the machine and in AD. For one machine, there may be several hashed passwords which are encrypted by different Etype. If the AES Etype is disabled or unavailable on the client machine, there is not a AES-encrypted password for the client machine stored both in the machine and AD. If the AES is enabled or available then, when the machine changes its password, the DC realizes that the client machine supports the AES Etype now so it will create a AES-encrypted password for the machine. However, the AES-encrypted password will not be available at the time the AES is enabled or available. The machine must reboot to create the AES-encrypted password for itself.

    This description seems to fit with what I have seen, but now I just need to know if it will persist or this is a one time thing for all computers? Are there other steps I have to take to get it resolved. Hopefully MS will have some answers.

    Monday, July 21, 2014 5:16 PM
  • I have encountered something like this (incompatible encryption, etype) only once - with Windows 2003 DCs.

    Back then the solution was to set a registry key at the affected DCs  - I think this is the updated version of the related KB article.

    Thanks for the detailed update!

    Monday, July 21, 2014 10:13 PM
  • This recommendation from MS surprised me, I didn't see it in any of the documents I researched when preparing for the domain upgrade:

    Generally, after upgrading domain, we always suggest
    disjoining and rejoining the member servers/workstations to domain to update
    the system key so that we can avoid potential authentication issues.

    If this is the recommended course of action, I can see why so many enterprises would still be on 2003.

    Here is a good article they provided on encryption types. 

    http://blogs.msdn.com/b/openspecification/archive/2011/05/31/windows-configurations-for-kerberos-supported-encryption-type.aspx<o:p></o:p>


    Tuesday, July 22, 2014 11:59 AM