I was woken up by our junior admins at 3am last night. Seems our ADFS farm of four nodes completely died.
We receive a bunch of validity errors relating to certificates upon service startup, get the below error and then the service stops. I have tried all the various powershell scripts to work with the certs but receive errors that the WID database rejected
the connection (more cert based errors?) although the WID services are running and accessible via SSMS.
About all I can successfully do is remove the role and restart from scratch which forces an overwrite on the existing database. Doesn't seem to be any path forward. I even grabbed the WID files, ported them to SQL server and even that connection attempt
fails out with the same error.
There was an error in enabling endpoints of Federation Service. Fix configuration errors using PowerShell cmdlets and restart the Federation Service.
Additional Data
Exception details:
System.ArgumentNullException: Value cannot be null.
Parameter name: certificate
at System.IdentityModel.Tokens.X509SecurityToken..ctor(X509Certificate2 certificate, String id, Boolean clone, Boolean disposable)
at Microsoft.IdentityServer.Service.Configuration.MSISSecurityTokenServiceConfiguration.Create(Boolean forSaml, Boolean forPassive)
at Microsoft.IdentityServer.Service.Policy.PolicyServer.Service.ProxyPolicyServiceHost.ConfigureWIF()
at Microsoft.IdentityServer.Service.SecurityTokenService.MSISConfigurableServiceHost.Configure()
at Microsoft.IdentityServer.Service.Policy.PolicyServer.Service.ProxyPolicyServiceHost.Create()
at Microsoft.IdentityServer.ServiceHost.STSService.StartProxyPolicyStoreService(ServiceHostManager serviceHostManager)
at Microsoft.IdentityServer.ServiceHost.STSService.OnStartInternal(Boolean requestAdditionalTime)