(2013-09-06) Duplicate SPN Breaks Trust Between Client/Server And Active Directory
Posted by Jorge on 2013-09-06
A colleague and I were setting up an ADFS test environment with ADFS v2 and ADFS v3. We used ADFS v3 as both the Identity Provider and Service Provider and we used ADFS v2 as the Identity Provider. ADFS v3 was installed on just one member server and it was configured with a custom service account and a custom ADFS service FQDN. The ADFS service FQDN was different than the server FQDN. The service FQDN was therefore registered on the custom service account as (without the quotes) “HOST/<ADFS Service FQDN>”. ADFS v2 was also installed on just one member server and it was configured with a custom service account and, with laziness as the main reason, we configured the ADFS service FQDN to be exactly the same as the server FQDN (hint: this is where it went wrong!).
So, by default every Windows Client and Windows Server registers, amongst others, the following SPNs on its own computer account: HOST/<NetBIOS Name Of Computer> (e.g. HOST/MYCOMPUTER) and HOST/<FQDN of Computer> (e.g. HOST/MYCOMPUTER.DOMAIN.COM). As a regular/best practice to get Kerberos working on an ADFS STS, you should register the SPN HOST/<ADFS Service FQDN> (e.g. HOST/ADFS.DOMAIN.COM) on the service account the ADFS service is running on..
In the ADFS v3 case we did it correctly. And in the ADFS v2 case, because of our laziness, we made a mistake without even being aware of it. As the ADFS Service FQDN matches the server FQDN you end up with “HOST/<ADFS Service FQDN>”, which matches “HOST/<Server FQDN>” on two different security principals. In the case of ADFS v2 you would find the SPN on both the computer account and on the ADFS service account. In other words, the cause of all this was duplicate SPNs, which in our case is a stupid mistake!
This is just to explain of what happened and what the cause was upfront. ADFS in all this is just an example and is not directly related to the error or cause. All of this was done on W2K12, but it also applies to all other Windows OS’s!
We installed the Windows Server for ADFS and joined it to the AD domain. So far so good and everything is fine. We installed ADFS and performed the necessary configurations. We rebooted the server and suddenly we were surprised with the error “The trust relationship between this workstation and the primary domain failed” as shown in figure 1. What the heck!
Figure 1: “The Trust Relationship Between This Workstation And The Primary Domain Failed”
We saw this and our first reaction, honestly, was WTF!. We disjoined the server from the AD domain, reset its computer account password and rejoined it to the AD domain. So far so good, no errors at all. When trying to logon, we got the same error again as shown in figure 1. We disjoined the server from the AD domain again, reset its computer account password and rejoined it to the AD domain. Again, so far so good, no errors at all. When trying to logon, again we got the same error as shown in figure 1. OK, something is wrong! No shit, Sherlock!?
On another server I executed the following PowerShell command line which leverages this script.
.\Search-EventLog-For-String.ps1 -listOfServers R1FSMBSV2.ADCORP.LAB -eventLogName Security -stringToSearchFor "An Error occured during Logon"
Figure 2a: “An Error Occurred During Logon” Event When I tried To Logon
Figure 2b: “An Error Occurred During Logon” Event When I tried To Logon
The next event proves Kerberos is not being used at all and instead NTLM is being used. This is because I’m running the script on a server other than the failing server and the script is contacting the failed server to read its event log by using NTLM authN. Remember, Kerberos authN is broken!
.\Search-EventLog-For-String.ps1 -listOfServers R1FSMBSV2.ADCORP.LAB -eventLogName System -stringToSearchFor "NTLM authentication"
Figure 3: Detection Of NTLM AuthN Being Used Instead Of Kerberos AuthN
Just to be sure and based upon the errors found above, we decided to scan the DCs in the same site as the failing server for “KDC_ERR_S_PRINCIPAL_UNKNOWN” errors.
.\Search-EventLog-For-String.ps1 -discoverDC LOCALSITE -eventLogName System -stringToSearchFor "KDC_ERR_S_PRINCIPAL_UNKNOWN" -table $true
.\Search-EventLog-For-String.ps1 -discoverDC LOCALSITE -eventLogName System -stringToSearchFor "KDC_ERR_S_PRINCIPAL_UNKNOWN"
Figure 4: Detection SPN Issues
Based on the previous output, we scanned the DCs in the same site as the failing server for “DS_SERVICE_PRINCIPAL_NAME” errors.
.\Search-EventLog-For-String.ps1 -discoverDC LOCALSITE -eventLogName System -stringToSearchFor "DS_SERVICE_PRINCIPAL_NAME" -table $true
.\Search-EventLog-For-String.ps1 -discoverDC LOCALSITE -eventLogName System -stringToSearchFor "DS_SERVICE_PRINCIPAL_NAME"
Figure 5: Duplicate SPNs Detected On Multiple AD accounts
As an additional check, we used SETSPN to determine if duplicate SPNs exist at all.
Figure 6: Accounts Detected In The AD Domain With The Same SPN Registered (By Using SETSPN)
Querying AD for the SPN (without the quotes) “HOST/R1FSMBSV2.ADCORP.LAB”, resulted in showing the information in figure 7 to see which accounts have been affected by having the same SPN registered!
Figure 7: Accounts Determined That Have The Same SPN Registered
The solution in this case was to change the SPN on the ADFS service account. First, we changed the ADFS Service FQDN and reregistered that SPN on the existing service account. We also removed the old SPN from the the service account. In ADFS terms we also had to change the ADFS Service Communication Certificate (just mentioned as a side note)
The moral of the story? Duplicate SPNs break stuff!
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always evaluate/test yourself before using/implementing this!
* DISCLAIMER: http://jorgequestforknowledge.wordpress.com/disclaimer/
############### Jorge’s Quest For Knowledge #############
######### http://JorgeQuestForKnowledge.wordpress.com/ ########