NDES
6 TopicsNDES and the dreaded 2 & 10 Event ids stating “The parameter is incorrect"
Hey Guys Rob here again, today I am going to go over a set of typical Network Device Enrollment Service Event ID’s that you will inevitably encounter if you are maintaining an environment with NDES installed. These two events always seem to run together and can be seen in the Application Event log. Log Name: Application Source: Microsoft-Windows-NetworkDeviceEnrollmentService Date: DATE_TIME Event ID: 2 Level: Error User: NDES_APPLICATION_POOL_IDENTITY Computer: NDES_SERVER_COMPUTER_NAME Description: The Network Device Enrollment Service cannot be started (0x80070057). The parameter is incorrect. Log Name: Application Source: Microsoft-Windows-NetworkDeviceEnrollmentService Date: DATE_TIME Event ID: 10 Level: Error User: NDES_APPLICATION_POOL_IDENTITY Computer: NDES_SERVER_COMPUTER_NAME Description: The Network Device Enrollment Service cannot retrieve one of its required certificates (0x80070057). The parameter is incorrect. As you can see, although it is nice to see errors about a service or application, it does you no good if there is not enough information available to make something actionable about the event. Hopefully, this is where this blog will be helpful to all of you. The above errors happen for one of three common reasons. Access to the Private Keys for one or both Registration Authority (RA) certificates is not possible by the application pool identity account running the SCEP (Simple Certificate Enrolment Protocol) application pool. One or both RA certificates were NOT issued by the Certification Authority for which NDES is configured to forward Certificate Service Requests (CSR). The RA certificates are failing revocation checks. This means that either its certificates or one of the CA (certification authorities) certificates in the chain are failing revocation checks for some reason. If you try and access either the /CertSrv/MSCEP/MSCEP.dll or /CertSrv/MSCEP_Admin endpoints on the NDES Server you will also see an HTTP 500 error as well. Missing Private Key permissions Below are the steps for the first scenario to validate / add the application pool identity account. Private key Permissions: First, make sure you gave the Application Pool Identity account permission to the private keys on the newly issued certificates. Run: CertLM.msc Expand: Certificates - Local Computer\Personal\Certificates Click on the new certificate, and then right click on it, and select "All Tasks" Click on "Manage Private Keys" You should see the Permissions dialog box. Click the Add button, and type in the account running the SCEP Application Pool. Click the OK button. This account only requires "Allow" "Read" permissions. Once the permissions have been configured, click the OK button. Do this for all certificates that were recently renewed / issued. Open an elevated command prompt and type: IISReset Then, test accessing the website. If you are unsure what account is being used for the SCEP application pool, you can find this out by doing the following: Run: InetMgr.exe Navigate to: SERVERNAME\Application Pools Find the application pool named SCEP, and then look at the Identity column. NOTE: If the NDES role was configured with a non-domain service account and it is leveraging the Application PoolIdentity, please understand the ApplicatonPoolIdentity, and NetworkService are not the computer account. You will need to add NetworkService to the private key permissions for these two accounts. These accounts have very restricted rights on the system itself. Registration Authority certificates issued by wrong CA. The second scenario can only happen in a situation where you have more than one Certification Authority in the environment, where you have renewed the Registration Authority certificates, and one or both certificates were NOT issued by the Certification Authority that NDES is sending the certificate service requests to. The first thing we need to determine is what CA issued the two NDES certificates. Run: CertLM.msc Expand: Certificates - Local Computer\Personal\Certificates Look at the Issued By column for the current RA certificate. This will tell us the CA that issued the certificates. To find out the CA NDES is configured to use run: Regedit.exe Navigate to: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MSCEP\CAInfo The registry value named Configuration shows you the CA computer \ CA Name that NDES is using. Validate that this is the CA that issued both RA certificates. If not, delete the certificates not issued by this CA and enroll again. NOTE: If you are using the MMC (Microsoft Management Console) to do the enrollment, you can specify the CA to use when you are filling out the information. You would click on the Certification Authority tab and select the CA to use. After procuring the NDES certificates from the correct CA, you must perform an IISESET from an elevated command prompt. One or both RA certificates are failing chaining or revocation checks. The third scenario is a bit trickier as most customers are not familiar with CAPI2 operational logging and how to interpret the data being provided. I am going to concentrate on looking at the NDES RA certificates to determine if they are failing a revocation check. By no means is this meant to be an exhaustive guide on how to use CAPI2 to troubleshoot chaining or revocation checking failures. The first two problems usually show themselves once the NDES has been in place for one or more years, and it failed just after replacing the existing NDES certificates. So, if everything was working before replacing the RA certificates, please review the two previous scenarios before jumping to an issue with certificate chaining or revocation checking. What is certificate chaining? Certificate chaining refers to the computer being able to take an end entity certificate and follow the chain all the way up to a Root CA certificate that is in the Trusted Root store of the computer. If the certificate cannot be chained to a Root CA certificate, then the certificate would not be considered a ‘Trusted’ certificate since the computer does not trust the root CA that issued the entire certificate chain. What is revocation checking? All certificates except Root CA certificates have a field on them called CRL Distribution Points (CDP). This lists different URLs that host a file known as a Certificate Revocation List (CRL). This file lists all issued certificates that, for several reasons, the CA Manager decided should no longer be trusted, and they wanted the world to know about this fact. Just like certificates, these CRL files have a finite lifetime restricting how long they can be used. Once the CRL’s Next update value has been reached it is no longer trusted, and the computer MUST download the newest CRL at that time. If it fails to download the CRL because the URL is not reachable, or the CRL has not been updated at the URL, then it will generate a revocation check failure. When the revocation check fails, the computer can no longer trust the certificate it was trying to use. In this case, it would mean that NDES would not trust the RA certificate and thus NDES would fail to start / run. Keep in mind that in a two-tier PKI hierarchy where you have an Offline Root CA and an Online Enterprise Issuing CA that issued the RA certificates, the following checks will happen: The RA certificate is going to be looked at to find the download locations of the Online Enterprise Issuing CA’s CRL. Then, we will validate that the RA certificate has NOT been revoked by the Online Enterprise Issuing CA Manager. The Online Enterprise Issuing CA’s CRL certificate is going to be looked at to find the download locations of the Offline Root CA’s CRL. Then, we will validate that the Online Enterprise Issuing CA certificate has NOT been revoked by the Offline Root CA Manager. As you can see, this can get complicated quickly depending on how many tiers the PKI hierarchy has within the environment. Enabling logging and data collection. First, we must enable CAPI2 Operational Event logging on the NDES server in question. Run: Eventvwr.msc Navigate to: Event Viewer\Applications and Services Logs\Microsoft\Windows\CAPI2\Operational Right click on Operational log, and we are going to do two things: Increase the log file size to at least 10000. A 1 MB log file is not going to be enough, and if the server is busy 10 MBs might not be enough, but it is a start. Enable the log. Click the OK button. A few commands need to be run in an elevated command prompt. IISReset This makes IIS (Internet Information Services) reload / reset and will cause the NDES application pool to try and load the certificates again once someone hits either the /CertSrv/MSCEP/MSCEP.dll or /CertSrv/MSCEP_Admin sites. b. CertUtil -SetReg Chain\ChainCacheResyncFileTime @Now This command tells CryptoAPI (CAPI) not to rely on the Crypto cache and instead attempt to access the real locations for AIA (Authority Information Access) and CDP locations. If these fail, it still allows the Crypto cache to be used so it will NOT cause an outage. It just helps with putting more error events in the log. c. CertUtil -URLCache * Delete This command tells CryptoAPI to delete everything in its FILE cache. This will NOT delete anything in its memory cache or per process Memory cache. 6. Lastly, try and access the /CertSrv/MSCEP_Admin page. You should see a HTTP 500 error, which is fine at this time, and you should also see the NetworkDeviceEnrollment events of 2 and 10 in the application event logs. Log Name: Application Source: Microsoft-Windows-NetworkDeviceEnrollmentService Date: DATE_TIME Event ID: 2 Level: Error User: NDES_APPLICATION_POOL_IDENTITY Computer: NDES_SERVER_COMPUTER_NAME Description: The Network Device Enrollment Service cannot be started (0x80070057). The parameter is incorrect. Log Name: Application Source: Microsoft-Windows-NetworkDeviceEnrollmentService Date: DATE_TIME Event ID: 10 Level: Error User: NDES_APPLICATION_POOL_IDENTITY Computer: NDES_SERVER_COMPUTER_NAME Description: The Network Device Enrollment Service cannot retrieve one of its required certificates (0x80070057). The parameter is incorrect. 7. Once the issue is reproduced, I suggest you return to the CAPI2 Operational Event log properties and disable it. We do not want other CryptoAPI calls happening on the server to push or overwrite the data in the event log. The CAPI2 event logs could have quite a few events in them. A good event ID based filter to start with is to only show the following events: 11,30,41-42,51-53,90 To find out what events are of interest for the Registration Authority (RA) certificates, you will want to do a Find in the event logs for either the thumbprint value of the certificate or the subject name of the certificate. Example of a common RA certificate subject name is: ADATAM-WEB01-MSCEP-RA The subject name is usually defined as the NDES servers name-MSCEP-RA. Once the certificate has been renewed at least you will want to validate the current subject name is of the Exchange Enrollment Agent (Offline request), and CEP Encryption based templates that are in use. CAPI2 Events of Interest It is well to understand a little bit about the different CAPI2 events that you are going to see in the event log that are related to chaining and revocation checking: Event ID 90 X509 Objects (X509Objects) - Shows all the certificates, CRL’s and OCSP (Online Certificate Status Protocol) Responses it was able to collect either via the certificate stores, or via CryptoAPI cache. This is a good event to review to see if the OS (operating systems) found all certificates in the chain or not. If you do not see a required certificate, then the chaining function will not succeed. This event also shows more detail about each certificate than the other events in the log. Event ID 11 Build Chain (CertGetCertificateChain) - Shows if the certificate chains to a valid root certification authority. In addition, it does revocation checks to see if all certificates in the chain succeed or fail their revocation check. Event ID 30 Verify Chain Policy (CertVerifyChainPolicy) - First thing with this event is to determine what Policy is getting verified. There are several types of policy checks that this event will check against (See CertVerifyChainPolicy link above for the list of policy checks). Given the policy it will either show success or failure. The policy check could pass and still show as a failure if Event ID 11 fails because of a revocation check failure you will see this same failure here. Event ID 41 Verify Revocation (CertVerifyRevocation) - Shows what CAPI2 knows about the status of the CryptoAPI cache in reference to revocation information. Event ID 42 Reject Revocation Information (CertRejectedRevocationInfo) - Shows that CryptoAPI cached data is being rejected as it is either stale or needs to go off system to get the latest CRL / OCSP response from the network. Event ID 53 Retrieve Object from Network (CryptRetrieveObjectByUrlWire) - Shows the status of attempting to access a specific AIA or CDP URI (Uniform Resource Identifier). It will give you the call status too. Example of troubleshooting with CAPI2 logging enabled First filter the collected CAPI2 event log with the following: 11,30,41-42,51-53,90 Click on Find and type something unique about the certificate. Either Subject Name or thumbprint value. We can see the first instance where the subject name is found, and it is shown as an error. When looking at these events you want to have the Detail, Friendly View when reviewing the entries. 4. Now when looking at the event, we will be interested in looking at multiple events in the logs to determine what is going on. Typically, in this type of problem, you want to look at the events in the following order: 90, 11, 30, and lastly 53. 5. We can see in event 11 that this is failing a revocation check. You will want to pay attention to the TrustStatus field in the Details section. The first TrustStatus is the overall TrustStatus. This tells you about the entire chain and specifically that one of the certificates in the trust path failed revocation. Below the overall TrustStatus, it will show each individual Chain Element (each certificate in the chain) in the certificate trust path and its TrustStatus. From looking at the above, we can determine that the ADATAM-WEB01-MSCEP-RA certificate is the one that is failing the revocation check. This means we need to look at the CA that issued this certificate and validate that its CRL is reachable and valid at the URIs (Uniform Resource Identifier) in question. If the PKI hierarchy has more CA’s, you may discover that the RA certificate is valid and that the Issuing CA’s certificate failed validation. If that is the case, it would mean that there is an issue with the Root CA’s CRLs (certificate revocation lists). 6. We are not going to look at the Event 30s as there are no policy checks that would be validated in the context of NDES RA certificates being valid or not. 7. Next we would typically jump right to Event 53s to see what might be going on with accessing the CRL / OCSP URLs. First thing to look at is the URL in the event. This tells us what path it is trying to access: The next part is we can see the HTTP Request and HTTP Response from the OCSP Server. It was an error of HTTP 503. This tells us there is an issue with the OCSP Server that must be addressed to resolve the NDES problem. 8. Another example is trying to access a Certificate Revocation List (CRL) file. It was able to successfully download the CRL file as evident by the following HTTP 200 (OK): 9. But after successfully downloading the CRL file we see Event 42 error. So, we can see that it is stating that the CRL at the HTTP URL path is no longer valid. Usually, when an HTTP 500 error is seen and is related to revocation checking, it is an unreachable or expired CRL. Most of the cases I have seen where this is the issue, it is that the Root CA’s CRL that has expired, and the customer has forgotten to boot the Root CA and publish the new CRL that gets created at service start.Common Network Device Enrollment Service (NDES) configuration wizard failures
Hey all! Rob Greene here. We see cases around Network Device Enrollment Service (NDES) failing to successfully complete. Please keep in mind that you can get these error messages outside of NDES installation, however we are not going to be covering those errors within this blog. This blog is going to concentrate on the assumption that everything is working fine in general with regards to issuing certificates within the environment, but the NDES configuration wizard is failing. The most often encountered errors by customers are: Access Denied RPC Communication AD CS Service Stop / Start Access Denied Message The first error message is the dreaded Access Denied error message while running through the wizard like the one below. Or if looking at the deployment operational logs: Event Viewer\Application and Services Logs\Microsoft\Windows\CertificateServices-Deployment\Operational Log Name: Microsoft-Windows-CertificateServices-Deployment/Operational Source: Microsoft-Windows-CertificateServices-Deployment Date: [Date/Time] Event ID: 104 Level: Error User: [DOMAIN\USER] Computer: [NDES Computer Name] Description: System.Exception: System.Exception: CMSCEPSetup::InitializeDefaults: Access is denied. 0x80070005 (WIN32: 5 ERROR_ACCESS_DENIED) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellCommandExecutor.Execute(Command command, IPowerShellEngine powerShellEngine, IRehydrator rehydrator) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellExecutablePR`2.ExecuteCommand(CommandParameter[] parameters) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.Operations.Initialize.Execute(PostConfigurationTaskData taskData) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.AsyncOperationP`1.DoWork(Object sender, DoWorkEventArgs eventArgs) There are several tasks that are happening when going through the configuration wizard, and most of these tasks require an elevated account. Due to this account elevation requirement, if Access Denied is being seen during the configuration, it will mean that the account running the wizard does not have the required permissions. Here is the list of tasks that are done: Modify the permissions on the certificate template named: IPSec(Offline request). It adds the Application Pool Identity account that was specified in the NDES configuration wizard with Enroll permissions for the template. Use the CertTmpl.msc console while logged in as the account used to run the NDES configuration wizard to try and set Enroll permissions on the template. Was it able to successfully set permissions on this Template? Modify the CertificateTemplates attribute on the CA's pKIEnrollmentService object. The object is in the Configuration partition (CN=CA NAME,CN=Enrollment Services,CN=Public Key Services,CN=Services,CN=Configuration,DC=Forest Root Domain) of the CA it is targeting. The following template names are added the CertificateTemplates attribute: IPSECIntermediateOffline, CEPEncryption, and EnrollmentAgentOffline. Use the CertSrv.msc console on the CA computer while logged in as the account used to run the NDES configuration wizard and try and add these templates to the CA. Was it able to successfully add the templates to the CA? Stop and Start of the Active Directory Certificate Services service on the certification authority (CA) computer. From the NDES Server, use Services.msc console and try and restart the AD CS service while logged in as the account used to run the NDES configuration wizard. Was it able to stop and start the AD CS Service? If any of these tasks fail, you will see the error message of Access Denied. So, the first thing to check is to ensure that the account used to run the NDES configuration wizard can do each of these tasks independently of the wizard. How RPC communications works. Remote Procedure Call (RPC) has two components. Endpoint Mapper – The endpoint mapper listens on port TCP 135. The point of the endpoint mapper is to have a database of each RPC based application (via UUID) and then know what high / ephemeral port the RPC application is listening on. RPC application / DCOM application - When a DCOM or RPC based application starts up, it finds an available high port (also known as an ) typically in the range of 49152 – 65535. Once it finds a port it then registers its RPC application (also known as a UUID) with the RPC Endpoint Mapper and its UUID. When an RPC / DCOM based client application wants to connect to the RPC/DCOM application it first contacts the RPC Endpoint Mapper and asks to be given the port number for the RPC/DCOM application via the UUID information. The endpoint mapper looks this information up and then returns the high port that the RPC / DCOM application gave it. Then the RPC / DCOM client application attempts to connect to the high port given to it by the RPC endpoint mapper. For more information on RPC and how it works see this: https://learn.microsoft.com/en-us/troubleshoot/windows-client/networking/rpc-errors-troubleshooting The RPC server is unavailable (RPC_S_SERVER_UNAVAILABE) – 0x800706ba / 1722 When not an Access Denied, this is the other most often seen error, when running the configuration wizard. The RPC Server is unavailable. 0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE) dialog box. The event log entry for this is going to look something like the below: Log Name: Microsoft-Windows-CertificateServices-Deployment/Operational Source: Microsoft-Windows-CertificateServices-Deployment Date: [Date/Time] Event ID: 104 Level: Error User: [DOMAIN\USER] Computer: [NDES Computer Name] Description: Microsoft.CertificateServices.Deployment.Common.NDES.NetworkDeviceEnrollmentServiceSetupException: Microsoft.CertificateServices.Deployment.Common.NDES.NetworkDeviceEnrollmentServiceSetupException: The Network Device Enrollment Service setup failed because certification authority (CA) "[CA COMPUTERNAME]\CA NAME" could not be contacted. Make sure that the CA is properly configured and available. The error is: The RPC server is unavailable. 0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellCommandExecutor.Execute(Command command, IPowerShellEngine powerShellEngine, IRehydrator rehydrator) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.NDESPSHProviderContext.Validate() at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.Operations.SetCAConfiguration.Execute(CAConfigurationParameters caInformationParameter) at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.DeploymentWizard.Common.ViewModels.CAConfigurationViewModel.Validate() This type of error will come from a few different scenarios. DCOM Permissions / Hardening Mismatch issues: Run the following command from the NDES Server and target the Certification Authority that is the specific CA the NDES server will be a proxy for. If you get back Access Denied, then you will have problems with DCOM permissions. CA Computer Name of: fab-rt-rootca01.fabrikam.com CA Name of: Fabrikam Root CA1 G2 CertUtil -Config "fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2" -ping See the following KB: KB5004442—Manage changes for Windows DCOM Server Security Feature Bypass (CVE-2021-26414) https://support.microsoft.com/en-us/topic/kb5004442-manage-changes-for-windows-dcom-server-security-feature-bypass-cve-2021-26414-f1400b52-c141-43d2-941e-37ed901c769c This could be from the DCOM Hardening setting being mismatched between the NDES Server and the Certification Authority. Ports being blocked by Firewalls A Firewall (hardware based or software based) is preventing RPC / DCOM communications between the NDES Server and the server running the Certification Authority Service. To see if this is an issue you can run the following CertUtil.exe command. CA Computer Name of: fab-rt-rootca01.fabrikam.com CA Name of: Fabrikam Root CA1 G2 CertUtil -Config "fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2" -ping When things are correct you should see output like this: Connecting to fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2 ... Server "Fabrikam Root CA1 G2" ICertRequest2 interface is alive (437ms) CertUtil: -ping command completed successfully. If this fails with “The RPC Server is unavailable (0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE))”, then connectivity from the NDES Server to the Certification Authority needs to be investigated. While running the above CertUtil command get double-sided network traces. Double-sided network traces means you will run a network tracing tool on the NDES Server and the Certification Authority at the same time. Look in the resultant traces and see if the required ports are leaving the NDES Server and successfully getting to the Certification Authority server. Service Control Manager times out waiting for AD CS Service to Stop and Start As stated earlier, the NDES configuration wizard needs to be able to successfully stop and start the AD CS Service on the Certification Authority server. If you can stop and start the service, you can still fail to configure NDES, if the AD CS Service cannot be stopped and started within a 30-second window. NDES stops and starts the service via the Service Control Manager (SCM) APIs. If you have ever attempted to stop/start a service and noticed it does not stop/start quickly, you might see a message stating that Service Control Manager cannot tell you if it the service was successfully stopped / started, as it did not report back in a timely fashion. Well, SCM will only wait 30 seconds for the service to return the status of the stop/start command it sent to it. SCM stops worrying about the service when it takes longer than 30 seconds. NDES first sends the stop command to SCM for AD CS, then uses SCM to find out when the service is successfully stopped. the start command to SCM for AD CS and again uses SCM to find out when the service is successfully started. We typically see this fail in the following two scenarios: The AD CS Service uses a Hardware Storage Module (HSM), and AD CS service does not start quickly because it requires the use of Operator Cards or communications with the HSM is latent. The AD CS Service just takes a long time to stop and start. This happens typically because an AD CS Auditing setting was enabled on the Certification Authority. The auditing setting is: Start and stop Active Directory Certificate Services. Launch CertSrv.msc Right click on the CA’s computer object and select Properties. Click on the Auditing tab. Uncheck “Start and stop Active Directory Certificate Services” Click the OK button. In an elevated command prompt type: Net Stop CertSvc & Net Start CertSvc Depending on how long the service takes to stop and start with either or both these issues, the Service Control Manager (SCM) can be modified to wait longer than the default 30 seconds. See this WIKI content. Event ID 7011: Service Timeout - TechNet Articles - United States (English) - TechNet Wiki (microsoft.com) Increase the service timeout period for Service Control Manager (SCM) The Service Control Manager will generate an event if a service does not respond within the defined timeout period (the default timeout period is 30000 milliseconds). To resolve this problem, use the Registry Editor to change the default timeout value for all services. To perform this procedure, you must have membership in the Administrators group, or you must have been delegated the appropriate authority. Caution: Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data. To change the service timeout period: Click the Start button, then click Run, type regedit, and click OK. In the Registry Editor, click the registry subkey HKLM\SYSTEM\CurrentControlSet\Control. In the details pane, locate the ServicesPipeTimeout entry, right-click that entry and then select Modify. Note: If the ServicesPipeTimeout entry does not exist, you must create it by selecting New on the Edit menu, followed by the DWORD Value, then typing ServicesPipeTimeout, and clicking Enter. Click Decimal, enter the new timeout value in milliseconds and then click OK. Restart the computer. If you have one of these errors I hope that this was able helpful in determining what was going on and helped in resolving the issue for you.15KViews3likes3Comments