morse
4 TopicsEverything Old Is New Again: Hardening the Trust Boundary of VBS Enclaves
Virtualization-Based Security (VBS) enclaves use the hypervisor’s virtual trust levels (VTLs) to isolate regions of memory and code execution within a user-mode process. This provides a powerful solution for trusted execution environments (TEE) that protects sensitive data, like encryption keys, from even malicious administrators. However, this also introduces a new trust boundary: one between the VTL1 enclave and the VTL0 host. This complicates things! One of the foundational premises of evaluating if data is untrusted is whether that data crosses a trust boundary. Common examples of crossing trust boundaries include a higher privileged process ingesting data from a lower privileged process, a network service receiving packets from the internet, and a word processor opening a file from a USB drive you found in the parking lot. A key difference between those trust boundaries and the one separating an enclave and its host process is that in each of those boundaries the higher privileged entity is external to the lower privileged one: a kernel driver vs a user-mode process, a network server vs an internet client, a word processor vs a file on a USB drive you found in the parking lot. However, an enclave exists within its host process, and this new trust boundary is internal to that process. This fact requires a shift in perspective for the developer because the enclave cannot trust anything that originates from the host process. MORSE has partnered closely with teams across Microsoft building VBS enclaves and has collected some lessons learned with this shift in perspective. Since support for third-party enclaves was announced last year, it is important that we highlight this new threat model and its design patterns for the broader developer community. In this blog post, we will present some recommendations that you can follow to help harden your enclave against common vulnerabilities. Never trust VTL0 The most important thing to remember is that while the host process cannot read or write in the enclave’s memory region, the converse does not hold true – an enclave can read and write the memory of its host VTL0 process. This can create tricky situations when the enclave operates on pointers passed from the host process to the enclave. There are two guidelines you should always follow when operating on VTL0 data: validate that pointers are actually outside the address range of the VTL1 enclave and create a copy of parameters’ data structures in VTL1 before further validating structure fields. Validate pointers are in VTL0 An exported enclave function called with the CallEnclave API has a similar type definition to a function invoked by the CreateThread API: LPVOID (WINAPI *PENCLAVE_ROUTINE)( LPVOID lpThreadParameter ); If the host process wants to pass a structure to the enclave routine, this parameter must be a pointer to data in the VTL0 host process’s memory region... but there is nothing to enforce that by default. In the analogous trust boundary between kernel and user, there are primitives for checking this (ProbeForRead and ProbeForWrite), but no such primitives exist in the enclave runtime. Consider an example scenario where an enclave holds a secret buffer and the host process can query what the state of the buffer is: enum State { Unallocated, Allocated, Initialized }; State g_State = State::Unallocated; uint8_t *g_SecretData = nullptr; size_t g_SecretLength = 0; LPVOID GetState(LPVOID lpParam) { State* state = (State*)lpParam; if (state == nullptr) { return (LPVOID)E_INVALIDARG; } *state = g_State; return (LPVOID)S_OK; } LPVOID AllocateBuffer(LPVOID lpParam) { size_t size = (size_t)lpParam; g_SecretData = new uint8_t[size]; if (g_SecretData == nullptr) { return (LPVOID)E_INVALIDARG; } g_SecretLength = size; g_State = State::Allocated; return (LPVOID)S_OK; } // Generate secret data in other functions (not displayed) LPVOID GenerateSecret(LPVOID lpParam) { /* ... */ } A legitimate host will pass a VTL0 pointer to retrieve the state value, but what would happen if you called AllocateBuffer, then called GetState and passed in an address inside the enclave? Let's see what that might look like. The global variable declarations of g_State, g_SecretData, and g_SecretLength in the code snippet above initializes all three to zero/nullptr. Within the VTL0 address space, the malicious host process allocates a buffer at address 0x00000001`00000000. When the host calls the AllocateMemory routine with a parameter of 0x20, that function allocates a buffer of length 0x20 and sets g_SecretData to point to that buffer. The function then sets g_SecretLength to be the value passed in, 0x20. Finally, the function updates g_State to State::Allocated (1). Next, the host process calls GetState, but instead of passing an address of a variable in the host process’s VTL0 address space, it passes an offset relative to g_SecretData (this address is known to the host because the host can easily calculate it based on the exported function addresses). Since the enclave does not validate the parameter, the enclave happily writes the value of g_State to this offset within g_SecretData. Since g_State is only four bytes wide, it takes a few overlapped writes to fully clear the original value of the g_SecretData pointer. The third function call completes the overwrite, and now g_SecretData points to 0x00000100`00000000. However, g_SecretLength is 0 now, which is a problem. One final write sets g_SecretLength to a usable value again, in this case 0x100, so that any operations relying on the length don’t immediately fail because it’s been zeroed out. Now that both values have been changed, the host has full control of the secret buffer; it can read it, or it can even modify it if necessary! To prevent this type of pattern, use the EnclaveGetEnclaveInformation API during your enclave’s initialization to figure out what the bounds of your enclave are, then confirm every pointer passed from the VTL0 host process is outside of those bounds before copying data to/from that pointer. Capture VTL0 structures in VTL1 before checks Validating that the function parameter lives in VTL0 is only a piece of the puzzle. If the parameter is a structure, you also need to recursively confirm that every pointer in that structure is in the VTL0 address space, every pointer in those structures is in the VTL0 address space, ad infinitum. This is where developers make the second most common misstep: they do not “capture” the structure in VTL1, which is just a fancy way of saying “copy it”. Once you have checked the value of a structure in VTL0, that value is still sitting in VTL0. If the host process is fast enough, it can win the race and change a value with a second thread after the enclave checks a pointer (or other value like a buffer size!) and before it uses the pointer. This is known as a “time of check, time of use” bug, or TOCTOU, and is shown in the following diagram. To avoid this class of bugs, you should first validate the parameter’s address as previously described, then create a local copy of the parameter in VTL1. Care must be taken to ensure any pointers within the captured structures are also validated and captured as well. After you have done this, you can freely check and use the fields without worrying that they might be changed out from under you! Avoid reentrancy if possible The host process can call the enclave’s exported functions using the CallEnclave API, but the enclave can also use the CallEnclave API to call a function in VTL0. A classic use case for this is when the host calls a function in the enclave that will generate an unknown amount of data; the enclave can invoke a callback in VTL0 that allocates the required space in VTL0, returning that address to VTL1 so that the enclave can copy the data out. However, there is no prohibition on that VTL0 callback calling back into VTL1; this is known as “reentrancy,” and this pattern can often be abused to create use-after-free conditions and other types of bugs. The obvious solution here is to use synchronization primitives – enclaves support the commonly used CRITICAL_SECTION locks – but there is a subtle problem that arises. When the host calls an entry point in the enclave, the Secure Kernel (SK) assigns execution to a VTL1 thread. If, in the process of executing that call, the enclave calls back out to VTL0, execution continues on the original VTL0 thread. If that host thread calls another entry point in the enclave before returning to the first enclave call, the SK will assign execution on the same thread again...and there’s the rub. Let’s take a look at a scenario that is possible with CRITICAL_SECTION locks: CRITICAL_SECTION locks allow for recursive locking, which means if a thread tries to call EnterCriticalSection a second time before LeaveCriticalSection, that call will succeed. This can lead to situations where one enclave routine is operating on data it thinks is locked while a second enclave routine is deleting or changing it out from under the first. Thus, a developer writing an enclave that uses CRITICAL_SECTION locks must take great care to either avoid reentrancy all together or establish checks for reentrancy and respond accordingly. If you absolutely must call back into VTL0 during an enclave routine, the best choice you can make is to only use the other primitive supported by enclaves: SRW locks. SRW locks cannot be acquired recursively, which will prevent a second enclave routine from modifying data out from under a first routine (assuming you have correctly protected your data with the locks, but that is true of all multi-threaded applications). Keep secrets in the enclave The whole purpose of a trusted execution environment like an enclave is to protect anyone from ever accessing secrets like encryption keys. Creating this sensitive information in the clear within the untrusted host process and feeding it to the enclave introduces a point of failure that can be exploited to reveal this secret information to untrusted actors on the system. Anything you wish for your enclave to protect should always be generated within the confines of the enclave, and it should never leave the enclave unless securely communicated to a trusted entity; a good example of this is the SQL Always Encrypted enclave: encryption keys are generated in the enclave, and sensitive database contents are only passed to and from the trusted client through an encrypted channel that the enclave has securely negotiated with the client. It is important to remember that any process can load your enclave, and your enclave cannot tell what process loaded it. Imagine a scenario in which you have an enclave that persists an encryption key in an encrypted file on disk, and your host process needs to provision the encryption key when it does not exist. If the host process handles generating that key and passing it to the enclave, what happens if an attacker gets there first? When an attacker’s untrusted process loads the enclave and passes it an encryption key they control to encrypt and write to disk, your legitimate host process (and its loaded enclave) will not be able to tell the difference. All data encrypted by the enclave with that key will then be fully compromised by the attacker, because they already know the secret key that is protecting it. If the enclave itself handles generating all secret data, then it matters much less if an attacker tries to provision this data first, because the attacker cannot know the secrets. Similarly, your enclave should not release secrets outside of a secure channel negotiated with trusted parties. If it does, you might have created conditions for an oracle attack. If you do have sensitive data that your enclave needs to release to a remote trusted party, take care not to use a negotiation protocol that is vulnerable to an interception attack. For example, a Diffie-Hellman key exchange is not an authenticated protocol and allows an attacker to insert themselves between the exchange. Instead, consider using enclave attestation reports in conjunction with Azure Attestation or Host Guardian Service, which can verify that the enclave’s system is healthy and can be trusted. Through this trusted service and attestation report, the enclave can provide a public key that the trusted party can use to encrypt a session key that can be used to provision the enclave for future secure communications. Don’t reinvent the wheel The runtime in enclaves is quite limited compared to a standard user-mode process or even a legacy VTL1 trustlet, so C is the only official language supported for developing enclaves. This can cause some friction, because if a developer wants to use safe coding patterns, like something as simple as a bounds-checking array, they need to “roll their own” ... and rolling your own anything can be error-prone and dangerous. Options here are limited; official C++ build support may happen at some point but doesn’t yet exist. However, that does not have to stop you! With a little bit of effort and configuration, some C++ standard library features can still be compiled in the enclave environment. Additionally, some of the Windows Implementation Library RAII wrappers can be used once any linking errors are solved through stubbing. If you limit your modifications to only fixing linker errors, you can take advantage of safer containers in C++, rather than reimplementing them yourself. But, if we’re already delving into the murky waters of “unsupported languages,” we might suggest implementing your enclave in Rust. During a recent MORSE hackathon, we built a simple proof-of-concept enclave in Rust. If you tightly constrain any unsafe behavior to a limited amount of glue code, the Rust language brings to bear the borrow checker for added memory safety. Conclusion VBS Enclaves are a great way to protect your sensitive data from even highly privileged actors, but as we’ve seen here, there are a lot of ways to step on rakes. Not only are there common errors that resemble those found in traditional trust boundaries, but there are also some new ones that can be subtle. By following the recommendations we’ve outlined in this article, you can harden your enclave against common vulnerability patterns that we have seen in our reviews!1.8KViews1like2CommentsEvolving the Windows User Model – Introducing Administrator Protection
Previously, in part one, we outlined the history of the multi-user model in Windows, how Microsoft introduced features to secure it, and in what ways we got it right (and wrong). In the final part of this series, we will describe how Microsoft intends to raise the security bar via its new Administrator protection (AP) feature. Core Principles for Administrator Protection As the main priority, Administrator protection aims to provide a strong security boundary between elevated and non-elevated user contexts. There are several additional usability goals that we will cover later, but for security, Administrator protection can be summarized by the following five principles: Users operate within the Principle of Least Privilege Administrator privileges only persist for the duration of the task for which they were invoked Strong separation between elevated and non-elevated user accounts, except for paths of intentional access Elevation actions must be explicit (e.g. no silent elevations) Allowing a more granular use of elevated privileges by applications, rather than the “up-front” elevation practice common in User Account Control (UAC) Specifically, principles two and three represent major changes to the existing design of the Windows user model, while principles one and four are intent on fulfilling promises of previous features (standard user and, to a lesser extent, UAC) and rolling back changes which degraded security (auto-elevation), respectively. What Does Administrator Protection Fix and How? Administrator protection is nearly as much about what it removes as to what it adds. Recall, beginning with Windows Vista, the split-token administrator user type was added to allow a user to run as both standard user and administrator depending on the level of privilege required for a specific task. It was originally seen to make standard user more viable for wide-spread adoption and to enforce the Principle of Least Privilege. However, the features did not fully live up to expectations as UAC bypasses were numerous following the release of Windows 7. As a refresher, when a user was configured as a split-token admin, they would receive two access tokens upon logon – a full privilege, “elevated” administrator token with admin group policy set to “Enabled” and a restricted, “unelevated” access token with admin group policy set to “DenyOnly”. Depending on the required run level of an application, one token or the other would be used to create the process. Administrator protection changes the paradigm via System Managed Administrator Accounts (SMAA) – a local administrator account which is linked to a specific standard user account. Upon elevation, if a SMAA does not exist already it is created. Each SMAA is a separate user profile and member of the Administrators group. It is a local account named via the following scheme utilizing extra digits in the unlikely event of a collision: Local Account: WIN-ABC123\BobFoo SMAA: WIN-ABC123\admin_BobFoo Or on collision: Local Account: WIN-ABC123\BobFoo (the account to be SMAA-linked) Local Account: WIN-ABC123\admin_BobFoo (another standard user account, oddly named) SMAA: WIN-ABC123\admin1_BobFoo Similarly, for domain accounts, the scheme remains the same, except the SMAA will still be a local account: Domain Account: Redmond\BobFoo SMAA: WIN-ABC123\admin_BobFoo To ensure these accounts can’t be abused, they are created as password-less accounts with additional logon restrictions to ensure only specific, SYSTEM processes are permitted to logon as the SMAA. Specifically, following an elevation request, a logon request is made via the Local Security Authority (LSA), and the following conditions are checked: Access Check. Call NtAccessCheck, including both an ACE for the SYSTEM account and a SYSTEM IL mandatory ACE with no read up, no write up, and no execute up. The access check must pass. Process Path. Call NtOpenProcess with the caller’s PID to obtain a process handle, then check the process image path via QueryFullProcessImageName. Compare the path to the hardcoded allow-list of binaries that are allowed to logon SMAA accounts. The astute reader may notice that process path checks are not enforceable security boundaries in Windows; rather, the check is a defense-in-depth measure to prevent SYSTEM processes such as WinLogon or RDP from exposing SMAA logon surface to the user. In fact, Process Execution Block (PEB) spoofing was a class of UAC bypass in which a trusted image path was faked by a malicious process. However, in this case the PEB is not queried, but instead the kernel EPROCESS object is used to query the image path. As such, the process path check will be used alongside an allowlist to prevent current and future system components from misusing SMAA. Splitting the Hive A major design compromise made with the split-token administrator model was that both “halves” of the user shared a common profile. Despite each token being appropriately restricted in its use, both restricted and admin-level processes could access shared resources such as the user file system and the registry. As such, improper access restrictions on a given file or registry key would allow a restricted user the ability to influence a privileged process. In fact, improper access controls on shared resources were the source of many classic UAC bypasses. As an example, when the Event Viewer application, “eventvwr.exe”, attempts to launch “mmc.exe” as a High Integrity Level (IL) process, it searches two registry locations to find the executable path (1): HKCU\Software\Classes\mscfile\shell\open\command HKCR\mscfile\shell\open\command In most circumstances, the first registry location does not exist, so the second is used to launch the process. However, an unprivileged process running within the restricted user context can create the missing key; this would then allow the attack to run any executable it wished at High IL. As a bonus for the attacker, this attack was silent as Event Viewer is a trusted Windows application and allows for “auto-elevation” meaning no UAC prompt would be displayed. $registryPath = "HKCU:\Software\Classes\mscfile\shell\open\command" $newValue = "C:\Windows\System32\cmd.exe" # Check if the registry key exists if (-not (Test-Path $registryPath)) { # Create the registry key if it doesn't exist New-Item -Path "HKCU:\Software\Classes\mscfile\shell\open" -Name "command" -Force | Out-Null `}` # Set the registry value Set-ItemProperty -Path $registryPath -Name "(default)" -Value $newValue # Run mmc.exe to auto-elevate cmd.exe Start-Process “mmc.exe” Similarly, the Windows Task Scheduler – which configures processes to run periodically – could be exploited to run arbitrary commands or executables in an elevated context. These attacks worked similarly in that they used writable local environment variables to overload system variables such as %WINDIR% to allow an attack to execute arbitrary applications with elevated privileges – with SilentCleanup being a particular favorite (2). Such attacks were attractive as an unprivileged process could also trigger the scheduled task to run at any time. New-ItemProperty -Path "HKCU:\Environment" -Name "windir" -Value "cmd.exe /k whoami & " -PropertyType ExpandString; schtasks.exe /Run /TN \Microsoft\Windows\DiskCleanup\SilentCleanup /I As separate-but-linked accounts, each with its own profile, registry hives are no longer shared. Thus, classic UAC bypasses, such as the registry key manipulation and environment variable (like many things in Windows, environment variables are backed in the registry) overloading attacks are mitigated. As an added benefit administrator tokens can now be created on-demand and discarded just as quickly, thus limiting exposure of the privileged token to the lifetime of the requesting process. Rolling Back Auto-Elevations When auto-elevation was added in Windows 7, it was primarily done so to improve the user experience and allow simpler administration of a Windows machine. Unfortunately, despite several restrictions placed on applications allowed to auto-elevate, the feature introduced a huge hole in the Windows security model and opened a number of new avenues for UAC bypass. Most prevalent of these bypasses were those which exploited the auto-elevating COM interface IFileOperation. Attackers would leverage this interface to write malicious DLLs to secure locations – a so-called “DLL Hijacking” attack. The attack would work whenever a process met all of the conditions for auto-elevation but ran at the Medium Integrity Level (IL). The malicious process would inject code into the target process and request the DLL payload be written to a secure path via IFileOperation. Whenever the DLL was loaded by an elevated process, the malicious code would be run, giving the attacker full privileges on the system. With Administrator protection, auto-elevation is removed. Users will notice an increase in consent prompts, though many fewer than the Vista days as much work has been done to clean up elevation points in most workflows. Additionally, users and administrators will have the option to configure elevation prompts as “credentialed” (biometric/password/PIN) via Windows Hello or simply confirmation prompts. This simple change trades some user convenience for a reduction in attack surface of roughly 92 auto-elevating COM interfaces, 11 DLL Hijacks, and 23 auto-elevating apps. Of the 79 known UAC bypasses tested, all but one are now fully or partially mitigated. The remaining open issue around token manipulation attacks has been assigned MSRC cases and will be addressed. It should be noted that not all auto-elevations have been removed. Namely, the Run and RunOnce registry keys found in the HKEY_LOCAL_MACHINE hive will still auto-elevate as needed. Appropriately, these keys are ACL’d such that only an administrator can modify them. Improving Useability Administrator protection is not limited to security-focused changes only – improved useability is also a major focal point of the feature. Chief amongst the areas targeted for improvement is the removal of unnecessary elevations and “dead-ends”. Specifically, dead-ends occur when a functional pathway which requires administrator privileges does not account for a user operating as a standard user and thus presents no elevation path at all, resulting in the user interface either displaying the setting as disabled or not at all. In such cases, a so-called “over-the-shoulder” elevation is required – the same underlying mechanism used when elevating to the SMAA user in AP. Such scenarios represent huge inconvenience for non-Administrator accounts in both AP and non-AP enabled configurations. One example of this scenario was the group policy editor (gpedit.exe). When launching as a standard user, an error prompt would be displayed, and the app would be launched in an unusable state. More Work To Be Done Administrator protection represents a huge jump in the security of the Windows OS. However, as always, there is more work to be done. While AP has mitigated large classes of vulnerabilities, some remain, albeit in a diminished state. DLL hijacking attacks prior to AP primarily relied on abusing the auto-elevating IFileOperation COM interface to write a malicious DLL to a secure path. As auto-elevation has been removed, this path no longer exists. However, situations where an unsigned DLL is loaded from an insecure path still represent a potential AP bypass. Note that the user will still be prompted for elevation in such a scenario but may not be aware that a malicious DLL is being included in the process. Token manipulation bypasses such as those shown by James Forshaw and splinter_code, remain a class of potential exploitation. Elevation prompts are shown only before creation of an elevated token, not use. Therefore, should additional pathways be discovered where an elevated token can be obtained by a malicious process, AP would not be positioned to stop it from silently elevating. However, MSRC cases for known variants of token manipulation/reuse attack have been filed and fixes are currently in-development. Lastly, attacks which rely on obtaining a UIAccess capability from another running process are partially mitigated by AP. Previously, UAC bypass attacks would launch an auto-elevating app, such as mmc.exe, and then obtain a UIAccess-enabled token — a token which gives a lower-privileged process the ability to manipulate the UI of a higher-privileged process, typically used for accessibility features. With AP enabled, all attempts to launch an elevated process would be met with a consent prompt which an attacker would be unable manipulate with a UIAccess token alone. However, in situations where a user has previously elevated a running process, an attack would be able to obtain a UIAccess token and manipulate the UI with no additional consent prompts. This list is not exhaustive, it is likely edge cases will pop up which will require attention. Fortunately Administrator protection is covered by the Windows Insider Bug Bounty Program and internal efforts by MORSE and others will continue to identify remaining issues. A Welcome Security Boundary We In MORSE review quite a few features in Windows and are big fans of Administrator protection. It addresses many gaps left by UAC today and adds protections which for all intents and purposes simply did not exist before. The feature is far from complete, usability improvements are needed, and there are some remaining bugs which will take time to resolve. However, the short-term inconvenience, is worth long term security benefit to users. While Administrator protection will certainly experience some growing pains, even in its current state, it’s a leap forward for user security. Going forward, we encourage those users who prioritize strong security to give Administrator protection a try. If you encounter an issue, send us feedback using the feedback tool. Lastly, for app developers, we ask they update their applications to support Administrator protection, as it will eventually become the default option in Windows. References UAC Bypass – Event Viewer – Penetration Testing Lab Tyranid's Lair: Exploiting Environment Variables in Scheduled Tasks for UAC Bypass Tyranid's Lair: Bypassing UAC in the most Complex Way Possible! Bypassing UAC with SSPI Datagram Contexts Administrator protection on Windows 11 | Microsoft Community Hub3.3KViews6likes0CommentsA new, modern, and secure print experience from Windows
Over the past year, the MORSE team has been working in collaboration with the Windows Print team to modernize the Windows Print System. This new design, called Windows Protected Print, is a redesign of the Windows Print system that greatly enhances user security.1.3MViews10likes56Comments