A couple of years ago I reverse-engineered an important critical security bug in VMware ESXi vmxnet3 virtual network adapter (CVE-2018-6981) that was leveraged in a full guest-to-host VM escape exploit demonstrated at GeekPwn 2018 competition. I published the results in my twitter and github, and (since there was no way to know if I identified the right bug) later confirmed it against the official bug details. This blog article is a quick wrap-up of that research project.
In mid 2019 I noticed that the world's probably most popular corporate-grade hypervisor VMware ESXi (perhaps better known by its full product name VMware vSphere) is suspiciously low on known security bugs. While security issues in VMware Workstation are very common to show up in the VMware Security Advisories and at security conferences, similar issues in ESXi are significantly more rare1. In fact, based on a nearly total absense of any vulnerability research writeups or whitepapers targeting it, ESXi seemed like a forbidden territory that saw no security research scrutiny whatsoever. For an experienced security researcher situations like this look like a big red flag.
After some OSINT research I found only one reputable mention of a serious (and proved exploitable) security issue in VMware ESXi: just a couple months before then, a full guest-to-host escape (zero day exploit) from a VMware ESXi virtual machine was publicly demonstrated at the Chinese Pwn2Own competition GeekPwn 2018. However, no details of the bug were available, aside from a brief non-technical mention in the VMware security advisory crediting Zhangyanyu of Chaitin Tech. I decided to spare a few nights and investigate the vulnerability by myself, based on the binary security patches that were already rolled out by the vendor.
The project was very curious for me, because at that moment I had no knowledge about ESXi system internals, no research platform, no debugging capabilities, and very limited understanding of virtual hardware in general. However, I had a solid basic skill of analyzing binary security patches which enables you to recover patched vulnerability details and create a proof-of-concept or exploit in situations when no technical details of bug are available. The combination of "something that you are good at", "something that you want to learn" and "something novel and intriguing" is an ideal combo that drives you to learn new things fast. So I jumped in, analyzed the patch, and created a proof of concept that I later published on my github.
Meanwhile, the original vulnerability finder and his team decided to publish a whitepaper about their exploit, that was presented at the Usenix WOOT 2019 conference. You can find the paper here. You'll see that the bug details that they chose to disclose are quite scarce, they don't discuss the virtual device protocol or how to trigger the bug, and never published the exploit. The proof-of-concept that I published remains a unique connection between the issue and exploit development workflows.
The process of recovering vulnerability details and root cause analysis based on a binary patch is well understood, though it's somewhat underestimated in hardness. In fact, in many security patches for mainstream software products, a bug's details are simply impossible to identify, unless you know exactly what you are looking for.
For complex systems such as hypervisors the workflow is quite complex, and consists of several steps that involve completely different skills and sets of knowledge. This is the algorithm required for the bug in question:
0. Collect all the information that is available in the public scope. This is important. Even tiny details such as class of bug (uninitialized variable in this case) may be critical for success of the project. 1. Inspect the binary security patch. Identify the affected subsystem and the insecurity pattern in the low-level code. Input from step (0) is critical here. 2. Reverse-engineering (partial) of the target software/subsystem. Goal is to understand the abstract model, input flows, and potential attack vectors. 3. Analyze the binary patch again. Now with (2) you can make sense of the security issue in the high-level context of the target subsystem (Virtual networking adapter in this case). For example, get a general idea which device commands would potentially trigger the vulnerable code branch from inside the guest OS. 4. See if you can find some external code that uses the affected subsystem, that may help you to reach the vulnerable code with inputs. Map the code to the abstract model of the subsystem (2) and the security issue (3). If such code exists, it can serve as the base for rapid prototyping of the PoC or as a testing harness. If no ready code was found, you'll have to write it yourself. 5. Now the goal is to figure out which exactly malformed input you need in order to trigger the bug, and how exactly (by which interface, protocol, timing) it must be supplied into the target subsystem (4). 6. Put everything together, write some code, set up debuggers and sanitizers, test it. Now you have a PoC testcase (usually it demonstrates a crash of the target system, though in some cases the malicious effect can be observed only in debugger) which can be further developed into an exploit.
The important thing about this workflow is that it uses a lot of generic "connecting the dots" cognitive processes, so being attentive here is just as important as being technical. If you miss just one bit of information, any of the next steps may fail. For example, for this bug the "uninitialized variable" bit turned out to be immensely helpful (if necessary) for identifying the bug in the patch relatively quickly.
It may be interesting to note that I used both Diaphora and BinDiff for this workflow, the two most usable open source binary diffing tools available today. This is a common approach, as the two tools tend to produce very different output. However, none of them was able to solve the task right away, so I ended using custom in-lab heuristics to quickly pin down the culprit code.
VMware ESXi is based on a custom operating system named VMkernel. The bug in question, however, resides not in the VMkernel per se, but in the additional software layer which implements virtualized devices (which is attack surface #2 in the custom taxonomy of my Hypervisor Threat Model, greatly simplified on this picture).
VMXNET3 is a synthetic virtual PCI device (a virtual network adapter) model developed by VMware for use in their virtualisation products. Technically it's an emulated device, though no real prototype of it exists in the physical world, it still uses emulated hardware I/O and emulated PCI bus and thus cannot be classified as a paravirtualized device (which is how VMware publications position it). In VMware ESXi the VMXNET3 adapter is default, while in Workstation it can be enabled with undocumented configuration properties. The code overlaps, which makes it possible to analyze the issue based on Workstation binaries.
VMXNET3 uses a relatively simple hardware I/O protocol that involves MMIO to two PCI BARs and some DMA, which is similar to common I/O protocols of real networking hardware. The security issue lays in the fact that VMware failed to properly handle invalid DMA address access emulation in the virtual device code, a variable was left uninitialized that could further lead to memory corruption2 due to internal code logic.
Triggering the bug consists of three distinct steps:
0. Put the device in the operational state by completing initializations and set up. In my proof-of-concept this step is done by the original VMware vmxnet3 kernel mode driver (included in the Linux kernel) 1. Supply a malformed DMA address of the shared memory region to one of the registers of the virtual device model. It would be saved in internal state. 2. Send a special device command according to the device protocol via one of the PCI MMIO BARs, that would execute the code branch that tries to use the saved DMA address.
This leads to an access violation in most tests. I did not pursue further development or weaponizaton of this proof-of-concept into a full exploit.
The proof-of-concept code is here. Apply the patch against vmxnet3 driver and build it.
An astute reader of technical vulnerability blogs may notice a curious similarity of this security issue to another critical bug that affected VMware Workstation.
Just like "general fuzzing" is not enough to find good bugs in complex systems, so "general binary diffing" is merely a little (if still essential) part of the workflow to analyze vulnerabilities based on binary patches in complex systems. In fact, the most impactful/exclusive/innovative work is always done by human mind, and cannot be solved by automation.
Virtualized devices in hypervisors remain a major source of security issues. Now as paravirtualization is becoming de-facto standard of implementing virtual devices, the attack surface would be somewhat less uniform across the industry, but the general trend is unchanged.
Deep technical knowledge about past bugs in your target class of software is a direct path to find new bugs.
I am discussing this vulnerability in complete details and technical context in the training "Hypervisor Vulnerability Research". A full set of exercises is included, that follows my workflow inspecting the bug, from patch to proof-of-concept.
If you are interested in VMware ESXi specifically, the mini-class "Deep Dive ESXi OpenSLP heap overflow (CVE-2019-5544)" may also be worthy of consideration.
VMware ESXi is one of my research interests. If you have an idea of a research or collab project, you can reach me at: firstname.lastname@example.org.
1 If you take a closer look at VMware security advisories, you'll notice that some vulnerabilities in VMware workstation also affect ESXi. However, in most of these advisories, the bug either affects a non-default configuration (Workstation and ESXi have different defaults), or an unexploitable issue such as a DoS. 2 Uninitialized variable bugs are not usually exploitable for an RCE, and is primarily useful for infoleaks, though in some rare cases an uninitialized variable issue can lead to a memory corruption.