Zero Day Engineering Insights

Deep Dive: Qualcomm MSM Linux Kernel & ARM Mali GPU 0-day Exploit Attacks of October 2023

December 13th, 2023 - by Alisa Esage

Overview

This deep technical note briefly covers five kernel vulnerabilities in Qualcomm chipsets & ARM Mali GPU, which landed on CISA Known Exploited Vulnerabilities Catalog between October and December 2023. All the bugs were reported to be exploited "in the wild" by Google TAG and Project Zero trackers in October 2023, and sometimes covered together in security news. At this moment, exploit details are not publicly disclosed. Security patches and advisories were released by respective vendors, including some general information about the underlying vulnerabilities, such as the vulnerable product and bug class. With no access to the original 0-day exploit code, I looked at the patches to derive additional information about the vulnerabilities for purposes of clarifying practical impact of exploitation, and inform offensive research based on 0-day trends.

Summary
ID Vendor Product Component Code target Bug class Patch
CVE-2023-33063 Qualcomm MSM kernel DSP Services aDSP FastRPC DMA memory management Use-after-free msm-5.15, msm-4.14
CVE-2023-33106 Qualcomm MSM kernel Adreno GPU driver KGSL high-level IOCTL logic Memory Corruption due to Unsanitized Input msm-4.19
CVE-2023-33107 Qualcomm MSM kernel Adreno GPU driver KGSL IOMMU SVM memory management Memory Corruption due to Integer Overflow msm-4.19
CVE-2022-22071 Qualcomm MSM kernel DSP Services aDSP driver memory management Use-after-free msm-5.4
CVE-2023-4211 ARM Mali GPU Kernel software driver IOCTL management of driver memory Memory corruption due to Unsanitized Input(?) n/a

Table 1. Summary of vulnerabilities in discussion, based on my independent technical analysis

Background

Smartphones based on Qualcomm chipsets use a custom version of Linux Kernel known as MSM, which is where majority of bugs in our today's consideration reside. Most of MSM-specific additions to the Linux Kernel implement hardware support for Qualcomm-specific functionality: kernel device drivers for on-SoC microdevices, higher level OS kernel components, and interfaces of communication. Take a look at the technical brief (PDF) for Snapdragon 801, one of the many Snapdragon SoC models manufactured by Qualcomm. This particular one powers NASA's Ingenuity helicopter, which landed on Mars on February 18, 2021.

Source: Snapdragon 801 Processor Product Brief (Qualcomm)

The SoC (System on Chip) board here holds only 9 components: aside from the (obviously) CPU, there is Adreno GPU, Hexagon DSP (Digital Signal Processor), an integrated wireless connectivity chip, sensor, camera, satellite support, display and multimedia chips. Each one of the components is a more or less self-contained hardware device that will require a specialized kernel driver in the OS. In addition, certain components on Snapdragon SoC will typically run a separate RTOS (Real Time Operating System), which requires establishing dedicated communication interfaces with HLOS (High Level Operating System, such as AOSP) for purposes of offloading specialized workfwlows to DSPs and reading back the results (more on that later). MSM is also in charge for support of such communication channels on HLOS side - such as QMI. A typical modern smartphone-grade Snapdragon SoC is somewhat more complex with more CPU cores, many specialized hardware interfaces, higher clock rate, 64-bit vs. 32-bit, a more developed wireless connectivity infrastructure, etc. Many Android smartphones are based on Snapdragon SoCs (roughly 30%-40%, averaging over long-term trends), while iPhones nowadays use only a standalone cellular modem chip from Snapdragon line. Here is a more realistic map of a typical Snapdragon SoC (820e), which gives an idea of how much custom OS kernel code will be needed to support it.

Image is taken from the slides of my talk Advanced Hexagon Diag. It seems that Qualcomm has simplified their technical briefs since then, and I couldn't quickly find the original technical specification document online which the slide refers to.

Now that we know what is Qualcomm MSM which holds 4 out of 5 bugs in discussion, let's consider ARM Mali target (CVE-2023-4211). ARM Mali is a GPU core design from Arm Holdings, which they license to industry partners for use in their hardware products. The list of SoCs which embed Mali GPU is quite long, and includes MediaTek, Qualcomm's main competitor in mobile hardware world - which brings the market share of Mali-equipped smartphones on par or somewhat above Qualcomm. In summary, Mali GPU and Adreno GPU are the two most popular mobile GPU cores, and more or less equal competitors, to be found in roughtly 30% to 40% Android-based smartphones each. (iPhones use Apple's own proprietary GPU design). Adreno GPU can be seen on the picture above as one of the 9 microchips on the board. But wait, why the two GPUs? Isn't Qualcomm an ARM technology vendor? While a fully qualified answer will require another article (and a very diligent analysis of the legal paperwork behind Arm IP licensing options!), the short answer is NO. Snapdragon != ARM. Specifically, ARM defines a CPU ISA (Instruction Set Architecture), plus processor hardware designs based on it, both of which were licensed by Qualcomm for use in just one component on the SoC board: the CPU. On our sample SoC that would be the 32-bit Krait CPU - 64-bit Snapdragons have Kryo CPU instead, which also uses ARM ISA. The remaining 8 components on the board have nothing to do with ARM - including the Adreno GPU, of Qualcomm's own design and production. Consequently, having two different and incompatible hardware designs, Adreno GPU and Mali GPU require different OS kernel drivers, in which vulnerabilities can be found. Now, having some clarity around our vulnerable targets, let's now look at the actual bugs.

CVE-2023-33063: Use-after-free in aDSP FastRPC DMA Memory Management

From Qualcomm advisory: "Use After Free in [MSM Linux Kernel] DSP Services - Memory corruption in DSP Services during a remote call from HLOS to DSP."

Interestingly, the bug was added to CISA KEV list only in December (2023-12-05), while Qualcomm notified 0-day exploitation in October 2023

The bug is in aDSP device driver code which implements FastRPC. The patch too long to show here in full. We'll need some extra background information for this one. Modern Snapdragon SoCs have not one, but multiple DSP processors, specialized for different classes of computation tasks. aDSP (Application DSP) was historically responsible for audio and sensor input processing (nowadays sensor processing is offloaded to a dedicated sDSP core), while mDSP runs cellular modem code, and cDSP runs AI compute. All DSP cores are architecturally based on Hexagon, Qualcomm's proprietary processor architecture with unique traits and assembly coding patterns, and one of the very few VLIW ISA designs that managed to succeed. From the hardware architecture perspective, DSP cores are stand alone and largely independent from the CPU, and managed by a separate operating system. CPU runs HLOS, such as Android Open Source Project or its OEM customizations. DSP runs RTOS, which is QuRT, in case of modern Qualcomm chipsets. CPU and DSP cores are isolated from each other, and programs they execute cannot talk to each other directly - so they must communicate through some interface, which is (on the CPU/HLOS side) implemented in MSM kernel. RPC protocols generally encode high level aspects of the communication interface. The adsprpc.c source module in which the vulnerability resides, implements this code. Specifically, aDSP processor exposes a fastrpc device through the hardware bus, which implements the CPU-DSP communication interface based on DMA hardware technology. DMA support in OS kernel requires mapping and unmapping of memory buffers, so that data could be mirrored from the hardware device to CPU-accessible RAM and back. In situations when such mapping and unmapping is done dynamically, rather than once upon device initialization, pointer management of allocated regions can be mishandled, leading to Use-after-free exploitable conditions. Apparently, this is what happened in this bug. The patch introduces a new variable unsigned int ctx_refs, which holds the number of context references to fastrpc buffers. This suggests that in the vulnerable code, one of fastrpc buffers was freed out of turn, while some code was still using it, which lead to a Use-after-free vulnerability. I did not verify where exactly it happens; looking at the patch, most likely candidate is the unmapping code, for which the patch adds an extra condition checking the value of the new variable ctx_refs.

Sample snippet of the patch for CVE-2023-33063

The vulnerability opens an attack vector from userland to kernel, most likely through the IOCTL interface. Use-after-free bugs are, in most cases, easily exploitable. This and other bugs in Qualcomm MSM Linux kernel can be exploited on Qualcomm-based smartphones to escape Android application sandbox and thus elevate privileges of the exploit payload code. Also, this may be a stepping stone to achieve device boot persistence (rooting the device), or simply execute more powerful arbitrary code.

CVE-2023-33106: Unsanitized Input in Adreno GPU KGSL IOCTL

From Qualcomm security advisory: "Use of Out-of-range Pointer Offset in Graphics - Memory corruption while submitting a large list of sync points in an AUX command to the IOCTL_KGSL_GPU_AUX_COMMAND".

Full patch code for CVE-2023-33106

This one is easy. Similarly to DSP microdevices on a mobile SoCs, the GPU microdevice requires dedicated OS kernel support, which, for Snapdragon's Adreno GPU, is implemented in the kernel driver named KGSL. One of the source modules which implement this driver in MSM Linux kernel is at drivers/gpu/msm/kgsl.c, where the bug resides. Further narrowing down the affected code, it's in IOCTL processing logic. IOCTL is a universal mechanism of communication between OS userland and kernel, used in all the major operating systems. Actual communication is achieved by sending module-specific commands and data through the interface exposed by the kernel. For example, on Linux you'd use ioctl() syscall with specialized command ID and parameters, which the kernel core will forward to the corresponding loadable kernel module, in which the call parameters will be parsed. Implementation of IOCTL processing is module-specific, and kernel modules can mishandle the data supplied in IOCTL parameters, that may lead to exploitable memory corruption and logic vulnerabilities - as in case of this bug. The patch adds a sanity check for numsyncs variable, which comes from userland in a IOCTL request IOCTL_KGSL_GPU_AUX_COMMAND. This variable is later used as an unchecked loop counter in downstream code (kgsl_drawobj.c), leading to memory corruption.

CVE-2023-33107: Integer Overflow in KGSL IOMMU SVM memory management

Vendor: "Integer Overflow or Wraparound in Graphics Linux - Memory corruption in Graphics Linux while assigning shared virtual memory region during IOCTL call."

Full patch code for CVE-2023-33107

Similarly to the previous bug, this one is in kgsl driver code for Adreno GPU. Low level aspect of the vulnerability is obvious from the patch: there is an integer wraparound in arithmetic computation of the sum of gpuaddr and size variables. In addition, patch description suggests that the bug is in IOCTL processing (this is not immediately obvious in the patched code). However, high level context of this bug is quite complex and interesting. GPU computing independently from the CPU, can communicate with it via a feature named SVM (Shared Virtual Memory), introduced in OpenCL 2.0 - not to be confused with AMD SVM, a hardware-based virtualization technology. SVM shared memory region can be accessed from both Android applications (on CPU) and graphics shaders (on GPU). In order to enable such communication for Android applications, the GPU kernel driver must implement low-level infrastructure for establishing and managing shared memory regions, reading and writing data, atomics (ideally), and so on. A subset of this communication infrastructure lies through the IOCTL interface of kgsl character device. Looking again at the bug, it is in kgsl_iommu.c source module, and the vulnerable function iommu_addr_in_svm_ranges implements a (failed) sanity check for three input parameters related to low-level SVM memory management of the GPU. MMU internals are one of the most complex subjects in OS kernel practice, that warrants a dedicated treatment; for our purposes it is only important to know how exactly the user can influence those parameters. Here is where the check procedure is called:

Invocation of the buggy check (kgsl_iommu.c)

When the check is miscalculated due to integer wraparound, the malicious gpuaddr and size parameters will go to downstream code and cause a memory corruption (I'm guessing, via corrupted linked list management). The code of kgsl_iommu_set_svm_region can be reached, after passing a long chain of calls in deep internals of the driver, with IOCTL_KGSL_GPUOBJ_IMPORT, which has the following arguments:

Definition of IOCTL_KGSL_GPUOBJ_IMPORT and parameters

Interstengly, this IOCTL command represents a still rather high level implementation of the the GPU communication protocol for SVM (passing "GPU objects" around the inter-processor boundary).

CVE-2022-22071: Use-after-free in aDSP driver memory management

From Qualcomm advisory (May 2022 Security Bulletin, not a typo): "Use After Free in Automotive OS Platform Android - Possible use after free when process shell memory is freed using IOCTL munmap call and process initialization is in progress".

Full patch code for CVE-2022-22071

This vulnerability stands out for several reasons. First, it is a bug that was patched in May 2022, and attributed to the security researcher Seonung Jang(@IFdLRx4At1WFm74), while bug exploitation "in the wild" was reported in October 2023. When 0-day exploit attack is reported, credit is usually given to the incident response analyst who caugh the exploit, while the security researcher who discovered the vulnerability in the code will remain unknown. This line of thinking suggests that the bug was either exploited as N-day on unpatched devices, or as a pseudo-0-day on OEM devices that ship outdated software and fail to update it. Either way, it is most likely that the October attack was informed by patch analysis, similar to the workflow which I show in this article. Second, bug target is marked as "Automotive OS Platform Android", while the patch itself is in the same adsprpc.c source module that was seen in CVE-2023-33063. Was it exploited on cars? Or can the bug be triggered only on Automotive platform, because the vulnerable code apparently affects aDSP driver platform-wide? There is a lot of information in here. Third, this is the only bug in the list which was added to CISA KEV list in October rather than December. It seems that CISA doesn't rely on Google TAG's and software vendor's reporting directly, but rather, seeks to establish 0-day attacks via other sources. The meaning of the patch is simple: it informs adsprpc device driver that some memory is still in use, so that the pointer behind it would not be freed. Especially, is_filemap variable is a "flag to indicate map used in process init", and it's checked in unmapping code like this:

Details for CVE-2022-22071

Here, fastrpc_mmap_remove would be easily reached with with one of FASTRPC_IOCTL_MUNMAP* ioctl calls.

CVE-2023-4211: Mali GPU IOCTL processing

From ARM advisory: "Mali GPU Kernel Driver allows improper GPU memory processing operations - A local non-privileged user can make improper GPU memory processing operations to gain access to already freed memory."

This is the only bug in the bunch which affects non-Qualcomm Android devices - such as those based on MediaTek SoC. Looking at Bifrost GPU Kernel Driver code, between r42p0 and r43p0 the code was changed substantially (the source diff is around 10k lines) and not only for security reasons, which makes it non-trivial to identify the specific vulnerability patch. Upon a quick inspection, the change in kbase_csf_queue_register makes a good candidate:

Suspected patch for CVE-2023-4211

Here, kbase_csf_queue_register procedure can be reached with KBASE_IOCTL_CS_QUEUE_REGISTER ioctl call, while it clearly involves added sanitization of input parameters related to GPU memory management. If this is not CVE-2023-4211, then it must be a different but very similar vulnerability, which could be exploited using the same technique.

Conclusions

This technical note covered a total of 5 distinct vulnerabilities in Android kernel that were exploited "in the wild" in October 2023. Previously unknown technical details were disclosed, together with a limited Root Cause analysis of selected bugs. This information was inferred from security patches and other public sources with no access to the original exploit code. Vulnerabilities covered here allow to attack in summary around 80+% of Android devices, including mobile smartphones (primarily) and also potentially smart cars, IoT, and wearables. The number 80% is based on a rough approximation of 35% Qualcomm-based and 35% Mediatek-based mobile platforms, plus a small percentage of non-MediaTek embedders of Mali GPU. The impact of all the bugs is the same: they allow to escape Android application sandbox and execute arbitrary code with elevated privileges. Such bugs would be typically exploited in the third stage of a full-chain exploit, post-RCE through the browser or a similar remote attack vector. Further, such bugs open an attack surface to persistence, that would normally require a third vulnerability. To outline the big pictire, a complete full-chain exploit for a modern mobile smartphone (Android-based or iPhone, no big difference) will normally require *at least* four distinct vulnerabilities: 1. Remote code execution bug in Application (such as an exploit for JavaScript engine of mobile browser). 2. Memory disclosure (infoleak) bug in Application, to defeat ASLR. 3. Application sandbox bypass and/or elevation of privilege bug. 4. Bootloader or a similar level vulnerability to bypass Secure Boot technologies and persist on the device between reboots. Issues covered in this article represent common examples of stage 3 vulnerabilities. In more peculiar scenarios, such as attacking hardened systems or using less powerful bugs, stage 3 of a full-chain exploit may need to be further split in two separate vulnerabilities.

References

1. Qualcomm Security Bulletin, December 2023 https://docs.qualcomm.com/product/publicresources/securitybulletin/december-2023-bulletin.html 2. Qualcomm Security Bulletin, October 2023 https://docs.qualcomm.com/product/publicresources/securitybulletin/october-2023-bulletin.html 3. Qualcom MSM kernel open source code https://git.codelinaro.org/clo/la/kernel/msm-5.4 4. ARM Security Center https://developer.arm.com/Arm%20Security%20Center 5. CISA Known Exploited Vulnerabilities Catalog https://www.cisa.gov/known-exploited-vulnerabilities-catalog

Metadata

Discussions: Twitter

Categories: 0-Day Insights


Tags: Qualcomm, Snapdragon, ARM, GPU, Elevation of Privilege, 0day

Research Training