"Race conditions arise when multiple threads attempt to access a shared resource without proper synchronization, often leading to vulnerabilities such as concurrent use-after-free. To mitigate their occurrence, operating systems rely on synchronization primitives such as mutexes, spinlocks, etc."

"Our key finding is that all the common synchronization primitives implemented using conditional branches can be microarchitecturally bypassed on speculative paths using a branch misprediction attack, turning all architecturally race-free critical regions into Speculative Race Conditions, allowing attackers to leak information from the target."

Um. What? That's crazy!

"Mutex" here means "mutual exclusion". It is a lock that allows only one concurrent threat to enter a section. "Spinlock" refers to a more primitive technique, where a threat asks "are you unlocked yet?" over and over in a loop until the lock is released and it can acquire it. In modern systems, the hardware and the operating system work together to enable threads to go to sleep and get woken up when their locks are released instead of doing the spinlock thing.

Digging into this further, the researchers say:

"Since 2018, after the discovery of Spectre and Meltdown, transient execution attacks have become an intensively studied area of research."

You know, I remember hearing about Spectre but didn't look into the details of it.

"Whenever a modern CPU implements speculative optimizations (e.g., branch prediction), it speculatively executes a sequence of instructions. The two possible outcome for these instructions are that either they are committed and made visible to the architectural level or they are squashed due to mispeculation (e.g., misprediction) -- leading to transient execution. When the instructions are squashed, the CPU rollbacks the state. Despite the rollback, some microarchitectural side effects are left and can be observed through one of the many side channels available (e.g., data cache, branch target buffer, port contention, etc.) to leak sensitive information."

"Spectre-PHT, also known as Spectre-v1, is the first known attack of this kind, targeting the pattern history table and exploiting a code pattern. The code checks for x to be in-bound before performing a double array access. For exploitation purposes, the attacker can ensure x is out-of-bound and array1_ size is not present in the cache. In this scenario, instead of waiting for array1_size to be loaded from main memory to perform the comparison, the CPU speculates and starts to transiently execute the instructions beyond the comparison. If the comparison has been executed several times before with x in-bound, the CPU is prone to speculate that x is once again in-bound, hence transiently performing the out-of-bound access of array1. When the not cached array2 is accessed using the byte retrieved from the out-of-bound access of array1, the specific accessed location is loaded into the cache. The attacker can complete the 1 byte leak by testing which location of array2 can be accessed faster than the others. Its position within the buffer reveals the secret byte value. Notably, Spectre-PHT remains unmitigated in hardware. Software developers remain responsible to harden potentially vulnerable branches with mitigations (e.g., fencing to prevent speculation), but the extent to which all the 'right' branches have been adequately hardened in large high-value codebases such as the Linux kernel remains an open question."

"Concurrency bugs are a category of bugs which affect multithreaded programs and occur due to the absence or the incorrect use of synchronization primitives. Due to their nondeterministic behavior, concurrency bugs are one of the most elusive and difficult to triage classes of bugs. Under certain conditions, concurrency bugs can also lead to memory error vulnerabilities. In modern operating systems such as the Linux kernel, one of the most common memory error vulnerability caused by concurrency bugs is use-after-free."

"In a use-after-free attack, the first step is generally to free a memory object. This operation invalidates all the pointers to that object, which become dangling. The second step generally involves forcing the allocator to reuse the memory slot of the free object for the allocation of a new object. This step reinitialize the previously freed memory slot. The final step of the attack is generally to force the victim to use one of the dangling pointers, which now points to the newly allocated object. A read from or write to such pointer to controlled data can be used to exploit the bug in a variety of ways."

"When this attack is performed in concurrency settings, and the free step and the use step are executed by distinct threads sharing the underlying object. Such concurrent use-after-free vulnerability is harder to exploit than the single-threaded use-after-free case, since exploitation depends on thread interleaving and the availability of a sufficient race window. While the community has invested significant effort in investigating traditional concurrency bugs and concurrent use-after-free -- e.g., studies demonstrating that more than 40% of the use-after-free vulnerabilities patched in Linux kernel drivers are concurrent use-after-free -- their microarchitectural properties have largely been neglected. In this paper, we study such properties and their security implications for the first time, uncovering a new class of speculative execution vulnerabilities in the process."

They go on to explain their new exploitation technique to precisely interrupt any (kernel) thread and create an architecturally unbounded use-after-free exploitation window. This works by first identifying use-after-free exploitation windows as tiny as eight instructions. Then they employ high-precision hardware timers to interrupt the victim thread at just the right time and amplify the original UAF window. After that, they rely on user interfaces to trigger an interrupt storm to interrupt the victim thread in the amplified window, which has the effect of stretching the UAF window indefinitely. Probably should menton that by "user interfaces", here they mean things like the host controller interface layer of the near field communication (NFC) driver.

Then they go on to exploit speculative race conditions, their new term for speculative execution vulnerabilities "affecting all common synchronization primitives", by which they mean mutexes, spinlocks, etc. "We can consistently trick speculative execution into acquiring a mutex and entering the guarded critical region. Since this is the case regardless of the current (architectural) state of the mutex, we can speculatively acquire a mutex already held by another thread. In other words, the mutex becomes a no-op on the speculative path, leading to a speculative race condition and opening the door to arbitrary concurrency vulnerabilities at the microarchitectural level."

The end result of all this is that they can leak memory from the Linux kernel at a rate of 12 KB/s.

I have to say, I'm amazed people exist who can pull stuff like this off.

GhostRace: Exploiting and mitigating speculative race conditions - Syssec@IBM Research

#solidstatelife #cybersecurity