0% found this document useful (0 votes)
7 views10 pages

Introduction to Procedural Debugging through binary libification

Gafga

Uploaded by

lololo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
7 views10 pages

Introduction to Procedural Debugging through binary libification

Gafga

Uploaded by

lololo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 10

Introduction to Procedural Debugging

through Binary Libification


Jonathan Brossard, Conservatoire National des Arts et Métiers, Paris
https://www.usenix.org/conference/woot24/presentation/brossard

This paper is included in the Proceedings of the


18th USENIX WOOT Conference on Offensive Technologies.
August 12–13, 2024 • Philadelphia, PA, USA
ISBN 978-1-939133-43-4

Open access to the


Proceedings of the 18th USENIX WOOT
Conference on Offensive Technologies
is sponsored by USENIX.
Introduction to Procedural Debugging through Binary Libification

Jonathan Brossard
Conservatoire National des Arts et Métiers, Paris

Abstract presence and impact of a given CVE in a piece of software es-


Assessing the existence, exact impact and exploitability of sentially requires writing an exploit for each potential vulnera-
a known (or theoretical) memory corruption vulnerability bility. As such, this situation creates a seemingly unreasonable
in an arbitrary piece of compiled software has arguably not burden on Product Security teams, where triaging bugs re-
become simpler. The current methodology essentially boils quires performing operations like overcoming the reachability
down to writing an exploit - or at least a trigger - for each problem multiple times.
potential vulnerability. Writing an exploit for a weird machine
involves several undecidable steps, starting with overcoming Writing exploits for a weird machine involves three steps:
the reachability problem. In this article, we introduce the no- reaching, triggering, and exploiting. Much work has been
tions of “libification” and “procedural debugging” to facilitate done in automating the first step. Arguably, all of the fuzzing
partial debugging of binaries at the procedural level. These and dynamic testing performed hitherto follows this top-
techniques allow the transformation of arbitrary dynamically bottom approach, where execution starts from an application’s
linked ELF binaries into shared libraries, and the study of entry point, toward the leaves of the application, across the
memory corruption bugs by directly calling the vulnerable application’s call graph.
functions, hence separating the memory corruption intrapro-
cedural analysis from the reachability problem. Finally, we
In this article, we aim to focus on the second step alone -
publish a framework [3] to implement such a libification un-
without requiring solving the first one, which is undecidable
der a permissive open-source license to facilitate its adoption
in general.
within the security community.

Our methodology starts with modifying the ELF headers


1 Introduction and dynamic section of an arbitrary dynamically linked ELF
executable to turn it into a more workable shared library. The
Triaging bugs has become an essential part of security. The
benefit of this technique is that any public function within the
Product Security function as a whole is becoming ever
binary becomes callable without crafting an input to reach
more critical for software manufacturers as legal frameworks
the attractive, potentially vulnerable function. Subsequently,
around the globe mandate more clarity, speed, and trans-
we can render an arbitrary function within an ELF callable,
parency in dealing with existing and new vulnerabilities. The
even turn the entire ELF application into a callable API, and
Cyber Resilience Act being implemented in Europe and the
finally manually produce more limited, partial vulnerability
Executive Order on Improving the Nation’s Cybersecurity
triggers under the form of simple text files.
published in the US, for instance, both mandate the use of
Software Bill of Materials (sBOMs) and their communication
to clients and third parties, effectively rendering the super- In the rest of this article, we will focus on memory corrup-
ficial - software version based - vulnerability assessment of tion vulnerabilities unless stated otherwise and limit ourselves
potential new CVEs affecting their software, seemingly more to C/C++ applications compiled as ELF binaries, as used un-
apparent. der GNU/Linux and Unix-like operating systems, when imple-
However, assessing the actual existence, exact impact, and menting our framework. We will assume that the application’s
exploitability of a given memory corruption bug, as required source code is unavailable to the auditor.
by the above laws, has not become significantly more man-
ageable over time. The current methodology to assess the

USENIX Association 18th USENIX WOOT Conference on Offensive Technologies 17


Our first contribution is a methodology to transform an arbi- same Intel microprocessor. The tool is very specialized since
trary ET_EXEC or ET_DYN dynamically linked ELF binary it only addresses the problem of mono-variable race condi-
into a shared library. We provide a tool named the Witchcraft tions in kernel mode under Linux.
Linker to perform this operation on ELF32 and ELF64 exe- The FUZE tool [51] aims at dynamically triggering, to
cutables alike, regardless of their architectures. Our second prove their existence, “Use After Free” vulnerabilities in ker-
contribution is a methodology to invoke arbitrary C or C++ nel mode under Linux. By combining open-source frame-
functions within ELF shared libraries without prior knowl- works such as syzcaller (fuzzer), angr [46] (for binary analysis,
edge of their exact prototypes. We implement an original type function call graph generation, decompilation, and symbolic
of debugger, procedural-based, allowing the invocation of ar- execution), and kernel mode debugging techniques (parsing
bitrary C/C++ functions. This debugger, the Witchcraft Shell, the list of kernel modules, “LKM linked list”), it dramati-
noticeably does not use ptrace(), breakpoints, or single step- cally reduces the complexity of UAF vulnerability analysis
ping. We name this new form of debugging procedural since by determining the few paths and system calls that can poten-
analysis is performed at the granularity of function calls. tially modify a variable in kernel mode, then using combined
fuzzing and symbolic execution techniques to generate user
2 Previous Work inputs capable of automatically triggering the vulnerability,
and thus proving its existence.
2.1 Exploitability of a Vulnerability The article “A Hybrid Interface Recovery Method for An-
droid Kernel Fuzzing” [32] is also specialized. The problem
As noted by Wang et al. [49], in general, the only definitive raised by the authors is the addition of undocumented inter-
way to prove the exploitability of a vulnerability is to write faces (system calls or ioctls) between user and kernel modes
an exploit for that vulnerability. This constitutes proof by by mobile phone equipment manufacturers. These new inter-
construction since the expert exhibits an exploit that demon- faces are typically additions via proprietary kernel modules
strates the exploitability of the vulnerability. On the other (the source code of which is unavailable, implying an analysis
hand, proving that a vulnerability is not exploitable is a diffi- partially to be made in black box mode) to the Android ker-
cult problem, according to Suciu et al. [44]. Demonstrating the nel (which is based on Linux and is, therefore, open-source,
non-exploitability of a vulnerability via formal proof based on auditable in white box mode). However, these interfaces are
a crash analysis is sometimes possible despite the explosive prime targets for privilege escalation attacks, where a program
nature of proofs based on symbolic execution [9] [49]. in unprivileged user mode will purposely call these extra in-
Green et al. [24] consider that when it comes to vulner- terfaces to the privileged mode of the kernel to trigger vulner-
abilities such as memory corruptions, the fact that attacker abilities. Therefore, the analysis is gray, combining a white
controls the next instruction to be executed (the “Program box analysis of the open-source Android part of the kernel
Counter”) is a strong indicator of a function’s exploitability. and a black box analysis of the non-open-source, proprietary
However, the presence of countermeasures may not make this part added by the equipment manufacturer. The methodology
condition entirely sufficient [20]. followed is a taint analysis of proprietary modules, includ-
ing type propagation, to find the prototypes of the interfaces
2.2 Automatic Exploitation of Vulnerabilities introduced (whether they are new system calls in their own
right or, more commonly, new valid ioctl calls on arbitrary
Detected via Static Analysis device drivers). Once the prototypes of these interfaces have
Several research projects focus on exploiting (or at least trig- been determined, it becomes possible to use classic whitebox
gering) vulnerabilities detected using a preliminary static anal- fuzzing tools, such as Syzcaller, by measuring the impact of
ysis to demonstrate that they are true positives. ExpRace [31], calling these new system calls dynamically on the rest of the
for example, focuses on a single class of vulnerabilities: race kernel (id est: by instrumenting only the open-source part of
conditions in the Linux kernel. After having distinguished the kernel).
race conditions involving several variables (qualified as diffi- The PhD thesis “Finding race conditions in kernels: from
cult) and race conditions involving a single variable in the ker- fuzzing to symbolic execution” [52] proposes an original ap-
nel (qualified as easy), the authors propose a generic method- proach to the detection and exploitation of “time of check,
ology for exploiting reusable single-variable race conditions time of use” (or TOCTOU) vulnerabilities, which are a sub-
on several cores, running under Intel processors, making it class of race conditions, where a kernel resource is validated
possible to trigger the previously identified vulnerability, tak- at time t, then read back and used at time t+1. The underlying
ing advantage of the fact that an unprivileged process (or fundamental issue is that this resource may have changed
secondary thread) in user mode can significantly increase in the meantime, the Linux kernel being multi-tasking and
the race window using common system calls (mmap() and concurrent, leading to false assumptions on the said resource
mprotect()) to trigger the synchronization of memory tables core properties. It should be noted that several vulnerabili-
(“Lookaside Buffers translation”) between the cores of the ties of this type have been discovered on the Linux kernel

18 18th USENIX WOOT Conference on Offensive Technologies USENIX Association


in recent years, hence the renewed interest in an automatic Khasan et al. detail the use of Address Space Layout Ran-
approach to the discovery and practical validation of this pe- domization (ASLR) [43], which consists of making the base
culiar vulnerability subclass. The methodology followed is to address of a binary and each dynamic library in memory
modify the Linux kernel (using source patches) to instrument non-predictable. An attacker can no longer hardcode return
regions likely to contain TOCTOU vulnerabilities, selected addresses when writing an exploit. The introduction of ASLR
by a preliminary static analysis, then use a fuzzer guided by typically requires modifying the kernel and the file format
symbolic execution toward these regions to be more purpose- of executables to allow arbitrary relocation of protected bina-
fully scrutinized. This methodology is limited to TOCTOU ries [34].
vulnerabilities and does not apply to kernels whose code is
unavailable. Khasan et al. also describe binary protection techniques
Furthermore, the article “From source code to crash test using canaries. These techniques have known several names,
cases through software testing automation” [16] offers a such as Propolice [21], StackGuard [15], and Stack Smashing
methodology for creating proof of exploits (id est: the au- Protection (SSP) [39]. This involves modifying the compiler
tomatic generation of user input triggering a vulnerability, in such a way as to introduce a canary (or “stack cookie”)
previously identified in source code), by combining a pre- before the return address in the stack, the integrity of which
liminary static analysis (generation of the call graph of the will be checked in the prologue of each instrumented function.
application) of the application, a fuzzing engine to traverse If the canary has been modified, the stack is corrupted, and
this graph, and the use of a symbolic execution engine (named the program will be immediately terminated rather than risk
Triton) to guide the fuzzer toward the vulnerable function. Al- arbitrary code execution by an attacker. These techniques
though the source code is essential to this methodology, it have undergone several successive improvements until they
applies to several classes of vulnerabilities, giving it notable no longer have any significant cost during the execution of
genericity. the protected application [53].

Khasan et al. finally detail FORTIFY (standardized in the


2.3 Defense in Depth: Hardened Compilation ISO/IEC TR 247315 standard). This compilation option au-
Techniques tomatically replaces specific C library functions vulnerable
to buffer overflows with functions including an additional
Countermeasures have been developed to prevent or limit the
argument, the maximum size of the destination buffer (which
exploitability of vulnerabilities in compiled applications, par-
can often be inferred by the compiler). In the event of a stack
ticularly those developed in C or C++. Detecting and taking
buffer overflow during the program execution, the applica-
into account, where applicable, the presence of these counter-
tion is terminated rather than allowing the attacker to execute
measures is critical when writing an exploit taking advantage
arbitrary code [30] [23].
of memory corruption.
Khalsan et al. [28] identify in particular the DEP (“Data Ex- These techniques have been extended to other architectures
ecution Prevention”) technique introduced in Microsoft Win- and operating systems, such as Linux [39], Android [33],
dows XP SP2, which makes the stack, dynamic memory, and OSX [39] or iOS [28].
variables in the data sections of an application non-executable.
According to the authors, the non-execution of the stack is Finally, there are protections against memory corruption
made possible thanks to hardware extensions (“NX” bit on at the hardware level of specific microprocessors, such as In-
AMD processors or “XD” equivalent on Intel processors). tel Control Flow Integrity (CFI) [7] [29], which allows, by
These countermeasures primarily aim to prevent the introduc- instrumenting the start of each block of code (an endbr64
tion and execution of shellcode [13] in all writable sections instruction is added at compile time under Intel x86-64) [29]
of the application. We also find the term W^X to name the to ensure that the control flow of the application has not been
segregation of variables (writable, non-executable) and code altered via memory corruption exploitation techniques such
(executable, non-writable) in the literature [34] [10]. as ret2libc [10] or Return Oriented Programming (ROP) [40]
Khalsan et al. also describe the use of ASCII armoring, [34] [1] [12] at any point in time. During a transfer of execu-
which ensures that all virtual addresses used by an applica- tion via branching or when returning to a calling function, the
tion contain at least one 0x00 ASCII byte (in hexadecimal microprocessor can ensure whether the destination address is
code). Given that the functions manipulating character strings an endbr64 instruction under x86-64 (respectively endbr32
end with a 0x00 (named ASCIIZ format), exploitating a stack under x86) and terminate the application if this is not the case.
buffer overflow vulnerability via the functions from libc mak-
ing a copy of strings of characters is made impossible. Intro- These countermeasures to exploiting memory corruption
ducing ASCII armoring requires modifying the kernel and vulnerabilities are effective against their respective vulnera-
dynamic linker to provide armoring on the main binary and bility subclasses but require activation (often at compile time)
all its dynamic libraries. to operate correctly [50].

USENIX Association 18th USENIX WOOT Conference on Offensive Technologies 19


2.4 Binary Loaders and Binary Post- In light of this state of the art, it seems relevant to intro-
Compilation Instrumentation duce a more lightweight form of binary rewriting focused
solely on modifying an application’s metadata. As such, it
The idea of statically or dynamically loading and instrument- shall not suffer from the limitations of the techniques based
ing binaries is fundamental in analyzing compiled applica- on control flow recovery or the runtime penalty of dynamic
tions. instrumentation.
The most basic form of dynamic instrumentation is simply
using the trap instructions to force an application to divert its
execution flow, as seen in DTrace [14]. 3 Overview of the Libification Process
A more complex tool like Valgrind and its popular mem-
check [42] memory sanitizer can perform a Just in Time (JIT) 3.1 Libification: Methodology
dynamic recompilation of executables. It is a complex frame-
In this section, we describe the production of a libifier, that
work that starts by transforming the original basic blocks of
is to say, a tool able to reliably and automatically transform
the application into an intermediate representation, then ap-
an arbitrary ELF binary into a shared library. We detail this
plies instrumentation and code optimization before translating
methodology, so it may be extended in the future, if necessary,
the intermediate representation back to machine code [36].
to compensate for breaking changes in the GNU dynamic
Such an instrumentation is heavy and incurs an execution
linker, or adapted to other toolchains.
penalty of 10x or higher.
The POSIX 2001 standard specifies the API of the dynamic
Some of the techniques available include dynamically linker, and in particular the dlopen() function, which allows
rewriting a single basic block of code at a time, while the appli- loading an arbitrary shared library in memory:
cation is running, using a shadow memory mechanism. This is
the foundation of tools like DynamoRIO [4] [6] [5], a frame- # i n c l u d e < d l f c n . h>
work reused in popular security tools such as WinAFL [55]. void * dlopen ( const char * filename , i n t f l a g s ) ;
Dinesh et al., on the other hand, opt for a pure static rewrit-
The filename parameter must point to the path to the library
ing of binaries to retrofit into binaries instrumentation that is
to be loaded on the file system.
usually introduced at compile time, such as AFL [54] and Ad-
The flags parameter controls the locality (local or global) of
dress Sanitizer [41]. Their framework, named RETROWRITE
the symbols loaded in the address space, as well as the behav-
[17], works by diverting the flow of execution through the in-
ior of the dynamic linker. In particular, if the RTLD_LAZY
sertion of trampolines. A preliminary static analysis involves
bit is set, the dynamic linker performs lazy binding of symbols
building the control flow graph, which is a difficult problem
when necessary, as opposed to an immediate binding at load
in general [35] and undecidable [22].
time if the RTLD_NOW bit is set, in which case the Global
This mechanism, where a preliminary disassembly and
Offset Table may be safely remapped read-only.
control flow recovery precedes a static rewriting of portions
of the binary to introduce instrumentation code, is a popu-
lar design [2] [48] [37] [47], subject essentially to the same
In the remainder of this chapter, we will define a shared
limitations: recovery of the control flow is undecidable in
library as an ELF file that can be loaded in memory via
general [22].
dlopen().
To avoid this pitfall, Duck et al. [19] developed a suite of
A minimal oracle to determine whether the dynamic linker
binary rewriting techniques, implemented under the E9Patch
can load an ELF file can be summarized with the following
framework, that can insert jump trampolines without requir-
code:
ing an understanding of the binary’s control flow. As such,
their instrumentation is more robust and scales to large appli- i n t main ( v o i d ) {
cations such as web browsers. They leverage techniques such void * handle = 0;
as instruction punning [11], which may safely replace branch-
ing conditions and introduce trampoline code by overwriting h a n d l e = d l o p e n ( " . / t e s t . s o " , RTLD_NOW ) ;
exactly one assembly instruction.
Furthermore, it is worth mentioning the idea of recovering i f ( h a n d l e <= 0 ) {
individual object files from a compiled binary [8] thanks p r i n t f ( " ! ! ERROR : %s \ n " , d l e r r o r ( ) ) ;
to a control flow and data flow analysis. When individual e x i t ( EXIT_FAILURE ) ;
compilation units can be unlinked, they may be subsequently }
relinked and instrumented.
Finally, custom loaders may allow the loading of Windows p r i n t f ( " Loading s u c c e s s f u l \ n " ) ;
dynamic libraries under Linux [38] or rewriting Windows return 0;
Executables so they may be loaded as DLLs [18]. }

20 18th USENIX WOOT Conference on Offensive Technologies USENIX Association


If successful, the return code from this oracle will be 0. It 3.3 Toward Procedural Debugging
will be non-zero otherwise, and an error message stemming
from the dynamic linker will indicate the cause of the memory Once the principle of libifying an ELF has been acquired,
loading error. Empirically, the work of the libifier will, there- writing a debugger capable of loading a libified executable
fore, be to modify the binary in a way that prevents the error in its own address space is straightforward: simply load the
returned by dlerror() from occurring. The developer of the libified binary via the dlopen() function of the dynamic linker.
libifier will then read the code of the dynamic linker, identify It appears appealing to integrate an interpreter into our de-
the cause of the error, and modify the libifier to patch the bugger to allow a developer to interact with the functions
input binary to prevent this last error from occurring. exposed by the libified binary. Due to its tiny size, the choice
of interpreter fell on the Lua language [25] since a Lua inter-
The goal - and hope - of the developer of the libifier is that
preter, including all its dependencies, occupies less than 500
through this iterative and empirical process, the shared library
kilobytes of memory footprint.
produced by the libifier will be able to pass all of the dlopen()
We wish to make the entire API available in the address
parsing checks, and eventually be loaded in memory. There is
space available to the Lua interpreter once the libified binary
no guarantee that such a libification will be or remain possible
is loaded in memory. This API is made up, on the one hand,
in the future or across an arbitrary corpus of executables since
of the functions exported directly by the libified binary but
this libification is a reverse engineering technique and not a
also of the APIs exported by all the dynamic libraries loaded
standardized feature of a dynamic linker guaranteed in any
in memory by the dynamic linker when loading the libified
form or fashion.
binary in memory via dlopen(). The case of functions declared
static and hence not exported at compile time is left aside for
3.2 Practical Libification now1 . Obtaining these APIs can be done via the use of the
dlinfo() function of the dynamic linker [27] [45].
The operations performed by the Witchcraft Linker to libify By making the entire API available in memory exposed to
an arbitrary ELF binary modify the ELF header, the dynamic the Lua interpreter, we simply make these APIs available to
section, and the GNU-specific symbols versioning section of the developer. One of the advantages of this methodology is
an input executable. that a developer or security analyst may invoke any function
First, within the ELF header, the libifier must ensure that loaded in the address space without worrying too much about
the e_type field is set to ET_DYN since all shared libraries the actual calling conventions or prototypes (number and type
are of type ET_DYN. of arguments) of these functions. Additionally, they may do so
Then, the dynamic section of the ELF must be parsed and without compilation from a Lua interpreter, which facilitates
possibly modified: manual exploration of said APIs.
The DT_BIND_NOW shall be changed to DT_NULL if We name this technique, which allows invoking a single
present in the .dynamic section. function at a time, “procedural debugging”.
The DT_FLAGS_1 flags present in the .dynamic section
may need to be modified: the DF_1_PIE and DF_1_NOOPEN 3.4 An Empirical Assessment of the Side Ef-
bits must be removed if set. This last flag prevents an object fects of Libification
from being loaded via dlopen().
If the binary features constructors or destructors, those In this section, we address the question of the side effects
may not expect to be called from dlopen(). The Witchcraft introduced by the libification of a binary over its main security
Linker, therefore, features an optional command line argu- hardening properties.
ment to prevent constructors and destructors from being We successively consider the following properties: the base
called. Within the .dynamic section, setting the values of address of the executable mapping (ASLR), the presence of
DT_INIT_ARRAYSZ and DT_INIT_ARRAY to zero inhibits stack cookies aimed at preventing buffer overflows, the stack’s
the instantiation of constructors, and setting the values of executability, the presence of static relocations (RELRO), and
DT_FINI_ARRAYSZ and DT_FINI_ARRAY to zero inhibits the presence of Control Flow Integrity type protections (Intel
the calls to destructors. FCF).
Finally, because the dynamic linker may refuse to load Libification of an ET_DYN binary does not modify its
multiple versions of symbols if symbols versioning is in use ASLR properties: the binary being initially mappable to an
within the libified binary, the Witchcraft Linker will simply arbitrary address remains so. In the case of the libification of
zero out the entire section of type SHT_GNU_versym. an ET_EXEC binary, which was initially only mappable to
Currently, the Witchcraft Linker (wld) version v0.0.6 can a fixed address, the ASLR is not modified either: the library
libify all of the binaries of a standard GNU/Linux distribution 1 Static functions whose addresses relative to the base address of libraries
such as Ubuntu 22.04 LTS, so that they may be loaded via the or executables are known thanks to a preliminary control flow analysis may
dlopen() function of the GNU dynamic linker version 2.35. be named and called within the debugger.

USENIX Association 18th USENIX WOOT Conference on Offensive Technologies 21


thus generated is only mappable at the same address. Loading Furthermore, copying, libifing, and loading via dlopen()
happens as if the binary had been transformed into a library the 435 binaries in the default path of a default Ubuntu 22.04
by prelinking to the same base address [26]. AMD64 install took less than 3 seconds (in total) on a laptop
The stack executability of a library loaded via dlopen() is featuring a core i-7 11850H CPU and 32Gb of RAM.
determined by the stack executability of our debugger since
the latter loads the library in its own address space. This de-
bugger property can be arbitrarily changed via the execstack2 4 Conclusion and Future Work
application.
Libification does not modify the presence of static reloca- In this article, we presented a methodology to transform a dy-
tions (binary or library with the BIND_NOW flag in their namically linked ELF binary into a shared library. We called
dynamic sections). this methodology “libification”.
The presence of stack cookies protecting the stack is intrin- We then introduced a very simple debugger able to load
sic to each function since it is implemented by instrumenting such a libified executable within its own address space, hence
the prologue and epilogue of each function. Libification, there- rendering nonstatic functions within the binary callable. We
fore, does not modify this property of the functions present in named this technique facilitating the invocation of arbitrary
the libified binary. functions in isolation and out of context “procedural debug-
The presence of Intel Integrity Protection (Intel FCF) type ging”.
protections is characterized by the presence of endbr64 in- Thus, a security analyst seeking to experiment with a pos-
structions at the start of each basic block in each protected sible vulnerability within an executable manually may now
function. Libification does not modify this intrinsic property directly invoke the function featuring the vulnerability via
either. procedural debugging without needing to produce user in-
Finally, this empirical study overall suggests that libify- puts traversing the application’s call graph before reaching
ing an ELF binary into a shared library does not modify its the vulnerable function. This is notable since the reachability
fundamental security properties, particularly the countermea- problem is undecidable in general.
sures possibly introduced into the binary at compile time. In a We verified the reproducibility of the libification process on
nutshell, libification does not introduce notable security side some of the most complex user-mode binaries available under
effects from an exploitability standpoint. GNU/Linux, as well as across an entire widespread Linux
distribution, which validates the generality of the approach.
3.5 Limits to Binary Libification In the future, we hope to be able to automatically generate
scripts to trigger a vulnerability within a compiled binary,
Libifying an ET_EXEC binary as a shared library generates a which would save significant time for Product Security teams.
somewhat special shared library since it cannot be remapped
to an arbitrary address. This induces a limit to our libifier. On
the one hand, a libified library can generate a collision with Availability
the address space of a program trying to load it, as noted by
beta testers3 . On the other hand, it is not possible to load two The Witchcraft Compiler Collection [3], including the
libified ET_EXEC binaries initially provided with the same Witchcraft Linker described in this article, is published under
base address in our debugger. a permissive dual MIT/BSD open-source license. The frame-
work is available from https://github.com/endrazine/wcc and
3.6 Validation via the package managers of several GNU/Linux distributions,
including at least Debian, Ubuntu, and Arch Linux.
The libification process and the WSH debugger were validated
under GNU/Linux Ubuntu 22.04 equipped with an Intel 64-bit
processor using the following binaries: References

[1] Salman Ahmed, Long Cheng, Hans Liljestrand,


Software Version Status Time
N Asokan, and Danfeng Daphne Yao. Tutorial:
Google Chrome 114.0.5735.198 OK < 0.01s
Investigating advanced exploits for system security as-
OpenSSH Server 8.9p1 OK < 0.01s
surance. In 2021 IEEE Secure Development Conference
Apache2 2.4.52 OK < 0.01s
(SecDev), pages 3–4. IEEE, 2021.
Nginx 1.18.0 OK < 0.01s
GCC 11.4.0 OK < 0.01s
[2] Erick Bauman, Zhiqiang Lin, Kevin W Hamlen, et al.
2 https://linux.die.net/man/8/execstack Superset disassembly: Statically rewriting x86 binaries
3 Thanks to Dan Kaminsky https://github.com/endrazine/wcc/issues/26 without heuristics. In NDSS, 2018.

22 18th USENIX WOOT Conference on Offensive Technologies USENIX Association


[3] Jonathan Brossard. The witchcraft compiler collec- [15] Crispin Cowan, Steve Beattie, Ryan Finnin Day, Calton
tion. https://zenodo.org/doi/10.5281/zenodo. Pu, Perry Wagle, and Erik Walthinsen. Protecting sys-
11298208, May 2024. tems from stack smashing attacks with stackguard. In
Linux Expo, 1999.
[4] Derek Bruening and Saman Amarasinghe. Efficient,
transparent, and comprehensive runtime code manipula- [16] Robin David, Jonathan Salwan, and Justin Bourroux.
tion. 2004. From source code to crash test-cases through software
testing automation. Proceedings of the 28th C&ESAR,
[5] Derek Bruening and Timothy Garnett. Building dy-
page 27, 2021.
namic instrumentation tools with dynamorio. In Proc.
Int. Conf. IEEE/ACM Code Generation and Optimi za- [17] Sushant Dinesh, Nathan Burow, Dongyan Xu, and Math-
tion (CGO), Shen Zhen, China, 2013. ias Payer. Retrowrite: Statically instrumenting cots bi-
[6] Derek Bruening and Qin Zhao. Building dynamic in- naries for fuzzing and sanitization. In 2020 IEEE Sym-
strumentation tools with dynamorio. posium on Security and Privacy (SP), pages 1497–1511.
IEEE, 2020.
[7] Nathan Burow, Scott A Carr, Joseph Nash, Per Larsen,
Michael Franz, Stefan Brunthaler, and Mathias Payer. [18] Aleksandra Doniec. Converts a exe into dll. https:
Control-flow integrity: Precision, security, and perfor- //github.com/hasherezade/exe_to_dll, 2020.
mance. ACM Computing Surveys (CSUR), 50(1):1–33, [19] Gregory J Duck, Xiang Gao, and Abhik Roychoudhury.
2017. Binary rewriting without control flow recovery. In Pro-
[8] Mauro Capeletti. Unlinker: an approach to identify ceedings of the 41st ACM SIGPLAN conference on pro-
original compilation units in stripped binaries. 2016. gramming language design and implementation, pages
151–163, 2020.
[9] Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert,
and David Brumley. Unleashing mayhem on binary [20] Thomas Dullien. Weird machines, exploitability, and
code. In 2012 IEEE Symposium on Security and Privacy, provable unexploitability. IEEE Transactions on Emerg-
pages 380–394. IEEE, 2012. ing Topics in Computing, 8(2):391–403, 2017.

[10] S Sibi Chakkaravarthy, Dhamodara Sangeetha, and [21] Hiroaki Etoh and Kunikazu Yoda. Propolice: Protecting
V Vaidehi. A survey on malware analysis and miti- from stack-smashing attacks. Technical Report, IBM
gation techniques. Computer Science Review, 32:1–23, Research Division, Tokyo Research Laboratory, 2000.
2019.
[22] Isaac Evans, Fan Long, Ulziibayar Otgonbaatar, Howard
[11] Buddhika Chamith, Bo Joel Svensson, Luke Dalessan- Shrobe, Martin Rinard, Hamed Okhravi, and Stelios
dro, and Ryan R Newton. Instruction punning: Sidiroglou-Douskos. Control jujutsu: On the weak-
Lightweight instrumentation for x86-64. In Proceedings nesses of fine-grained control flow integrity. In Proceed-
of the 38th ACM SIGPLAN Conference on Programming ings of the 22nd ACM SIGSAC Conference on Computer
Language Design and Implementation, pages 320–332, and Communications Security, pages 901–913, 2015.
2017.
[23] Jeff Gennari, Shaun Hedrick, Frederick W Long, Justin
[12] Long Cheng, Salman Ahmed, Hans Liljestrand, Thomas Pincar, and Robert C Seacord. Ranged integers for the
Nyman, Haipeng Cai, Trent Jaeger, N Asokan, and Dan- c programming language. 2007.
feng Yao. Exploitation techniques for data-oriented
attacks with existing and potential defense approaches. [24] Matthew Green, Mathias Hall-Andersen, Eric Hen-
ACM Transactions on Privacy and Security (TOPS), nenfent, Gabriel Kaptchuk, Benjamin Perez, and Gijs
24(4):1–36, 2021. Van Laer. Efficient proofs of software exploitability for
real-world processors. Proceedings on Privacy Enhanc-
[13] Tsung-Huan Cheng, Ying-Dar Lin, Yuan-Cheng Lai, and ing Technologies, 2023.
Po-Ching Lin. Evasion techniques: Sneaking through
your intrusion detection/prevention systems. IEEE [25] Roberto Ierusalimschy. Programming in lua. Roberto
Communications Surveys & Tutorials, 14(4):1011–1020, Ierusalimschy, 2006.
2011.
[26] Changhee Jung, Duk-Kyun Woo, Kanghee Kim, and
[14] Greg Cooper. Dtrace: dynamic tracing in oracle so- Sung-Soo Lim. Performance characterization of prelink-
laris, mac os x, and free bsd by brendan gregg and jim ing and preloadingfor embedded systems. In Proceed-
mauro. ACM SIGSOFT Software Engineering Notes, ings of the 7th ACM & IEEE international conference
37:34, 2012. on Embedded software, pages 213–220, 2007.

USENIX Association 18th USENIX WOOT Conference on Offensive Technologies 23


[27] David Keller, Timothy Roscoe, Reto Achermann, and [40] Yefeng Ruan, Sivapriya Kalyanasundaram, and Xukai
Simon Gerber. Bachelor’s thesis nr. 137b. Zou. Survey of return-oriented programming defense
mechanisms. Security and Communication Networks,
[28] Mahmood Jasim Khalsan and Michael Opoku Agyeman. 9(10):1247–1265, 2016.
An overview of prevention/mitigation against memory
corruption attack. In Proceedings of the 2nd Interna- [41] Konstantin Serebryany, Derek Bruening, Alexander
tional Symposium on Computer Science and Intelligent Potapenko, and Dmitriy Vyukov. {AddressSanitizer}:
Control, pages 1–6, 2018. A fast address sanity checker. In 2012 USENIX annual
technical conference (USENIX ATC 12), pages 309–318,
[29] Sandeep Kumar, Diksha Moolchandani, and Smruti R
2012.
Sarangi. Hardware-assisted mechanisms to enforce con-
trol flow integrity: A comprehensive survey. Journal of [42] Julian Seward and Nicholas Nethercote. Using valgrind
Systems Architecture, 130:102644, 2022. to detect undefined value errors with bit-precision. In
[30] Marc-André Laverdière, Serguei A Mokhov, and Djamel USENIX Annual Technical Conference, General Track,
Benredjem. On implementation of a safer c library, pages 17–30, 2005.
iso/iec tr 24731. arXiv preprint arXiv:0906.2512, 2009.
[43] Zhidong Shen and Weiying Chen. A survey of research
[31] Yoochan Lee, Changwoo Min, and Byoungyoung Lee. on runtime rerandomization under memory disclosure.
{ExpRace}: Exploiting kernel races through raising in- IEEE Access, 7:105432–105440, 2019.
terrupts. In 30th USENIX Security Symposium (USENIX
Security 21), pages 2363–2380, 2021. [44] Octavian Suciu, Connor Nelson, Zhuoer Lyu, Tiffany
Bao, and Tudor Dumitras, . Expected exploitability: Pre-
[32] Shuaibing Lu, Xiaohui Kuang, Yuanping Nie, and dicting the development of functional vulnerability ex-
Zhechao Lin. A hybrid interface recovery method for ploits. In 31st USENIX Security Symposium (USENIX
android kernels fuzzing. In 2020 IEEE 20th Interna- Security 22), pages 377–394, 2022.
tional Conference on Software Quality, Reliability and
Security (QRS), pages 335–346. IEEE, 2020. [45] Justin Tracey. Building a better tor experimentation
platform from the magic of dynamic elfs. Master’s
[33] Héctor Marco-Gisbert and Ismael Ripoll-Ripoll. Sspfa: thesis, University of Waterloo, 2017.
effective stack smashing protection for android os. In-
ternational Journal of Information Security, 18(4):519– [46] Fish Wang and Yan Shoshitaishvili. Angr-the next gen-
532, 2019. eration of binary analysis. In 2017 IEEE Cybersecurity
Development (SecDev), pages 8–9. IEEE, 2017.
[34] Jonathan AP Marpaung, Mangal Sain, and Hoon-Jae
Lee. Survey on malware evasion techniques: State of the [47] Ruoyu Wang, Yan Shoshitaishvili, Antonio Bianchi, Ar-
art and challenges. In 2012 14th International Confer- avind Machiry, John Grosen, Paul Grosen, Christopher
ence on Advanced Communication Technology (ICACT), Kruegel, and Giovanni Vigna. Ramblr: Making reassem-
pages 744–749. IEEE, 2012. bly great again. In NDSS, 2017.
[35] Xiaozhu Meng and Barton P Miller. Binary code is not [48] Shuai Wang, Pei Wang, and Dinghao Wu. Reassem-
easy. In Proceedings of the 25th International Sympo- bleable disassembling. In 24th USENIX Security Sym-
sium on Software Testing and Analysis, pages 24–35, posium (USENIX Security 15), pages 627–642, 2015.
2016.
[49] Yan Wang, Wei Wu, Chao Zhang, Xinyu Xing, Xiaorui
[36] Nicholas Nethercote. Dynamic binary analysis and in- Gong, and Wei Zou. From proof-of-concept to ex-
strumentation. Technical report, University of Cam- ploitable. Cybersecurity, 2(1):1–25, 2019.
bridge, Computer Laboratory, 2004.
[50] Ye Wang, Qingbao Li, Zhifeng Chen, Ping Zhang, and
[37] Trail of Bits. Mcsema. https://github.com/
Guimin Zhang. A survey of exploitation techniques and
lifting-bits/mcsema, 2020.
defenses for program data attacks. Journal of Network
[38] Tavis Ormandy. Porting windows dynamic link and Computer Applications, 154:102534, 2020.
libraries to linux. https://github.com/taviso/
loadlibrary, 2017. [51] Wei Wu, Yueqi Chen, Jun Xu, Xinyu Xing, Xiaorui
Gong, and Wei Zou. {FUZE}: Towards facilitating
[39] Conor Pirry, Hector Marco-Gisbert, and Carolyn Begg. exploit generation for kernel {Use-After-Free} vulnera-
A review of memory errors exploitation in x86-64. Com- bilities. In 27th USENIX Security Symposium (USENIX
puters, 9(2):48, 2020. Security 18), pages 781–797, 2018.

24 18th USENIX WOOT Conference on Offensive Technologies USENIX Association


[52] Meng Xu. Finding race conditions in kernels: The sym- ence (ACSAC’06), pages 429–438. IEEE, 2006.
bolic way and the fuzzy way. 2020.
[54] Michal Zalewski. Afl: American fuzzy lop. https:
[53] Yves Younan, Davide Pozza, Frank Piessens, and //lcamtuf.coredump.cx/afl/, 2016.
Wouter Joosen. Extended protection against stack
smashing attacks without performance loss. In 2006 [55] Google Project Zero. Winafl. https://github.com/
22nd Annual Computer Security Applications Confer- googleprojectzero/winafl, 2016.

USENIX Association 18th USENIX WOOT Conference on Offensive Technologies 25

You might also like