In the context of a class at University, we got tasked in writing an eBPF application. This not only gave me the oppurtunity in learning more about eBPF and improve upon my skills in programming inside the Linux kernel. In addition I got to learn more about the ncurses library, which was a great addition to the developed program. The following blog post outlines the background information and functionality of the eBPF Linux Kernel Exploit Scanner programm. As described by its name already, the tool manages to detect some types of Memory Management and Privilege Escalation exploits in the Linux kernel.
This is the write-up to the project. For the full code click Github.
The project was created by a group of students I was part of. I want to thank all for the work and the permission to publish the code on my personal blog.
Modern operating systems require tools to monitor performance, debug issues, and enhance security. Traditionally, kernel instrumentation techniques often rely on kernel modules or custom kernel builds, which are difficult to deploy, error-prone, and can be unsafe. These limitations motivate the need for mechanisms that allow dynamic, safe, and efficient inspection of kernel behavior. Extended Berkeley Packet Filter (eBPF) addresses these challenges by allowing user-defined programs to run inside the Linux kernel in a sandboxed environment. eBPF programs can be attached to kernel hooks such as system calls, tracepoints and network events, enabling fine-tuned monitoring without requiring kernel recompilation or rebooting. The motivation behind this project is to gain hands-on experience with eBPF by implementing a small yet functional program that demonstrates eBPFs core capabilities. By developing a minimal eBPF application, the aim was to understand how eBPF programs interact with the kernel, how data is safely transferred between the kernel space and user space, and what implications arise from such instrumentation.
According to their own documentation, eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules. Now in regards to what I understand by it is, that eBPF modules allow us to quickly and safely run code in a sandboxed environment in kernel space. This is quite handy since we can circumvent two main requirements when running code in the kernel, which is security and complexity.
One part of the eBPF logic consists of detecting memory management exploitations. In the case of this work, the team intentionally focused on one group of such exploits, namely DirtyCOW. To detect DirtyCOW exploits, the eBPF program monitors kernel events related to memory management and file write operations that are indicative of copy-on-write abuse. Dirty COW exploits rely on repeatedly triggering race conditions between memory write operations and memory mapping invalidation, often involving functions such as mmap, madvise, or page fault handling routines. The eBPF program is attached to selected kernel hooks associated with these operations (e.g., tracepoints or kprobes). It records relevant metadata such as process identifiers, target file identifiers, and the frequency of suspicious operations. This information is stored in eBPF maps for later analysis in user space. The detection logic focuses on identifying anomalous patterns, such as unusually high rates of memory advice calls combined with write attempts on read-only mappings. These patterns are characteristic of Dirty COW exploitation attempts and are unlikely to occur during normal application behaviour.
The second detection component targets privilege escalation attempts. Privilege escalation is one of the most common goals of kernel exploitation, since successful exploitation often enables an attacker to modify kernel-managed credential structures and gain root-level privileges. Unlike Dirty COW detection, which is focused on memory management patterns, privilege escalation detection focuses on monitoring changes in process identity and authorization state. The design monitors events associated with credential transitions, such as changes in user privileges, changes on files and writing to sensitive locations. The eBPF program collects metadata whenever a process undergoes a privilege transition, including the process identifier, the old and new privilege values, and a timestamp. This information is forwarded to user space where it can be evaluated against simple security policies. For example, the user-space program can raise an alert if certain actions are performed in succession. This approach provides visibility into the final security outcome of an exploit attempt, even when the exploit mechanism itself is unknown.
The design uses eBPF maps to store persistent process information in the kernel and to make this information accessible to the user space. The user-space component reads the collected data continuously and outputs readable logs for monitoring and debugging. This separation between kernel-space collection and user-space analysis keeps the kernel program small and safe while still enabling meaningful detection logic.
This chapter covers the implementation steps of the memory management and privilege escalation detection modules including the Ncurses userspace frontend.
The memory management exploit detection component was implemented as an eBPF tracing program that monitors kernel activity associated with typical execution patterns of said exploit class, while keeping the dirtyCOW exploit family as reference. The module focuses on detecting three key behaviors that are strongly correlated with the dirtyCOW exploit: the usage of the madvise() function while passing the MADV_DONTNEED flag, a high degree of calls to mmap() with the PROT_READ and MAP_PRIVATE | MAP_ANONYMOUS flags, and finally continuous reads and writes to the process’ own /proc/self/mem file. To support tracking and correlation across multiple system calls, the implementation uses eBPF maps as a shared state. A process statistics map (implemented in the project as proc_stats_map) is used to store per-process information, such as its process ID, comm name and data related to the exploit indicators. In addition, the implementation defines a temporary hash map called mem_fd_map, which is used to associate a process with the file descriptor returned when opening /proc/self/mem. This association is required because later syscalls such as write() only provide a file descriptor number and do not include the original filename. The first hook is implemented using a tracepoint to sys_enter_mmap. This hook executes every time a process calls madvise() while only with the specific PROT_READ and MAP_PRIVATE | MAP_ANONYMOUS flags. Inside the handler, the program extracts the calling process ID and reads the third and second arguments from the eBPF provided ctx struct. The implementation then checks whether the function arguments are those we want to track. This function signature is frequently used by the dirtyCOW exploit family to invalidate pages repeatedly and increase the likelihood of winning the race condition. The second hook uses the tracepoint sys_enter_madvise. The principle is like before, where we check a specific set of function arguments, in this case the second argument named “advise”, and whether they match to the ones we are interested in. In the context of dirtyCOW, we need to check whether the advice argument is equal to the MADV_DONTNEED macro. Lastly, we have the third group of hooks, which all work together to monitor accesses and writes to the /proc/self/mem file of the calling process. This is done using the tracepoints sys_enter_openat, sys_exit_openat and sys_enter_write. The first are used to keep track if said file was accessed by checking the argument passed to openat and keeping track of its return value, which is the file descriptor pointing to the /proc/self/mem file. The program keeps track of that file descriptor inside a temporary map (eBPF only) and then checks if the same is passed to sys_enter_write. If the opened file is not /proc/self/mem, the handler exits. If the file matches, the program updates the process entry in proc_stats_map to mark that the process has opened /proc/self/mem. This flag is then used to correlate the open event with subsequent write activity.
The first monitored stage is credential modification, implemented using a kprobe attached to the kernel function commit_creds. This is a high value, although often used, hook because many kernel privilege escalations eventually call commit_creds() to elevate their permissions. When the kprobe triggers, the old and new UID are read using BPF_CORE_READ and if the value for the new UID is equal to 0, a new entry is created for the executing TGID if none exists already. A timestamp of the execution is also written into the pid_records map. This event in isolation is is assigned a baseline severity of LOW because legitimate system operations can also trigger credential changes, but it becomes more suspicious when correlated with other signals. Should the same TGID have already triggered any of the other hooks, the criticality gets adjusted accordingly. The second stage monitors suspicious permission changes involving SUID or SGID bits. This is implemented using the tracepoint sys_enter_fchmodat. The handler reads the mode argument and checks whether the SUID/SGID bits are set by masking the mode with 06000 (04000 for setuid, 02000 for setgid). If the chmod operation does not set these bits, the handler exits early. If it does, the program creates a new pid_record in the pid_record_map if none for the current tgid is present. If there is already an entry for the current tgid, the execution timestamp is updated. This event is assigned a baseline severity of LOW but is elevated to MEDIUM if it occurs within the correlation window of a credential change, since the combination of commit_creds activity and SUID/SGID changes can indicate an attempt to establish persistent privilege escalation. The third stage detects the execution of a root shell. This is implemented using the syscall tracepoint sys_enter_execve. The handler first checks whether the calling process has UID 0, ensuring that only root shells are considered. It then applies a lightweight string-based filter to detect common shells such as sh, bash, and zsh using the helper function is_shell(), which reads the filename string from user space via bpf_probe_read_user_str. If both conditions match, the program updates the root_shell_timestamp in the per-process state. This event is assigned a baseline severity of LOW, but it is elevated to MEDIUM if it occurs within the correlation window of a recent commit_creds event. Additionally, the detector assigns HIGH severity if the process exhibits a more complete escalation chain, specifically when a credential change and a SUID/SGID chmod are observed and are followed by either a root shell execution or a sensitive file write. A second operation that can elevate the criticallity to HIGH is wiriting into a sensitive directory. For this proof of concept, /etc/sudoers, /etc/shadow and /etcpasswd were selected. This is implemented using the syscall tracepoint sys_enter_openat. The handler checks the open flags to determine whether the file is being opened in a write-capable mode. This is done using is_write_flags(), which checks for write access modes such as O_WRONLY, O_RDWR, and modifiers such as O_APPEND or O_TRUNC. The helper function is_sensitive_path() is used to determine, if one of the selected paths is targeted by the action. This function performs a safe string copy from user space and uses manual character comparisons rather than higher-level library calls, ensuring verifier compatibility. If a sensitive file is opened for writing, the program updates the sensitive_write_timestamp in the per- process state and emits an event. This stage is assigned a baseline severity of MEDIUM because writing to these files is a strong indicator of privilege escalation or persistence behavior, even if it occurs without other correlated signals.
Both modules including the maps needed in user and kernel space are mapped using a loader, which automatically injects all the needed logic into the kernel on the running host.
The UI in kernel space is limited to running `sudo cat /sys/kernel/debug/tracing/trace_pipe` in a separate terminal while also loading the eBPF modules. Several printk functions then print the output directly to that buffer. This is the minimum required to run the program and see its output. For better readability, consider using the User Space UI.
The same data gathered in kernel space can be presented in user space. This was done using the ncurses library in C. There are three sub-windows, each representing different data. The log window prints the processes that got tagged during the program’s runtime and the other smaller windows show medium and highly escalated processes separate from the rest (except for privilege escalation exploits, where also low severity processes are shown). Depending on the threat indicator, a process is either shown in the Memory Management or Privilege Escalation window. It is to be noted, however, that using the TUI is not necessary for the program to function. The ncurses UI was primarly done out of interest and to present the user with a cleaner output.
The repository contains two Makefiles, which can be used to compile the program components. One is used to compile user space dependencies while the second one, residing in the ./module folder, is used to compile the eBPF part of the program.
Some of the planned features had to be scratched due to time's sake. Some of them are listed as follows: