+++ title = 'Handy tracing tools with eBPF' date = 2024-11-17T14:36:07+01:00 draft = true tags = ['ebpf', 'tracing', 'network', 'kernel'] +++ [eBPF](https://ebpf.io/) allows event-driven programs, written in high-level languages, to be configured to run against pre-defined hooks such as syscalls, function invocations, and network events. The technology enables the creation of user-space implementations of many tools which previously required a kernel implementation or module. While researching the technology and scoping out potentially interesting use-cases, I've discovered that eBPF ships with a collection of simple but useful tracing tools. I suspect that I'll be reaching for these frequently in the future — especially for those tricky bugs where more traditional debugging techniques fail to deliver. The introspection provided by many, if not all of these tools was previously achievable in other ways. However, oftentimes it was using tooling such as [strace](https://man7.org/linux/man-pages/man1/strace.1.html) which has performance issues and is not always a convenient choice. Additionally, the tooling provided by eBPF should theoretically be cross-platform, meaning it works on both Linux and MacOS alike. This is something that frequently is not the case for more legacy solutions. ## Installing bcc-tools The tooling is provided by [bcc-tools](https://github.com/iovisor/bcc) package. To install on Arch Linux: ```bash % pacman -S bcc bcc-tools python-bcc linux-headers # required for bashreadline: % pacman -S python-pyelftools ``` The tools will be installed to `/usr/share/bcc/tools`. Inconveniently, this is probably outside of your search path. It's easy to add this path to your `PATH` variable: ```bash % export PATH=/usr/share/bcc/tools:$PATH ``` Now, let's try the `execsnoop` command. This tool traces the creation of new processes. ```sh % execsnoop bpf: Failed to load program: Operation not permitted Traceback (most recent call last): File "/usr/share/bcc/tools/execsnoop", line 268, in b.attach_kprobe(event=execve_fnname, fn_name="syscall__execve") File "/usr/lib/python3.12/site-packages/bcc/__init__.py", line 851, in attach_kprobe fn = self.load_func(fn_name, BPF.KPROBE) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/bcc/__init__.py", line 523, in load_func raise Exception("Need super-user privileges to run") Exception: Need super-user privileges to run ``` Running code which installs eBPF hooks requires root privileges or the `CAP_BPF` capacity. When running outside of containerized environments this probably means running bcc tools with `sudo`. However, when I tried to run the `execsnoop` tool with sudo: ```bash % sudo execsnoop sudo: execsnoop: command not found ``` Of course, this is a common issue with sudo. Many Linux distributions are configured to not preserve environment variables, including `$PATH`, when running commands with `sudo`. Somehow I've managed to live with this over the years with a number of unsatisfying workarounds but the prospect of being able to run this tooling with minimal friction finally provided the inspiration for me to find a better way. This [was achieved](https://git.netflux.io/rob/dotfiles/commit/c39ec29b21751744a645b9bef5ba06d5fabee9bf) with a simple alias: ```bash alias sudop='sudo env PATH=$PATH' ``` After adding the alias, it's easy to run a bcc tool: :tada: ```bash % sudop execsnoop PCOMM PID PPID RET ARGS ``` ## Useful commands This section only scratches the surface of what's available out-of-the-box with bcc-tools. Most of my inspiration came from Brendan Gregg's [blog post](https://www.brendangregg.com/ebpf.html). ### Tracing newly created processes As mentioned above, you can watch process creation with `execsnoop`, which is named after the `exec` syscall: ```bash % sudop execsnoop PCOMM PID PPID RET ARGS sh 497415 2580 0 /bin/sh -c tmbatinfo tmbatinfo 497415 2580 0 /home/rob/script/tmbatinfo bash 497415 2580 0 /usr/bin/bash /home/rob/script/tmbatinfo batinfo 497417 497415 0 /home/rob/script/batinfo bash 497417 497415 0 /usr/bin/bash /home/rob/script/batinfo sh 497416 2580 0 /bin/sh -c sysinfo sysinfo 497416 2580 0 /home/rob/script/sysinfo ``` This outputs columns: * `PCOMM`: some googling suggests this should be the _parent_ command name, but it seems to me to be the name of the launched (child) command. * `PID`: the PID of the launched process. * `PPID`: the PID of the parent process. * `RET`: I assume this is the return value of the syscall. * `ARGS`: the arguments provided to the syscall, which at least in my experiments seems to include the file or path of the command as well as the arguments. Importantly, the `-x` flag can be passed to make `execsnoop` show failed syscall attempts as well as those that are successful. ### Tracing open files Similarly, `opensnoop` allows tracing of the `open` syscall which is used to open files. I used to rely heavily on the pre-eBPF, dtrace-based implementation of `opensnoop` back in my Mac days, and it's nice to discover that I can call upon it from Linux. ```bash % sudop opensnoop PID COMM FD ERR PATH 347357 ThreadPoolForeg 27 0 /home/rob/.cache/google-chrome/Default/Cache/Cache_Data/1b2c2ddd2d7ef7c5_0 347357 Chrome_ChildIOT 32 0 /dev/shm/.com.google.Chrome.ZkPQQO 347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/.com.google.Chrome.lWY8pP 347312 ThreadPoolForeg 114 0 /dev/shm/.com.google.Chrome.aLGFXG 347312 ThreadPoolForeg 192 0 /proc/347426/stat 347312 ThreadPoolSingl 192 0 /proc/347426/task/347426/status 347312 chrome 192 0 /dev/shm/.com.google.Chrome.ddntL8 347312 ThreadPoolForeg 109 0 /proc/347426/stat 347312 ThreadPoolForeg 103 0 /proc/347426/stat 347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/Extensions/fmkadmapgofadopljbjfkapdkoienihi/6.0.1_0/build/proxy.js 347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/Extensions/fmkadmapgofadopljbjfkapdkoienihi/6.0.1_0/build/fileFetcher.js ``` The columns are similar to `execsnoop`: * `PID`: the PID of the process calling `open`. * `COMM`: the name of the calling process. * `FD`: this is the process-scoped Linux file descriptor which is opened. * `ERR`: error code, 0 for success. * `PATH`: the path of the file being opened, which may of course be a Linux virtual filesystem. ### Tracing outgoing TCP connections With `tcpconnect` we can trace outgoing TCP connections (using the `connect` syscall) — in this case triggered by a `curl https://google.com` from my local network address (`192.168.1.147`) to Google's server at `172.217.14.14:443`. ```bash % sudop tcpconnect Tracing connect ... Hit Ctrl-C to end PID COMM IP SADDR DADDR DPORT 515773 curl 4 192.168.1.147 172.217.17.14 443 ``` ### Tracing incoming TCP connections Simialrly, with `tcpaccept` we can trace incoming TCP connections (using the `accept` syscall). In this case, I used Ruby to spun up an HTTP server on port 8001: ```bash $ ruby -run -ehttpd . -p8001 ``` And then again used `curl` to make a request which was traced successfully: ```bash % sudop tcpaccept PID COMM IP RADDR RPORT LADDR LPORT 516996 ruby 6 ::1 41200 ::1 8001 ``` ## Conclusions I am just starting to explore the world of eBPF but I'm excited to discover a suite of lightweight system tracing utilities that I can see being a genuinely useful addition to my toolkit. Once again, don't forget to check out the [eBPF website](https://ebpf.io) and [Brendan Gregg's blog](https://brendangregg.com/ebpf.html) to dive deeper.