7.8 KiB
+++ title = 'Handy tracing tools with eBPF' date = 2024-11-17T14:36:07+01:00 draft = false tags = ['ebpf', 'tracing', 'network', 'kernel'] +++
eBPF allows event-driven programs, written in high-level languages, to be configured to run against pre-defined hooks such as syscalls, function invocations, and network events. The technology enables the creation of user-space implementations of many tools which previously required a kernel implementation or module.
While researching the technology and scoping out potentially interesting use-cases, I've discovered that eBPF ships with a collection of simple but useful tracing tools. I suspect that I'll be reaching for these frequently in the future — especially for those tricky bugs where more traditional debugging techniques fail to deliver.
The introspection provided by many, if not all of these tools was previously achievable in other ways. However, oftentimes it was using tooling such as strace which has performance issues and is not always a convenient choice. Additionally, the tooling provided by eBPF should theoretically be cross-platform, meaning it works on both Linux and MacOS alike. This is something that frequently is not the case for more legacy solutions.
Installing bcc-tools
The tooling is provided by bcc-tools package. To install on Arch Linux:
% pacman -S bcc bcc-tools python-bcc linux-headers
# required for bashreadline:
% pacman -S python-pyelftools
The tools will be installed to /usr/share/bcc/tools
. Inconveniently, this is probably outside of your search path.
It's easy to add this path to your PATH
variable:
% export PATH=/usr/share/bcc/tools:$PATH
Now, let's try the execsnoop
command. This tool traces the creation of new processes.
% execsnoop
bpf: Failed to load program: Operation not permitted
Traceback (most recent call last):
File "/usr/share/bcc/tools/execsnoop", line 268, in <module>
b.attach_kprobe(event=execve_fnname, fn_name="syscall__execve")
File "/usr/lib/python3.12/site-packages/bcc/__init__.py", line 851, in attach_kprobe
fn = self.load_func(fn_name, BPF.KPROBE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/bcc/__init__.py", line 523, in load_func
raise Exception("Need super-user privileges to run")
Exception: Need super-user privileges to run
Running code which installs eBPF hooks requires root privileges or the CAP_BPF
capacity. When running outside of
containerized environments this probably means running bcc tools with sudo
. However, when I tried to run the
execsnoop
tool with sudo:
% sudo execsnoop
sudo: execsnoop: command not found
Of course, this is a common issue with sudo. Many Linux distributions are configured to not preserve environment
variables, including $PATH
, when running commands with sudo
. Somehow I've managed to live with this over the years
with a number of unsatisfying workarounds but the prospect of being able to run this tooling with minimal friction
finally provided the inspiration for me to find a better way. This was
achieved with a simple alias:
alias sudop='sudo env PATH=$PATH'
After adding the alias, it's easy to run a bcc tool: 🎉
% sudop execsnoop
PCOMM PID PPID RET ARGS
Useful commands
This section only scratches the surface of what's available out-of-the-box with bcc-tools. Most of my inspiration came from Brendan Gregg's blog post.
Tracing newly created processes
As mentioned above, you can watch process creation with execsnoop
, which is named after the exec
syscall:
% sudop execsnoop
PCOMM PID PPID RET ARGS
sh 497415 2580 0 /bin/sh -c tmbatinfo
tmbatinfo 497415 2580 0 /home/rob/script/tmbatinfo
bash 497415 2580 0 /usr/bin/bash /home/rob/script/tmbatinfo
batinfo 497417 497415 0 /home/rob/script/batinfo
bash 497417 497415 0 /usr/bin/bash /home/rob/script/batinfo
sh 497416 2580 0 /bin/sh -c sysinfo
sysinfo 497416 2580 0 /home/rob/script/sysinfo
This outputs columns:
PCOMM
: some googling suggests this should be the parent command name, but it seems to me to be the name of the launched (child) command.PID
: the PID of the launched process.PPID
: the PID of the parent process.RET
: I assume this is the return value of the syscall.ARGS
: the arguments provided to the syscall, which at least in my experiments seems to include the file or path of the command as well as the arguments.
Importantly, the -x
flag can be passed to make execsnoop
show failed syscall attempts as well as those that are
successful.
Tracing open files
Similarly, opensnoop
allows tracing of the open
syscall which is used to open files.
I used to rely heavily on the pre-eBPF, dtrace-based implementation of opensnoop
back in my Mac days, and it's nice
to discover that I can call upon it from Linux.
% sudop opensnoop
PID COMM FD ERR PATH
347357 ThreadPoolForeg 27 0 /home/rob/.cache/google-chrome/Default/Cache/Cache_Data/1b2c2ddd2d7ef7c5_0
347357 Chrome_ChildIOT 32 0 /dev/shm/.com.google.Chrome.ZkPQQO
347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/.com.google.Chrome.lWY8pP
347312 ThreadPoolForeg 114 0 /dev/shm/.com.google.Chrome.aLGFXG
347312 ThreadPoolForeg 192 0 /proc/347426/stat
347312 ThreadPoolSingl 192 0 /proc/347426/task/347426/status
347312 chrome 192 0 /dev/shm/.com.google.Chrome.ddntL8
347312 ThreadPoolForeg 109 0 /proc/347426/stat
347312 ThreadPoolForeg 103 0 /proc/347426/stat
347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/Extensions/fmkadmapgofadopljbjfkapdkoienihi/6.0.1_0/build/proxy.js
347312 ThreadPoolForeg 103 0 /home/rob/.config/google-chrome/Default/Extensions/fmkadmapgofadopljbjfkapdkoienihi/6.0.1_0/build/fileFetcher.js
The columns are similar to execsnoop
:
PID
: the PID of the process callingopen
.COMM
: the name of the calling process.FD
: this is the process-scoped Linux file descriptor which is opened.ERR
: error code, 0 for success.PATH
: the path of the file being opened, which may of course be a Linux virtual filesystem.
Tracing outgoing TCP connections
With tcpconnect
we can trace outgoing TCP connections (using the connect
syscall) — in this case triggered by
a curl https://google.com
from my local network address (192.168.1.147
) to Google's server at 172.217.14.14:443
.
% sudop tcpconnect
Tracing connect ... Hit Ctrl-C to end
PID COMM IP SADDR DADDR DPORT
515773 curl 4 192.168.1.147 172.217.17.14 443
Tracing incoming TCP connections
Simialrly, with tcpaccept
we can trace incoming TCP connections (using the accept
syscall). In this case, I used
Ruby to spun up an HTTP server on port 8001:
$ ruby -run -ehttpd . -p8001
And then again used curl
to make a request which was traced successfully:
% sudop tcpaccept
PID COMM IP RADDR RPORT LADDR LPORT
516996 ruby 6 ::1 41200 ::1 8001
Conclusions
I am just starting to explore the world of eBPF but I'm excited to discover a suite of lightweight system tracing utilities that I can see being a genuinely useful addition to my toolkit.
Once again, don't forget to check out the eBPF website and Brendan Gregg's blog to dive deeper.