The Magic of eBPF
Welcome to Cloud Native Compass, a podcast to help you navigate
the vast landscape of the Cloud native ecosystem. We're your
hosts. I'm David Flanagan, a technology magpie that can't stop playing
with new, shiny things. I'm Laura Santamaria, a forever
learner who is constantly breaking production. eBPF is
turing complete and can be written in Rust. So obviously,
I'm sold. That's just two of the things we learned today's
episode. Want to learn more? Contrary to not so popular
belief, eBPF is not some secret Berkeley green project
that's out there to eat your refrigerator, but it is a gateway to
becoming a Linux kernel maintainer. If you're curious about
what eBPF is, why it matters, how to
pronounce it, and how badly you can break your kernel when trying to learn it,
this is the episode for you as we talk with Liz Rice,
the author of Learning eBPF and Chief Open
Source Officer at Isovalent. All right, thank you.
Liz, could you please say hello and tell us a little bit more about you?
Hi. Yeah. So my name is Liz Rice. I am Chief Open Source
Officer at Isovalent, which is the company that originally
created Cilium, and a lot of people will have
heard of Cilium being based on eBPF.
And earlier this year, I published a book about eBPF called Learning
eBPF. I feel like I may have answered more than
one question in one go there. Hey, no, that's fine.
That's fine. Little bit of context. Just us. Perfect. Yeah. Good
context, considering we're about to talk about eBPF and go into a little
bit more detail there, so yeah. Awesome. All right, well, let's
just know not everyone is familiar with
eBPF. So could you give us the TLDR? What do people need to know to
understand the rest of the conversation today. So they don't need to know what
it stands for? It stands for Extended Berkeley Packet
filter, but honestly, forget that it doesn't really help
because it does a lot more than packet filtering now. So we
tend to say the acronym doesn't really mean anything anymore.
What it really is is the ability to run
programs within the kernel, within the operating
system kernel, so we can dynamically change the way that the
kernel behaves by loading these
eBPF programs. And I think when I say that,
I have to make sure that people really know what I mean when
I say the kernel. The kernel is the part of the operating system
that interfaces between our applications and the
hardware that the processor and its
peripherals. So if you are writing to a
file, you're doing anything over the network, writing anything
to screen even allocating memory, the kernel has
to get involved. Your application can't do it directly. It has to
ask the kernel for help. And the kernel is also
coordinating all the different processes that might be running on the machine. And
that means the kernel is involved whenever you're doing anything
interesting, really. So it's a really great
place to write things like observability tools
and security tools. And we can do that with
eBPF, and we also get to customize the way that the kernel
behaves for the things that it takes on, things like the networking
stack. We can modify the way that behaves with eBPF. So
it's really powerful and a really interesting way to
instrument all of your different applications that
are running on that one kernel. Nice.
And here I was hoping you could I was going to say, here I was
hoping you could actually explain why it said Berkeley in the middle of all of
it, but that's a whole more than anything else. The
original packet filter paper
was written by two people whose names I can't quite
remember right now, but they were at Berkeley at the
time that makes precise
paper that it all originates from says, like, Lawrence Berkeley
Lab or however. That'S know, just for
fun for somebody like me, who's like, why does it say
Berkeley? Okay, anyway,
so I'm curious then, did you join
Isovalent because you were really interested in eBPF, or are you now really interested
in eBPF because you've joined Isovalent? Like, what came first?
No, I really got interested in eBPF the first time
I heard of it. I saw Thomas Graff, who is the
CTO of Isovalent, talking about Psyllium and
eBPF at DockerCon back in 2017. And at the time I
thought, well, that's pretty interesting technology. And at that
point, it was really cutting edge in the kernel.
It wasn't available to most people in production. It wasn't available to hardly anybody
in the Linux distributions that they were using back
then. But I thought, this is a really interesting
technology, and I'd kind of kept an eye on it. A
couple of years later, I started working on a project that was sort of
using Ubpf, and I was also, as part of
learning about it myself, I was going out and doing talks.
I find that the best way to make sure you really understand it is to
try and explain something to somebody else.
So, yeah, I'd started doing talks about eBPF as
well, and through that, actually got
invited back to the eBPF Summit in, I guess,
2020, which I surveillant put
on sort of on behalf of the eBPF community.
And there was so much really cool stuff going on
in the world of eBPF, and particularly at I survealent,
it turns out I hadn't realized before that summit just how much
I surveillant had been involved with eBPF right from the get go. So
Daniel Balkman, who's one of the three maintainers of
eBPF in the kernel, was one of the early engineers at
Isovalent. And he's still, you know, we're
so embedded in the way that eBPF
has developed over the years. I really do get
to work with the people who created it and the people who've been
using it and had the vision for building things like Psyllium.
So, yeah, I joined Isovalent because it's just
absolutely full of really cool people doing really fun things with
eBPF. Awesome. Makes sense. So
you started experimenting with eBPF in 2017.
We're now in the latter half of 2023, which just
seems absurd to me now, but over those years, you've seen
the adoption grow, as we all have, especially across the industry and even the
CNCF, with projects like Pixie and Falco and of course Psyllium as
well. Why has that adoption grown
so quickly for a relatively niche? I don't know. Is that a niche technology? I
think it is. Why are people I don't
know. It seems like it's kind of everywhere. Yeah, it's one of those things
that, I guess, expertise in it
is pretty niche, but a lot of people are using it without really even
knowing that they're using it. I mean, there's probably people using
Psyllium who don't realize that it's based on eBPF. Certainly a lot
of people will be using things like TCP Dump and never
really sort of thought about EBPs. And that's fine.
There's so many really powerful tools that have been
built. I think
things that Brendan Greg has
popularized in the kind of
2017 period, he was already out there talking about
how Netflix were using eBPF for
observability purposes, for tracing, for
diagnosing, and then improving performance
issues and really
popularizing the power of
eBPF. The reason
why I think there was that kind of sudden
upturn in adoption is
the fact that the level of
eBPF support in the kernel had reached a point,
I think around the 418 kernel version, around that kind
of time frame, is where you really start getting it's sufficient
eBPF support to do really interesting things. The more
modern your kernel is, the more additional capabilities in
eBPF and probably lots of other areas of the kernel as
well. But there was this real turning point
when I would say particularly when Rel was
probably the last of the distributions to kind of
it's always relatively cautious about upgrading
to new versions of the kernel. And at the point where really all
of the distributions were using a modern enough kernel, that meant
you could just deploy these eBPF based tools in production
regardless of your distro. And I think that really made a huge difference to the
adoption. Yeah. Awesome. So you
mentioned the Berkeley packet filter, and for the people that are not
aware, it's like a networking thing that allows you to do Iptables like
stuff. I'm not trying going to go into it in any more detail than that
because I'll make an absolute mess of it. But it does networking stuff, right? It
blocks packets, it Reroutes packets, it does other stuff. But Evpf has kind
of grown beyond that. Now we're seeing it used for a whole
variety of different technologies like Falco. And
is there like what's the right way to phrase this question? Really? I should have
had it prepared. But why has it extended beyond this? Why does it have these
new capabilities? What is it enabling within the kernel for people? Why is it
interesting to you and to others? I know that's a very broad
question. Yeah. So I think the original idea
of packet filtering was to be able to look at each incoming packet
and make decisions about what to do with that packet.
And I think in the very first place, it was really just am I
interested in sort of seeing this packet? Maybe I want to
filter packets that are going to a particular port so that I can
count them or something like that. So it was making
fairly simple decisions about
what to do with these packet filters. The
extended part involved, I
think, a few different trains of thought. One was the idea that if
you extended this sort of relatively small
instruction set that could be used to examine
packets, if you turn that into something a
bit more kind of like a virtual machine
instruction set, if you look at BPF Bytecode,
it's very reminiscent of machine code.
It's all about registers and loading values into
registers and comparing them and jumping to other instructions.
It's very similar to machine code. So there was this
idea that having a virtual
machine in the kernel could allow you to do all sorts of interesting
things. There was the idea that maybe you could attach these
programs to other points in the kernel, not just to incoming packets,
but you could make decisions or change the
behavior at other points in the kernel. And I
think the last major thing that distinguished extended from
its predecessors is what's called eBPF maps.
And Maps are these data structures that you can access from
within an eBPF program and you can share them between eBPF
programs, and you can also access them using system
calls from user space. So it's a way of
exchanging information between user space and eBPF programs
or between multiple different eBPF programs.
And all those things kind of combined has turned out
to be really powerful to the extent that one of my colleagues
recently did a talk at one of the kind
of Linux kernel developer conferences where he showed that
eBPF is now turing complete, which is
pretty cool. A question just on that, right?
Because the talks I've seen from yourself and others in this
space, when you talk about eBPF, one of the things that's always mentioned is the
fact that the eBPF sandbox the virtual machine can and
shouldn't never crash. Is that
still true even with the ability for eBPF programs to communicate with each
other and with user space programs? Yeah, exactly. So
the reason we're able to make that claim is because of
a thing in the kernel called the eBPF verifier. So as
you load a program into the kernel, it goes through this verification
process, which is really analyzing all the possible paths through the program
and ensuring that, well, first
of all, it will run to completion.
A long time ago, that used to be no loops at all. Now that's
been kind of improved and optimized and you can have loops.
It's checking for things like there's only a limited
set of what's called helper functions that you can make from an eBPF
program and the set that you can call depend on really
the event that triggered it. So if you were being
triggered because a network packet arrived, then you can
call helper functions that are related to looking at that network packet.
But you can't, for example, ask for, well, what's the
user space process associated with this packet because there is no
user space process associated at that time. Whereas
if you were in an eBPF program attached to
a user space program making a system call, then you
absolutely could ask a helper function to give
you the process ID. So the Verifies checking that all this sort of
contextual helper functions are being used
appropriately and that memory access is safe,
that if you're going to dereference a pointer, you have to
explicitly check that it's not nil before. You do so
because dereferencing a nil pointer. If anybody has ever
written a C program, they will have crashed their C program by dereferencing a null
pointer. I guarantee it. Yeah. So the
verify is really just checking that that program is
safe to run. Safe in the sense that it can't crash the machine, that
memory access is safe. Of course it can't
tell the difference between maybe I'm
a legitimate networking packet that's a
legitimate Evpf program that's filtering network packets,
maybe I'm protecting against DDoS attacks, or
maybe I'm a malicious Evpf program and I am
just throwing away packets for fun. The
Verifier can't tell the difference between those two things. But when we talk about
being safe to run in this context, we really mean it's not going to crash
or hang the machine. Okay, cool. And I think worth
pointing face on here. Yeah, I've got my thinking face on.
Sorry. Go ahead. Go ahead, Liz. No, keep going, keep
going. Well, it's the thing that really
distinguishes eBPF from writing a kernel module.
So you always could extend the kernel. Always.
For a very long time you've been able to extend the kernel by writing kernel
modules. But people are pretty reluctant
to install kernel modules because
if there is some kind of bug in it, if it does crash, it's going
to bring down the whole machine and there's no kind of safety
net like what the eBPF Verifier is bringing. Makes
sense. So, out of curiosity, obviously we're talking a
lot more in depth kernel kind of things. And obviously
you referenced if you've ever developed a C program, you know what this is like
in machine code and things like that. Who would you say are probably
the most common user that you encounter or
like the people who are really using this the most. Who
do you think this is the. Most relevant for, I guess, right now. So
lots of people will use eBPF
through tooling that's built on
david mentioned. There's various different projects in the CNCF.
There's all the BPF Trace
and the BCC family of tools that people can use on the command
line to do observability. There's lots of
different tools that have been built on
eBPF as a platform. And I think for the vast majority of people,
that's how they'll really experience it. They'll use things like
Psyllium or Falco
or Pixi and they may be
interested in the fact that it's using eBPF, but they don't actually
have to get involved in the details. Which turns out to be a
really good thing because I'm the sort of person who really wants to
understand how the sausage is made. I want to
kind of get inside and get a feel for how is this really working?
You can learn about eBPF
programming. It's relatively easy to get
things like a hello world or some basic
networking capabilities running in eBPF
programs that you've written yourself. But you do quite
quickly start hitting the point where you're
interacting with kernel data structures. And at that point you kind of need to
understand what those data structures represent and what the effect of you
changing them might have. So it does quite
quickly turn into kernel programming. So I kind of say that on
the one hand, most people just don't need to know anything about the details
of it at all. But if they're interested, it's
relatively accessible for people who are comfortable with
programming to kind of dip your toe in.
And then if you really want to become an expert eBPF
programmer, which I by no means consider myself at
all, that really does start to require kind of
kernel expertise. But fortunately I work with lots of people at
Isovant who have that kernel expertise. So would you
say this is the gateway to actually becoming a Linux
maintainer? Honestly, yes.
I've not done it, but it never crossed my mind that I would ever
make a contribution to the Linux kernel. But now I've kind of in that world,
I sort of start to think if I had another 25 hours in.
The temptation is high.
The temptation is high to go get involved. I understand.
I've asked my one obligatory question, David, now it's your turn.
Like that. I like that discussion, right? Because it's one of those questions
I do a couple of talks that touch on eBPF, right? I don't go deep
on it because I'm not that smart, but I always do the same demos and
it's the Iovisor Bpfcc tools
demo specifically, I show off execsnoop
and open. You know, in the SRE
platform DevOpsy world. It's quite interesting and
important from a security perspective and an automation perspective to be able to show when
certain sensitive files are opened on a disk and eBPF makes that really
simple. And another really cool demo is just by using Execsnoop you
can actually monitor for pseudo and setuid bit one
binaries on the machine. So when people elevate their privileges, you can get
notifications for that kind of stuff too. And the question is always like
how much do I need to know about eBPF to then start doing tools similar
to that. And then you show them the source code and it's like 20 lines
of Python. It's not a lot to do these kind of things. And I think
that's because I feel like people can start to build
eBPF programs using tracy's without going deep into it in the same way
that with containers, we can all run containers on our machine. But you don't really
need to understand what a control group is or a namespace is anymore. And I
feel like eBPF may make that same transition, probably already has made that
same transition. So I'm going to flip that around a little bit and throw the
question to Liz. People do see these demos, they listen to
this and they're like, OK, eBPF is really cool. What are the languages
and the SDKs that they can go and start to work with right away to
experiment with the new tech? Yeah. So you kind of have
to answer that question in two parts. There's the
actual eBPF program itself that's going to run in the kernel. And
then there's the user space part that might interact with it. There are
some occasions where you don't even need a user space part. So for
example, if you're doing networking
functions, sometimes they
don't need any kind of user space interaction because you can just load them into
the kernel and they can do what they need to do. But usually
we're going to have both these parts
for the kernel. The program is going to be
in ultimately it's going to be in eBPF
bytecode form talked about there being these bytecode instructions
that look like machine code. You could
just write the machine code by hand. Apparently there are people who do that,
but for me I would rather
write in a slightly higher language than that. And
the languages are restricted by being
they have to be able to compile down to bytecode.
So the compilers that support it right now
are clang GCC, both of which can compile
C and also the Rust compiler.
I'm not aware of there being other programs that support
BPF bytecode as a target.
So yeah, C or Rust
really become your choice there. There is a little bit of a
caveat in that. There is this project called
Iovisor BCC. David mentioned the tools
and things like Opensnoop and Execsnoop that come from that
project. And BCC gives
you some friendly kind of
macros such that you can write your code in a sort of
hybrid of Python and C. And it
takes care of a few things for you from
your C program. But then there's the user space
side of things and there you have a much wider choice. I
mean, really, you're not restricted
at all, except that you probably want some SDKs that will make system calls
for you and make it easier for you to interact
with the eBPF program through that syscall
interface. There's
a go SDK there's a rust
SDK in fact, there's a couple of Go ones
and there's a C one, which is probably, I would say
today, the most widely used, called libbpf
Psyllium uses Go. We have a Go
eBPF library, but I think a lot of the
projects outside of that are probably directly using
libbpf. Yeah, you said one thing there that I completely disagree
with, and you said you have a choice between Rust and C. That's not
a choice. I knew this was coming. I knew this was coming. I knew it.
Of course. Talk about rust. I'm going to just break in before he gets going
on it. It occurs to me
there was the mention of when
containers came around and things like that, things kind of change.
And as there are more and more languages that people are familiar with that you
can compile down to the eBPF bytecode.
How often do you find people getting into trouble like they used to do
when containers first came around because they didn't quite know what they were doing,
but they kind of got it enough to get away with it. So how often
do you find people getting in trouble and how do you get them out? That's
always my question. How do you troubleshoot this thing? Especially when
this is like kernel level and you can really mess things up
fast? Yeah, I would say that there's probably
two major ways that people get caught out.
One of them is around the kind of the tool chain
and installing things that are compatible with each other,
because every kernel version has different Evpf
support. And then you need maybe your user space libraries
that maybe are or aren't compatible. And different distributions
of Linux might have different
versions of different either Lib, BPF or
Tools, things like the BCC tools. Or
particularly, there's a thing called BPF Tool that you can use for managing
BPF programs and making sure that
you've got a compatible set of
the packages, the source code, the kernel, the tools that
you want can trip you up in numerous
different ways. And the other thing that catches people out,
once they got everything installed and everything seems to be compatible, and they've started
compiling some code, and then they go to load it into the kernel and
they hit verification errors.
And I would say over the last
few years, there's been a lot of improvement in the
kind of information that the Verifier gives you about why it's
objecting to whatever it is it's objecting to. But
there's a blog post somewhere that describes the Verifier as a
fickle beast. I think it is.
All right, so, you know, we've covered a lot about
EBPs so far, and just kind of to understand the
landscape right now, it's heavily used for networking. selium has gone all
in on eBPF, even towards the service mesh angle
now is available and even now have Tetragon going for the security
angle and trying to help people with you know, we've seen like
pexi and Falco to do more security and monitoring automated
observability, all of this
as I don't know if there's an eBPF maturity curve right. But as
people start to do more of it and the skills become more aware or
people are using it more, will eBPF creep into
our day to day application code? Do you see people using eBPF
to write their CMS or their
proprietary applications? I don't know what those use cases are. I don't know if they
exist. But will Evpf become more than what it is today? Which is a bit
of a forward thinking question, but maybe
it's really interesting. To sort of speculate about what you could do and
also what would be useful. But I think one interesting parallel is the
way that networking
capabilities that used to be in user space
have migrated into the kernel TCP
stack. I am old enough to remember when that was more
commonly in user space, you'd use
a TCP library. And now
we just expect the kernel to take care of that. And I think
what eBPF will allow us to do is to gradually
move more of that kind of functionality into the
kernel, but in a way that doesn't require everybody to take
the leap at the same time, because we don't all have to be running the
same Evpf programs. So I think something like
Service Mesh is a really great example where
Psyllium as a CNI networking component
for Kubernetes is in this really great sort of position in the kernel
to be able to pick up network packets and put them where they
need to be and observe them and report on them
and do kind of security related operations on
them. All of which are very much the kind of things that we expect from
a service mesh today. We can't do
everything in the kernel. I mean, theoretically I
think we could, but in practice, think all the kind of layer
seven operations. We're using a user space proxy to do
that. We're using Envoy to do that. I
think over time, I expect that in
some number of years time, all of that code will
be in the kernel. But maybe Kubernetes will be in
the kernel too. Who knows, maybe that's kind of future.
If you fancy rewriting Kubernetes, perhaps we should all do it in
Rust in Evpf.
David's looking like he might actually do that.
Yeah, no, he might. I
am not personally writing any Kubernetes components in Rust, but people
are exploring that these days.
You never know, right? I don't trust that
statement that you're not doing it.
I'm not. I don't have time. I'm too busy talking to you.
All right, well, is there anything else, Liz, you
would like us to throw at you to ask before we wrap this up? Anything
that's just sitting on the tip of your tongue waiting to be said? Sorry. Yeah,
no, there is one thing I would like to mention, which is the upcoming eBPF
Summit. Because if people are interested
in hearing more about what's going on in the kind of eBPF
community, learning more about how it's being used,
seeing some of the kind of interesting directions that people are
going with it, and learning more about the future of eBPF
itself. eBPF Summit. It's online, it's
free, it's a virtual conference. This is going to be, I think,
the fourth time that we've held it. It's on September
the 13th. And yeah, come join in. If you'll go
to Evpf IO, you'll find a link to the summit
there and yeah, it's always been a really
nice kind of community feel event, so I'm really looking forward to
it. Excellent. All right, if you want
to shamelessly plug anything else yourself, Azovalant,
anything else, feel free to mention it now or forever hold your piece.
We'll make sure all the links end up in the show notes as well. I
guess it would be remiss of me, not to mention learning Evpf.
There you go. Available either know, if you want the PDF version,
you can download that from Isovalent.com or you
can order it from your favorite local bookseller or
get it from Amazon if that's your bag
shop local. Exactly. All right,
well, thank you so much for your time. Pleasure. Thank you, Liz. Thanks for joining
us. If you want to keep up with us, consider subscribing to the podcast
on your favorite podcasting app or even go to Cloudnativecompass
FM. And if you want to to talk with someone specific or cover
a specific topic, reach out to us on any social media
platform. Until next time, when exploring the cloud native
landscape. On three, on
3123. Don't
forget your compass. Forget your compass.
Bye.