August 25, 2023 32:00 E9

The Magic of eBPF

Welcome to Cloud Native Compass, a podcast to help you navigate

the vast landscape of the Cloud native ecosystem. We're your

hosts. I'm David Flanagan, a technology magpie that can't stop playing

with new, shiny things. I'm Laura Santamaria, a forever

learner who is constantly breaking production. eBPF is

turing complete and can be written in Rust. So obviously,

I'm sold. That's just two of the things we learned today's

episode. Want to learn more? Contrary to not so popular

belief, eBPF is not some secret Berkeley green project

that's out there to eat your refrigerator, but it is a gateway to

becoming a Linux kernel maintainer. If you're curious about

what eBPF is, why it matters, how to

pronounce it, and how badly you can break your kernel when trying to learn it,

this is the episode for you as we talk with Liz Rice,

the author of Learning eBPF and Chief Open

Source Officer at Isovalent. All right, thank you.

Liz, could you please say hello and tell us a little bit more about you?

Hi. Yeah. So my name is Liz Rice. I am Chief Open Source

Officer at Isovalent, which is the company that originally

created Cilium, and a lot of people will have

heard of Cilium being based on eBPF.

And earlier this year, I published a book about eBPF called Learning

eBPF. I feel like I may have answered more than

one question in one go there. Hey, no, that's fine.

That's fine. Little bit of context. Just us. Perfect. Yeah. Good

context, considering we're about to talk about eBPF and go into a little

bit more detail there, so yeah. Awesome. All right, well, let's

just know not everyone is familiar with

eBPF. So could you give us the TLDR? What do people need to know to

understand the rest of the conversation today. So they don't need to know what

it stands for? It stands for Extended Berkeley Packet

filter, but honestly, forget that it doesn't really help

because it does a lot more than packet filtering now. So we

tend to say the acronym doesn't really mean anything anymore.

What it really is is the ability to run

programs within the kernel, within the operating

system kernel, so we can dynamically change the way that the

kernel behaves by loading these

eBPF programs. And I think when I say that,

I have to make sure that people really know what I mean when

I say the kernel. The kernel is the part of the operating system

that interfaces between our applications and the

hardware that the processor and its

peripherals. So if you are writing to a

file, you're doing anything over the network, writing anything

to screen even allocating memory, the kernel has

to get involved. Your application can't do it directly. It has to

ask the kernel for help. And the kernel is also

coordinating all the different processes that might be running on the machine. And

that means the kernel is involved whenever you're doing anything

interesting, really. So it's a really great

place to write things like observability tools

and security tools. And we can do that with

eBPF, and we also get to customize the way that the kernel

behaves for the things that it takes on, things like the networking

stack. We can modify the way that behaves with eBPF. So

it's really powerful and a really interesting way to

instrument all of your different applications that

are running on that one kernel. Nice.

And here I was hoping you could I was going to say, here I was

hoping you could actually explain why it said Berkeley in the middle of all of

it, but that's a whole more than anything else. The

original packet filter paper

was written by two people whose names I can't quite

remember right now, but they were at Berkeley at the

time that makes precise

paper that it all originates from says, like, Lawrence Berkeley

Lab or however. That'S know, just for

fun for somebody like me, who's like, why does it say

Berkeley? Okay, anyway,

so I'm curious then, did you join

Isovalent because you were really interested in eBPF, or are you now really interested

in eBPF because you've joined Isovalent? Like, what came first?

No, I really got interested in eBPF the first time

I heard of it. I saw Thomas Graff, who is the

CTO of Isovalent, talking about Psyllium and

eBPF at DockerCon back in 2017. And at the time I

thought, well, that's pretty interesting technology. And at that

point, it was really cutting edge in the kernel.

It wasn't available to most people in production. It wasn't available to hardly anybody

in the Linux distributions that they were using back

then. But I thought, this is a really interesting

technology, and I'd kind of kept an eye on it. A

couple of years later, I started working on a project that was sort of

using Ubpf, and I was also, as part of

learning about it myself, I was going out and doing talks.

I find that the best way to make sure you really understand it is to

try and explain something to somebody else.

So, yeah, I'd started doing talks about eBPF as

well, and through that, actually got

invited back to the eBPF Summit in, I guess,

2020, which I surveillant put

on sort of on behalf of the eBPF community.

And there was so much really cool stuff going on

in the world of eBPF, and particularly at I survealent,

it turns out I hadn't realized before that summit just how much

I surveillant had been involved with eBPF right from the get go. So

Daniel Balkman, who's one of the three maintainers of

eBPF in the kernel, was one of the early engineers at

Isovalent. And he's still, you know, we're

so embedded in the way that eBPF

has developed over the years. I really do get

to work with the people who created it and the people who've been

using it and had the vision for building things like Psyllium.

So, yeah, I joined Isovalent because it's just

absolutely full of really cool people doing really fun things with

eBPF. Awesome. Makes sense. So

you started experimenting with eBPF in 2017.

We're now in the latter half of 2023, which just

seems absurd to me now, but over those years, you've seen

the adoption grow, as we all have, especially across the industry and even the

CNCF, with projects like Pixie and Falco and of course Psyllium as

well. Why has that adoption grown

so quickly for a relatively niche? I don't know. Is that a niche technology? I

think it is. Why are people I don't

know. It seems like it's kind of everywhere. Yeah, it's one of those things

that, I guess, expertise in it

is pretty niche, but a lot of people are using it without really even

knowing that they're using it. I mean, there's probably people using

Psyllium who don't realize that it's based on eBPF. Certainly a lot

of people will be using things like TCP Dump and never

really sort of thought about EBPs. And that's fine.

There's so many really powerful tools that have been

built. I think

things that Brendan Greg has

popularized in the kind of

2017 period, he was already out there talking about

how Netflix were using eBPF for

observability purposes, for tracing, for

diagnosing, and then improving performance

issues and really

popularizing the power of

eBPF. The reason

why I think there was that kind of sudden

upturn in adoption is

the fact that the level of

eBPF support in the kernel had reached a point,

I think around the 418 kernel version, around that kind

of time frame, is where you really start getting it's sufficient

eBPF support to do really interesting things. The more

modern your kernel is, the more additional capabilities in

eBPF and probably lots of other areas of the kernel as

well. But there was this real turning point

when I would say particularly when Rel was

probably the last of the distributions to kind of

it's always relatively cautious about upgrading

to new versions of the kernel. And at the point where really all

of the distributions were using a modern enough kernel, that meant

you could just deploy these eBPF based tools in production

regardless of your distro. And I think that really made a huge difference to the

adoption. Yeah. Awesome. So you

mentioned the Berkeley packet filter, and for the people that are not

aware, it's like a networking thing that allows you to do Iptables like

stuff. I'm not trying going to go into it in any more detail than that

because I'll make an absolute mess of it. But it does networking stuff, right? It

blocks packets, it Reroutes packets, it does other stuff. But Evpf has kind

of grown beyond that. Now we're seeing it used for a whole

variety of different technologies like Falco. And

is there like what's the right way to phrase this question? Really? I should have

had it prepared. But why has it extended beyond this? Why does it have these

new capabilities? What is it enabling within the kernel for people? Why is it

interesting to you and to others? I know that's a very broad

question. Yeah. So I think the original idea

of packet filtering was to be able to look at each incoming packet

and make decisions about what to do with that packet.

And I think in the very first place, it was really just am I

interested in sort of seeing this packet? Maybe I want to

filter packets that are going to a particular port so that I can

count them or something like that. So it was making

fairly simple decisions about

what to do with these packet filters. The

extended part involved, I

think, a few different trains of thought. One was the idea that if

you extended this sort of relatively small

instruction set that could be used to examine

packets, if you turn that into something a

bit more kind of like a virtual machine

instruction set, if you look at BPF Bytecode,

it's very reminiscent of machine code.

It's all about registers and loading values into

registers and comparing them and jumping to other instructions.

It's very similar to machine code. So there was this

idea that having a virtual

machine in the kernel could allow you to do all sorts of interesting

things. There was the idea that maybe you could attach these

programs to other points in the kernel, not just to incoming packets,

but you could make decisions or change the

behavior at other points in the kernel. And I

think the last major thing that distinguished extended from

its predecessors is what's called eBPF maps.

And Maps are these data structures that you can access from

within an eBPF program and you can share them between eBPF

programs, and you can also access them using system

calls from user space. So it's a way of

exchanging information between user space and eBPF programs

or between multiple different eBPF programs.

And all those things kind of combined has turned out

to be really powerful to the extent that one of my colleagues

recently did a talk at one of the kind

of Linux kernel developer conferences where he showed that

eBPF is now turing complete, which is

pretty cool. A question just on that, right?

Because the talks I've seen from yourself and others in this

space, when you talk about eBPF, one of the things that's always mentioned is the

fact that the eBPF sandbox the virtual machine can and

shouldn't never crash. Is that

still true even with the ability for eBPF programs to communicate with each

other and with user space programs? Yeah, exactly. So

the reason we're able to make that claim is because of

a thing in the kernel called the eBPF verifier. So as

you load a program into the kernel, it goes through this verification

process, which is really analyzing all the possible paths through the program

and ensuring that, well, first

of all, it will run to completion.

A long time ago, that used to be no loops at all. Now that's

been kind of improved and optimized and you can have loops.

It's checking for things like there's only a limited

set of what's called helper functions that you can make from an eBPF

program and the set that you can call depend on really

the event that triggered it. So if you were being

triggered because a network packet arrived, then you can

call helper functions that are related to looking at that network packet.

But you can't, for example, ask for, well, what's the

user space process associated with this packet because there is no

user space process associated at that time. Whereas

if you were in an eBPF program attached to

a user space program making a system call, then you

absolutely could ask a helper function to give

you the process ID. So the Verifies checking that all this sort of

contextual helper functions are being used

appropriately and that memory access is safe,

that if you're going to dereference a pointer, you have to

explicitly check that it's not nil before. You do so

because dereferencing a nil pointer. If anybody has ever

written a C program, they will have crashed their C program by dereferencing a null

pointer. I guarantee it. Yeah. So the

verify is really just checking that that program is

safe to run. Safe in the sense that it can't crash the machine, that

memory access is safe. Of course it can't

tell the difference between maybe I'm

a legitimate networking packet that's a

legitimate Evpf program that's filtering network packets,

maybe I'm protecting against DDoS attacks, or

maybe I'm a malicious Evpf program and I am

just throwing away packets for fun. The

Verifier can't tell the difference between those two things. But when we talk about

being safe to run in this context, we really mean it's not going to crash

or hang the machine. Okay, cool. And I think worth

pointing face on here. Yeah, I've got my thinking face on.

Sorry. Go ahead. Go ahead, Liz. No, keep going, keep

going. Well, it's the thing that really

distinguishes eBPF from writing a kernel module.

So you always could extend the kernel. Always.

For a very long time you've been able to extend the kernel by writing kernel

modules. But people are pretty reluctant

to install kernel modules because

if there is some kind of bug in it, if it does crash, it's going

to bring down the whole machine and there's no kind of safety

net like what the eBPF Verifier is bringing. Makes

sense. So, out of curiosity, obviously we're talking a

lot more in depth kernel kind of things. And obviously

you referenced if you've ever developed a C program, you know what this is like

in machine code and things like that. Who would you say are probably

the most common user that you encounter or

like the people who are really using this the most. Who

do you think this is the. Most relevant for, I guess, right now. So

lots of people will use eBPF

through tooling that's built on

david mentioned. There's various different projects in the CNCF.

There's all the BPF Trace

and the BCC family of tools that people can use on the command

line to do observability. There's lots of

different tools that have been built on

eBPF as a platform. And I think for the vast majority of people,

that's how they'll really experience it. They'll use things like

Psyllium or Falco

or Pixi and they may be

interested in the fact that it's using eBPF, but they don't actually

have to get involved in the details. Which turns out to be a

really good thing because I'm the sort of person who really wants to

understand how the sausage is made. I want to

kind of get inside and get a feel for how is this really working?

You can learn about eBPF

programming. It's relatively easy to get

things like a hello world or some basic

networking capabilities running in eBPF

programs that you've written yourself. But you do quite

quickly start hitting the point where you're

interacting with kernel data structures. And at that point you kind of need to

understand what those data structures represent and what the effect of you

changing them might have. So it does quite

quickly turn into kernel programming. So I kind of say that on

the one hand, most people just don't need to know anything about the details

of it at all. But if they're interested, it's

relatively accessible for people who are comfortable with

programming to kind of dip your toe in.

And then if you really want to become an expert eBPF

programmer, which I by no means consider myself at

all, that really does start to require kind of

kernel expertise. But fortunately I work with lots of people at

Isovant who have that kernel expertise. So would you

say this is the gateway to actually becoming a Linux

maintainer? Honestly, yes.

I've not done it, but it never crossed my mind that I would ever

make a contribution to the Linux kernel. But now I've kind of in that world,

I sort of start to think if I had another 25 hours in.

The temptation is high.

The temptation is high to go get involved. I understand.

I've asked my one obligatory question, David, now it's your turn.

Like that. I like that discussion, right? Because it's one of those questions

I do a couple of talks that touch on eBPF, right? I don't go deep

on it because I'm not that smart, but I always do the same demos and

it's the Iovisor Bpfcc tools

demo specifically, I show off execsnoop

and open. You know, in the SRE

platform DevOpsy world. It's quite interesting and

important from a security perspective and an automation perspective to be able to show when

certain sensitive files are opened on a disk and eBPF makes that really

simple. And another really cool demo is just by using Execsnoop you

can actually monitor for pseudo and setuid bit one

binaries on the machine. So when people elevate their privileges, you can get

notifications for that kind of stuff too. And the question is always like

how much do I need to know about eBPF to then start doing tools similar

to that. And then you show them the source code and it's like 20 lines

of Python. It's not a lot to do these kind of things. And I think

that's because I feel like people can start to build

eBPF programs using tracy's without going deep into it in the same way

that with containers, we can all run containers on our machine. But you don't really

need to understand what a control group is or a namespace is anymore. And I

feel like eBPF may make that same transition, probably already has made that

same transition. So I'm going to flip that around a little bit and throw the

question to Liz. People do see these demos, they listen to

this and they're like, OK, eBPF is really cool. What are the languages

and the SDKs that they can go and start to work with right away to

experiment with the new tech? Yeah. So you kind of have

to answer that question in two parts. There's the

actual eBPF program itself that's going to run in the kernel. And

then there's the user space part that might interact with it. There are

some occasions where you don't even need a user space part. So for

example, if you're doing networking

functions, sometimes they

don't need any kind of user space interaction because you can just load them into

the kernel and they can do what they need to do. But usually

we're going to have both these parts

for the kernel. The program is going to be

in ultimately it's going to be in eBPF

bytecode form talked about there being these bytecode instructions

that look like machine code. You could

just write the machine code by hand. Apparently there are people who do that,

but for me I would rather

write in a slightly higher language than that. And

the languages are restricted by being

they have to be able to compile down to bytecode.

So the compilers that support it right now

are clang GCC, both of which can compile

C and also the Rust compiler.

I'm not aware of there being other programs that support

BPF bytecode as a target.

So yeah, C or Rust

really become your choice there. There is a little bit of a

caveat in that. There is this project called

Iovisor BCC. David mentioned the tools

and things like Opensnoop and Execsnoop that come from that

project. And BCC gives

you some friendly kind of

macros such that you can write your code in a sort of

hybrid of Python and C. And it

takes care of a few things for you from

your C program. But then there's the user space

side of things and there you have a much wider choice. I

mean, really, you're not restricted

at all, except that you probably want some SDKs that will make system calls

for you and make it easier for you to interact

with the eBPF program through that syscall

interface. There's

a go SDK there's a rust

SDK in fact, there's a couple of Go ones

and there's a C one, which is probably, I would say

today, the most widely used, called libbpf

Psyllium uses Go. We have a Go

eBPF library, but I think a lot of the

projects outside of that are probably directly using

libbpf. Yeah, you said one thing there that I completely disagree

with, and you said you have a choice between Rust and C. That's not

a choice. I knew this was coming. I knew this was coming. I knew it.

Of course. Talk about rust. I'm going to just break in before he gets going

on it. It occurs to me

there was the mention of when

containers came around and things like that, things kind of change.

And as there are more and more languages that people are familiar with that you

can compile down to the eBPF bytecode.

How often do you find people getting into trouble like they used to do

when containers first came around because they didn't quite know what they were doing,

but they kind of got it enough to get away with it. So how often

do you find people getting in trouble and how do you get them out? That's

always my question. How do you troubleshoot this thing? Especially when

this is like kernel level and you can really mess things up

fast? Yeah, I would say that there's probably

two major ways that people get caught out.

One of them is around the kind of the tool chain

and installing things that are compatible with each other,

because every kernel version has different Evpf

support. And then you need maybe your user space libraries

that maybe are or aren't compatible. And different distributions

of Linux might have different

versions of different either Lib, BPF or

Tools, things like the BCC tools. Or

particularly, there's a thing called BPF Tool that you can use for managing

BPF programs and making sure that

you've got a compatible set of

the packages, the source code, the kernel, the tools that

you want can trip you up in numerous

different ways. And the other thing that catches people out,

once they got everything installed and everything seems to be compatible, and they've started

compiling some code, and then they go to load it into the kernel and

they hit verification errors.

And I would say over the last

few years, there's been a lot of improvement in the

kind of information that the Verifier gives you about why it's

objecting to whatever it is it's objecting to. But

there's a blog post somewhere that describes the Verifier as a

fickle beast. I think it is.

All right, so, you know, we've covered a lot about

EBPs so far, and just kind of to understand the

landscape right now, it's heavily used for networking. selium has gone all

in on eBPF, even towards the service mesh angle

now is available and even now have Tetragon going for the security

angle and trying to help people with you know, we've seen like

pexi and Falco to do more security and monitoring automated

observability, all of this

as I don't know if there's an eBPF maturity curve right. But as

people start to do more of it and the skills become more aware or

people are using it more, will eBPF creep into

our day to day application code? Do you see people using eBPF

to write their CMS or their

proprietary applications? I don't know what those use cases are. I don't know if they

exist. But will Evpf become more than what it is today? Which is a bit

of a forward thinking question, but maybe

it's really interesting. To sort of speculate about what you could do and

also what would be useful. But I think one interesting parallel is the

way that networking

capabilities that used to be in user space

have migrated into the kernel TCP

stack. I am old enough to remember when that was more

commonly in user space, you'd use

a TCP library. And now

we just expect the kernel to take care of that. And I think

what eBPF will allow us to do is to gradually

move more of that kind of functionality into the

kernel, but in a way that doesn't require everybody to take

the leap at the same time, because we don't all have to be running the

same Evpf programs. So I think something like

Service Mesh is a really great example where

Psyllium as a CNI networking component

for Kubernetes is in this really great sort of position in the kernel

to be able to pick up network packets and put them where they

need to be and observe them and report on them

and do kind of security related operations on

them. All of which are very much the kind of things that we expect from

a service mesh today. We can't do

everything in the kernel. I mean, theoretically I

think we could, but in practice, think all the kind of layer

seven operations. We're using a user space proxy to do

that. We're using Envoy to do that. I

think over time, I expect that in

some number of years time, all of that code will

be in the kernel. But maybe Kubernetes will be in

the kernel too. Who knows, maybe that's kind of future.

If you fancy rewriting Kubernetes, perhaps we should all do it in

Rust in Evpf.

David's looking like he might actually do that.

Yeah, no, he might. I

am not personally writing any Kubernetes components in Rust, but people

are exploring that these days.

You never know, right? I don't trust that

statement that you're not doing it.

I'm not. I don't have time. I'm too busy talking to you.

All right, well, is there anything else, Liz, you

would like us to throw at you to ask before we wrap this up? Anything

that's just sitting on the tip of your tongue waiting to be said? Sorry. Yeah,

no, there is one thing I would like to mention, which is the upcoming eBPF

Summit. Because if people are interested

in hearing more about what's going on in the kind of eBPF

community, learning more about how it's being used,

seeing some of the kind of interesting directions that people are

going with it, and learning more about the future of eBPF

itself. eBPF Summit. It's online, it's

free, it's a virtual conference. This is going to be, I think,

the fourth time that we've held it. It's on September

the 13th. And yeah, come join in. If you'll go

to Evpf IO, you'll find a link to the summit

there and yeah, it's always been a really

nice kind of community feel event, so I'm really looking forward to

it. Excellent. All right, if you want

to shamelessly plug anything else yourself, Azovalant,

anything else, feel free to mention it now or forever hold your piece.

We'll make sure all the links end up in the show notes as well. I

guess it would be remiss of me, not to mention learning Evpf.

There you go. Available either know, if you want the PDF version,

you can download that from Isovalent.com or you

can order it from your favorite local bookseller or

get it from Amazon if that's your bag

shop local. Exactly. All right,

well, thank you so much for your time. Pleasure. Thank you, Liz. Thanks for joining

us. If you want to keep up with us, consider subscribing to the podcast

on your favorite podcasting app or even go to Cloudnativecompass

FM. And if you want to to talk with someone specific or cover

a specific topic, reach out to us on any social media

platform. Until next time, when exploring the cloud native

landscape. On three, on

3123. Don't

forget your compass. Forget your compass.

Bye.

Episode Video

Creators and Guests

Host

David Flanagan

I teach people advanced Kubernetes & Cloud Native patterns and practices. I am the founder of the Rawkode Academy and KubeHuddle, and I co-organise Kubernetes London.

Host

Laura Santamaria

🌻💙💛Developer Advocate 🥑 I ❤️ DevOps. Recovering Earth/Atmo Sci educator, cloud aficionado. Curator #AMinuteOnTheMic; cohost The Hallway Track, ex-PulumiTV.

Guest

Liz Rice

Open Source @isovalent / @ciliumproject / #CNCF GB / OpenUK board / O'Reilly author / #AWS hero / #Golang #GDE / 🎶 @insidernine / not paying for my blue tick