Cgroup SKB
Source Code
Full code for the example in this chapter is available here
What is Cgroup SKB?
Cgroup SKB programs are attached to v2 cgroups and get triggered by network traffic (egress or ingress) associated with processes inside the given cgroup. They allow to intercept and filter the traffic associated with particular cgroups (and therefore - containers).
What's the difference between Cgroup SKB and Classifiers?
Both Cgroup SKB and Classifiers receive the same type of context -
SkBuffContext
.
The difference is that Classifiers are attached to the network interface.
Example project
This example will be similar to the Classifier example - a program which allows the dropping of egress traffic, but for the specific cgroup.
Design
We're going to:
- Create a
HashMap
that will act as a blocklist. - Check the destination IP address from the packet against the
HashMap
to make a policy decision (pass or drop). - Add entries to the blocklist from userspace.
Generating bindings to vmlinux.h
In this example, we are going to use one kernel structure called iphdr
, which
represents the IP protocol header. We need to generate Rust bindings to it.
First, we must make sure that bindgen
is installed.
```sh
cargo install bindgen-cli
```
Let's use xtask
to automate the process of generating bindings so we can
easily reproduce it in the future by adding the following code:
Once we've generated our file using cargo xtask codegen
from the root of the
project, we can access it by including mod bindings
from eBPF code.
eBPF code
The program is going to start with a definition of BLOCKLIST
map. To enforce
the police, the program is going to lookup the destination IP address in that
map. If the map entry for that address exists, we are going to drop the packet
by returning 0
. Otherwise, we are going to accept it by returning 1
.
Here's how the eBPF code looks like:
```rust linenums="1" title="cgroup-skb-egress-ebpf/src/main.rs"
#![no_std]
#![no_main]
use aya_ebpf::{
macros::{cgroup_skb, map},
maps::{HashMap, PerfEventArray},
programs::SkBuffContext,
};
use memoffset::offset_of;
use cgroup_skb_egress_common::PacketLog;
#[allow(non_upper_case_globals)]
#[allow(non_snake_case)]
#[allow(non_camel_case_types)]
#[allow(dead_code)]
mod bindings;
use bindings::iphdr;
#[map]
static EVENTS: PerfEventArray<PacketLog> = PerfEventArray::new(0);
#[map] // (1)
static BLOCKLIST: HashMap<u32, u32> = HashMap::with_max_entries(1024, 0);
#[cgroup_skb]
pub fn cgroup_skb_egress(ctx: SkBuffContext) -> i32 {
match { try_cgroup_skb_egress(ctx) } {
Ok(ret) => ret,
Err(_) => 0,
}
}
// (2)
fn block_ip(address: u32) -> bool {
unsafe { BLOCKLIST.get(&address).is_some() }
}
fn try_cgroup_skb_egress(ctx: SkBuffContext) -> Result<i32, i64> {
let protocol = unsafe { (*ctx.skb.skb).protocol };
if protocol != ETH_P_IP {
return Ok(1);
}
let destination = u32::from_be(ctx.load(offset_of!(iphdr, daddr))?);
// (3)
let action = if block_ip(destination) { 0 } else { 1 };
let log_entry = PacketLog {
ipv4_address: destination,
action: action,
};
EVENTS.output(&ctx, &log_entry, 0);
Ok(action)
}
const ETH_P_IP: u32 = 8;
#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
loop {}
}
```
- Create our map.
- Check if we should allow or deny our packet.
- Return the correct action.
Userspace code
The purpose of the userspace code is to load the eBPF program, attach it to the cgroup and then populate the map with an address to block.
In this example, we'll block all egress traffic going to 1.1.1.1
.
Here's how the code looks like:
```rust linenums="1" title="cgroup-skb-egress/src/main.rs"
use std::net::Ipv4Addr;
use aya::{
include_bytes_aligned,
maps::{perf::AsyncPerfEventArray, HashMap},
programs::{CgroupAttachMode, CgroupSkb, CgroupSkbAttachType},
util::online_cpus,
Ebpf,
};
use bytes::BytesMut;
use clap::Parser;
use log::info;
use tokio::{signal, task};
use cgroup_skb_egress_common::PacketLog;
#[derive(Debug, Parser)]
struct Opt {
#[clap(short, long, default_value = "/sys/fs/cgroup/unified")]
cgroup_path: String,
}
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
let opt = Opt::parse();
env_logger::init();
// This will include your eBPF object file as raw bytes at compile-time and load it at
// runtime. This approach is recommended for most real-world use cases. If you would
// like to specify the eBPF program at runtime rather than at compile-time, you can
// reach for `Ebpf::load_file` instead.
#[cfg(debug_assertions)]
let mut bpf = Ebpf::load(include_bytes_aligned!(
"../../target/bpfel-unknown-none/debug/cgroup-skb-egress"
))?;
#[cfg(not(debug_assertions))]
let mut bpf = Ebpf::load(include_bytes_aligned!(
"../../target/bpfel-unknown-none/release/cgroup-skb-egress"
))?;
let program: &mut CgroupSkb =
bpf.program_mut("cgroup_skb_egress").unwrap().try_into()?;
let cgroup = std::fs::File::open(opt.cgroup_path)?;
// (1)
program.load()?;
// (2)
program.attach(
cgroup,
CgroupSkbAttachType::Egress,
CgroupAttachMode::Single,
)?;
let mut blocklist: HashMap<_, u32, u32> =
HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;
let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).try_into()?;
// (3)
blocklist.insert(block_addr, 0, 0)?;
let mut perf_array =
AsyncPerfEventArray::try_from(bpf.take_map("EVENTS").unwrap())?;
for cpu_id in online_cpus().map_err(|(_, error)| error)? {
let mut buf = perf_array.open(cpu_id, None)?;
task::spawn(async move {
let mut buffers = (0..10)
.map(|_| BytesMut::with_capacity(1024))
.collect::<Vec<_>>();
loop {
let events = buf.read_events(&mut buffers).await.unwrap();
for buf in buffers.iter_mut().take(events.read) {
let ptr = buf.as_ptr() as *const PacketLog;
let data = unsafe { ptr.read_unaligned() };
let src_addr = Ipv4Addr::from(data.ipv4_address);
info!("LOG: DST {}, ACTION {}", src_addr, data.action);
}
}
});
}
info!("Waiting for Ctrl-C...");
signal::ctrl_c().await?;
info!("Exiting...");
Ok(())
}
```
- Loading the eBPF program.
- Attaching it to the given cgroup.
- Populating the map with remote IP addresses which we want to prevent the egress traffic to.
The third thing is done with getting a reference to the BLOCKLIST
map and
calling blocklist.insert
. Using IPv4Addr
type in Rust will let us to read
the human-readable representation of IP address and convert it to u32
, which
is an appropriate type to use in eBPF maps.
Testing the program
First, check where cgroups v2 are mounted:
```console
$ mount | grep cgroup2
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
```
The most common locations are either /sys/fs/cgroup
or /sys/fs/cgroup/unified
.
Inside that location, we need to create our new cgroup (as root):
```console
# mkdir /sys/fs/cgroup/foo
```
Then run the program with:
```console
RUST_LOG=info cargo xtask run
```
And then, in a separate terminal, as root, try to access 1.1.1.1
:
```console
# bash -c "echo \$$ >> /sys/fs/cgroup/foo/cgroup.procs && curl 1.1.1.1"
```
That command should hang and the logs of our program should look like:
```console
LOG: DST 1.1.1.1, ACTION 0
LOG: DST 1.1.1.1, ACTION 0
```
On the other hand, accessing any other address should be successful, for example:
```console
# bash -c "echo \$$ >> /sys/fs/cgroup/foo/cgroup.procs && curl google.com"
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
```
And should result in the following logs:
```console
LOG: DST 192.168.88.10, ACTION 1
LOG: DST 192.168.88.10, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
```