1# Compiling P4 to EBPF 2 3Mihai Budiu - mbudiu@barefootnetworks.com 4 5September 22, 2015 6 7## Abstract 8 9This document describes a prototype compiler that translates programs 10written in the P4 programming languages to eBPF programs. The 11translation is performed by generating programs written in a subset of 12the C programming language, that are converted to EBPF using the BPF 13Compiler Collection tools. 14 15The compiler code is licensed under an [Apache v2.0 license] 16(http://www.apache.org/licenses/LICENSE-2.0.html). 17 18## Preliminaries 19 20In this section we give a brief overview of P4 and EBPF. A detailed 21treatment of these topics is outside the scope of this text. 22 23### P4 24 25P4 (http://p4.org) is a domain-specific programming language for 26specifying the behavior of the dataplanes of network-forwarding 27elements. The name of the programming language comes from the title 28of a paper published in the proceedings of SIGCOMM Computer 29Communications Review in 2014: 30http://www.sigcomm.org/ccr/papers/2014/July/0000000.0000004: 31"Programming Protocol-Independent Packet Processors". 32 33P4 itself is protocol-independent but allows programmers to express a 34rich set of data plane behaviors and protocols. The core P4 35abstractions are: 36 37* Header definitions describe the format (the set of fields and their 38 sizes) of each header within a packet. 39 40* Parse graphs (finite-state machines) describe the permitted header 41 sequences within received packets. 42 43* Tables associate keys to actions. P4 tables generalize traditional 44 forwarding tables; they can be used to implement routing tables, 45 flow lookup tables, access-control lists, etc. 46 47* Actions describe how packet header fields and metadata are manipulated. 48 49* Match-action units stitch together tables and actions, and perform 50 the following sequence of operations: 51 52 * Construct lookup keys from packet fields or computed metadata, 53 54 * Use the constructed lookup key to index into tables, choosing an 55 action to execute, 56 57 * Finally, execute the selected action. 58 59* Control flow is expressed as an imperative program describing the 60 data-dependent packet processing within a pipeline, including the 61 data-dependent sequence of match-action unit invocations. 62 63P4 programs describe the behavior of network-processing dataplanes. A 64P4 program is designed to operate in concert with a separate *control 65plane* program. The control plane is responsible for managing at 66runtime the contents of the P4 tables. P4 cannot be used to specify 67control-planes; however, a P4 program implicitly specifies the 68interface between the data-plane and the control-plane. 69 70The P4 language is under active development; the current stable 71version is 1.0.2 (see http://p4.org/spec); a reference implementation 72of a compiler and associated tools is freely available using a Apache 732 open-source license (see http://p4.org/code). 74 75### EBPF 76 77#### Safe code 78 79EBPF is a acronym that stands for Extended Berkeley Packet Filters. 80In essence EBPF is a low-level programming language (similar to 81machine code); EBPF programs are traditionally executed by a virtual 82machine that resides in the Linux kernel. EBPF programs can be 83inserted and removed from a live kernel using dynamic code 84instrumentation. The main feature of EBPF programs is their *static 85safety*: prior to execution all EBPF programs have to be validated as 86being safe, and unsafe programs cannot be executed. A safe program 87provably cannot compromise the machine it is running on: 88 89* it can only access a restricted memory region (on the local stack) 90 91* it can run only for a limited amount of time; during execution it 92 cannot block, sleep or take any locks 93 94* it cannot use any kernel resources with the exception of a limited 95 set of kernel services which have been specifically whitelisted, 96 including operations to manipulate tables (described below) 97 98#### Kernel hooks 99 100EBPF programs are inserted into the kernel using *hooks*. There are 101several types of hooks available: 102 103* any function entry point in the kernel can act as a hook; attaching 104 an EBPF program to a function `foo()` will cause the EBPF program to 105 execute every time some kernel thread executes `foo()`. 106 107* EBPF programs can also be attached using the Linux Traffic Control 108 (TC) subsystem, in the network packet processing datapath. Such 109 programs can be used as TC classifiers and actions. 110 111* EBPF programs can also be attached to sockets or network interfaces. 112 In this case they can be used for processing packets that flow 113 through the socket/interface. 114 115EBPF programs can be used for many purposes; the main use cases are 116dynamic tracing and monitoring, and packet processing. We are mostly 117interested in the latter use case in this document. 118 119#### EBPF Tables 120 121The EBPF runtime exposes a bi-directional kernel-userspace data 122communication channel, called *tables* (also called maps in some EBPF 123documents and code samples). EBPF tables are essentially key-value 124stores, where keys and values are arbitrary fixed-size bitstrings. 125The key width, value width and table size (maximum number of entries 126that can be stored) are declared statically, at table creation time. 127 128In user-space tables handles are exposed as file descriptors. Both 129user- and kernel-space programs can manipulate tables, by inserting, 130deleting, looking up, modifying, and enumerating entries in a table. 131 132In kernel space the keys and values are exposed as pointers to the raw 133underlying data stored in the table, whereas in user-space the 134pointers point to copies of the data. 135 136#### Concurrency 137 138An important aspect to understand related to EBPF is the execution 139model. An EBPF program is triggered by a kernel hook; multiple 140instances of the same kernel hook can be running simultaneously on 141different cores. 142 143Each table however has a single instances across all the cores. A 144single table may be accessed simultaneously by multiple instances of 145the same EBPF program running as separate kernel threads on different 146cores. EBPF tables are native kernel objects, and access to the table 147contents is protected using the kernel RCU mechanism. This makes 148access to table entries safe under concurrent execution; for example, 149the memory associated to a value cannot be accidentally freed while an 150EBPF program holds a pointer to the respective value. However, 151accessing tables is prone to data races; since EBPF programs cannot 152use locks, some of these races often cannot be avoided. 153 154EBPF and the associated tools are also under active development, and 155new capabilities are added frequently. The P4 compiler generates code 156that can be compiled using the BPF Compiler Collection (BCC) 157(https://github.com/iovisor/bcc) 158 159## Compiling P4 to EBPF 160 161From the above description it is apparent that the P4 and EBPF 162programming languages have different expressive powers. However, 163there is a significant overlap in their capabilities, in particular, 164in the domain of network packet processing. The following image 165illustrates the situation: 166 167![P4 and EBPF overlap in capabilities](scope.png) 168 169We expect that the overlapping region will grow in size as both P4 and 170EBPF continue to mature. 171 172The current version of the P4 to EBPF compiler translates programs 173written in the version 1.1 of the P4 programming language to programs 174written in a restricted subset of C. The subset of C is chosen such 175that it should be compilable to EBPF using BCC. 176 177``` 178 -------------- ------- 179P4 ---> | P4-to-EBPF | ---> C ----> | BCC | --> EBPF 180 -------------- ------- 181``` 182 183The P4 program only describes the packet processing *data plane*, that 184runs in the Linux kernel. The *control plane* must be separately 185implemented by the user. The BCC tools simplify this task 186considerably, by generating C and/or Python APIs that expose the 187dataplane/control-plane APIs. 188 189### Dependencies 190 191EBPF programs require a Linux kernel with version 4.2 or newer. 192 193In order to use the P4 to EBPF compiler the following software must be installed: 194 195* The compiler itself is written in the Python (v2.x) programming 196 language. 197 198* the P4 compiler front-end: (https://github.com/p4lang/p4-hlir). 199 This is required for parsing the P4 programs. 200 201* the BCC compiler collection tools: (https://github.com/iovisor/bcc). 202 This is required for compiling the generated code. Also, BCC comes 203 with a set of Python utilities which can be used to implement 204 control-plane programs that operate in concert with the kernel EBPF 205 datapath. 206 207The P4 to EBPF compiler generates code that is designed for being used 208as a classifier using the Linux TC subsystem. 209 210Furthermore, the test code provided is written using the Python (v3.x) 211programming language and requires several Python packages to be 212installed. 213 214### Supported capabilities 215 216The current version of the P4 to EBPF compiler supports a relatively 217narrow subset of the P4 language, but still powerful enough to write 218very complex packet filters and simple packet forwarding engines. In 219the spirit of open-source "release early, release often", we expect 220that the compiler's capabilities will improve gradually. 221 222* Packet filtering is performed using the `drop()` action. Packets 223 that are not dropped will be forwarded. 224 225* Packet forwarding is performed by setting the 226 `standard_metadata.egress_port` to the index of the destination 227 network interface 228 229Here are some limitations imposed on the P4 programs: 230 231* Currently both the ingress and the egress P4 pipelines are executed 232 at the same hook (wherever the user chooses to insert the generated 233 EBPF program). In the future the compiler should probably generate 234 two separate EBPF programs. 235 236* arbitrary parsers can be compiled, but the BCC compiler will reject 237 parsers that contain cycles 238 239* arithmetic on data wider than 32 bits is not supported 240 241* checksum computations are not implemented. In consequence, programs 242 that IP/TCP/UDP headers will produce incorrect packet headers. 243 244* EBPF does not offer support for ternary or LPM tables 245 246* P4 cloning and recirculation and not supported 247 248* meters and registers are not supported; only direct counters are 249 currently supported. EBPF can potentially support registers and 250 arbitrary counters, so these may appear in the future. 251 252* learning (i.e. `generate_digest`) is not implemented 253 254### Translating P4 to C 255 256To simplify the translation, the P4 programmer should refrain using 257identifiers whose name starts with `ebpf_`. 258 259The following table provides a brief summary of how each P4 construct 260is mapped to a corresponding C construct: 261 262#### Translating parsers 263 264P4 Construct | C Translation 265----------|------------ 266`header_type` | `struct` type 267`header` | `struct` instance with an additional `valid` bit 268`metadata` | `struct` instance 269parser state | code block 270state transition | `goto` statement 271`extract` | load/shift/mask data from packet buffer 272 273#### Translating match-action pipelines 274 275P4 Construct | C Translation 276----------|------------ 277table | 2 EBPF tables: second one used just for the default action 278table key | `struct` type 279table `actions` block | tagged `union` with all possible actions 280`action` arguments | `struct` 281table `reads` | EBPF table access 282`action` body | code block 283table `apply` | `switch` statement 284counters | additional EBPF table 285 286### Code organization 287 288The compiler code is organized in two folders: 289 290* `compiler`: the complete compiler source code, in Python v2.x 291 The compiler entry point is `p4toEbpf.py`. 292 293* `test`: testing code and data. There are two testing programs: 294 * `testP4toEbpf.py`: which compiles all P4 files in the testprograms folder 295 296 * `endToEndTest.py`: which compiles and executes the simple.p4 297 program, and includes a simple control plane 298 299Currently the compiler contains no installation capabilities. 300 301### Invoking the compiler 302 303Invoking the compiler is just a matter of invoking the python program 304with a suitable input P4 file: 305 306``` 307p4toEbpf.py file.p4 -o file.c 308``` 309 310#### Compiler options 311 312The P4 compiler first runs the C preprocessor on the input P4 file. 313Some of the command-line options are passed directly to the 314preprocessor. 315 316The following compiler options are available: 317 318Option | Meaning 319-------|-------- 320`-D macro` | Option passed to C preprocessor 321`-I path` | Option passed to C preprocessor 322`-U macro` | Option passed to C preprocessor 323`-g [router|filter]` | Controls whether the generated code behaves like a router or a filter. 324`-o outoutFile` | writes the generated C code to the specified output file. 325 326The `-g` option controls the nature of the generated code: 327 328* `-g filter` generates a filter; the only P4 action that has an 329 effect is the `drop()` action. Setting metadata in P4 (e.g., 330 `egress_port`) has no effect. 331 332* `-g router` generates a simple router; both `drop()` and 333 `egress_port` impact packet processing. 334 335#### Using the generated code 336 337The resulting file contains the complete data structures, tables, and 338a C function named `ebpf_filter` that implements the P4-specified 339data-plane. This C file can be manipulated using the BCC tools; 340please refer to the BCC project documentation and sample test files of 341the P4 to EBPF source code for an in-depth understanding. A minimal 342Python program that compiles and loads into the kernel the generated 343file into EBPF is: 344 345``` 346#!/usr/bin/env python3 347from bcc import BPF 348 349b = BPF(src_file="file.c", debug=0) 350fn = b.load_func("ebpf_filter", BPF.SCHED_CLS) 351``` 352 353##### Connecting the generated program with the TC 354 355The EBPF code that is generated is intended to be used as a classifier 356attached to the ingress packet path using the Linux TC subsystem. The 357same EBPF code should be attached to all interfaces. Note however 358that all EBPF code instances share a single set of tables, which are 359used to control the program behavior. 360 361The following code fragment illustrates how the EBPF code can be 362hooked up to the `eth0` interface using a Python program. (The `fn` 363variable is the one produced by the previous code fragment). 364 365``` 366from pyroute2 import IPRoute 367 368ipr = IPRoute() 369interface_name="eth0" 370if_index = ipr.link_lookup(ifname=interface_name)[0] 371ipr.tc("add", "ingress", if_index, "ffff:") 372ipr.tc("add-filter", "bpf", if_index, ":1", fd=fn.fd, 373 name=fn.name, parent="ffff:", action="ok", classid=1) 374``` 375