• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Compiling P4 to EBPF
2
3Mihai Budiu - mbudiu@barefootnetworks.com
4
5September 22, 2015
6
7## Abstract
8
9This document describes a prototype compiler that translates programs
10written in the P4 programming languages to eBPF programs.  The
11translation is performed by generating programs written in a subset of
12the C programming language, that are converted to EBPF using the BPF
13Compiler Collection tools.
14
15The compiler code is licensed under an [Apache v2.0 license]
16(http://www.apache.org/licenses/LICENSE-2.0.html).
17
18## Preliminaries
19
20In this section we give a brief overview of P4 and EBPF.  A detailed
21treatment of these topics is outside the scope of this text.
22
23### P4
24
25P4 (http://p4.org) is a domain-specific programming language for
26specifying the behavior of the dataplanes of network-forwarding
27elements.  The name of the programming language comes from the title
28of a paper published in the proceedings of SIGCOMM Computer
29Communications Review in 2014:
30http://www.sigcomm.org/ccr/papers/2014/July/0000000.0000004:
31"Programming Protocol-Independent Packet Processors".
32
33P4 itself is protocol-independent but allows programmers to express a
34rich set of data plane behaviors and protocols. The core P4
35abstractions are:
36
37* Header definitions describe the format (the set of fields and their
38  sizes) of each header within a packet.
39
40* Parse graphs (finite-state machines) describe the permitted header
41  sequences within received packets.
42
43* Tables associate keys to actions. P4 tables generalize traditional
44  forwarding tables; they can be used to implement routing tables,
45  flow lookup tables, access-control lists, etc.
46
47* Actions describe how packet header fields and metadata are manipulated.
48
49* Match-action units stitch together tables and actions, and perform
50  the following sequence of operations:
51
52  * Construct lookup keys from packet fields or computed metadata,
53
54  * Use the constructed lookup key to index into tables, choosing an
55  action to execute,
56
57  * Finally, execute the selected action.
58
59* Control flow is expressed as an imperative program describing the
60  data-dependent packet processing within a pipeline, including the
61  data-dependent sequence of match-action unit invocations.
62
63P4 programs describe the behavior of network-processing dataplanes.  A
64P4 program is designed to operate in concert with a separate *control
65plane* program.  The control plane is responsible for managing at
66runtime the contents of the P4 tables.  P4 cannot be used to specify
67control-planes; however, a P4 program implicitly specifies the
68interface between the data-plane and the control-plane.
69
70The P4 language is under active development; the current stable
71version is 1.0.2 (see http://p4.org/spec); a reference implementation
72of a compiler and associated tools is freely available using a Apache
732 open-source license (see http://p4.org/code).
74
75### EBPF
76
77#### Safe code
78
79EBPF is a acronym that stands for Extended Berkeley Packet Filters.
80In essence EBPF is a low-level programming language (similar to
81machine code); EBPF programs are traditionally executed by a virtual
82machine that resides in the Linux kernel.  EBPF programs can be
83inserted and removed from a live kernel using dynamic code
84instrumentation.  The main feature of EBPF programs is their *static
85safety*: prior to execution all EBPF programs have to be validated as
86being safe, and unsafe programs cannot be executed.  A safe program
87provably cannot compromise the machine it is running on:
88
89* it can only access a restricted memory region (on the local stack)
90
91* it can run only for a limited amount of time; during execution it
92  cannot block, sleep or take any locks
93
94* it cannot use any kernel resources with the exception of a limited
95  set of kernel services which have been specifically whitelisted,
96  including operations to manipulate tables (described below)
97
98#### Kernel hooks
99
100EBPF programs are inserted into the kernel using *hooks*.  There are
101several types of hooks available:
102
103* any function entry point in the kernel can act as a hook; attaching
104  an EBPF program to a function `foo()` will cause the EBPF program to
105  execute every time some kernel thread executes `foo()`.
106
107* EBPF programs can also be attached using the Linux Traffic Control
108  (TC) subsystem, in the network packet processing datapath.  Such
109  programs can be used as TC classifiers and actions.
110
111* EBPF programs can also be attached to sockets or network interfaces.
112  In this case they can be used for processing packets that flow
113  through the socket/interface.
114
115EBPF programs can be used for many purposes; the main use cases are
116dynamic tracing and monitoring, and packet processing.  We are mostly
117interested in the latter use case in this document.
118
119#### EBPF Tables
120
121The EBPF runtime exposes a bi-directional kernel-userspace data
122communication channel, called *tables* (also called maps in some EBPF
123documents and code samples).  EBPF tables are essentially key-value
124stores, where keys and values are arbitrary fixed-size bitstrings.
125The key width, value width and table size (maximum number of entries
126that can be stored) are declared statically, at table creation time.
127
128In user-space tables handles are exposed as file descriptors.  Both
129user- and kernel-space programs can manipulate tables, by inserting,
130deleting, looking up, modifying, and enumerating entries in a table.
131
132In kernel space the keys and values are exposed as pointers to the raw
133underlying data stored in the table, whereas in user-space the
134pointers point to copies of the data.
135
136#### Concurrency
137
138An important aspect to understand related to EBPF is the execution
139model.  An EBPF program is triggered by a kernel hook; multiple
140instances of the same kernel hook can be running simultaneously on
141different cores.
142
143Each table however has a single instances across all the cores.  A
144single table may be accessed simultaneously by multiple instances of
145the same EBPF program running as separate kernel threads on different
146cores.  EBPF tables are native kernel objects, and access to the table
147contents is protected using the kernel RCU mechanism.  This makes
148access to table entries safe under concurrent execution; for example,
149the memory associated to a value cannot be accidentally freed while an
150EBPF program holds a pointer to the respective value.  However,
151accessing tables is prone to data races; since EBPF programs cannot
152use locks, some of these races often cannot be avoided.
153
154EBPF and the associated tools are also under active development, and
155new capabilities are added frequently.  The P4 compiler generates code
156that can be compiled using the BPF Compiler Collection (BCC)
157(https://github.com/iovisor/bcc)
158
159## Compiling P4 to EBPF
160
161From the above description it is apparent that the P4 and EBPF
162programming languages have different expressive powers.  However,
163there is a significant overlap in their capabilities, in particular,
164in the domain of network packet processing.  The following image
165illustrates the situation:
166
167![P4 and EBPF overlap in capabilities](scope.png)
168
169We expect that the overlapping region will grow in size as both P4 and
170EBPF continue to mature.
171
172The current version of the P4 to EBPF compiler translates programs
173written in the version 1.1 of the P4 programming language to programs
174written in a restricted subset of C.  The subset of C is chosen such
175that it should be compilable to EBPF using BCC.
176
177```
178         --------------              -------
179P4 --->  | P4-to-EBPF | ---> C ----> | BCC | --> EBPF
180         --------------              -------
181```
182
183The P4 program only describes the packet processing *data plane*, that
184runs in the Linux kernel.  The *control plane* must be separately
185implemented by the user.  The BCC tools simplify this task
186considerably, by generating C and/or Python APIs that expose the
187dataplane/control-plane APIs.
188
189### Dependencies
190
191EBPF programs require a Linux kernel with version 4.2 or newer.
192
193In order to use the P4 to EBPF compiler the following software must be installed:
194
195* The compiler itself is written in the Python (v2.x) programming
196  language.
197
198* the P4 compiler front-end: (https://github.com/p4lang/p4-hlir).
199  This is required for parsing the P4 programs.
200
201* the BCC compiler collection tools: (https://github.com/iovisor/bcc).
202  This is required for compiling the generated code.  Also, BCC comes
203  with a set of Python utilities which can be used to implement
204  control-plane programs that operate in concert with the kernel EBPF
205  datapath.
206
207The P4 to EBPF compiler generates code that is designed for being used
208as a classifier using the Linux TC subsystem.
209
210Furthermore, the test code provided is written using the Python (v3.x)
211programming language and requires several Python packages to be
212installed.
213
214### Supported capabilities
215
216The current version of the P4 to EBPF compiler supports a relatively
217narrow subset of the P4 language, but still powerful enough to write
218very complex packet filters and simple packet forwarding engines.  In
219the spirit of open-source "release early, release often", we expect
220that the compiler's capabilities will improve gradually.
221
222* Packet filtering is performed using the `drop()` action.  Packets
223  that are not dropped will be forwarded.
224
225* Packet forwarding is performed by setting the
226  `standard_metadata.egress_port` to the index of the destination
227  network interface
228
229Here are some limitations imposed on the P4 programs:
230
231* Currently both the ingress and the egress P4 pipelines are executed
232  at the same hook (wherever the user chooses to insert the generated
233  EBPF program).  In the future the compiler should probably generate
234  two separate EBPF programs.
235
236* arbitrary parsers can be compiled, but the BCC compiler will reject
237  parsers that contain cycles
238
239* arithmetic on data wider than 32 bits is not supported
240
241* checksum computations are not implemented.  In consequence, programs
242  that IP/TCP/UDP headers will produce incorrect packet headers.
243
244* EBPF does not offer support for ternary or LPM tables
245
246* P4 cloning and recirculation and not supported
247
248* meters and registers are not supported; only direct counters are
249  currently supported.  EBPF can potentially support registers and
250  arbitrary counters, so these may appear in the future.
251
252* learning (i.e. `generate_digest`) is not implemented
253
254### Translating P4 to C
255
256To simplify the translation, the P4 programmer should refrain using
257identifiers whose name starts with `ebpf_`.
258
259The following table provides a brief summary of how each P4 construct
260is mapped to a corresponding C construct:
261
262#### Translating parsers
263
264P4 Construct | C Translation
265----------|------------
266`header_type` | `struct` type
267`header`      | `struct` instance with an additional `valid` bit
268`metadata`    | `struct` instance
269parser state  | code block
270state transition | `goto` statement
271`extract` | load/shift/mask data from packet buffer
272
273#### Translating match-action pipelines
274
275P4 Construct | C Translation
276----------|------------
277table  | 2 EBPF tables: second one used just for the default action
278table key | `struct` type
279table `actions` block | tagged `union` with all possible actions
280`action` arguments | `struct`
281table `reads` | EBPF table access
282`action` body | code block
283table `apply` | `switch` statement
284counters | additional EBPF table
285
286### Code organization
287
288The compiler code is organized in two folders:
289
290* `compiler`: the complete compiler source code, in Python v2.x
291  The compiler entry point is `p4toEbpf.py`.
292
293* `test`: testing code and data.  There are two testing programs:
294  * `testP4toEbpf.py`: which compiles all P4 files in the testprograms folder
295
296  * `endToEndTest.py`: which compiles and executes the simple.p4
297  program, and includes a simple control plane
298
299Currently the compiler contains no installation capabilities.
300
301### Invoking the compiler
302
303Invoking the compiler is just a matter of invoking the python program
304with a suitable input P4 file:
305
306```
307p4toEbpf.py file.p4 -o file.c
308```
309
310#### Compiler options
311
312The P4 compiler first runs the C preprocessor on the input P4 file.
313Some of the command-line options are passed directly to the
314preprocessor.
315
316The following compiler options are available:
317
318Option | Meaning
319-------|--------
320`-D macro` | Option passed to C preprocessor
321`-I path`  | Option passed to C preprocessor
322`-U macro` | Option passed to C preprocessor
323`-g [router|filter]` | Controls whether the generated code behaves like a router or a filter.
324`-o outoutFile` | writes the generated C code to the specified output file.
325
326The `-g` option controls the nature of the generated code:
327
328* `-g filter` generates a filter; the only P4 action that has an
329  effect is the `drop()` action.  Setting metadata in P4 (e.g.,
330  `egress_port`) has no effect.
331
332* `-g router` generates a simple router; both `drop()` and
333  `egress_port` impact packet processing.
334
335#### Using the generated code
336
337The resulting file contains the complete data structures, tables, and
338a C function named `ebpf_filter` that implements the P4-specified
339data-plane.  This C file can be manipulated using the BCC tools;
340please refer to the BCC project documentation and sample test files of
341the P4 to EBPF source code for an in-depth understanding.  A minimal
342Python program that compiles and loads into the kernel the generated
343file into EBPF is:
344
345```
346#!/usr/bin/env python3
347from bcc import BPF
348
349b = BPF(src_file="file.c", debug=0)
350fn = b.load_func("ebpf_filter", BPF.SCHED_CLS)
351```
352
353##### Connecting the generated program with the TC
354
355The EBPF code that is generated is intended to be used as a classifier
356attached to the ingress packet path using the Linux TC subsystem.  The
357same EBPF code should be attached to all interfaces.  Note however
358that all EBPF code instances share a single set of tables, which are
359used to control the program behavior.
360
361The following code fragment illustrates how the EBPF code can be
362hooked up to the `eth0` interface using a Python program.  (The `fn`
363variable is the one produced by the previous code fragment).
364
365```
366from pyroute2 import IPRoute
367
368ipr = IPRoute()
369interface_name="eth0"
370if_index = ipr.link_lookup(ifname=interface_name)[0]
371ipr.tc("add", "ingress", if_index, "ffff:")
372ipr.tc("add-filter", "bpf", if_index, ":1", fd=fn.fd,
373       name=fn.name, parent="ffff:", action="ok", classid=1)
374```
375