Runs PROGRAM inside a sandbox. See minijail0(1) for details.
Run in a PID and VFS namespace without superuser capabilities (but still as root) and with a private view of /proc:
.EX # minijail0 -p -v -r -c 0 /bin/ps PID TTY TIME CMD 1 pts/0 00:00:00 minijail0 2 pts/0 00:00:00 psRunning a process with a seccomp filter policy at reduced privileges:
.EX # minijail0 -S /usr/share/minijail0/$(uname -m)/cat.policy -- \\ /bin/cat /proc/self/seccomp_filter ...Long lines may be broken up using \ at the end.
A policy that emulates seccomp(2) in mode 1 may look like:
.EX read: 1 write: 1 sig_return: 1 exit: 1The "1" acts as a wildcard and allows any use of the mentioned system call. More advanced filtering is possible if your kernel supports CONFIG_FTRACE_SYSCALLS. For example, we can allow a process to open any file read only and mmap PROT_READ only:
.EX # open with O_LARGEFILE|O_RDONLY|O_NONBLOCK or some combination. open: arg1 == 32768 || arg1 == 0 || arg1 == 34816 || arg1 == 2048 mmap2: arg2 == 0x0 munmap: 1 close: 1The supported arguments may be found by reviewing the system call prototypes in the Linux kernel source code. Be aware that any non-numeric comparison may be subject to time-of-check-time-of-use attacks and cannot be considered safe.
execve may only be used when invoking with CAP_SYS_ADMIN privileges.
In order to promote reusability, policy files can include other policy files using the following syntax:
.EX @include /absolute/path/to/file.policy @include ./path/relative/to/CWD/file.policyInclusion is limited to a single level (i.e. files that are @included cannot themselves @include more files), since that makes the policies harder to understand.
==, !=, <, <=, >, and >= should be pretty self explanatory.
& will test for a flag being set, for example, O_RDONLY for open (2):
.EX open: arg1 & O_RDONLYMinijail supports most common named constants, like O_RDONLY. It's preferable to use named constants rather than numeric values as not all architectures use the same numeric value.
When the possible combinations of allowed flags grow, specifying them all can be cumbersome. This is where the in operator comes handy. The system call will be allowed iff the flags set in the argument are included (as a set) in the flags in the policy:
.EX mmap: arg3 in MAP_PRIVATE|MAP_ANONYMOUSThis will allow mmap(2) as long as arg3 (flags) has any combination of MAP_PRIVATE and MAP_ANONYMOUS, but nothing else. One common use of this is to restrict mmap(2) / mprotect(2) to only allow write^exec mappings:
.EX mmap: arg2 in ~PROT_EXEC || arg2 in ~PROT_WRITE mprotect: arg2 in ~PROT_EXEC || arg2 in ~PROT_WRITEThis expression will block the read(2) syscall, make it return -1, and set errno to EBADF (9 on x86 platforms).
An expression can also include an optional return <errno> clause, separated by a semicolon:
.EX read: arg0 == 0; return EBADFThis is, if the first argument to read is 0, then allow the syscall; else, block the syscall, return -1, and set errno to EBADF.
It's also possible to analyze the binary checking for all non-dead functions and determining if any of them issue system calls. There is no active implementation for this, but something like code.google.com/p/seccompsandbox is one possible runtime variant.
It supports the following syntax:
.EX % minijail-config-file v0 <option>=<argument> <no-argument-option> <empty line> # any single line commentLong lines may be broken up using \ at the end.
The special directive "% minijail-config-file v0" must occupy the first line. "v0" also declares the version of the config file format.
Keys contain only alphabetic characters and '-'. Values can be any non-empty string. Leading and trailing whitespaces around keys and values are permitted but will be stripped before processing.
Currently all long options are supported such as mount, bind-mount. For a option that has no argument, the option will occupy a single line, without '=' and value. Otherwise, any string that is given after the '=' is interpreted as the argument.