1.. SPDX-License-Identifier: GPL-2.0 2 3Protected virtual machines (pKVM) 4================================= 5 6Introduction 7------------ 8 9Protected KVM (pKVM) is a KVM/arm64 extension which uses the two-stage 10translation capability of the Armv8 MMU to isolate guest memory from the host 11system. This allows for the creation of a confidential computing environment 12without relying on whizz-bang features in hardware, but still allowing room for 13complementary technologies such as memory encryption and hardware-backed 14attestation. 15 16The major implementation change brought about by pKVM is that the hypervisor 17code running at EL2 is now largely independent of (and isolated from) the rest 18of the host kernel running at EL1 and therefore additional hypercalls are 19introduced to manage manipulation of guest stage-2 page tables, creation of VM 20data structures and reclamation of memory on teardown. An immediate consequence 21of this change is that the host itself runs with an identity mapping enabled 22at stage-2, providing the hypervisor code with a mechanism to restrict host 23access to an arbitrary physical page. 24 25Enabling pKVM 26------------- 27 28The pKVM hypervisor is enabled by booting the host kernel at EL2 with 29"``kvm-arm.mode=protected``" on the command-line. Once enabled, VMs can be spawned 30in either protected or non-protected state, although the hypervisor is still 31responsible for managing most of the VM metadata in either case. 32 33Limitations 34----------- 35 36Enabling pKVM places some significant limitations on KVM guests, regardless of 37whether they are spawned in protected state. It is therefore recommended only 38to enable pKVM if protected VMs are required, with non-protected state acting 39primarily as a debug and development aid. 40 41If you're still keen, then here is an incomplete list of caveats that apply 42to all VMs running under pKVM: 43 44- Guest memory cannot be file-backed (with the exception of shmem/memfd) and is 45 pinned as it is mapped into the guest. This prevents the host from 46 swapping-out, migrating, merging or generally doing anything useful with the 47 guest pages. It also requires that the VMM has either ``CAP_IPC_LOCK`` or 48 sufficient ``RLIMIT_MEMLOCK`` to account for this pinned memory. 49 50- GICv2 is not supported and therefore GICv3 hardware is required in order 51 to expose a virtual GICv3 to the guest. 52 53- Read-only memslots are unsupported and therefore dirty logging cannot be 54 enabled. 55 56- Memslot configuration is fixed once a VM has started running, with subsequent 57 move or deletion requests being rejected with ``-EPERM``. 58 59- There are probably many others. 60 61Since the host is unable to tear down the hypervisor when pKVM is enabled, 62hibernation (``CONFIG_HIBERNATION``) and kexec (``CONFIG_KEXEC``) will fail 63with ``-EBUSY``. 64 65If you are not happy with these limitations, then please don't enable pKVM :) 66 67VM creation 68----------- 69 70When pKVM is enabled, protected VMs can be created by specifying the 71``KVM_VM_TYPE_ARM_PROTECTED`` flag in the machine type identifier parameter 72passed to ``KVM_CREATE_VM``. 73 74Protected VMs are instantiated according to a fixed vCPU configuration 75described by the ID register definitions in 76``arch/arm64/include/asm/kvm_pkvm.h``. Only a subset of the architectural 77features that may be available to the host are exposed to the guest and the 78capabilities advertised by ``KVM_CHECK_EXTENSION`` are limited accordingly, 79with the vCPU registers being initialised to their architecturally-defined 80values. 81 82Where not defined by the architecture, the registers of a protected vCPU 83are reset to zero with the exception of the PC and X0 which can be set 84either by the ``KVM_SET_ONE_REG`` interface or by a call to PSCI ``CPU_ON``. 85 86VM runtime 87---------- 88 89By default, memory pages mapped into a protected guest are inaccessible to the 90host and any attempt by the host to access such a page will result in the 91injection of an abort at EL1 by the hypervisor. For accesses originating from 92EL0, the host will then terminate the current task with a ``SIGSEGV``. 93 94pKVM exposes additional hypercalls to protected guests, primarily for the 95purpose of establishing shared-memory regions with the host for communication 96and I/O. These hypercalls are documented in hypercalls.rst. 97