Lines Matching +full:keep +full:- +full:power +full:- +full:in +full:- +full:suspend
1 .. SPDX-License-Identifier: GPL-2.0
4 The Definitive KVM (Kernel-based Virtual Machine) API Documentation
13 - System ioctls: These query and set global attributes which affect the
14 whole kvm subsystem. In addition a system ioctl is used to create
17 - VM ioctls: These query and set attributes that affect an entire virtual
18 machine, for example memory layout. In addition a VM ioctl is used to
24 - vcpu ioctls: These query and set attributes that control the operation
28 the vcpu, except for asynchronous vcpu ioctl that are marked as such in
32 - device ioctls: These query and set attributes that control the operation
51 In general file descriptors can be migrated among processes by means
60 file descriptor, not its creator (process). In other words, the VM and
80 facility that allows backward-compatible extensions to the API to be
104 the ioctl returns -ENOTTY.
122 -----------------------
139 -----------------
158 In order to create user controlled virtual machines on S390, check
167 memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
176 KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
178 address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
196 ioctl() at run-time.
202 exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
208 ----------------------------------------------------------
213 :Parameters: struct kvm_msr_list (in/out)
214 :Returns: 0 on success; -1 on error
220 E2BIG the msr index list is too big to fit in the array specified by
227 __u32 nmsrs; /* number of msrs in entries */
231 The user fills in the size of the indices array in nmsrs, and in return
232 kvm adjusts nmsrs to reflect the actual number of msrs and fills in the
239 not returned in the MSR list, as different vcpus can have a different number
250 -----------------------
262 additional information in the integer return value.
269 --------------------------
275 :Returns: size of vcpu mmap area, in bytes
282 the VCPU file descriptor can be mmap-ed, including:
284 - if KVM_CAP_COALESCED_MMIO is available, a page at
286 this page is included in the result of KVM_GET_VCPU_MMAP_SIZE.
289 - if KVM_CAP_DIRTY_LOG_RING is available, a number of pages at
295 -------------------
301 :Returns: vcpu fd on success, -1 on error
304 The vcpu id is an integer in the range [0, max_vcpu_id).
307 the KVM_CHECK_EXTENSION ioctl() at run-time.
309 KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time.
317 KVM_CAP_MAX_VCPU_ID of the KVM_CHECK_EXTENSION ioctl() at run-time.
323 threads in one or more virtual CPU cores. (This is because the
324 hardware requires all the hardware threads in a CPU core to be in the
327 dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
328 given vcore will always be in the same physical core as each other
332 single-threaded guest vcpus, it should make all vcpu ids be a multiple
337 KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
342 --------------------------------
347 :Parameters: struct kvm_dirty_log (in/out)
348 :Returns: 0 on success, -1 on error
363 since the last call to this ioctl. Bit 0 is the first page in the
367 If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of slot field specifies
371 The bits in the dirty bitmap are cleared before the ioctl returns, unless
380 ------------
386 :Returns: 0 on success, -1 on error
407 -----------------
413 :Returns: 0 on success, -1 on error
421 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
431 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
440 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
447 -----------------
452 :Parameters: struct kvm_regs (in)
453 :Returns: 0 on success, -1 on error
461 ------------------
467 :Returns: 0 on success, -1 on error
484 /* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
492 ------------------
497 :Parameters: struct kvm_sregs (in)
498 :Returns: 0 on success, -1 on error
505 ------------------
510 :Parameters: struct kvm_translation (in/out)
511 :Returns: 0 on success, -1 on error
519 /* in */
532 ------------------
537 :Parameters: struct kvm_interrupt (in)
546 /* in */
557 -EEXIST if an interrupt is already enqueued
558 -EINVAL the irq number is invalid
559 -ENXIO if the PIC is in the kernel
560 -EFAULT if the pointer is invalid
564 ioctl is useful if the in-kernel PIC is not used.
604 RISC-V:
631 -----------------
636 :Parameters: struct kvm_msrs (in/out)
638 -1 on error
641 Reads the values of MSR-based features that are available for the VM. This
643 The list of msr-based features can be obtained using KVM_GET_MSR_FEATURE_INDEX_LIST
644 in a system ioctl.
647 Reads model-specific registers from the vcpu. Supported msr indices can
648 be obtained using KVM_GET_MSR_INDEX_LIST in a system ioctl.
653 __u32 nmsrs; /* number of msrs in entries */
667 kvm will fill in the 'data' member.
671 -----------------
676 :Parameters: struct kvm_msrs (in)
677 :Returns: number of msrs successfully set (see below), -1 on error
679 Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
686 It tries to set the MSRs in array entries[] one by one. If setting an MSR
693 ------------------
698 :Parameters: struct kvm_cpuid (in)
699 :Returns: 0 on success, -1 on error
705 - If this IOCTL fails, KVM gives no guarantees that previous valid CPUID
707 of the resulting CPUID configuration through KVM_GET_CPUID2 in case.
708 - Using KVM_SET_CPUID{,2} after KVM_RUN, i.e. changing the guest vCPU model
710 - Using heterogeneous CPUID configurations, modulo APIC IDs, topology, etc...
733 ------------------------
738 :Parameters: struct kvm_signal_mask (in)
739 :Returns: 0 on success, -1 on error
744 their traditional behaviour) will cause KVM_RUN to return with -EINTR.
759 ----------------
765 :Returns: 0 on success, -1 on error
776 __u8 ftwx; /* in fxsave format */
797 ----------------
802 :Parameters: struct kvm_fpu (in)
803 :Returns: 0 on success, -1 on error
814 __u8 ftwx; /* in fxsave format */
835 -----------------------
841 :Returns: 0 on success, -1 on error
843 Creates an interrupt controller model in the kernel.
845 future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both
846 PIC and IOAPIC; GSI 16-23 only go to the IOAPIC.
857 -----------------
863 :Returns: 0 on success, -1 on error
865 Sets the level of a GSI input to the interrupt controller model in the kernel.
867 been previously created with KVM_CREATE_IRQCHIP. Note that edge-triggered
870 On real hardware, interrupt pins can be active-low or active-high. This
875 (active-low/active-high) for level-triggered interrupts, and KVM used
876 to consider the polarity. However, due to bitrot in the handling of
877 active-low interrupts, the above convention is now valid on x86 too.
879 should not present interrupts to the guest as active-low unless this
880 capability is present (or unless it is not using the in-kernel irqchip,
885 in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
894 - KVM_ARM_IRQ_TYPE_CPU:
895 out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ
896 - KVM_ARM_IRQ_TYPE_SPI:
897 in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.)
899 - KVM_ARM_IRQ_TYPE_PPI:
900 in-kernel GIC: PPI, irq_id between 16 and 31 (incl.)
902 (The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs)
904 In both cases, level is used to assert/deassert the line.
911 injection of interrupts for the in-kernel irqchip. KVM_IRQ_LINE can always
926 --------------------
931 :Parameters: struct kvm_irqchip (in/out)
932 :Returns: 0 on success, -1 on error
951 --------------------
956 :Parameters: struct kvm_irqchip (in)
957 :Returns: 0 on success, -1 on error
976 -----------------------
981 :Parameters: struct kvm_xen_hvm_config (in)
982 :Returns: 0 on success, -1 on error
986 blobs in userspace. When the guest writes the MSR, kvm copies one
987 page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
1003 be set in the flags field of this ioctl:
1007 intercepted and passed to userspace through KVM_EXIT_XEN. In this
1013 structures directly. This, in turn, may allow KVM to enable features
1019 No other flags are currently valid in the struct kvm_xen_hvm_config.
1022 ------------------
1028 :Returns: 0 on success, -1 on error
1030 Gets the current timestamp of kvmclock as seen by the current guest. In
1035 set of bits that KVM can return in struct kvm_clock_data's flag member.
1048 If set, the `realtime` field in the kvm_clock_data
1054 If set, the `host_tsc` field in the kvm_clock_data
1072 ------------------
1077 :Parameters: struct kvm_clock_data (in)
1078 :Returns: 0 on success, -1 on error
1080 Sets the current timestamp of kvmclock to the value specified in its parameter.
1081 In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
1089 KVM_SET_CLOCK was called. The difference in elapsed time is added to the final
1107 ------------------------
1114 :Returns: 0 on success, -1 on error
1157 The following bits are defined in the flags field:
1159 - KVM_VCPUEVENT_VALID_SHADOW may be set to signal that
1162 - KVM_VCPUEVENT_VALID_SMM may be set to signal that smi contains a
1165 - KVM_VCPUEVENT_VALID_PAYLOAD may be set to signal that the
1170 - KVM_VCPUEVENT_VALID_TRIPLE_FAULT may be set to signal that the
1177 If the guest accesses a device that is being emulated by the host kernel in
1189 guest-visible registers. It is not possible to 'cancel' an SError that has been
1192 A device being emulated in user-space may also wish to generate an SError. To do
1193 this the events structure can be populated by user-space. The current state
1201 advertise KVM_CAP_ARM_INJECT_SERROR_ESR. In this case exception.has_esr will
1202 always have a non-zero value when read, and the agent making an SError pending
1203 should specify the ISS field in the lower 24 bits of exception.serror_esr. If
1204 the system supports KVM_CAP_ARM_INJECT_SERROR_ESR, but user-space sets the events
1208 -EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
1209 will return -EINVAL.
1230 ------------------------
1236 :Parameters: struct kvm_vcpu_events (in)
1237 :Returns: 0 on success, -1 on error
1249 smi.pending. Keep the corresponding bits in the flags field cleared to
1250 suppress overwriting the current in-kernel state. The bits are:
1255 KVM_VCPUEVENT_VALID_SMM transfer the smi sub-struct.
1258 If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
1265 can be set in the flags field to signal that the
1270 can be set in flags field to signal that the triple_fault field contains
1288 KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in
1297 ----------------------
1303 :Returns: 0 on success, -1 on error
1319 ----------------------
1324 :Parameters: struct kvm_debugregs (in)
1325 :Returns: 0 on success, -1 on error
1334 -------------------------------
1339 :Parameters: struct kvm_userspace_memory_region (in)
1340 :Returns: 0 on success, -1 on error
1357 memory slot. Bits 0-15 of "slot" specify the slot id and this value
1360 Slots may not overlap in guest physical address space.
1362 If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot"
1365 KVM_CAP_MULTI_ADDRESS_SPACE capability. Slots in separate address spaces
1370 an existing slot, it may be moved in the guest physical memory space,
1382 be identical. This allows large pages in the guest to be backed by large
1383 pages in the host.
1386 KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
1389 to make a new slot read-only. In this case, writes to this memory will be
1392 When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
1397 Note: On arm64, a write generated by the page-table walker (to update
1398 the Access and Dirty flags, for example) never results in a
1401 page-table walker, making it impossible to emulate the access.
1402 Instead, an abort (data abort if the cause of the page-table update
1404 fetch) is injected in the guest.
1409 Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set.
1410 Returns -EINVAL if called on a protected VM.
1413 ---------------------
1418 :Parameters: unsigned long tss_address (in)
1419 :Returns: 0 on success, -1 on error
1421 This ioctl defines the physical address of a three-page region in the guest
1427 This ioctl is required on Intel-based hosts. This is needed on Intel hardware
1428 because of a quirk in the virtualization implementation (see the internals
1433 -------------------
1438 :Parameters: struct kvm_enable_cap (in)
1439 :Returns: 0 on success; -1 on error
1444 :Parameters: struct kvm_enable_cap (in)
1445 :Returns: 0 on success; -1 on error
1461 /* in */
1484 The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
1485 for vm-wide capabilities.
1488 ---------------------
1494 :Returns: 0 on success; -1 on error
1519 KVM_MP_STATE_CHECK_STOP the vcpu is in a special error state [s390]
1522 KVM_MP_STATE_LOAD the vcpu is in a special load/startup state
1524 KVM_MP_STATE_SUSPENDED the vcpu is in a suspend state and is waiting
1529 in-kernel irqchip, the multiprocessing state must be maintained by userspace on
1535 If a vCPU is in the KVM_MP_STATE_SUSPENDED state, KVM will emulate the
1542 event in subsequent calls to KVM_RUN.
1546 If userspace intends to keep the vCPU in a SUSPENDED state, it is
1569 ---------------------
1574 :Parameters: struct kvm_mp_state (in)
1575 :Returns: 0 on success; -1 on error
1581 in-kernel irqchip, the multiprocessing state must be maintained by userspace on
1594 ------------------------------
1599 :Parameters: unsigned long identity (in)
1600 :Returns: 0 on success, -1 on error
1602 This ioctl defines the physical address of a one-page region in the guest
1608 Setting the address to 0 will result in resetting the address to its default
1611 This ioctl is required on Intel-based hosts. This is needed on Intel hardware
1612 because of a quirk in the virtualization implementation (see the internals
1618 ------------------------
1624 :Returns: 0 on success, -1 on error
1627 as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default
1633 ------------------
1639 :Returns: 0 on success, -1 on error
1653 ------------------
1658 :Parameters: struct kvm_xsave (in)
1659 :Returns: 0 on success, -1 on error
1674 enabled with ``arch_prctl()``, but this may change in the future.
1676 The offsets of the state save areas in struct kvm_xsave follow the
1681 -----------------
1687 :Returns: 0 on success, -1 on error
1708 -----------------
1713 :Parameters: struct kvm_xcrs (in)
1714 :Returns: 0 on success, -1 on error
1735 ----------------------------
1740 :Parameters: struct kvm_cpuid2 (in/out)
1741 :Returns: 0 on success, -1 on error
1767 hardware and kvm in its default configuration. Userspace can use the
1774 Dynamically-enabled feature bits need to be requested with
1779 expose cpuid features (e.g. MONITOR) which are not supported by kvm in
1784 with the 'nent' field indicating the number of entries in the variable-size
1789 entries in the 'entries' array, which is then filled.
1793 x2apic), may not be present in the host cpu, but are exposed by kvm if it can
1794 emulate them efficiently. The fields in each entry are defined as follows:
1820 feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
1824 -----------------------
1846 additional piece of information will be set in the flags bitmap.
1854 ------------------------
1859 :Parameters: struct kvm_irq_routing (in)
1860 :Returns: 0 on success, -1 on error
1866 - GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD.
1904 - KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry
1905 type, specifies that the devid field contains a valid value. The per-VM
1909 - zero otherwise
1930 BDF identifier in the lower 16 bits.
1934 address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
1960 in its indication of supported features, routing to Xen event channels
1963 2 level event channels. FIFO event channel support may be added in
1968 --------------------
1974 :Returns: 0 on success, -1 on error
1984 --------------------
1990 :Returns: virtual tsc-khz on success, negative value on error
1993 KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
1998 ------------------
2004 :Returns: 0 on success, -1 on error
2014 data format and layout are the same as documented in the architecture manual.
2018 (reported by MSR_IA32_APICBASE) of its VCPU. x2APIC stores APIC ID in
2019 the APIC_ID register (bytes 32-35). xAPIC only allows an 8-bit APIC ID
2020 which is stored in bits 31-24 of the APIC register, or equivalently in
2029 ------------------
2034 :Parameters: struct kvm_lapic_state (in)
2035 :Returns: 0 on success, -1 on error
2045 and layout are the same as documented in the architecture manual.
2047 The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
2049 See the note in KVM_GET_LAPIC.
2053 ------------------
2058 :Parameters: struct kvm_ioeventfd (in)
2062 within the guest. A guest write in the registered address will signal the
2076 For the special case of virtio-ccw devices on s390, the ioevent is matched
2088 to the registered address is equal to datamatch in struct kvm_ioeventfd.
2090 For virtio-ccw devices, addr contains the subchannel id and datamatch the
2099 ------------------
2104 :Parameters: struct kvm_dirty_tlb (in)
2105 :Returns: 0 on success, -1 on error
2114 This must be called whenever userspace has changed an entry in the shared
2122 Each bit corresponds to one TLB entry, ordered the same as in the shared TLB
2125 The array is little-endian: the bit 0 is the least significant bit of the
2131 be set to the number of set bits in the bitmap.
2135 -------------------------
2140 :Parameters: struct kvm_create_spapr_tce (in)
2144 is an IOMMU for PAPR-style virtual I/O. It is used to translate
2145 logical addresses used in virtual I/O into guest physical addresses,
2158 which this TCE table will translate - the table will contain one 64
2163 in real mode, updating the TCE table. H_PUT_TCE calls for other
2168 the entries written by kernel-handled H_PUT_TCE calls, and also lets
2169 userspace update the TCE table directly which is useful in some
2174 ---------------------
2183 time by the kernel. An RMA is a physically-contiguous, aligned region
2184 of memory used on older POWER processors to provide the memory which
2185 will be accessed by real-mode (MMU off) accesses in a KVM guest.
2186 POWER processors support a set of sizes for the RMA that usually
2199 RMA for a virtual machine. The size of the RMA in bytes (which is
2200 fixed at host kernel boot time) is returned in the rma_size field of
2210 ------------
2216 :Returns: 0 on success, -1 on error
2226 - pause the vcpu
2227 - read the local APIC's state (KVM_GET_LAPIC)
2228 - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1)
2229 - if so, issue KVM_NMI
2230 - resume the vcpu
2232 Some guests configure the LINT1 NMI input to cause a panic, aiding in
2237 ----------------------
2242 :Parameters: struct kvm_s390_ucas_mapping (in)
2243 :Returns: 0 in case of success
2259 ------------------------
2264 :Parameters: struct kvm_s390_ucas_mapping (in)
2265 :Returns: 0 in case of success
2275 This ioctl unmaps the memory in the vcpu's address space starting at
2281 ------------------------
2286 :Parameters: vcpu absolute address (in)
2287 :Returns: 0 in case of success
2294 controlled virtual machines to fault in the virtual cpu's lowcore pages
2299 --------------------
2304 :Parameters: struct kvm_one_reg (in)
2311 EINVAL invalid register ID, or no such register or used with VMs in
2319 code being returned in a specific situation.)
2329 defined by user space with the passed in struct kvm_one_reg, where id
2333 and their own constants and width. To keep track of the implemented
2542 ARM 32-bit CP15 registers have the following id bit patterns::
2546 ARM 64-bit CP15 registers have the following id bit patterns::
2554 ARM 32-bit VFP control registers have the following id bit patterns::
2558 ARM 64-bit FP registers have the following id bit patterns::
2562 ARM firmware pseudo-registers have the following bit pattern::
2570 arm64 core/FP-SIMD registers have the following id bit patterns. Note
2573 value in the kvm_regs structure seen as a 32bit array::
2604 .. [1] These encodings are not accepted for SVE-enabled vcpus. See
2629 arm64 firmware pseudo-registers have the following bit pattern::
2638 0x6060 0000 0015 ffff KVM_REG_ARM64_SVE_VLS pseudo-register
2641 ENOENT. max_vq is the vcpu's maximum supported vector length in 128-bit
2647 In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not
2652 KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector
2662 ((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >>
2663 ((vq - KVM_ARM64_SVE_VQ_MIN) % 64)) & 1))
2685 is hardware-dependent and may not be available. Attempting to configure
2692 arm64 bitmap feature firmware pseudo-registers have the following bit pattern::
2705 run at least once. A KVM_SET_ONE_REG in such a scenario will return
2706 a -EBUSY to userspace.
2719 patterns depending on whether they're 32-bit or 64-bit registers::
2721 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit)
2722 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit)
2726 hardware, host kernel, guest, and whether XPA is present in the guest, i.e.
2727 with the RI and XI bits (if they exist) in bits 63 and 62 respectively, and
2747 0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers)
2748 0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers)
2749 0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers)
2761 RISC-V registers are mapped using the lower 32 bits. The upper 8 bits of
2764 RISC-V config registers are meant for configuring a Guest VCPU and it has
2770 Following are the RISC-V config registers:
2782 RISC-V core registers represent the general execution state of a Guest VCPU
2788 Following are the RISC-V core registers:
2825 0x80x0 0000 0200 0020 mode Privilege mode (1 = S-mode or 0 = U-mode)
2828 RISC-V csr registers represent the supervisor mode control/status registers
2834 Following are the RISC-V csr registers:
2850 RISC-V timer registers represent the timer state of a Guest VCPU and it has
2855 Following are the RISC-V timer registers:
2860 0x8030 0000 0400 0000 frequency Time base frequency (read-only)
2866 RISC-V F-extension registers represent the single precision floating point
2871 Following are the RISC-V F-extension registers:
2882 RISC-V D-extension registers represent the double precision floating point
2886 0x8030 0000 06 <index into the __riscv_d_ext_state struct:24> (non-fcsr)
2888 Following are the RISC-V D-extension registers:
2905 0x9030 0000 0001 00 <reg:5> <sel:3> (64-bit)
2914 --------------------
2919 :Parameters: struct kvm_one_reg (in and out)
2926 EINVAL invalid register ID, or no such register or used with VMs in
2932 code being returned in a specific situation.)
2935 in a vcpu. The register to read is indicated by the "id" field of the
2936 kvm_one_reg struct passed in. On success, the register value can be found
2940 list in 4.68.
2944 ----------------------
2950 :Returns: 0 on success, -1 on error
2955 The host will set a flag in the pvclock structure that is checked from the
2961 load-link/store-conditional, or equivalent must be used. There are two cases
2968 -------------------
2973 :Parameters: struct kvm_msi (in)
2974 :Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
2976 Directly inject a MSI message. Only valid with in-kernel irqchip that handles
2991 KVM_MSI_VALID_DEVID: devid contains a valid value. The per-VM
2998 BDF identifier in the lower 16 bits.
3002 address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
3007 --------------------
3012 :Parameters: struct kvm_pit_config (in)
3013 :Returns: 0 on success, -1 on error
3015 Creates an in-kernel device model for the i8254 PIT. This call is only valid
3016 after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
3028 PIT timer interrupts may use a per-VM kernel thread for injection. If it
3031 kvm-pit/<owner-process-pid>
3040 -----------------
3046 :Returns: 0 on success, -1 on error
3048 Retrieves the state of the in-kernel PIT model. Only valid after
3049 KVM_CREATE_PIT2. The state is returned in the following structure::
3059 /* disable PIT in HPET legacy mode */
3068 -----------------
3073 :Parameters: struct kvm_pit_state2 (in)
3074 :Returns: 0 on success, -1 on error
3076 Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
3083 --------------------------
3089 :Returns: 0 on success, -1 on error
3093 This can in turn be used by userspace to generate the appropriate
3094 device-tree properties for the guest operating system.
3108 - KVM_PPC_PAGE_SIZES_REAL:
3110 store page sizes. When not set, any page size in the list can
3113 - KVM_PPC_1T_SEGMENTS
3114 The emulated MMU supports 1T segments in addition to the
3117 - KVM_PPC_NO_HASH
3124 page sizes for a segment in increasing order. Each entry is defined
3134 organized in increasing order, a lookup can stop when encountering
3137 The "slb_enc" field provides the encoding to use in the SLB for the
3138 page size. The bits are in positions such as the value can directly
3144 corresponding encoding in the hash PTE. Similarly, the array is
3150 __u32 pte_enc; /* Encoding in the HPTE (>>12) */
3158 --------------
3163 :Parameters: struct kvm_irqfd (in)
3164 :Returns: 0 on success, -1 on error
3174 With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
3175 mechanism allowing emulation of level-triggered, irqfd-based
3177 additional eventfd in the kvm_irqfd.resamplefd field. When operating
3178 in resample mode, posting of an interrupt through kvm_irq.fd asserts
3179 the specified gsi in the irqchip. When the irqchip is resampled, such
3180 as from an EOI, the gsi is de-asserted and the user is notified via
3181 kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
3189 - in case no routing entry is associated to this gsi, injection fails
3190 - in case the gsi is associated to an irqchip routing entry,
3192 - in case the gsi is associated to an MSI routing entry, the MSI
3194 to GICv3 ITS in-kernel emulation).
3197 --------------------------
3202 :Parameters: Pointer to u32 containing hash table order (in/out)
3203 :Returns: 0 on success, -1 on error
3215 The parameter is a pointer to a 32-bit unsigned integer variable
3222 default-sized hash table (16 MB).
3229 all HPTEs). In either case, if the guest is using the virtualized
3230 real-mode area (VRMA) facility, the kernel will re-create the VMRA
3234 -----------------------
3239 :Parameters: struct kvm_s390_interrupt (in)
3240 :Returns: 0 on success, -1 on error
3256 - sigp stop; optional flags in parm
3258 - program check; code in parm
3260 - sigp set prefix; prefix address in parm
3262 - restart
3264 - clock comparator interrupt
3266 - CPU timer interrupt
3268 - virtio external interrupt; external interrupt
3269 parameters in parm and parm64
3271 - sclp external interrupt; sclp parameter in parm
3273 - sigp emergency; source cpu in parm
3275 - sigp external call; source cpu in parm
3277 - compound value to indicate an
3278 I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
3279 I/O interruption parameters in parm (subchannel) and parm64 (intparm,
3282 - machine check interrupt; cr 14 bits in parm, machine check interrupt
3283 code in parm64 (note that machine checks needing further payload are not
3289 ------------------------
3294 :Parameters: Pointer to struct kvm_get_htab_fd (in)
3295 :Returns: file descriptor number (>= 0) on success, -1 on error
3298 entries in the guest's hashed page table (HPT), or to write entries to
3300 KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
3315 The 'start_index' field gives the index in the HPT of the entry at
3330 in the stream. The header format is::
3338 Writes to the fd create HPT entries starting at the index given in the
3344 ----------------------
3349 :Parameters: struct kvm_create_device (in/out)
3350 :Returns: 0 on success, -1 on error
3363 Creates an emulated device in the kernel. The file descriptor returned
3364 in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
3368 in the current vm).
3377 __u32 type; /* in: KVM_DEV_TYPE_xxx */
3379 __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
3383 --------------------------------------------
3391 :Returns: 0 on success, -1 on error
3399 (e.g. read-only attribute, or attribute that only makes
3400 sense when the device is in a different state)
3406 semantics are device-specific. See individual device documentation in
3414 __u32 group; /* device-defined */
3415 __u64 attr; /* group-defined */
3420 ------------------------
3427 :Returns: 0 on success, -1 on error
3438 indicate that the attribute can be read or written in the device's
3444 ----------------------
3449 :Parameters: struct kvm_vcpu_init (in)
3450 :Returns: 0 on success; -1 on error
3465 - Processor state:
3470 - General Purpose registers, including PC and SP: set to 0
3471 - FPSIMD/NEON registers: set to 0
3472 - SVE registers: set to 0
3473 - System registers: Reset to their architecturally defined
3486 - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
3489 - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
3491 - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
3494 - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
3497 - KVM_ARM_VCPU_PTRAUTH_ADDRESS: Enables Address Pointer authentication
3505 - KVM_ARM_VCPU_PTRAUTH_GENERIC: Enables Generic Pointer authentication
3513 - KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only).
3519 - KVM_REG_ARM64_SVE_VLS may be read using KVM_GET_ONE_REG: the
3520 initial value of this pseudo-register indicates the best set of
3525 - KVM_RUN and KVM_GET_REG_LIST are not available;
3527 - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access
3532 - KVM_REG_ARM64_SVE_VLS may optionally be written using
3538 - the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and can
3542 -----------------------------
3548 :Returns: 0 on success; -1 on error
3561 kvm_vcpu_init->features bitmap returned will have feature bits set if
3566 of struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in
3571 ---------------------
3576 :Parameters: struct kvm_reg_list (in/out)
3577 :Returns: 0 on success; -1 on error
3582 E2BIG the reg index list is too big to fit in the array specified by
3589 __u64 n; /* number of registers in reg[] */
3598 -----------------------------------------
3603 :Parameters: struct kvm_arm_device_address (in)
3604 :Returns: 0 on success, -1 on error
3623 Specify a device address in the guest's physical address space where guests
3634 arm64 currently only require this when using the in-kernel GIC
3640 base addresses will return -EEXIST.
3647 ------------------------------
3653 :Returns: 0 on success, -1 on error
3656 service in order to allow it to be handled in the kernel. The
3658 of a service that has a kernel-side implementation. If the token
3659 value is non-zero, it will be associated with that service, and
3667 ------------------------
3672 :Parameters: struct kvm_guest_debug (in)
3673 :Returns: 0 on success; -1 on error
3688 - KVM_GUESTDBG_ENABLE: guest debugging is enabled
3689 - KVM_GUESTDBG_SINGLESTEP: the next run should single-step
3694 - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64]
3695 - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390]
3696 - KVM_GUESTDBG_USE_HW: using hardware debug events [arm64]
3697 - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86]
3698 - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86]
3699 - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390]
3700 - KVM_GUESTDBG_BLOCKIRQ: avoid injecting interrupts/NMI/SMI [x86]
3703 are enabled in memory so we need to ensure breakpoint exceptions are
3718 the single-step debug event (KVM_GUESTDBG_SINGLESTEP) is supported.
3721 supported KVM_GUESTDBG_* bits in the control field.
3728 ---------------------------
3733 :Parameters: struct kvm_cpuid2 (in/out)
3734 :Returns: 0 on success, -1 on error
3768 structure with the 'nent' field indicating the number of entries in
3769 the variable-size array 'entries'. If the number of entries is too low
3773 to the number of valid entries in the 'entries' array, which is then
3780 Features like x2apic, for example, may not be present in the host cpu
3781 but are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be
3784 The fields in each entry are defined as follows:
3803 --------------------
3808 :Parameters: struct kvm_s390_mem_op (in)
3810 < 0 on generic error (e.g. -EFAULT or -ENOMEM),
3824 __u64 buf; /* buffer in userspace */
3837 The start address of the memory region has to be specified in the "gaddr"
3838 field, and the length of the region in the "size" field (which must not
3847 The type of operation is specified in the "op" field. Flags modifying
3848 their behavior can be set in the "flags" field. Undefined flag bits must
3868 Logical accesses are permitted for non-protected guests only.
3877 no actual access to the data in memory at the destination is performed.
3878 In this case, "buf" is unused and can be NULL.
3880 In case an access exception occurred during the access (or would occur
3881 in case of KVM_S390_MEMOP_F_CHECK_ONLY), the ioctl returns a positive
3886 translation-exception identifier (TEID) indicates suppression.
3889 protection is also in effect and may cause exceptions if accesses are
3895 after memory has been modified. In this case, if the exception is injected,
3909 Absolute accesses are permitted for non-protected guests only.
3926 parameter. "size" must be a power of two up to and including 16.
3948 -----------------------
3974 will cause the ioctl to return -EINVAL.
3980 -----------------------
3998 will cause the ioctl to return -EINVAL.
4001 storage keys. Each byte in the buffer will be set as the storage key for a
4004 Note: If any architecturally invalid key value is found in the given data then
4005 the ioctl will return -EINVAL.
4008 -----------------
4013 :Parameters: struct kvm_s390_irq (in)
4014 :Returns: 0 on success, -1 on error
4055 - KVM_S390_SIGP_STOP - sigp stop; parameter in .stop
4056 - KVM_S390_PROGRAM_INT - program check; parameters in .pgm
4057 - KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix
4058 - KVM_S390_RESTART - restart; no parameters
4059 - KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters
4060 - KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters
4061 - KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg
4062 - KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall
4063 - KVM_S390_MCHK - machine check interrupt; parameters in .mchk
4068 ---------------------------
4075 -EINVAL if buffer size is 0,
4076 -ENOBUFS if buffer size is too small to fit all pending interrupts,
4077 -EFAULT if the buffer address was invalid
4080 pending interrupts in a single buffer. Use cases include migration
4091 Userspace passes in the above struct and for each pending interrupt a
4095 the kernel never checked for flags == 0 and QEMU never pre-zeroed flags and
4096 reserved, these fields can not be used in the future without breaking
4099 If -ENOBUFS is returned the buffer provided was too small and userspace
4103 ---------------------------
4108 :Parameters: struct kvm_s390_irq_state (in)
4110 -EFAULT if the buffer address was invalid,
4111 -EINVAL for an invalid buffer length (see below),
4112 -EBUSY if there were already interrupts pending,
4116 This ioctl allows userspace to set the complete state of all cpu-local
4138 which is the maximum number of possibly pending cpu-local interrupts.
4141 ------------
4147 :Returns: 0 on success, -1 on error
4152 ----------------------------
4166 __u32 nmsrs; /* number of msrs in bitmap */
4168 __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
4183 Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
4190 Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
4222 the access in accordance with the vCPU model. Note, KVM may still ultimately
4226 By default, KVM operates in KVM_MSR_FILTER_DEFAULT_ALLOW mode with no MSR range
4230 filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
4241 part of VM-Enter/VM-Exit emulation.
4244 of VM-Enter/VM-Exit emulation. If an MSR access is denied on VM-Enter, KVM
4245 synthesizes a consistency check VM-Exit(EXIT_REASON_MSR_LOAD_FAIL). If an
4246 MSR access is denied on VM-Exit, KVM synthesizes a VM-Abort. In short, KVM
4248 the VM-Enter/VM-Exit MSR list. It is platform owner's responsibility to
4256 filter, e.g. MSRs with identical settings in both the old and new filter will
4262 result in KVM injecting a #GP instead of exiting to userspace.
4265 ----------------------------
4270 :Parameters: struct kvm_create_spapr_tce_64 (in)
4274 windows, described in 4.62 KVM_CREATE_SPAPR_TCE
4276 This capability uses extended struct in ioctl interface::
4283 __u64 offset; /* in pages */
4284 __u64 size; /* in pages */
4298 -------------------------
4303 :Parameters: struct kvm_reinject_control (in)
4305 -EFAULT if struct kvm_reinject_control cannot be read,
4306 -ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier.
4325 ------------------------------
4330 :Parameters: struct kvm_ppc_mmuv3_cfg (in)
4332 -EFAULT if struct kvm_ppc_mmuv3_cfg cannot be read,
4333 -EINVAL if the configuration is invalid
4346 There are two bits that can be set in flags; KVM_PPC_MMUV3_RADIX and
4354 process table, which is in the guest's space. This field is formatted
4355 as the second doubleword of the partition table entry, as defined in
4356 the Power ISA V3.00, Book III section 5.7.6.1.
4359 ---------------------------
4366 -EFAULT if struct kvm_ppc_rmmu_info cannot be written,
4367 -EINVAL if no useful information can be returned
4371 page sizes to put in the "AP" (actual page size) field for the tlbie
4386 radix page table, in terms of the log base 2 of the smallest page
4388 the PTE level up to the PGD level in that order. Any unused entries
4389 will have 0 in the page_shift field.
4392 encodings, encoded with the AP value in the top 3 bits and the log
4393 base 2 of the page size in the bottom 6 bits.
4396 --------------------------------
4401 :Parameters: struct kvm_ppc_resize_hpt (in)
4405 -EFAULT if struct kvm_reinject_control cannot be read,
4406 -EINVAL if the supplied shift or flags are invalid,
4407 -ENOMEM if unable to allocate the new HPT,
4428 requested in the parameters, discards the existing pending HPT and
4436 * If preparation of the pending HPT is still in progress, return an
4440 returns 0 (i.e. cancels any in-progress preparation).
4442 flags is reserved for future expansion, currently setting any bits in
4443 flags will result in an -EINVAL.
4450 -------------------------------
4455 :Parameters: struct kvm_ppc_resize_hpt (in)
4457 -EFAULT if struct kvm_reinject_control cannot be read,
4458 -EINVAL if the supplied shift or flags are invalid,
4459 -ENXIO is there is no pending HPT, or the pending HPT doesn't
4461 -EBUSY if the pending HPT is not fully prepared,
4462 -ENOSPC if there was a hash collision when moving existing
4464 -EIO on other error conditions
4480 returned 0 with the same parameters. In other cases
4481 KVM_PPC_RESIZE_HPT_COMMIT will return an error (usually -ENXIO or
4482 -EBUSY, though others may be possible if the preparation was started,
4486 placed itself in a quiescent state where no vcpu will make MMU enabled
4495 -----------------------------------
4501 :Returns: 0 on success, -1 on error
4508 -----------------------
4513 :Parameters: u64 mcg_cap (in)
4515 -EFAULT if u64 mcg_cap cannot be read,
4516 -EINVAL if the requested number of banks is invalid,
4517 -EINVAL if requested MCE capability is not supported.
4522 supported number of error-reporting banks can be retrieved when
4527 ---------------------
4532 :Parameters: struct kvm_x86_mce (in)
4534 -EFAULT if struct kvm_x86_mce cannot be read,
4535 -EINVAL if the bank number is invalid,
4536 -EINVAL if VAL bit is not set in status field.
4553 MCG_STATUS register reports that an MCE is in progress, KVM
4557 store it in the corresponding bank (provided this bank is
4561 ----------------------------
4566 :Parameters: struct kvm_s390_cmma_log (in, out)
4582 architecture. It is meant to be used in two scenarios:
4584 - During live migration to save the CMMA values. Live migration needs
4586 - To non-destructively peek at the CMMA values, with the flag
4591 member in the kvm_s390_cmma_log struct. The values in the input struct are
4612 count is the length of the buffer in bytes,
4617 KVM_S390_SKEYS_MAX. KVM_S390_SKEYS_MAX is re-used for consistency with
4620 The result is written in the buffer pointed to by the field values, and
4631 count will indicate the number of bytes actually written in the buffer.
4634 are then not copied in the buffer). Since a CMMA migration block needs
4644 the existing storage attributes are read even when not in migration
4652 In both cases:
4662 ----------------------------
4667 :Parameters: struct kvm_s390_cmma_log (in)
4691 count indicates how many values are to be considered in the buffer,
4699 values points to the buffer in userspace where to store the values.
4701 This ioctl can fail with -ENOMEM if not enough memory can be allocated to
4702 complete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if
4704 if the flags field was not 0, with -EFAULT if the userspace address is
4710 --------------------------
4717 -EFAULT if struct kvm_ppc_cpu_char cannot be written
4722 CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754). The information is
4723 returned in struct kvm_ppc_cpu_char, which looks like this::
4728 __u64 character_mask; /* valid bits in character */
4729 __u64 behaviour_mask; /* valid bits in behaviour */
4733 indicate which bits of character and behaviour have been filled in by
4734 the kernel. If the set of defined bits is extended in future then
4739 with preventing inadvertent information disclosure - specifically,
4740 whether there is an instruction to flash-invalidate the L1 data cache
4757 ---------------------------
4762 :Parameters: an opaque platform specific structure (in/out)
4763 :Returns: 0 on success; -1 on error
4766 for issuing platform-specific memory encryption commands to manage those
4770 (SEV) commands on AMD Processors. The SEV commands are defined in
4771 Documentation/virt/kvm/x86/amd-memory-encryption.rst.
4774 -----------------------------------
4779 :Parameters: struct kvm_enc_region (in)
4780 :Returns: 0 on success; -1 on error
4785 It is used in the SEV-enabled guest. When encryption is enabled, a guest
4789 moving ciphertext of those pages will not result in plaintext being
4798 -------------------------------------
4803 :Parameters: struct kvm_enc_region (in)
4804 :Returns: 0 on success; -1 on error
4810 ------------------------
4815 :Parameters: struct kvm_hyperv_eventfd (in)
4818 the specified Hyper-V connection id through the SIGNAL_EVENT hypercall, without
4819 causing a user exit. SIGNAL_EVENT hypercall with non-zero event flag number
4820 (bits 24-31) still triggers a KVM_EXIT_HYPERV_HCALL user exit.
4840 -EINVAL if conn_id or flags is outside the allowed range,
4841 -ENOENT on deassign if the conn_id isn't registered,
4842 -EEXIST on assign if the conn_id is already registered
4845 --------------------------
4850 :Parameters: struct kvm_nested_state (in/out)
4851 :Returns: 0 on success, -1 on error
4919 --------------------------
4924 :Parameters: struct kvm_nested_state (in)
4925 :Returns: 0 on success, -1 on error
4931 -------------------------------------
4946 do not exit to userspace and their value is recorded in a ring buffer
4960 ------------------------------------
4965 :Parameters: struct kvm_clear_dirty_log (in)
4966 :Returns: 0 on success, -1 on error
4981 The ioctl clears the dirty status of pages in a memory slot, according to
4982 the bitmap that is passed in struct kvm_clear_dirty_log's dirty_bitmap
4983 field. Bit 0 of the bitmap corresponds to page "first_page" in the
4984 memory slot, and num_pages is the size in bits of the input bitmap.
4987 bit that is set in the input bitmap, the corresponding page is marked "clean"
4988 in KVM's dirty bitmap, and dirty tracking is re-enabled for that page
4989 (for example via write-protection, or by clearing the dirty bit in
4992 If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of slot field specifies
5002 --------------------------------
5007 :Parameters: struct kvm_cpuid2 (in/out)
5008 :Returns: 0 on success, -1 on error
5029 This ioctl returns x86 cpuid features leaves related to Hyper-V emulation in
5031 cpuid information presented to guests consuming Hyper-V enlightenments (e.g.
5032 Windows or Hyper-V guests).
5034 CPUID feature leaves returned by this ioctl are defined by Hyper-V Top Level
5041 - HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
5042 - HYPERV_CPUID_INTERFACE
5043 - HYPERV_CPUID_VERSION
5044 - HYPERV_CPUID_FEATURES
5045 - HYPERV_CPUID_ENLIGHTMENT_INFO
5046 - HYPERV_CPUID_IMPLEMENT_LIMITS
5047 - HYPERV_CPUID_NESTED_FEATURES
5048 - HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS
5049 - HYPERV_CPUID_SYNDBG_INTERFACE
5050 - HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES
5053 with the 'nent' field indicating the number of entries in the variable-size
5054 array 'entries'. If the number of entries is too low to describe all Hyper-V
5056 to the number of Hyper-V feature leaves, the 'nent' field is adjusted to the
5057 number of valid entries in the 'entries' array, which is then filled.
5059 'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved,
5066 - HYPERV_CPUID_NESTED_FEATURES leaf and HV_X64_ENLIGHTENED_VMCS_RECOMMENDED
5069 - HV_STIMER_DIRECT_MODE_AVAILABLE bit is only exposed with in-kernel LAPIC.
5073 ---------------------------
5077 :Parameters: int feature (in)
5078 :Returns: 0 on success, -1 on error
5096 means of a successful KVM_ARM_VCPU_INIT call with the appropriate flag set in
5104 that should be performed and how to do it are feature-dependent.
5108 -EPERM unless the feature has already been finalized by means of a
5115 ------------------------------
5120 :Parameters: struct kvm_pmu_event_filter (in)
5121 :Returns: 0 on success, -1 on error
5127 EINVAL args[0] contains invalid data in the filter or filter events
5157 In this mode each event will contain an event select + unit mask.
5166 In this mode each filter event will contain an event select, mask, match, and
5174 ---- -----------
5183 When the guest attempts to program the PMU, these steps are followed in
5203 When setting a new pmu event filter, -EINVAL will be returned if any of the
5204 unused fields are set or if any of the high bits (35:32) in the event
5215 Specifically, KVM follows the following pseudo-code when determining whether to
5216 allow the guest FixCtr[i] to count its pre-defined fixed event::
5231 ---------------------
5255 ---------------------------
5264 the cpu reset definition in the POP (Principles Of Operation).
5267 ----------------------------
5276 the initial cpu reset definition in the POP. However, the cpu is not
5280 --------------------------
5289 the clear cpu reset definition in the POP. However, the cpu is not put
5294 -------------------------
5339 All registered VCPUs are converted back to non-protected ones. If a
5342 KVM_PV_ASYNC_CLEANUP_PERFORM, it will be torn down in this call
5346 Pass the image header from VM memory to the Ultravisor in
5363 valid fields if more response fields are added in the future.
5389 hosts. These values are likely also exported as files in the sysfs
5391 programs in this API.
5440 not succeed all other subcommands will fail with -EINVAL. This
5441 subcommand will return -EINVAL if a dump process has not yet been
5445 allowed` PCF bit 34 in the SE header to allow dumping.
5472 resume execution immediately as non-protected. There can be at most
5476 fail. In that case, the userspace process should issue a normal
5497 --------------------------
5543 Sets the ABI mode of the VM to 32-bit or 64-bit (long mode). This
5549 32 vCPUs in the shared_info page, KVM does not automatically do so
5553 in the shared_info page. This is because KVM may not be aware of
5570 If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
5573 will always be fixed in the VMM regardless of where it is mapped
5574 in guest physical address space. This attribute should be used in
5577 re-mapped in guest physcial address space.
5583 This is the HVM-wide vector injected directly by the hypervisor
5595 by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but other
5597 removed by using KVM_XEN_EVTCHN_DEASSIGN in the flags field. Passing
5598 KVM_XEN_EVTCHN_RESET in the flags field removes all interception of
5605 the 32-bit version code returned to the guest when it invokes the
5620 --------------------------
5633 ---------------------------
5678 If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
5682 in the shared_info page. In this case it is safe to assume the
5685 regardless of where it is mapped in guest physical address space
5718 other four times. The state field must be set to -1, or to a valid
5726 vCPU ID of the given vCPU, to allow timer-related VCPU operations to
5739 per-vCPU local APIC upcall vector, configured by a Xen guest with
5741 used by Windows guests, and is distinct from the HVM-wide upcall
5747 ---------------------------
5762 ---------------------------
5768 :Returns: number of bytes copied, < 0 on error (-EINVAL for incorrect
5769 arguments, -EFAULT if memory cannot be accessed).
5783 ``length`` must not be bigger than 2^31 - PAGE_SIZE bytes. The ``addr``
5790 (granules in MTE are 16 bytes long). Each byte contains a single tag
5800 --------------------
5806 :Returns: 0 on success, -1 on error
5814 /* out (KVM_GET_SREGS2) / in (KVM_SET_SREGS2) */
5833 --------------------
5838 :Parameters: struct kvm_sregs2 (in)
5839 :Returns: 0 on success, -1 on error
5846 ----------------------
5861 The returned file descriptor can be used to read VM/vCPU statistics data in
5862 binary format. The data in the file descriptor consists of four blocks
5865 +-------------+
5867 +-------------+
5869 +-------------+
5871 +-------------+
5873 +-------------+
5876 not guaranteed that the four blocks are adjacent or in the above order;
5877 the offsets of the id, descriptors and data blocks are found in the
5878 header. However, all four blocks are aligned to 64 bit offsets in the
5885 All data is in system endianness.
5900 The ``name_size`` field is the size (in byte) of the statistics name string
5901 (including trailing '\0') which is contained in the "id string" block and
5904 The ``num_desc`` field is the number of descriptors that are included in the
5905 descriptor block. (The actual number of values in the data block may be
5919 trailing ``'\0'``, is indicated by the ``name_size`` field in the header.
5963 Bits 0-3 of ``flags`` encode the type:
5967 Most of the counters used in KVM are of this type.
5978 of items in a hash table bucket, the longest time waited and so on.
5985 is [``hist_param``*(N-1), ``hist_param``*N), while the range of the last
5986 bucket is [``hist_param``*(``size``-1), +INF). (+INF means positive infinity
5991 [0, 1), while the range of the last bucket is [pow(2, ``size``-2), +INF).
5993 [pow(2, N-2), pow(2, N-1)).
5995 Bits 4-7 of ``flags`` encode the unit:
6001 It indicates that the statistics data is used to measure memory size, in the
6003 determined by the ``exponent`` field in the descriptor.
6014 Note that, in the case of histograms, the unit applies to the bucket
6015 ranges, while the bucket value indicates how many samples fell in the
6018 Bits 8-11 of ``flags``, together with ``exponent``, encode the scale of the
6022 The scale is based on power of 10. It is used for measurement of time and
6023 CPU clock cycles. For example, an exponent of -9 can be used with
6026 The scale is based on power of 2. It is used for measurement of memory size.
6039 bucket in the unit expressed by bits 4-11 of ``flags`` together with ``exponent``.
6043 the trailing ``'\0'``, is indicated by ``name_size`` in the header.
6045 The Stats Data block contains an array of 64-bit values in the same order
6046 as the descriptors in Descriptors block.
6049 --------------------
6055 :Returns: 0 on success, -1 on error
6070 enabled with ``arch_prctl()``, but this may change in the future.
6072 The offsets of the state save areas in struct kvm_xsave follow the contents
6076 -----------------------------
6096 -----------------------------
6105 for vcpus. It re-uses the kvm_s390_pv_dmp struct and hence also shares
6121 ----------------------
6126 :Parameters: struct kvm_s390_zpci_op (in)
6129 Used to manage hardware-assisted virtualization features for zPCI devices.
6134 /* in */
6153 The type of operation is specified in the "op" field.
6168 --------------------------------
6173 :Parameters: struct kvm_arm_counter_offset (in)
6176 This capability indicates that userspace is able to apply a single VM-wide
6196 Any value other than 0 for the "reserved" field may result in an error
6197 (-EINVAL) being returned. This ioctl can also return -EBUSY if any vcpu
6200 Note that using this ioctl results in KVM ignoring subsequent userspace
6208 -------------------------------------------
6213 :Parameters: struct reg_mask_range (in/out)
6247 op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}.
6256 ---------------------------------
6261 :Parameters: struct kvm_userspace_memory_region2 (in)
6262 :Returns: 0 on success, -1 on error
6267 in flags to have KVM bind the memory region to a given guest_memfd range of
6292 on-demand.
6303 Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set.
6304 Returns -EINVAL if called on a protected VM.
6307 -------------------------------
6312 :Parameters: struct kvm_memory_attributes (in)
6343 ----------------------------
6348 :Parameters: struct kvm_create_guest_memfd(in)
6353 via memfd_create(), e.g. guest_memfd files live in RAM, have volatile storage,
6376 and more specifically via the guest_memfd and guest_memfd_offset fields in
6386 ---------------------------
6391 :Parameters: struct kvm_pre_fault_memory (in/out)
6410 /* in/out */
6413 /* in */
6418 KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory
6420 stage-2 read page fault, e.g. faults in memory as needed, but doesn't break
6421 CoW. However, KVM does not mark any newly created stage-2 PTE as Accessed.
6423 In the case of confidential VM types where there is an initial set up of
6429 In some cases, multiple vCPUs might share the page tables. In this
6430 case, the ioctl can be called in parallel.
6451 execution by changing fields in kvm_run prior to calling the KVM_RUN
6458 /* in */
6462 interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
6468 This field is polled once when KVM_RUN starts; if non-zero, KVM_RUN
6469 exits immediately, returning -EINTR. In the common scenario where a
6473 a signal handler that sets run->immediate_exit to a non-zero value.
6499 The value of the current interrupt flag. Only valid if in-kernel
6506 More architecture-specific flags detailing state of the VCPU that may
6509 /* x86, set if the VCPU is in system management mode */
6511 /* x86, set if bus lock detected in VM */
6521 /* in (pre_kvm_run), out (post_kvm_run) */
6524 The value of the cr8 register. Only valid if in-kernel local APIC is
6531 The value of the APIC BASE msr. Only valid if in-kernel local
6543 reasons. Further architecture-specific information is available in
6555 to unknown reasons. Further architecture-specific information is
6556 available in hardware_entry_failure_reason.
6608 executed a memory-mapped I/O instruction which could not be satisfied
6612 The 'data' member contains, in its first 'len' bytes, the value as it would
6621 has re-entered the kernel with KVM_RUN. The kernel side will first finish
6624 The pending state of the operation is not preserved in state which is
6626 completed before performing a live migration. Userspace can re-enter the
6649 ----------
6660 - ``KVM_HYPERCALL_EXIT_SMC``: Indicates that the guest used the SMC
6664 - ``KVM_HYPERCALL_EXIT_16BIT``: Indicates that the guest used a 16bit
6720 in the cpu's lowcore are presented here as defined by the z Architecture
6721 Principles of Operation Book in the Chapter for Dynamic Address Translation
6733 Deprecated - was used for 440 KVM.
6748 in this struct.
6759 This is used on 64-bit PowerPC when emulating a pSeries partition,
6760 e.g. with the 'pseries' machine type in qemu. It occurs when the
6763 the arguments (from the guest R4 - R12). Userspace should put the
6764 return code in 'ret' and any extra returned values in args[].
6765 The possible hypercalls are defined in the Power Architecture Platform
6766 Requirements (PAPR) document available from www.power.org (free
6800 In case the interrupt controller lives in user space, we need to do
6824 a system-level event using some architecture specific mechanism (hypercall
6825 or some special instruction). In case of ARM64, this is triggered using
6828 The 'type' field describes the system-level event type.
6831 - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
6835 - KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
6837 to schedule the reset to occur in the future and may call KVM_RUN again.
6838 - KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
6842 - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
6843 The guest physical address of the guest's GHCB is stored in `data[0]`.
6844 - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and
6847 - KVM_SYSTEM_EVENT_SUSPEND -- the guest has requested a suspension of
6851 architecture specific information for the system-level event. Only
6854 - for arm64, data[0] is set to KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 if
6858 - for RISC-V, data[0] is set to the value of the second argument of the
6861 Previous versions of Linux defined a `flags` member in this struct. The
6866 --------------
6876 the call parameters are left in-place in the vCPU registers.
6881 - Honor the guest request to suspend the VM. Userspace can request
6882 in-kernel emulation of suspension by setting the calling vCPU's
6888 - Deny the guest request to suspend the VM. See ARM DEN0022D.b 5.19.2
6898 Indicates that the VCPU's in-kernel local APIC received an EOI for a
6899 level-triggered IOAPIC interrupt. This exit only triggers when the
6900 IOAPIC is implemented in userspace (i.e. KVM_CAP_SPLIT_IRQCHIP is enabled);
6941 related to Hyper-V emulation.
6945 - KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
6947 Hyper-V SynIC state change. Notification is used to remap SynIC
6949 in userspace.
6951 - KVM_EXIT_HYPERV_SYNDBG -- synchronously notify user-space about
6953 Hyper-V Synthetic debugger state change. Notification is used to either update
6955 in send_page or recv a buffer to recv_page).
6965 Used on arm64 systems. If a guest accesses memory not in a memslot,
6969 the instruction from the VM is overly complicated to live in the kernel.
6972 the VM. KVM assumed that if the guest accessed non-memslot memory, it was
6976 meaningful warning message and an external abort in the guest, if the access
6982 the ESR_EL2 in the esr_iss field, and the faulting IPA in the fault_ipa field.
6985 executing the guest, or it can decide to suspend, dump, or restart the guest.
6993 Instead, a data abort exception is directly injected in the guest.
7002 __u8 error; /* user -> kernel */
7004 __u32 reason; /* kernel -> user */
7005 __u32 index; /* kernel -> user */
7006 __u64 data; /* kernel <-> user */
7029 If the RDMSR request was unsuccessful, userspace indicates that with a "1" in
7064 - KVM_EXIT_XEN_HCALL -- synchronously notify user-space about Xen hypercall.
7079 done a SBI call which is not handled by KVM RISC-V kernel module. The details
7080 of the SBI call are available in 'riscv_sbi' member of kvm_run structure. The
7085 values of SBI call before resuming the VCPU. For more details on RISC-V SBI
7086 spec refer, https://github.com/riscv/riscv-sbi-doc.
7099 could not be resolved by KVM. The 'gpa' and 'size' (in bytes) describe the
7103 - KVM_MEMORY_EXIT_FLAG_PRIVATE - When set, indicates the memory fault occurred
7107 Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that it
7108 accompanies a return code of '-1', not '0'! errno will always be set to EFAULT
7121 enabled, a VM exit generated if no event window occurs in VM non-root mode
7129 - KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid
7130 in VMCS. It would run into unknown result if resume the target VM.
7162 values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
7180 whether this is a per-vcpu or per-vm capability.
7191 -------------------
7196 :Returns: 0 on success; -1 on error
7200 were invented by Mac-on-Linux to have a standardized communication mechanism
7207 --------------------
7212 :Returns: 0 on success; -1 on error
7218 runs in "hypervisor" privilege mode with a few missing features.
7220 In addition to the above, it changes the semantics of SDR1. In this mode, the
7228 ------------------
7233 :Returns: 0 on success; -1 on error
7246 addresses of mmu-type-specific data structures. The "array_len" field is an
7247 safety mechanism, and should be set to the size in bytes of the memory that
7252 contents are undefined, and any modification by userspace results in
7262 - The "params" field is of type "struct kvm_book3e_206_tlb_params".
7263 - The "array" field points to an array of type "struct
7265 - The array consists of all entries in the first TLB, followed by all
7266 entries in the second TLB.
7267 - Within a TLB, entries are ordered first by increasing set number. Within a
7269 - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1)
7271 - The tsize field of mas1 shall be set to 4K on TLB0, even though the
7275 ----------------------------
7280 :Returns: 0 on success; -1 on error
7285 handled in-kernel, while the other I/O instructions are passed to userspace.
7290 Note that even though this capability is enabled per-vcpu, the complete
7294 -------------------
7299 :Returns: 0 on success; -1 on error
7313 --------------------
7319 This capability connects the vcpu to an in-kernel MPIC device.
7322 --------------------
7329 This capability connects the vcpu to an in-kernel XICS device.
7332 ------------------------
7338 This capability enables the in-kernel irqchip for s390. Please refer to
7342 --------------------
7349 allows the Config1.FP bit to be set to enable the FPU in the guest. Once this is
7356 ---------------------
7369 ----------------------
7374 :Returns: x86: KVM_CHECK_EXTENSION returns a bit-array indicating which register
7376 (bitfields defined in arch/x86/include/uapi/asm/kvm.h).
7378 As described above in the kvm_sync_regs struct info in section 5 (kvm_run):
7383 modifications, e.g. when emulating and/or intercepting instructions in
7390 - the register sets to be copied out to kvm_run are selectable
7392 - vcpu_events are available in addition to regs and sregs.
7395 function as an input bit-array field set by userspace to indicate the
7404 Unused bitfields in the bitarrays must be set to zero.
7415 -------------------------
7422 This capability connects the vcpu to an in-kernel XIVE device.
7447 ----------------------------
7451 args[1] is 0 to disable, 1 to enable in-kernel handling
7454 get handled by the kernel or not. Enabling or disabling in-kernel
7456 initial set of hcalls are enabled for in-kernel handling, which
7457 consists of those hcalls for which in-kernel handlers were implemented
7464 If the hcall number specified is not one that has an in-kernel
7469 --------------------------
7474 This capability controls which SIGP orders will be handled completely in user
7476 in the kernel:
7478 - SENSE
7479 - SENSE RUNNING
7480 - EXTERNAL CALL
7481 - EMERGENCY SIGNAL
7482 - CONDITIONAL EMERGENCY SIGNAL
7484 All other orders will be handled completely in user space.
7486 Only privileged operation exceptions will be checked for in the kernel (or even
7487 in the hardware prior to interception). If this capability is not enabled, the
7488 old way of handling SIGP orders is used (partially in kernel and user space).
7491 ---------------------------------
7499 return -EINVAL if the machine does not support vectors.
7502 --------------------------
7507 This capability allows post-handlers for the STSI instruction. After
7508 initial handling in the kernel, KVM exits to user space with
7511 Before exiting to userspace, kvm handlers should fill in s390_stsi field of
7512 vcpu->run::
7523 @addr - guest address of STSI SYSIB
7524 @fc - function code
7525 @sel1 - selector 1
7526 @sel2 - selector 2
7527 @ar - access register number
7529 KVM handlers should exit to userspace with rc = -EREMOTE.
7532 -------------------------
7535 :Parameters: args[0] - number of routes reserved for userspace IOAPICs
7536 :Returns: 0 on success, -1 on error
7538 Create a local apic for each processor in the kernel. This can be used
7543 This capability also enables in kernel routing of interrupt requests;
7545 used in the IRQ routing table. The first args[0] MSI routes are reserved
7549 Fails if VCPU has already been created, or if the irqchip is already in the
7553 -------------------
7558 Allows use of runtime-instrumentation introduced with zEC12 processor.
7559 Will return -EINVAL if the machine does not support runtime-instrumentation.
7560 Will return -EBUSY if a VCPU has already been created.
7563 ----------------------
7566 :Parameters: args[0] - features that should be enabled
7567 :Returns: 0 on success, -EINVAL when args[0] contains invalid features
7569 Valid feature flags in args[0] are::
7576 allowing the use of 32-bit APIC IDs. See KVM_CAP_X2APIC_API in their
7580 in logical mode or with more than 255 VCPUs. Otherwise, KVM treats 0xff
7581 as a broadcast even in x2APIC mode in order to support physical x2APIC
7582 without interrupt remapping. This is undesirable in logical mode,
7583 where 0xff represents CPUs 0-7 in cluster 0.
7586 ----------------------------
7593 mechanism e.g. to realize 2-byte software breakpoints. The kernel will
7601 -------------------
7605 :Returns: 0 on success; -EINVAL if the machine does not support
7606 guarded storage; -EBUSY if a VCPU has already been created.
7611 ---------------------
7616 Allow use of adapter-interruption suppression.
7617 :Returns: 0 on success; -EBUSY if a VCPU has already been created.
7620 --------------------
7627 virtual core). The virtual SMT mode, vsmt_mode, must be a power of 2
7630 be 0. A successful call to enable this capability will result in
7638 ----------------------
7643 With this capability a machine check exception in the guest address
7650 ------------------------------
7654 :Returns: 0 on success, -EINVAL when args[0] contains invalid exits
7656 Valid bits in args[0] are::
7664 longer intercept some instructions for improved latency in some
7666 physical CPUs. More bits can be added in the future; userspace can
7673 --------------------------
7677 :Returns: 0 on success, -EINVAL if hpage module parameter was not set
7685 hpage module parameter is not set to 1, -EINVAL is returned.
7691 ------------------------------
7701 --------------------------
7705 :Returns: 0 on success, -EINVAL when the implementation doesn't support
7706 nested-HV virtualization.
7708 HV-KVM on POWER9 and later systems allows for "nested-HV"
7710 can run using the CPU's supervisor mode (privileged non-hypervisor
7713 kvm-hv module parameter.
7716 ------------------------------
7722 emulated VM-exit when L1 intercepts a #PF exception that occurs in
7723 L2. Similarly, for kvm-intel only, DR6 will not be modified prior to
7724 the emulated VM-exit when L1 intercepts a #DB exception that occurs in
7727 faulting address (or the new DR6 bits*) will be reported in the
7730 exception.has_payload and to put the faulting address - or the new DR6
7731 bits\ [#]_ - in the exception_payload field.
7733 This capability also enables exception.pending in struct
7742 --------------------------------------
7753 automatically clear and write-protect all pages that are returned as dirty.
7759 KVM_CLEAR_DIRTY_LOG ioctl can operate on a 64-page granularity rather
7761 take spinlocks for an extended period of time. Second, in some cases a
7763 userspace actually using the data in the page. Pages can be modified
7772 dirty logging can be enabled gradually in small chunks on the first call
7784 ------------------------------
7801 ----------------------
7805 :Parameters: args[0] is the maximum poll time in nanoseconds
7806 :Returns: 0 on success; -1 on error
7809 maximum halt-polling time for all vCPUs in the target VM. This capability can
7811 maximum halt-polling time.
7813 See Documentation/virt/kvm/halt-polling.rst for more information on halt
7817 -------------------------------
7822 :Returns: 0 on success; -1 on error
7832 this capability. With it enabled, MSR accesses that match the mask specified in
7849 -------------------------------
7853 :Parameters: args[0] defines the policy used when bus locks detected in guest
7854 :Returns: 0 on success, -EINVAL when args[0] contains invalid bits
7856 Valid bits in args[0] are::
7862 policy to handle the bus locks detected in guest. Userspace can obtain the
7864 the KVM_ENABLE_CAP. The supported modes are mutually-exclusive.
7866 This capability allows userspace to force VM exits on bus locks detected in the
7867 guest, irrespective whether or not the host has enabled split-lock detection
7873 exit, although the host kernel's split-lock #AC detection still applies, if
7877 bus locks in the guest trigger a VM exit, and KVM exits to userspace for all
7879 apply some other policy-based mitigation. When exiting to userspace, KVM sets
7880 KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
7888 ----------------------
7892 :Returns: 0 on success, -EINVAL when CPU doesn't support 2nd DAWR
7899 -------------------------------------
7909 This is intended to support in-guest workloads scheduled by the host. This
7910 allows the in-guest workload to maintain its own NPTs and keeps the two vms
7915 --------------------------
7919 :Parameters: args[0] is a file handle of a SGX attribute file in securityfs
7920 :Returns: 0 on success, -EINVAL if the file handle is invalid or if a requested
7932 by running an enclave in a VM, KVM prevents access to privileged attributes by
7938 -------------------------------
7947 In order to enable the use of H_RPT_INVALIDATE in the guest,
7949 IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
7950 present in the "ibm,hypertas-functions" device-tree property.
7956 --------------------------------------
7961 When this capability is enabled, an emulation failure will result in an exit
7968 defines the 'flags' field which is used to describe the fields in the struct
7970 set in the 'flags' field then both 'insn_size' and 'insn_bytes' have valid data
7971 in them.)
7974 --------------------
7982 available to a guest running in AArch64 mode and enabling this capability will
7990 When this capability is enabled all memory in memslots must be mapped as
7991 ``MAP_ANONYMOUS`` or with a RAM-based file mapping (``tmpfs``, ``memfd``),
7992 attempts to create a memslot with an invalid mmap will result in an
7993 -EINVAL return.
7999 -------------------------------------
8009 This is intended to support intra-host migration of VMs between userspace VMMs,
8013 -------------------------------
8023 This capability allows a guest kernel to use a better-performance mode for
8027 ----------------------------
8030 :Parameters: args[0] - set of KVM quirks to disable
8038 quirks that can be disabled in KVM.
8044 The valid bits in cap.args[0] are:
8054 that runs in perpetuity with CR0.CD, i.e.
8055 with caches in "no fill" mode.
8064 LAPIC is in x2APIC mode.
8066 KVM_X86_QUIRK_OUT_7E_INC_RIP By default, KVM pre-increments %rip before
8069 KVM does not pre-increment %rip before
8096 KVM will modify MONITOR/MWAIT support in
8102 invalidates all SPTEs in all memslots and
8113 ------------------------
8117 :Parameters: args[0] - maximum APIC ID value set for current VM
8118 :Returns: 0 on success, -EINVAL if args[0] is beyond KVM_MAX_VCPU_IDS
8119 supported in KVM or if it has been set.
8134 ------------------------------
8139 :Returns: 0 on success, -EINVAL if args[0] contains invalid flags or notify
8149 in per-VM scope during VM creation. Notify VM exit is disabled by default.
8150 When userspace sets KVM_X86_NOTIFY_VMEXIT_ENABLED bit in args[0], VMM will
8152 a VM exit if no event window occurs in VM non-root mode for a specified of
8155 If KVM_X86_NOTIFY_VMEXIT_USER is set in args[0], upon notify VM exits happen,
8163 ------------------------------
8166 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
8169 kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-Exit, e.g. if
8173 The information in kvm_run.memory_fault is valid if and only if KVM_RUN returns
8184 -----------------------------------
8188 :Parameters: args[0] is the desired APIC bus clock rate, in nanoseconds
8189 :Returns: 0 on success, -EINVAL if args[0] contains an invalid value for the
8190 frequency or if any vCPUs have been created, -ENXIO if a virtual
8193 This capability sets the VM's APIC bus clock frequency, used by KVM's in-kernel
8198 core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest.
8201 ------------------------------
8204 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
8207 KVM_RUN_X86_GUEST_MODE bit in kvm_run.flags to indicate whether the
8221 ---------------------
8227 H_RANDOM hypercall backed by a hardware random-number generator.
8232 ------------------------
8238 Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is
8239 used to support Windows Hyper-V based guest paravirt drivers(VMBus).
8241 In order to use SynIC, it has to be activated by setting this
8244 by the CPU, as it's incompatible with SynIC auto-EOI behavior.
8247 -------------------------
8253 radix MMU defined in Power ISA V3.00 (as implemented in the POWER9
8257 ---------------------------
8263 hashed page table MMU defined in Power ISA V3.00 (as implemented in
8264 the POWER9 processor), including in-memory segment tables.
8267 -------------------
8288 0 The trap & emulate implementation is in use to run guest code in user
8289 mode. Guest virtual memory segments are rearranged to fit the guest in the
8292 1 The MIPS VZ ASE is in use, providing full hardware assisted
8297 -------------------
8303 run guest code in user mode, even if KVM_CAP_MIPS_VZ indicates that hardware
8311 ----------------------
8325 Both registers and addresses are 32-bits wide.
8326 It will only be possible to run 32-bit guest code.
8328 1 MIPS64 or microMIPS64 with access only to 32-bit compatibility segments.
8329 Registers are 64-bits wide, but addresses are 32-bits wide.
8330 64-bit guest code may run but cannot access MIPS64 memory segments.
8331 It will also be possible to run 32-bit guest code.
8334 Both registers and addresses are 64-bits wide.
8335 It will be possible to run 64-bit or 32-bit guest code.
8339 ------------------------
8344 that if userspace creates a VM without an in-kernel interrupt controller, it
8345 will be notified of changes to the output level of in-kernel emulated devices,
8348 updates the vcpu's run->s.regs.device_irq_level field to represent the actual
8351 Whenever kvm detects a change in the device output level, kvm guarantees at
8354 userspace can always sample the device output level and re-compute the state of
8356 of run->s.regs.device_irq_level on every kvm exit.
8357 The value in run->s.regs.device_irq_level can represent both level and edge
8359 signals will exit to userspace with the bit in run->s.regs.device_irq_level
8362 The field run->s.regs.device_irq_level is available independent of
8363 run->kvm_valid_regs or run->kvm_dirty_regs bits.
8367 and thereby which bits in run->s.regs.device_irq_level can signal values.
8373 KVM_ARM_DEV_EL1_VTIMER - EL1 virtual timer
8374 KVM_ARM_DEV_EL1_PTIMER - EL1 physical timer
8375 KVM_ARM_DEV_PMU - ARM PMU overflow interrupt signal
8382 -----------------------------
8392 --------------------------
8396 This capability enables a newer version of Hyper-V Synthetic interrupt
8402 ----------------------------
8412 -------------------------------
8422 ---------------------
8429 ----------------------
8434 be anywhere in the user memory address space, as long as the memory slots are
8438 ---------------------
8443 use copy-on-write semantics as well as dirty pages tracking via read-only page
8447 ---------------------
8456 ----------------------------
8460 This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush
8466 ----------------------------------
8476 AArch64, this value will be reported in the ISS field of ESR_ELx.
8481 ----------------------------
8485 This capability indicates that KVM supports paravirtualized Hyper-V IPI send
8490 -----------------------------------
8494 This capability indicates that KVM running on top of Hyper-V hypervisor
8496 hypercalls are handled by Level 0 hypervisor (Hyper-V) bypassing KVM.
8497 Due to the different ABI for hypercall parameters between Hyper-V and
8500 flush hypercalls by Hyper-V) so userspace should disable KVM identification
8501 in CPUID and only exposes Hyper-V identification. In this case, guest
8502 thinks it's running on Hyper-V and only use Hyper-V hypercalls.
8505 -----------------------------
8513 ---------------------------
8524 -----------------------
8530 architecture-specific interfaces. This capability and the architecture-
8537 -------------------------
8547 an 8-byte value consisting of a one-byte Control Program Name Code (CPNC) and
8548 a 7-byte Control Program Version Code (CPVC). The CPNC determines what
8549 environment the control program is running in (e.g. Linux, z/VM...), and the
8557 -------------------------------
8568 ---------------------------
8577 In combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to
8582 -------------------------------------
8587 guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
8592 ----------------------------------------------------------
8595 :Parameters: args[0] - size of the dirty log ring
8619 vCPU, and the size of the ring must be a power of two. The larger the
8626 set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered
8630 An entry in the ring buffer can be unused (flag bits ``00``),
8635 00 -----------> 01 -------------> 1X -------+
8638 +------------------------------------------+
8652 using load-acquire/store-release accessors when available, or any
8656 However it must collect the dirty GFNs in sequence, i.e., the userspace
8659 After processing one or more entries in the ring buffer, userspace
8669 KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from
8677 should be exposed by weakly ordered architecture, in order to indicate
8680 Architecture with TSO-like ordering (such as x86) are allowed to
8686 ring structures can be backed by per-slot bitmaps. With this capability
8689 maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
8696 context. Otherwise, the stand-alone per-slot bitmap mechanism needs to
8699 To collect dirty bits in the backup bitmap, userspace can use the same
8701 the generation of the dirty bits is done in a single pass. Collecting
8709 KVM device "kvm-arm-vgic-its". (2) restore vgic/its tables through
8711 "kvm-arm-vgic-its". VGICv3 LPI pending status is restored. (3) save
8713 command on KVM device "kvm-arm-vgic-v3".
8716 --------------------
8736 provided in the flags to KVM_XEN_HVM_CONFIG, without providing hypercall page
8746 The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related
8763 the KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG attribute in the KVM_XEN_SET_ATTR
8765 XEN_RUNSTATE_UPDATE flag in guest memory mapped vcpu_runstate_info during
8775 clearing the PVCLOCK_TSC_STABLE_BIT flag in Xen pvclock sources. This will be
8780 -------------------------
8791 in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
8793 In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
8795 IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
8796 present in the "ibm,hypertas-functions" device-tree property.
8799 in the kernel based fast path. If they can not be handled by the kernel,
8801 an implementation for these despite the in kernel acceleration.
8806 --------------------
8811 supported in the host. A VMM can check whether the service is
8815 ---------------------------------
8819 When enabled, KVM will disable emulated Hyper-V features provided to the
8820 guest according to the bits Hyper-V CPUID feature leaves. Otherwise, all
8821 currently implemented Hyper-V features are provided unconditionally when
8822 Hyper-V identification is set in the HYPERV_CPUID_INTERFACE (0x40000001)
8826 ---------------------------
8841 the hypercalls whose corresponding bit is in the argument, and return
8845 ---------------------------
8851 :Returns: 0 on success, -EINVAL when arg[0] contains invalid bits
8853 This capability alters PMU virtualization in KVM.
8867 -------------------------------
8874 type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
8877 --------------------------------
8890 -------------------------------------
8896 :Returns: 0 on success, -EPERM if the userspace process does not
8897 have CAP_SYS_BOOT, -EINVAL if args[0] is not 0 or any vCPUs have been
8907 ------------------------------
8928 When getting the Modified Change Topology Report value, the attr->addr
8932 ---------------------------------------
8938 :Returns: 0 on success, -EINVAL if any memslot was already created.
8940 This capability sets the chunk size used in Eager Page Splitting.
8942 Eager Page Splitting improves the performance of dirty-logging (used
8943 in live migrations) when guest memory is backed by huge-pages. It
8944 avoids splitting huge-pages (into PAGE_SIZE pages) on fault, by doing
8954 block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES as a
8955 64-bit bitmap (each bit describing a block size). The default value is
8959 ---------------------
8965 This capability returns a bitmap of support VM types. The 1-setting of bit @n
8974 Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in
8975 production. The behavior and effective ABI for software-protected VMs is
8981 In some cases, KVM's API has some inconsistencies or common pitfalls
8989 --------
8994 In general, ``KVM_GET_SUPPORTED_CPUID`` is designed so that it is possible
8996 documents some cases in which that requires some care.
9003 ``KVM_ENABLE_CAP(KVM_CAP_IRQCHIP_SPLIT)`` are used to enable in-kernel emulation of
9010 has enabled in-kernel emulation of the local APIC.
9021 the values of these three leaves differ for each CPU. In particular,
9022 the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX
9023 for 0x8000001e; the latter also encodes the core id and node id in bits