1The Definitive KVM (Kernel-based Virtual Machine) API Documentation 2=================================================================== 3 41. General description 5---------------------- 6 7The kvm API is a set of ioctls that are issued to control various aspects 8of a virtual machine. The ioctls belong to three classes 9 10 - System ioctls: These query and set global attributes which affect the 11 whole kvm subsystem. In addition a system ioctl is used to create 12 virtual machines 13 14 - VM ioctls: These query and set attributes that affect an entire virtual 15 machine, for example memory layout. In addition a VM ioctl is used to 16 create virtual cpus (vcpus) and devices. 17 18 Only run VM ioctls from the same process (address space) that was used 19 to create the VM. 20 21 - vcpu ioctls: These query and set attributes that control the operation 22 of a single virtual cpu. 23 24 Only run vcpu ioctls from the same thread that was used to create the 25 vcpu. 26 27 - device ioctls: These query and set attributes that control the operation 28 of a single device. 29 30 device ioctls must be issued from the same process (address space) that 31 was used to create the VM. 32 332. File descriptors 34------------------- 35 36The kvm API is centered around file descriptors. An initial 37open("/dev/kvm") obtains a handle to the kvm subsystem; this handle 38can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this 39handle will create a VM file descriptor which can be used to issue VM 40ioctls. A KVM_CREATE_VCPU or KVM_CREATE_DEVICE ioctl on a VM fd will 41create a virtual cpu or device and return a file descriptor pointing to 42the new resource. Finally, ioctls on a vcpu or device fd can be used 43to control the vcpu or device. For vcpus, this includes the important 44task of actually running guest code. 45 46In general file descriptors can be migrated among processes by means 47of fork() and the SCM_RIGHTS facility of unix domain socket. These 48kinds of tricks are explicitly not supported by kvm. While they will 49not cause harm to the host, their actual behavior is not guaranteed by 50the API. The only supported use is one virtual machine per process, 51and one vcpu per thread. 52 53 543. Extensions 55------------- 56 57As of Linux 2.6.22, the KVM ABI has been stabilized: no backward 58incompatible change are allowed. However, there is an extension 59facility that allows backward-compatible extensions to the API to be 60queried and used. 61 62The extension mechanism is not based on the Linux version number. 63Instead, kvm defines extension identifiers and a facility to query 64whether a particular extension identifier is available. If it is, a 65set of ioctls is available for application use. 66 67 684. API description 69------------------ 70 71This section describes ioctls that can be used to control kvm guests. 72For each ioctl, the following information is provided along with a 73description: 74 75 Capability: which KVM extension provides this ioctl. Can be 'basic', 76 which means that is will be provided by any kernel that supports 77 API version 12 (see section 4.1), a KVM_CAP_xyz constant, which 78 means availability needs to be checked with KVM_CHECK_EXTENSION 79 (see section 4.4), or 'none' which means that while not all kernels 80 support this ioctl, there's no capability bit to check its 81 availability: for kernels that don't support the ioctl, 82 the ioctl returns -ENOTTY. 83 84 Architectures: which instruction set architectures provide this ioctl. 85 x86 includes both i386 and x86_64. 86 87 Type: system, vm, or vcpu. 88 89 Parameters: what parameters are accepted by the ioctl. 90 91 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 92 are not detailed, but errors with specific meanings are. 93 94 954.1 KVM_GET_API_VERSION 96 97Capability: basic 98Architectures: all 99Type: system ioctl 100Parameters: none 101Returns: the constant KVM_API_VERSION (=12) 102 103This identifies the API version as the stable kvm API. It is not 104expected that this number will change. However, Linux 2.6.20 and 1052.6.21 report earlier versions; these are not documented and not 106supported. Applications should refuse to run if KVM_GET_API_VERSION 107returns a value other than 12. If this check passes, all ioctls 108described as 'basic' will be available. 109 110 1114.2 KVM_CREATE_VM 112 113Capability: basic 114Architectures: all 115Type: system ioctl 116Parameters: machine type identifier (KVM_VM_*) 117Returns: a VM fd that can be used to control the new virtual machine. 118 119The new VM has no virtual cpus and no memory. An mmap() of a VM fd 120will access the virtual machine's physical address space; offset zero 121corresponds to guest physical address zero. Use of mmap() on a VM fd 122is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is 123available. 124You most certainly want to use 0 as machine type. 125 126In order to create user controlled virtual machines on S390, check 127KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as 128privileged user (CAP_SYS_ADMIN). 129 130 1314.3 KVM_GET_MSR_INDEX_LIST 132 133Capability: basic 134Architectures: x86 135Type: system 136Parameters: struct kvm_msr_list (in/out) 137Returns: 0 on success; -1 on error 138Errors: 139 E2BIG: the msr index list is to be to fit in the array specified by 140 the user. 141 142struct kvm_msr_list { 143 __u32 nmsrs; /* number of msrs in entries */ 144 __u32 indices[0]; 145}; 146 147This ioctl returns the guest msrs that are supported. The list varies 148by kvm version and host processor, but does not change otherwise. The 149user fills in the size of the indices array in nmsrs, and in return 150kvm adjusts nmsrs to reflect the actual number of msrs and fills in 151the indices array with their numbers. 152 153Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are 154not returned in the MSR list, as different vcpus can have a different number 155of banks, as set via the KVM_X86_SETUP_MCE ioctl. 156 157 1584.4 KVM_CHECK_EXTENSION 159 160Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl 161Architectures: all 162Type: system ioctl, vm ioctl 163Parameters: extension identifier (KVM_CAP_*) 164Returns: 0 if unsupported; 1 (or some other positive integer) if supported 165 166The API allows the application to query about extensions to the core 167kvm API. Userspace passes an extension identifier (an integer) and 168receives an integer that describes the extension availability. 169Generally 0 means no and 1 means yes, but some extensions may report 170additional information in the integer return value. 171 172Based on their initialization different VMs may have different capabilities. 173It is thus encouraged to use the vm ioctl to query for capabilities (available 174with KVM_CAP_CHECK_EXTENSION_VM on the vm fd) 175 1764.5 KVM_GET_VCPU_MMAP_SIZE 177 178Capability: basic 179Architectures: all 180Type: system ioctl 181Parameters: none 182Returns: size of vcpu mmap area, in bytes 183 184The KVM_RUN ioctl (cf.) communicates with userspace via a shared 185memory region. This ioctl returns the size of that region. See the 186KVM_RUN documentation for details. 187 188 1894.6 KVM_SET_MEMORY_REGION 190 191Capability: basic 192Architectures: all 193Type: vm ioctl 194Parameters: struct kvm_memory_region (in) 195Returns: 0 on success, -1 on error 196 197This ioctl is obsolete and has been removed. 198 199 2004.7 KVM_CREATE_VCPU 201 202Capability: basic 203Architectures: all 204Type: vm ioctl 205Parameters: vcpu id (apic id on x86) 206Returns: vcpu fd on success, -1 on error 207 208This API adds a vcpu to a virtual machine. The vcpu id is a small integer 209in the range [0, max_vcpus). 210 211The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of 212the KVM_CHECK_EXTENSION ioctl() at run-time. 213The maximum possible value for max_vcpus can be retrieved using the 214KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time. 215 216If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4 217cpus max. 218If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is 219same as the value returned from KVM_CAP_NR_VCPUS. 220 221On powerpc using book3s_hv mode, the vcpus are mapped onto virtual 222threads in one or more virtual CPU cores. (This is because the 223hardware requires all the hardware threads in a CPU core to be in the 224same partition.) The KVM_CAP_PPC_SMT capability indicates the number 225of vcpus per virtual core (vcore). The vcore id is obtained by 226dividing the vcpu id by the number of vcpus per vcore. The vcpus in a 227given vcore will always be in the same physical core as each other 228(though that might be a different physical core from time to time). 229Userspace can control the threading (SMT) mode of the guest by its 230allocation of vcpu ids. For example, if userspace wants 231single-threaded guest vcpus, it should make all vcpu ids be a multiple 232of the number of vcpus per vcore. 233 234For virtual cpus that have been created with S390 user controlled virtual 235machines, the resulting vcpu fd can be memory mapped at page offset 236KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual 237cpu's hardware control block. 238 239 2404.8 KVM_GET_DIRTY_LOG (vm ioctl) 241 242Capability: basic 243Architectures: x86 244Type: vm ioctl 245Parameters: struct kvm_dirty_log (in/out) 246Returns: 0 on success, -1 on error 247 248/* for KVM_GET_DIRTY_LOG */ 249struct kvm_dirty_log { 250 __u32 slot; 251 __u32 padding; 252 union { 253 void __user *dirty_bitmap; /* one bit per page */ 254 __u64 padding; 255 }; 256}; 257 258Given a memory slot, return a bitmap containing any pages dirtied 259since the last call to this ioctl. Bit 0 is the first page in the 260memory slot. Ensure the entire structure is cleared to avoid padding 261issues. 262 263If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 specifies 264the address space for which you want to return the dirty bitmap. 265They must be less than the value that KVM_CHECK_EXTENSION returns for 266the KVM_CAP_MULTI_ADDRESS_SPACE capability. 267 268 2694.9 KVM_SET_MEMORY_ALIAS 270 271Capability: basic 272Architectures: x86 273Type: vm ioctl 274Parameters: struct kvm_memory_alias (in) 275Returns: 0 (success), -1 (error) 276 277This ioctl is obsolete and has been removed. 278 279 2804.10 KVM_RUN 281 282Capability: basic 283Architectures: all 284Type: vcpu ioctl 285Parameters: none 286Returns: 0 on success, -1 on error 287Errors: 288 EINTR: an unmasked signal is pending 289 290This ioctl is used to run a guest virtual cpu. While there are no 291explicit parameters, there is an implicit parameter block that can be 292obtained by mmap()ing the vcpu fd at offset 0, with the size given by 293KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct 294kvm_run' (see below). 295 296 2974.11 KVM_GET_REGS 298 299Capability: basic 300Architectures: all except ARM, arm64 301Type: vcpu ioctl 302Parameters: struct kvm_regs (out) 303Returns: 0 on success, -1 on error 304 305Reads the general purpose registers from the vcpu. 306 307/* x86 */ 308struct kvm_regs { 309 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 310 __u64 rax, rbx, rcx, rdx; 311 __u64 rsi, rdi, rsp, rbp; 312 __u64 r8, r9, r10, r11; 313 __u64 r12, r13, r14, r15; 314 __u64 rip, rflags; 315}; 316 317/* mips */ 318struct kvm_regs { 319 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 320 __u64 gpr[32]; 321 __u64 hi; 322 __u64 lo; 323 __u64 pc; 324}; 325 326 3274.12 KVM_SET_REGS 328 329Capability: basic 330Architectures: all except ARM, arm64 331Type: vcpu ioctl 332Parameters: struct kvm_regs (in) 333Returns: 0 on success, -1 on error 334 335Writes the general purpose registers into the vcpu. 336 337See KVM_GET_REGS for the data structure. 338 339 3404.13 KVM_GET_SREGS 341 342Capability: basic 343Architectures: x86, ppc 344Type: vcpu ioctl 345Parameters: struct kvm_sregs (out) 346Returns: 0 on success, -1 on error 347 348Reads special registers from the vcpu. 349 350/* x86 */ 351struct kvm_sregs { 352 struct kvm_segment cs, ds, es, fs, gs, ss; 353 struct kvm_segment tr, ldt; 354 struct kvm_dtable gdt, idt; 355 __u64 cr0, cr2, cr3, cr4, cr8; 356 __u64 efer; 357 __u64 apic_base; 358 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64]; 359}; 360 361/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */ 362 363interrupt_bitmap is a bitmap of pending external interrupts. At most 364one bit may be set. This interrupt has been acknowledged by the APIC 365but not yet injected into the cpu core. 366 367 3684.14 KVM_SET_SREGS 369 370Capability: basic 371Architectures: x86, ppc 372Type: vcpu ioctl 373Parameters: struct kvm_sregs (in) 374Returns: 0 on success, -1 on error 375 376Writes special registers into the vcpu. See KVM_GET_SREGS for the 377data structures. 378 379 3804.15 KVM_TRANSLATE 381 382Capability: basic 383Architectures: x86 384Type: vcpu ioctl 385Parameters: struct kvm_translation (in/out) 386Returns: 0 on success, -1 on error 387 388Translates a virtual address according to the vcpu's current address 389translation mode. 390 391struct kvm_translation { 392 /* in */ 393 __u64 linear_address; 394 395 /* out */ 396 __u64 physical_address; 397 __u8 valid; 398 __u8 writeable; 399 __u8 usermode; 400 __u8 pad[5]; 401}; 402 403 4044.16 KVM_INTERRUPT 405 406Capability: basic 407Architectures: x86, ppc, mips 408Type: vcpu ioctl 409Parameters: struct kvm_interrupt (in) 410Returns: 0 on success, negative on failure. 411 412Queues a hardware interrupt vector to be injected. 413 414/* for KVM_INTERRUPT */ 415struct kvm_interrupt { 416 /* in */ 417 __u32 irq; 418}; 419 420X86: 421 422Returns: 0 on success, 423 -EEXIST if an interrupt is already enqueued 424 -EINVAL the the irq number is invalid 425 -ENXIO if the PIC is in the kernel 426 -EFAULT if the pointer is invalid 427 428Note 'irq' is an interrupt vector, not an interrupt pin or line. This 429ioctl is useful if the in-kernel PIC is not used. 430 431PPC: 432 433Queues an external interrupt to be injected. This ioctl is overleaded 434with 3 different irq values: 435 436a) KVM_INTERRUPT_SET 437 438 This injects an edge type external interrupt into the guest once it's ready 439 to receive interrupts. When injected, the interrupt is done. 440 441b) KVM_INTERRUPT_UNSET 442 443 This unsets any pending interrupt. 444 445 Only available with KVM_CAP_PPC_UNSET_IRQ. 446 447c) KVM_INTERRUPT_SET_LEVEL 448 449 This injects a level type external interrupt into the guest context. The 450 interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET 451 is triggered. 452 453 Only available with KVM_CAP_PPC_IRQ_LEVEL. 454 455Note that any value for 'irq' other than the ones stated above is invalid 456and incurs unexpected behavior. 457 458MIPS: 459 460Queues an external interrupt to be injected into the virtual CPU. A negative 461interrupt number dequeues the interrupt. 462 463 4644.17 KVM_DEBUG_GUEST 465 466Capability: basic 467Architectures: none 468Type: vcpu ioctl 469Parameters: none) 470Returns: -1 on error 471 472Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead. 473 474 4754.18 KVM_GET_MSRS 476 477Capability: basic 478Architectures: x86 479Type: vcpu ioctl 480Parameters: struct kvm_msrs (in/out) 481Returns: 0 on success, -1 on error 482 483Reads model-specific registers from the vcpu. Supported msr indices can 484be obtained using KVM_GET_MSR_INDEX_LIST. 485 486struct kvm_msrs { 487 __u32 nmsrs; /* number of msrs in entries */ 488 __u32 pad; 489 490 struct kvm_msr_entry entries[0]; 491}; 492 493struct kvm_msr_entry { 494 __u32 index; 495 __u32 reserved; 496 __u64 data; 497}; 498 499Application code should set the 'nmsrs' member (which indicates the 500size of the entries array) and the 'index' member of each array entry. 501kvm will fill in the 'data' member. 502 503 5044.19 KVM_SET_MSRS 505 506Capability: basic 507Architectures: x86 508Type: vcpu ioctl 509Parameters: struct kvm_msrs (in) 510Returns: 0 on success, -1 on error 511 512Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the 513data structures. 514 515Application code should set the 'nmsrs' member (which indicates the 516size of the entries array), and the 'index' and 'data' members of each 517array entry. 518 519 5204.20 KVM_SET_CPUID 521 522Capability: basic 523Architectures: x86 524Type: vcpu ioctl 525Parameters: struct kvm_cpuid (in) 526Returns: 0 on success, -1 on error 527 528Defines the vcpu responses to the cpuid instruction. Applications 529should use the KVM_SET_CPUID2 ioctl if available. 530 531 532struct kvm_cpuid_entry { 533 __u32 function; 534 __u32 eax; 535 __u32 ebx; 536 __u32 ecx; 537 __u32 edx; 538 __u32 padding; 539}; 540 541/* for KVM_SET_CPUID */ 542struct kvm_cpuid { 543 __u32 nent; 544 __u32 padding; 545 struct kvm_cpuid_entry entries[0]; 546}; 547 548 5494.21 KVM_SET_SIGNAL_MASK 550 551Capability: basic 552Architectures: all 553Type: vcpu ioctl 554Parameters: struct kvm_signal_mask (in) 555Returns: 0 on success, -1 on error 556 557Defines which signals are blocked during execution of KVM_RUN. This 558signal mask temporarily overrides the threads signal mask. Any 559unblocked signal received (except SIGKILL and SIGSTOP, which retain 560their traditional behaviour) will cause KVM_RUN to return with -EINTR. 561 562Note the signal will only be delivered if not blocked by the original 563signal mask. 564 565/* for KVM_SET_SIGNAL_MASK */ 566struct kvm_signal_mask { 567 __u32 len; 568 __u8 sigset[0]; 569}; 570 571 5724.22 KVM_GET_FPU 573 574Capability: basic 575Architectures: x86 576Type: vcpu ioctl 577Parameters: struct kvm_fpu (out) 578Returns: 0 on success, -1 on error 579 580Reads the floating point state from the vcpu. 581 582/* for KVM_GET_FPU and KVM_SET_FPU */ 583struct kvm_fpu { 584 __u8 fpr[8][16]; 585 __u16 fcw; 586 __u16 fsw; 587 __u8 ftwx; /* in fxsave format */ 588 __u8 pad1; 589 __u16 last_opcode; 590 __u64 last_ip; 591 __u64 last_dp; 592 __u8 xmm[16][16]; 593 __u32 mxcsr; 594 __u32 pad2; 595}; 596 597 5984.23 KVM_SET_FPU 599 600Capability: basic 601Architectures: x86 602Type: vcpu ioctl 603Parameters: struct kvm_fpu (in) 604Returns: 0 on success, -1 on error 605 606Writes the floating point state to the vcpu. 607 608/* for KVM_GET_FPU and KVM_SET_FPU */ 609struct kvm_fpu { 610 __u8 fpr[8][16]; 611 __u16 fcw; 612 __u16 fsw; 613 __u8 ftwx; /* in fxsave format */ 614 __u8 pad1; 615 __u16 last_opcode; 616 __u64 last_ip; 617 __u64 last_dp; 618 __u8 xmm[16][16]; 619 __u32 mxcsr; 620 __u32 pad2; 621}; 622 623 6244.24 KVM_CREATE_IRQCHIP 625 626Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390) 627Architectures: x86, ARM, arm64, s390 628Type: vm ioctl 629Parameters: none 630Returns: 0 on success, -1 on error 631 632Creates an interrupt controller model in the kernel. 633On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up 634future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both 635PIC and IOAPIC; GSI 16-23 only go to the IOAPIC. 636On ARM/arm64, a GICv2 is created. Any other GIC versions require the usage of 637KVM_CREATE_DEVICE, which also supports creating a GICv2. Using 638KVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2. 639On s390, a dummy irq routing table is created. 640 641Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled 642before KVM_CREATE_IRQCHIP can be used. 643 644 6454.25 KVM_IRQ_LINE 646 647Capability: KVM_CAP_IRQCHIP 648Architectures: x86, arm, arm64 649Type: vm ioctl 650Parameters: struct kvm_irq_level 651Returns: 0 on success, -1 on error 652 653Sets the level of a GSI input to the interrupt controller model in the kernel. 654On some architectures it is required that an interrupt controller model has 655been previously created with KVM_CREATE_IRQCHIP. Note that edge-triggered 656interrupts require the level to be set to 1 and then back to 0. 657 658On real hardware, interrupt pins can be active-low or active-high. This 659does not matter for the level field of struct kvm_irq_level: 1 always 660means active (asserted), 0 means inactive (deasserted). 661 662x86 allows the operating system to program the interrupt polarity 663(active-low/active-high) for level-triggered interrupts, and KVM used 664to consider the polarity. However, due to bitrot in the handling of 665active-low interrupts, the above convention is now valid on x86 too. 666This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace 667should not present interrupts to the guest as active-low unless this 668capability is present (or unless it is not using the in-kernel irqchip, 669of course). 670 671 672ARM/arm64 can signal an interrupt either at the CPU level, or at the 673in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to 674use PPIs designated for specific cpus. The irq field is interpreted 675like this: 676 677 bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | 678 field: | irq_type | vcpu_index | irq_id | 679 680The irq_type field has the following values: 681- irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ 682- irq_type[1]: in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.) 683 (the vcpu_index field is ignored) 684- irq_type[2]: in-kernel GIC: PPI, irq_id between 16 and 31 (incl.) 685 686(The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs) 687 688In both cases, level is used to assert/deassert the line. 689 690struct kvm_irq_level { 691 union { 692 __u32 irq; /* GSI */ 693 __s32 status; /* not used for KVM_IRQ_LEVEL */ 694 }; 695 __u32 level; /* 0 or 1 */ 696}; 697 698 6994.26 KVM_GET_IRQCHIP 700 701Capability: KVM_CAP_IRQCHIP 702Architectures: x86 703Type: vm ioctl 704Parameters: struct kvm_irqchip (in/out) 705Returns: 0 on success, -1 on error 706 707Reads the state of a kernel interrupt controller created with 708KVM_CREATE_IRQCHIP into a buffer provided by the caller. 709 710struct kvm_irqchip { 711 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 712 __u32 pad; 713 union { 714 char dummy[512]; /* reserving space */ 715 struct kvm_pic_state pic; 716 struct kvm_ioapic_state ioapic; 717 } chip; 718}; 719 720 7214.27 KVM_SET_IRQCHIP 722 723Capability: KVM_CAP_IRQCHIP 724Architectures: x86 725Type: vm ioctl 726Parameters: struct kvm_irqchip (in) 727Returns: 0 on success, -1 on error 728 729Sets the state of a kernel interrupt controller created with 730KVM_CREATE_IRQCHIP from a buffer provided by the caller. 731 732struct kvm_irqchip { 733 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 734 __u32 pad; 735 union { 736 char dummy[512]; /* reserving space */ 737 struct kvm_pic_state pic; 738 struct kvm_ioapic_state ioapic; 739 } chip; 740}; 741 742 7434.28 KVM_XEN_HVM_CONFIG 744 745Capability: KVM_CAP_XEN_HVM 746Architectures: x86 747Type: vm ioctl 748Parameters: struct kvm_xen_hvm_config (in) 749Returns: 0 on success, -1 on error 750 751Sets the MSR that the Xen HVM guest uses to initialize its hypercall 752page, and provides the starting address and size of the hypercall 753blobs in userspace. When the guest writes the MSR, kvm copies one 754page of a blob (32- or 64-bit, depending on the vcpu mode) to guest 755memory. 756 757struct kvm_xen_hvm_config { 758 __u32 flags; 759 __u32 msr; 760 __u64 blob_addr_32; 761 __u64 blob_addr_64; 762 __u8 blob_size_32; 763 __u8 blob_size_64; 764 __u8 pad2[30]; 765}; 766 767 7684.29 KVM_GET_CLOCK 769 770Capability: KVM_CAP_ADJUST_CLOCK 771Architectures: x86 772Type: vm ioctl 773Parameters: struct kvm_clock_data (out) 774Returns: 0 on success, -1 on error 775 776Gets the current timestamp of kvmclock as seen by the current guest. In 777conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios 778such as migration. 779 780struct kvm_clock_data { 781 __u64 clock; /* kvmclock current value */ 782 __u32 flags; 783 __u32 pad[9]; 784}; 785 786 7874.30 KVM_SET_CLOCK 788 789Capability: KVM_CAP_ADJUST_CLOCK 790Architectures: x86 791Type: vm ioctl 792Parameters: struct kvm_clock_data (in) 793Returns: 0 on success, -1 on error 794 795Sets the current timestamp of kvmclock to the value specified in its parameter. 796In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios 797such as migration. 798 799struct kvm_clock_data { 800 __u64 clock; /* kvmclock current value */ 801 __u32 flags; 802 __u32 pad[9]; 803}; 804 805 8064.31 KVM_GET_VCPU_EVENTS 807 808Capability: KVM_CAP_VCPU_EVENTS 809Extended by: KVM_CAP_INTR_SHADOW 810Architectures: x86 811Type: vm ioctl 812Parameters: struct kvm_vcpu_event (out) 813Returns: 0 on success, -1 on error 814 815Gets currently pending exceptions, interrupts, and NMIs as well as related 816states of the vcpu. 817 818struct kvm_vcpu_events { 819 struct { 820 __u8 injected; 821 __u8 nr; 822 __u8 has_error_code; 823 __u8 pad; 824 __u32 error_code; 825 } exception; 826 struct { 827 __u8 injected; 828 __u8 nr; 829 __u8 soft; 830 __u8 shadow; 831 } interrupt; 832 struct { 833 __u8 injected; 834 __u8 pending; 835 __u8 masked; 836 __u8 pad; 837 } nmi; 838 __u32 sipi_vector; 839 __u32 flags; 840 struct { 841 __u8 smm; 842 __u8 pending; 843 __u8 smm_inside_nmi; 844 __u8 latched_init; 845 } smi; 846}; 847 848Only two fields are defined in the flags field: 849 850- KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that 851 interrupt.shadow contains a valid state. 852 853- KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that 854 smi contains a valid state. 855 8564.32 KVM_SET_VCPU_EVENTS 857 858Capability: KVM_CAP_VCPU_EVENTS 859Extended by: KVM_CAP_INTR_SHADOW 860Architectures: x86 861Type: vm ioctl 862Parameters: struct kvm_vcpu_event (in) 863Returns: 0 on success, -1 on error 864 865Set pending exceptions, interrupts, and NMIs as well as related states of the 866vcpu. 867 868See KVM_GET_VCPU_EVENTS for the data structure. 869 870Fields that may be modified asynchronously by running VCPUs can be excluded 871from the update. These fields are nmi.pending, sipi_vector, smi.smm, 872smi.pending. Keep the corresponding bits in the flags field cleared to 873suppress overwriting the current in-kernel state. The bits are: 874 875KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel 876KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector 877KVM_VCPUEVENT_VALID_SMM - transfer the smi sub-struct. 878 879If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in 880the flags field to signal that interrupt.shadow contains a valid state and 881shall be written into the VCPU. 882 883KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available. 884 885 8864.33 KVM_GET_DEBUGREGS 887 888Capability: KVM_CAP_DEBUGREGS 889Architectures: x86 890Type: vm ioctl 891Parameters: struct kvm_debugregs (out) 892Returns: 0 on success, -1 on error 893 894Reads debug registers from the vcpu. 895 896struct kvm_debugregs { 897 __u64 db[4]; 898 __u64 dr6; 899 __u64 dr7; 900 __u64 flags; 901 __u64 reserved[9]; 902}; 903 904 9054.34 KVM_SET_DEBUGREGS 906 907Capability: KVM_CAP_DEBUGREGS 908Architectures: x86 909Type: vm ioctl 910Parameters: struct kvm_debugregs (in) 911Returns: 0 on success, -1 on error 912 913Writes debug registers into the vcpu. 914 915See KVM_GET_DEBUGREGS for the data structure. The flags field is unused 916yet and must be cleared on entry. 917 918 9194.35 KVM_SET_USER_MEMORY_REGION 920 921Capability: KVM_CAP_USER_MEM 922Architectures: all 923Type: vm ioctl 924Parameters: struct kvm_userspace_memory_region (in) 925Returns: 0 on success, -1 on error 926 927struct kvm_userspace_memory_region { 928 __u32 slot; 929 __u32 flags; 930 __u64 guest_phys_addr; 931 __u64 memory_size; /* bytes */ 932 __u64 userspace_addr; /* start of the userspace allocated memory */ 933}; 934 935/* for kvm_memory_region::flags */ 936#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) 937#define KVM_MEM_READONLY (1UL << 1) 938 939This ioctl allows the user to create or modify a guest physical memory 940slot. When changing an existing slot, it may be moved in the guest 941physical memory space, or its flags may be modified. It may not be 942resized. Slots may not overlap in guest physical address space. 943 944If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot" 945specifies the address space which is being modified. They must be 946less than the value that KVM_CHECK_EXTENSION returns for the 947KVM_CAP_MULTI_ADDRESS_SPACE capability. Slots in separate address spaces 948are unrelated; the restriction on overlapping slots only applies within 949each address space. 950 951Memory for the region is taken starting at the address denoted by the 952field userspace_addr, which must point at user addressable memory for 953the entire memory slot size. Any object may back this memory, including 954anonymous memory, ordinary files, and hugetlbfs. 955 956It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr 957be identical. This allows large pages in the guest to be backed by large 958pages in the host. 959 960The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and 961KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of 962writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to 963use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, 964to make a new slot read-only. In this case, writes to this memory will be 965posted to userspace as KVM_EXIT_MMIO exits. 966 967When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of 968the memory region are automatically reflected into the guest. For example, an 969mmap() that affects the region will be made visible immediately. Another 970example is madvise(MADV_DROP). 971 972It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. 973The KVM_SET_MEMORY_REGION does not allow fine grained control over memory 974allocation and is deprecated. 975 976 9774.36 KVM_SET_TSS_ADDR 978 979Capability: KVM_CAP_SET_TSS_ADDR 980Architectures: x86 981Type: vm ioctl 982Parameters: unsigned long tss_address (in) 983Returns: 0 on success, -1 on error 984 985This ioctl defines the physical address of a three-page region in the guest 986physical address space. The region must be within the first 4GB of the 987guest physical address space and must not conflict with any memory slot 988or any mmio address. The guest may malfunction if it accesses this memory 989region. 990 991This ioctl is required on Intel-based hosts. This is needed on Intel hardware 992because of a quirk in the virtualization implementation (see the internals 993documentation when it pops into existence). 994 995 9964.37 KVM_ENABLE_CAP 997 998Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM 999Architectures: x86 (only KVM_CAP_ENABLE_CAP_VM), 1000 mips (only KVM_CAP_ENABLE_CAP), ppc, s390 1001Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM) 1002Parameters: struct kvm_enable_cap (in) 1003Returns: 0 on success; -1 on error 1004 1005+Not all extensions are enabled by default. Using this ioctl the application 1006can enable an extension, making it available to the guest. 1007 1008On systems that do not support this ioctl, it always fails. On systems that 1009do support it, it only works for extensions that are supported for enablement. 1010 1011To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should 1012be used. 1013 1014struct kvm_enable_cap { 1015 /* in */ 1016 __u32 cap; 1017 1018The capability that is supposed to get enabled. 1019 1020 __u32 flags; 1021 1022A bitfield indicating future enhancements. Has to be 0 for now. 1023 1024 __u64 args[4]; 1025 1026Arguments for enabling a feature. If a feature needs initial values to 1027function properly, this is the place to put them. 1028 1029 __u8 pad[64]; 1030}; 1031 1032The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl 1033for vm-wide capabilities. 1034 10354.38 KVM_GET_MP_STATE 1036 1037Capability: KVM_CAP_MP_STATE 1038Architectures: x86, s390, arm, arm64 1039Type: vcpu ioctl 1040Parameters: struct kvm_mp_state (out) 1041Returns: 0 on success; -1 on error 1042 1043struct kvm_mp_state { 1044 __u32 mp_state; 1045}; 1046 1047Returns the vcpu's current "multiprocessing state" (though also valid on 1048uniprocessor guests). 1049 1050Possible values are: 1051 1052 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running [x86,arm/arm64] 1053 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP) 1054 which has not yet received an INIT signal [x86] 1055 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is 1056 now ready for a SIPI [x86] 1057 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and 1058 is waiting for an interrupt [x86] 1059 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector 1060 accessible via KVM_GET_VCPU_EVENTS) [x86] 1061 - KVM_MP_STATE_STOPPED: the vcpu is stopped [s390,arm/arm64] 1062 - KVM_MP_STATE_CHECK_STOP: the vcpu is in a special error state [s390] 1063 - KVM_MP_STATE_OPERATING: the vcpu is operating (running or halted) 1064 [s390] 1065 - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state 1066 [s390] 1067 1068On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1069in-kernel irqchip, the multiprocessing state must be maintained by userspace on 1070these architectures. 1071 1072For arm/arm64: 1073 1074The only states that are valid are KVM_MP_STATE_STOPPED and 1075KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. 1076 10774.39 KVM_SET_MP_STATE 1078 1079Capability: KVM_CAP_MP_STATE 1080Architectures: x86, s390, arm, arm64 1081Type: vcpu ioctl 1082Parameters: struct kvm_mp_state (in) 1083Returns: 0 on success; -1 on error 1084 1085Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for 1086arguments. 1087 1088On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1089in-kernel irqchip, the multiprocessing state must be maintained by userspace on 1090these architectures. 1091 1092For arm/arm64: 1093 1094The only states that are valid are KVM_MP_STATE_STOPPED and 1095KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. 1096 10974.40 KVM_SET_IDENTITY_MAP_ADDR 1098 1099Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR 1100Architectures: x86 1101Type: vm ioctl 1102Parameters: unsigned long identity (in) 1103Returns: 0 on success, -1 on error 1104 1105This ioctl defines the physical address of a one-page region in the guest 1106physical address space. The region must be within the first 4GB of the 1107guest physical address space and must not conflict with any memory slot 1108or any mmio address. The guest may malfunction if it accesses this memory 1109region. 1110 1111This ioctl is required on Intel-based hosts. This is needed on Intel hardware 1112because of a quirk in the virtualization implementation (see the internals 1113documentation when it pops into existence). 1114 1115 11164.41 KVM_SET_BOOT_CPU_ID 1117 1118Capability: KVM_CAP_SET_BOOT_CPU_ID 1119Architectures: x86 1120Type: vm ioctl 1121Parameters: unsigned long vcpu_id 1122Returns: 0 on success, -1 on error 1123 1124Define which vcpu is the Bootstrap Processor (BSP). Values are the same 1125as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default 1126is vcpu 0. 1127 1128 11294.42 KVM_GET_XSAVE 1130 1131Capability: KVM_CAP_XSAVE 1132Architectures: x86 1133Type: vcpu ioctl 1134Parameters: struct kvm_xsave (out) 1135Returns: 0 on success, -1 on error 1136 1137struct kvm_xsave { 1138 __u32 region[1024]; 1139}; 1140 1141This ioctl would copy current vcpu's xsave struct to the userspace. 1142 1143 11444.43 KVM_SET_XSAVE 1145 1146Capability: KVM_CAP_XSAVE 1147Architectures: x86 1148Type: vcpu ioctl 1149Parameters: struct kvm_xsave (in) 1150Returns: 0 on success, -1 on error 1151 1152struct kvm_xsave { 1153 __u32 region[1024]; 1154}; 1155 1156This ioctl would copy userspace's xsave struct to the kernel. 1157 1158 11594.44 KVM_GET_XCRS 1160 1161Capability: KVM_CAP_XCRS 1162Architectures: x86 1163Type: vcpu ioctl 1164Parameters: struct kvm_xcrs (out) 1165Returns: 0 on success, -1 on error 1166 1167struct kvm_xcr { 1168 __u32 xcr; 1169 __u32 reserved; 1170 __u64 value; 1171}; 1172 1173struct kvm_xcrs { 1174 __u32 nr_xcrs; 1175 __u32 flags; 1176 struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1177 __u64 padding[16]; 1178}; 1179 1180This ioctl would copy current vcpu's xcrs to the userspace. 1181 1182 11834.45 KVM_SET_XCRS 1184 1185Capability: KVM_CAP_XCRS 1186Architectures: x86 1187Type: vcpu ioctl 1188Parameters: struct kvm_xcrs (in) 1189Returns: 0 on success, -1 on error 1190 1191struct kvm_xcr { 1192 __u32 xcr; 1193 __u32 reserved; 1194 __u64 value; 1195}; 1196 1197struct kvm_xcrs { 1198 __u32 nr_xcrs; 1199 __u32 flags; 1200 struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1201 __u64 padding[16]; 1202}; 1203 1204This ioctl would set vcpu's xcr to the value userspace specified. 1205 1206 12074.46 KVM_GET_SUPPORTED_CPUID 1208 1209Capability: KVM_CAP_EXT_CPUID 1210Architectures: x86 1211Type: system ioctl 1212Parameters: struct kvm_cpuid2 (in/out) 1213Returns: 0 on success, -1 on error 1214 1215struct kvm_cpuid2 { 1216 __u32 nent; 1217 __u32 padding; 1218 struct kvm_cpuid_entry2 entries[0]; 1219}; 1220 1221#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 1222#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) 1223#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) 1224 1225struct kvm_cpuid_entry2 { 1226 __u32 function; 1227 __u32 index; 1228 __u32 flags; 1229 __u32 eax; 1230 __u32 ebx; 1231 __u32 ecx; 1232 __u32 edx; 1233 __u32 padding[3]; 1234}; 1235 1236This ioctl returns x86 cpuid features which are supported by both the hardware 1237and kvm. Userspace can use the information returned by this ioctl to 1238construct cpuid information (for KVM_SET_CPUID2) that is consistent with 1239hardware, kernel, and userspace capabilities, and with user requirements (for 1240example, the user may wish to constrain cpuid to emulate older hardware, 1241or for feature consistency across a cluster). 1242 1243Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure 1244with the 'nent' field indicating the number of entries in the variable-size 1245array 'entries'. If the number of entries is too low to describe the cpu 1246capabilities, an error (E2BIG) is returned. If the number is too high, 1247the 'nent' field is adjusted and an error (ENOMEM) is returned. If the 1248number is just right, the 'nent' field is adjusted to the number of valid 1249entries in the 'entries' array, which is then filled. 1250 1251The entries returned are the host cpuid as returned by the cpuid instruction, 1252with unknown or unsupported features masked out. Some features (for example, 1253x2apic), may not be present in the host cpu, but are exposed by kvm if it can 1254emulate them efficiently. The fields in each entry are defined as follows: 1255 1256 function: the eax value used to obtain the entry 1257 index: the ecx value used to obtain the entry (for entries that are 1258 affected by ecx) 1259 flags: an OR of zero or more of the following: 1260 KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 1261 if the index field is valid 1262 KVM_CPUID_FLAG_STATEFUL_FUNC: 1263 if cpuid for this function returns different values for successive 1264 invocations; there will be several entries with the same function, 1265 all with this flag set 1266 KVM_CPUID_FLAG_STATE_READ_NEXT: 1267 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is 1268 the first entry to be read by a cpu 1269 eax, ebx, ecx, edx: the values returned by the cpuid instruction for 1270 this function/index combination 1271 1272The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned 1273as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC 1274support. Instead it is reported via 1275 1276 ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER) 1277 1278if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the 1279feature in userspace, then you can enable the feature for KVM_SET_CPUID2. 1280 1281 12824.47 KVM_PPC_GET_PVINFO 1283 1284Capability: KVM_CAP_PPC_GET_PVINFO 1285Architectures: ppc 1286Type: vm ioctl 1287Parameters: struct kvm_ppc_pvinfo (out) 1288Returns: 0 on success, !0 on error 1289 1290struct kvm_ppc_pvinfo { 1291 __u32 flags; 1292 __u32 hcall[4]; 1293 __u8 pad[108]; 1294}; 1295 1296This ioctl fetches PV specific information that need to be passed to the guest 1297using the device tree or other means from vm context. 1298 1299The hcall array defines 4 instructions that make up a hypercall. 1300 1301If any additional field gets added to this structure later on, a bit for that 1302additional piece of information will be set in the flags bitmap. 1303 1304The flags bitmap is defined as: 1305 1306 /* the host supports the ePAPR idle hcall 1307 #define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) 1308 13094.48 KVM_ASSIGN_PCI_DEVICE (deprecated) 1310 1311Capability: none 1312Architectures: x86 1313Type: vm ioctl 1314Parameters: struct kvm_assigned_pci_dev (in) 1315Returns: 0 on success, -1 on error 1316 1317Assigns a host PCI device to the VM. 1318 1319struct kvm_assigned_pci_dev { 1320 __u32 assigned_dev_id; 1321 __u32 busnr; 1322 __u32 devfn; 1323 __u32 flags; 1324 __u32 segnr; 1325 union { 1326 __u32 reserved[11]; 1327 }; 1328}; 1329 1330The PCI device is specified by the triple segnr, busnr, and devfn. 1331Identification in succeeding service requests is done via assigned_dev_id. The 1332following flags are specified: 1333 1334/* Depends on KVM_CAP_IOMMU */ 1335#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) 1336/* The following two depend on KVM_CAP_PCI_2_3 */ 1337#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) 1338#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) 1339 1340If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts 1341via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other 1342assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the 1343guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details. 1344 1345The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure 1346isolation of the device. Usages not specifying this flag are deprecated. 1347 1348Only PCI header type 0 devices with PCI BAR resources are supported by 1349device assignment. The user requesting this ioctl must have read/write 1350access to the PCI sysfs resource files associated with the device. 1351 1352Errors: 1353 ENOTTY: kernel does not support this ioctl 1354 1355 Other error conditions may be defined by individual device types or 1356 have their standard meanings. 1357 1358 13594.49 KVM_DEASSIGN_PCI_DEVICE (deprecated) 1360 1361Capability: none 1362Architectures: x86 1363Type: vm ioctl 1364Parameters: struct kvm_assigned_pci_dev (in) 1365Returns: 0 on success, -1 on error 1366 1367Ends PCI device assignment, releasing all associated resources. 1368 1369See KVM_ASSIGN_PCI_DEVICE for the data structure. Only assigned_dev_id is 1370used in kvm_assigned_pci_dev to identify the device. 1371 1372Errors: 1373 ENOTTY: kernel does not support this ioctl 1374 1375 Other error conditions may be defined by individual device types or 1376 have their standard meanings. 1377 13784.50 KVM_ASSIGN_DEV_IRQ (deprecated) 1379 1380Capability: KVM_CAP_ASSIGN_DEV_IRQ 1381Architectures: x86 1382Type: vm ioctl 1383Parameters: struct kvm_assigned_irq (in) 1384Returns: 0 on success, -1 on error 1385 1386Assigns an IRQ to a passed-through device. 1387 1388struct kvm_assigned_irq { 1389 __u32 assigned_dev_id; 1390 __u32 host_irq; /* ignored (legacy field) */ 1391 __u32 guest_irq; 1392 __u32 flags; 1393 union { 1394 __u32 reserved[12]; 1395 }; 1396}; 1397 1398The following flags are defined: 1399 1400#define KVM_DEV_IRQ_HOST_INTX (1 << 0) 1401#define KVM_DEV_IRQ_HOST_MSI (1 << 1) 1402#define KVM_DEV_IRQ_HOST_MSIX (1 << 2) 1403 1404#define KVM_DEV_IRQ_GUEST_INTX (1 << 8) 1405#define KVM_DEV_IRQ_GUEST_MSI (1 << 9) 1406#define KVM_DEV_IRQ_GUEST_MSIX (1 << 10) 1407 1408It is not valid to specify multiple types per host or guest IRQ. However, the 1409IRQ type of host and guest can differ or can even be null. 1410 1411Errors: 1412 ENOTTY: kernel does not support this ioctl 1413 1414 Other error conditions may be defined by individual device types or 1415 have their standard meanings. 1416 1417 14184.51 KVM_DEASSIGN_DEV_IRQ (deprecated) 1419 1420Capability: KVM_CAP_ASSIGN_DEV_IRQ 1421Architectures: x86 1422Type: vm ioctl 1423Parameters: struct kvm_assigned_irq (in) 1424Returns: 0 on success, -1 on error 1425 1426Ends an IRQ assignment to a passed-through device. 1427 1428See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified 1429by assigned_dev_id, flags must correspond to the IRQ type specified on 1430KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed. 1431 1432 14334.52 KVM_SET_GSI_ROUTING 1434 1435Capability: KVM_CAP_IRQ_ROUTING 1436Architectures: x86 s390 1437Type: vm ioctl 1438Parameters: struct kvm_irq_routing (in) 1439Returns: 0 on success, -1 on error 1440 1441Sets the GSI routing table entries, overwriting any previously set entries. 1442 1443struct kvm_irq_routing { 1444 __u32 nr; 1445 __u32 flags; 1446 struct kvm_irq_routing_entry entries[0]; 1447}; 1448 1449No flags are specified so far, the corresponding field must be set to zero. 1450 1451struct kvm_irq_routing_entry { 1452 __u32 gsi; 1453 __u32 type; 1454 __u32 flags; 1455 __u32 pad; 1456 union { 1457 struct kvm_irq_routing_irqchip irqchip; 1458 struct kvm_irq_routing_msi msi; 1459 struct kvm_irq_routing_s390_adapter adapter; 1460 __u32 pad[8]; 1461 } u; 1462}; 1463 1464/* gsi routing entry types */ 1465#define KVM_IRQ_ROUTING_IRQCHIP 1 1466#define KVM_IRQ_ROUTING_MSI 2 1467#define KVM_IRQ_ROUTING_S390_ADAPTER 3 1468 1469No flags are specified so far, the corresponding field must be set to zero. 1470 1471struct kvm_irq_routing_irqchip { 1472 __u32 irqchip; 1473 __u32 pin; 1474}; 1475 1476struct kvm_irq_routing_msi { 1477 __u32 address_lo; 1478 __u32 address_hi; 1479 __u32 data; 1480 __u32 pad; 1481}; 1482 1483struct kvm_irq_routing_s390_adapter { 1484 __u64 ind_addr; 1485 __u64 summary_addr; 1486 __u64 ind_offset; 1487 __u32 summary_offset; 1488 __u32 adapter_id; 1489}; 1490 1491 14924.53 KVM_ASSIGN_SET_MSIX_NR (deprecated) 1493 1494Capability: none 1495Architectures: x86 1496Type: vm ioctl 1497Parameters: struct kvm_assigned_msix_nr (in) 1498Returns: 0 on success, -1 on error 1499 1500Set the number of MSI-X interrupts for an assigned device. The number is 1501reset again by terminating the MSI-X assignment of the device via 1502KVM_DEASSIGN_DEV_IRQ. Calling this service more than once at any earlier 1503point will fail. 1504 1505struct kvm_assigned_msix_nr { 1506 __u32 assigned_dev_id; 1507 __u16 entry_nr; 1508 __u16 padding; 1509}; 1510 1511#define KVM_MAX_MSIX_PER_DEV 256 1512 1513 15144.54 KVM_ASSIGN_SET_MSIX_ENTRY (deprecated) 1515 1516Capability: none 1517Architectures: x86 1518Type: vm ioctl 1519Parameters: struct kvm_assigned_msix_entry (in) 1520Returns: 0 on success, -1 on error 1521 1522Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting 1523the GSI vector to zero means disabling the interrupt. 1524 1525struct kvm_assigned_msix_entry { 1526 __u32 assigned_dev_id; 1527 __u32 gsi; 1528 __u16 entry; /* The index of entry in the MSI-X table */ 1529 __u16 padding[3]; 1530}; 1531 1532Errors: 1533 ENOTTY: kernel does not support this ioctl 1534 1535 Other error conditions may be defined by individual device types or 1536 have their standard meanings. 1537 1538 15394.55 KVM_SET_TSC_KHZ 1540 1541Capability: KVM_CAP_TSC_CONTROL 1542Architectures: x86 1543Type: vcpu ioctl 1544Parameters: virtual tsc_khz 1545Returns: 0 on success, -1 on error 1546 1547Specifies the tsc frequency for the virtual machine. The unit of the 1548frequency is KHz. 1549 1550 15514.56 KVM_GET_TSC_KHZ 1552 1553Capability: KVM_CAP_GET_TSC_KHZ 1554Architectures: x86 1555Type: vcpu ioctl 1556Parameters: none 1557Returns: virtual tsc-khz on success, negative value on error 1558 1559Returns the tsc frequency of the guest. The unit of the return value is 1560KHz. If the host has unstable tsc this ioctl returns -EIO instead as an 1561error. 1562 1563 15644.57 KVM_GET_LAPIC 1565 1566Capability: KVM_CAP_IRQCHIP 1567Architectures: x86 1568Type: vcpu ioctl 1569Parameters: struct kvm_lapic_state (out) 1570Returns: 0 on success, -1 on error 1571 1572#define KVM_APIC_REG_SIZE 0x400 1573struct kvm_lapic_state { 1574 char regs[KVM_APIC_REG_SIZE]; 1575}; 1576 1577Reads the Local APIC registers and copies them into the input argument. The 1578data format and layout are the same as documented in the architecture manual. 1579 1580 15814.58 KVM_SET_LAPIC 1582 1583Capability: KVM_CAP_IRQCHIP 1584Architectures: x86 1585Type: vcpu ioctl 1586Parameters: struct kvm_lapic_state (in) 1587Returns: 0 on success, -1 on error 1588 1589#define KVM_APIC_REG_SIZE 0x400 1590struct kvm_lapic_state { 1591 char regs[KVM_APIC_REG_SIZE]; 1592}; 1593 1594Copies the input argument into the Local APIC registers. The data format 1595and layout are the same as documented in the architecture manual. 1596 1597 15984.59 KVM_IOEVENTFD 1599 1600Capability: KVM_CAP_IOEVENTFD 1601Architectures: all 1602Type: vm ioctl 1603Parameters: struct kvm_ioeventfd (in) 1604Returns: 0 on success, !0 on error 1605 1606This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address 1607within the guest. A guest write in the registered address will signal the 1608provided event instead of triggering an exit. 1609 1610struct kvm_ioeventfd { 1611 __u64 datamatch; 1612 __u64 addr; /* legal pio/mmio address */ 1613 __u32 len; /* 0, 1, 2, 4, or 8 bytes */ 1614 __s32 fd; 1615 __u32 flags; 1616 __u8 pad[36]; 1617}; 1618 1619For the special case of virtio-ccw devices on s390, the ioevent is matched 1620to a subchannel/virtqueue tuple instead. 1621 1622The following flags are defined: 1623 1624#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch) 1625#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio) 1626#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign) 1627#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \ 1628 (1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify) 1629 1630If datamatch flag is set, the event will be signaled only if the written value 1631to the registered address is equal to datamatch in struct kvm_ioeventfd. 1632 1633For virtio-ccw devices, addr contains the subchannel id and datamatch the 1634virtqueue index. 1635 1636With KVM_CAP_IOEVENTFD_ANY_LENGTH, a zero length ioeventfd is allowed, and 1637the kernel will ignore the length of guest write and may get a faster vmexit. 1638The speedup may only apply to specific architectures, but the ioeventfd will 1639work anyway. 1640 16414.60 KVM_DIRTY_TLB 1642 1643Capability: KVM_CAP_SW_TLB 1644Architectures: ppc 1645Type: vcpu ioctl 1646Parameters: struct kvm_dirty_tlb (in) 1647Returns: 0 on success, -1 on error 1648 1649struct kvm_dirty_tlb { 1650 __u64 bitmap; 1651 __u32 num_dirty; 1652}; 1653 1654This must be called whenever userspace has changed an entry in the shared 1655TLB, prior to calling KVM_RUN on the associated vcpu. 1656 1657The "bitmap" field is the userspace address of an array. This array 1658consists of a number of bits, equal to the total number of TLB entries as 1659determined by the last successful call to KVM_CONFIG_TLB, rounded up to the 1660nearest multiple of 64. 1661 1662Each bit corresponds to one TLB entry, ordered the same as in the shared TLB 1663array. 1664 1665The array is little-endian: the bit 0 is the least significant bit of the 1666first byte, bit 8 is the least significant bit of the second byte, etc. 1667This avoids any complications with differing word sizes. 1668 1669The "num_dirty" field is a performance hint for KVM to determine whether it 1670should skip processing the bitmap and just invalidate everything. It must 1671be set to the number of set bits in the bitmap. 1672 1673 16744.61 KVM_ASSIGN_SET_INTX_MASK (deprecated) 1675 1676Capability: KVM_CAP_PCI_2_3 1677Architectures: x86 1678Type: vm ioctl 1679Parameters: struct kvm_assigned_pci_dev (in) 1680Returns: 0 on success, -1 on error 1681 1682Allows userspace to mask PCI INTx interrupts from the assigned device. The 1683kernel will not deliver INTx interrupts to the guest between setting and 1684clearing of KVM_ASSIGN_SET_INTX_MASK via this interface. This enables use of 1685and emulation of PCI 2.3 INTx disable command register behavior. 1686 1687This may be used for both PCI 2.3 devices supporting INTx disable natively and 1688older devices lacking this support. Userspace is responsible for emulating the 1689read value of the INTx disable bit in the guest visible PCI command register. 1690When modifying the INTx disable state, userspace should precede updating the 1691physical device command register by calling this ioctl to inform the kernel of 1692the new intended INTx mask state. 1693 1694Note that the kernel uses the device INTx disable bit to internally manage the 1695device interrupt state for PCI 2.3 devices. Reads of this register may 1696therefore not match the expected value. Writes should always use the guest 1697intended INTx disable value rather than attempting to read-copy-update the 1698current physical device state. Races between user and kernel updates to the 1699INTx disable bit are handled lazily in the kernel. It's possible the device 1700may generate unintended interrupts, but they will not be injected into the 1701guest. 1702 1703See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified 1704by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is 1705evaluated. 1706 1707 17084.62 KVM_CREATE_SPAPR_TCE 1709 1710Capability: KVM_CAP_SPAPR_TCE 1711Architectures: powerpc 1712Type: vm ioctl 1713Parameters: struct kvm_create_spapr_tce (in) 1714Returns: file descriptor for manipulating the created TCE table 1715 1716This creates a virtual TCE (translation control entry) table, which 1717is an IOMMU for PAPR-style virtual I/O. It is used to translate 1718logical addresses used in virtual I/O into guest physical addresses, 1719and provides a scatter/gather capability for PAPR virtual I/O. 1720 1721/* for KVM_CAP_SPAPR_TCE */ 1722struct kvm_create_spapr_tce { 1723 __u64 liobn; 1724 __u32 window_size; 1725}; 1726 1727The liobn field gives the logical IO bus number for which to create a 1728TCE table. The window_size field specifies the size of the DMA window 1729which this TCE table will translate - the table will contain one 64 1730bit TCE entry for every 4kiB of the DMA window. 1731 1732When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE 1733table has been created using this ioctl(), the kernel will handle it 1734in real mode, updating the TCE table. H_PUT_TCE calls for other 1735liobns will cause a vm exit and must be handled by userspace. 1736 1737The return value is a file descriptor which can be passed to mmap(2) 1738to map the created TCE table into userspace. This lets userspace read 1739the entries written by kernel-handled H_PUT_TCE calls, and also lets 1740userspace update the TCE table directly which is useful in some 1741circumstances. 1742 1743 17444.63 KVM_ALLOCATE_RMA 1745 1746Capability: KVM_CAP_PPC_RMA 1747Architectures: powerpc 1748Type: vm ioctl 1749Parameters: struct kvm_allocate_rma (out) 1750Returns: file descriptor for mapping the allocated RMA 1751 1752This allocates a Real Mode Area (RMA) from the pool allocated at boot 1753time by the kernel. An RMA is a physically-contiguous, aligned region 1754of memory used on older POWER processors to provide the memory which 1755will be accessed by real-mode (MMU off) accesses in a KVM guest. 1756POWER processors support a set of sizes for the RMA that usually 1757includes 64MB, 128MB, 256MB and some larger powers of two. 1758 1759/* for KVM_ALLOCATE_RMA */ 1760struct kvm_allocate_rma { 1761 __u64 rma_size; 1762}; 1763 1764The return value is a file descriptor which can be passed to mmap(2) 1765to map the allocated RMA into userspace. The mapped area can then be 1766passed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the 1767RMA for a virtual machine. The size of the RMA in bytes (which is 1768fixed at host kernel boot time) is returned in the rma_size field of 1769the argument structure. 1770 1771The KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl 1772is supported; 2 if the processor requires all virtual machines to have 1773an RMA, or 1 if the processor can use an RMA but doesn't require it, 1774because it supports the Virtual RMA (VRMA) facility. 1775 1776 17774.64 KVM_NMI 1778 1779Capability: KVM_CAP_USER_NMI 1780Architectures: x86 1781Type: vcpu ioctl 1782Parameters: none 1783Returns: 0 on success, -1 on error 1784 1785Queues an NMI on the thread's vcpu. Note this is well defined only 1786when KVM_CREATE_IRQCHIP has not been called, since this is an interface 1787between the virtual cpu core and virtual local APIC. After KVM_CREATE_IRQCHIP 1788has been called, this interface is completely emulated within the kernel. 1789 1790To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the 1791following algorithm: 1792 1793 - pause the vcpu 1794 - read the local APIC's state (KVM_GET_LAPIC) 1795 - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1) 1796 - if so, issue KVM_NMI 1797 - resume the vcpu 1798 1799Some guests configure the LINT1 NMI input to cause a panic, aiding in 1800debugging. 1801 1802 18034.65 KVM_S390_UCAS_MAP 1804 1805Capability: KVM_CAP_S390_UCONTROL 1806Architectures: s390 1807Type: vcpu ioctl 1808Parameters: struct kvm_s390_ucas_mapping (in) 1809Returns: 0 in case of success 1810 1811The parameter is defined like this: 1812 struct kvm_s390_ucas_mapping { 1813 __u64 user_addr; 1814 __u64 vcpu_addr; 1815 __u64 length; 1816 }; 1817 1818This ioctl maps the memory at "user_addr" with the length "length" to 1819the vcpu's address space starting at "vcpu_addr". All parameters need to 1820be aligned by 1 megabyte. 1821 1822 18234.66 KVM_S390_UCAS_UNMAP 1824 1825Capability: KVM_CAP_S390_UCONTROL 1826Architectures: s390 1827Type: vcpu ioctl 1828Parameters: struct kvm_s390_ucas_mapping (in) 1829Returns: 0 in case of success 1830 1831The parameter is defined like this: 1832 struct kvm_s390_ucas_mapping { 1833 __u64 user_addr; 1834 __u64 vcpu_addr; 1835 __u64 length; 1836 }; 1837 1838This ioctl unmaps the memory in the vcpu's address space starting at 1839"vcpu_addr" with the length "length". The field "user_addr" is ignored. 1840All parameters need to be aligned by 1 megabyte. 1841 1842 18434.67 KVM_S390_VCPU_FAULT 1844 1845Capability: KVM_CAP_S390_UCONTROL 1846Architectures: s390 1847Type: vcpu ioctl 1848Parameters: vcpu absolute address (in) 1849Returns: 0 in case of success 1850 1851This call creates a page table entry on the virtual cpu's address space 1852(for user controlled virtual machines) or the virtual machine's address 1853space (for regular virtual machines). This only works for minor faults, 1854thus it's recommended to access subject memory page via the user page 1855table upfront. This is useful to handle validity intercepts for user 1856controlled virtual machines to fault in the virtual cpu's lowcore pages 1857prior to calling the KVM_RUN ioctl. 1858 1859 18604.68 KVM_SET_ONE_REG 1861 1862Capability: KVM_CAP_ONE_REG 1863Architectures: all 1864Type: vcpu ioctl 1865Parameters: struct kvm_one_reg (in) 1866Returns: 0 on success, negative value on failure 1867 1868struct kvm_one_reg { 1869 __u64 id; 1870 __u64 addr; 1871}; 1872 1873Using this ioctl, a single vcpu register can be set to a specific value 1874defined by user space with the passed in struct kvm_one_reg, where id 1875refers to the register identifier as described below and addr is a pointer 1876to a variable with the respective size. There can be architecture agnostic 1877and architecture specific registers. Each have their own range of operation 1878and their own constants and width. To keep track of the implemented 1879registers, find a list below: 1880 1881 Arch | Register | Width (bits) 1882 | | 1883 PPC | KVM_REG_PPC_HIOR | 64 1884 PPC | KVM_REG_PPC_IAC1 | 64 1885 PPC | KVM_REG_PPC_IAC2 | 64 1886 PPC | KVM_REG_PPC_IAC3 | 64 1887 PPC | KVM_REG_PPC_IAC4 | 64 1888 PPC | KVM_REG_PPC_DAC1 | 64 1889 PPC | KVM_REG_PPC_DAC2 | 64 1890 PPC | KVM_REG_PPC_DABR | 64 1891 PPC | KVM_REG_PPC_DSCR | 64 1892 PPC | KVM_REG_PPC_PURR | 64 1893 PPC | KVM_REG_PPC_SPURR | 64 1894 PPC | KVM_REG_PPC_DAR | 64 1895 PPC | KVM_REG_PPC_DSISR | 32 1896 PPC | KVM_REG_PPC_AMR | 64 1897 PPC | KVM_REG_PPC_UAMOR | 64 1898 PPC | KVM_REG_PPC_MMCR0 | 64 1899 PPC | KVM_REG_PPC_MMCR1 | 64 1900 PPC | KVM_REG_PPC_MMCRA | 64 1901 PPC | KVM_REG_PPC_MMCR2 | 64 1902 PPC | KVM_REG_PPC_MMCRS | 64 1903 PPC | KVM_REG_PPC_SIAR | 64 1904 PPC | KVM_REG_PPC_SDAR | 64 1905 PPC | KVM_REG_PPC_SIER | 64 1906 PPC | KVM_REG_PPC_PMC1 | 32 1907 PPC | KVM_REG_PPC_PMC2 | 32 1908 PPC | KVM_REG_PPC_PMC3 | 32 1909 PPC | KVM_REG_PPC_PMC4 | 32 1910 PPC | KVM_REG_PPC_PMC5 | 32 1911 PPC | KVM_REG_PPC_PMC6 | 32 1912 PPC | KVM_REG_PPC_PMC7 | 32 1913 PPC | KVM_REG_PPC_PMC8 | 32 1914 PPC | KVM_REG_PPC_FPR0 | 64 1915 ... 1916 PPC | KVM_REG_PPC_FPR31 | 64 1917 PPC | KVM_REG_PPC_VR0 | 128 1918 ... 1919 PPC | KVM_REG_PPC_VR31 | 128 1920 PPC | KVM_REG_PPC_VSR0 | 128 1921 ... 1922 PPC | KVM_REG_PPC_VSR31 | 128 1923 PPC | KVM_REG_PPC_FPSCR | 64 1924 PPC | KVM_REG_PPC_VSCR | 32 1925 PPC | KVM_REG_PPC_VPA_ADDR | 64 1926 PPC | KVM_REG_PPC_VPA_SLB | 128 1927 PPC | KVM_REG_PPC_VPA_DTL | 128 1928 PPC | KVM_REG_PPC_EPCR | 32 1929 PPC | KVM_REG_PPC_EPR | 32 1930 PPC | KVM_REG_PPC_TCR | 32 1931 PPC | KVM_REG_PPC_TSR | 32 1932 PPC | KVM_REG_PPC_OR_TSR | 32 1933 PPC | KVM_REG_PPC_CLEAR_TSR | 32 1934 PPC | KVM_REG_PPC_MAS0 | 32 1935 PPC | KVM_REG_PPC_MAS1 | 32 1936 PPC | KVM_REG_PPC_MAS2 | 64 1937 PPC | KVM_REG_PPC_MAS7_3 | 64 1938 PPC | KVM_REG_PPC_MAS4 | 32 1939 PPC | KVM_REG_PPC_MAS6 | 32 1940 PPC | KVM_REG_PPC_MMUCFG | 32 1941 PPC | KVM_REG_PPC_TLB0CFG | 32 1942 PPC | KVM_REG_PPC_TLB1CFG | 32 1943 PPC | KVM_REG_PPC_TLB2CFG | 32 1944 PPC | KVM_REG_PPC_TLB3CFG | 32 1945 PPC | KVM_REG_PPC_TLB0PS | 32 1946 PPC | KVM_REG_PPC_TLB1PS | 32 1947 PPC | KVM_REG_PPC_TLB2PS | 32 1948 PPC | KVM_REG_PPC_TLB3PS | 32 1949 PPC | KVM_REG_PPC_EPTCFG | 32 1950 PPC | KVM_REG_PPC_ICP_STATE | 64 1951 PPC | KVM_REG_PPC_TB_OFFSET | 64 1952 PPC | KVM_REG_PPC_SPMC1 | 32 1953 PPC | KVM_REG_PPC_SPMC2 | 32 1954 PPC | KVM_REG_PPC_IAMR | 64 1955 PPC | KVM_REG_PPC_TFHAR | 64 1956 PPC | KVM_REG_PPC_TFIAR | 64 1957 PPC | KVM_REG_PPC_TEXASR | 64 1958 PPC | KVM_REG_PPC_FSCR | 64 1959 PPC | KVM_REG_PPC_PSPB | 32 1960 PPC | KVM_REG_PPC_EBBHR | 64 1961 PPC | KVM_REG_PPC_EBBRR | 64 1962 PPC | KVM_REG_PPC_BESCR | 64 1963 PPC | KVM_REG_PPC_TAR | 64 1964 PPC | KVM_REG_PPC_DPDES | 64 1965 PPC | KVM_REG_PPC_DAWR | 64 1966 PPC | KVM_REG_PPC_DAWRX | 64 1967 PPC | KVM_REG_PPC_CIABR | 64 1968 PPC | KVM_REG_PPC_IC | 64 1969 PPC | KVM_REG_PPC_VTB | 64 1970 PPC | KVM_REG_PPC_CSIGR | 64 1971 PPC | KVM_REG_PPC_TACR | 64 1972 PPC | KVM_REG_PPC_TCSCR | 64 1973 PPC | KVM_REG_PPC_PID | 64 1974 PPC | KVM_REG_PPC_ACOP | 64 1975 PPC | KVM_REG_PPC_VRSAVE | 32 1976 PPC | KVM_REG_PPC_LPCR | 32 1977 PPC | KVM_REG_PPC_LPCR_64 | 64 1978 PPC | KVM_REG_PPC_PPR | 64 1979 PPC | KVM_REG_PPC_ARCH_COMPAT | 32 1980 PPC | KVM_REG_PPC_DABRX | 32 1981 PPC | KVM_REG_PPC_WORT | 64 1982 PPC | KVM_REG_PPC_SPRG9 | 64 1983 PPC | KVM_REG_PPC_DBSR | 32 1984 PPC | KVM_REG_PPC_TM_GPR0 | 64 1985 ... 1986 PPC | KVM_REG_PPC_TM_GPR31 | 64 1987 PPC | KVM_REG_PPC_TM_VSR0 | 128 1988 ... 1989 PPC | KVM_REG_PPC_TM_VSR63 | 128 1990 PPC | KVM_REG_PPC_TM_CR | 64 1991 PPC | KVM_REG_PPC_TM_LR | 64 1992 PPC | KVM_REG_PPC_TM_CTR | 64 1993 PPC | KVM_REG_PPC_TM_FPSCR | 64 1994 PPC | KVM_REG_PPC_TM_AMR | 64 1995 PPC | KVM_REG_PPC_TM_PPR | 64 1996 PPC | KVM_REG_PPC_TM_VRSAVE | 64 1997 PPC | KVM_REG_PPC_TM_VSCR | 32 1998 PPC | KVM_REG_PPC_TM_DSCR | 64 1999 PPC | KVM_REG_PPC_TM_TAR | 64 2000 PPC | KVM_REG_PPC_TM_XER | 64 2001 | | 2002 MIPS | KVM_REG_MIPS_R0 | 64 2003 ... 2004 MIPS | KVM_REG_MIPS_R31 | 64 2005 MIPS | KVM_REG_MIPS_HI | 64 2006 MIPS | KVM_REG_MIPS_LO | 64 2007 MIPS | KVM_REG_MIPS_PC | 64 2008 MIPS | KVM_REG_MIPS_CP0_INDEX | 32 2009 MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64 2010 MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64 2011 MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32 2012 MIPS | KVM_REG_MIPS_CP0_WIRED | 32 2013 MIPS | KVM_REG_MIPS_CP0_HWRENA | 32 2014 MIPS | KVM_REG_MIPS_CP0_BADVADDR | 64 2015 MIPS | KVM_REG_MIPS_CP0_COUNT | 32 2016 MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64 2017 MIPS | KVM_REG_MIPS_CP0_COMPARE | 32 2018 MIPS | KVM_REG_MIPS_CP0_STATUS | 32 2019 MIPS | KVM_REG_MIPS_CP0_CAUSE | 32 2020 MIPS | KVM_REG_MIPS_CP0_EPC | 64 2021 MIPS | KVM_REG_MIPS_CP0_PRID | 32 2022 MIPS | KVM_REG_MIPS_CP0_CONFIG | 32 2023 MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32 2024 MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32 2025 MIPS | KVM_REG_MIPS_CP0_CONFIG3 | 32 2026 MIPS | KVM_REG_MIPS_CP0_CONFIG4 | 32 2027 MIPS | KVM_REG_MIPS_CP0_CONFIG5 | 32 2028 MIPS | KVM_REG_MIPS_CP0_CONFIG7 | 32 2029 MIPS | KVM_REG_MIPS_CP0_ERROREPC | 64 2030 MIPS | KVM_REG_MIPS_COUNT_CTL | 64 2031 MIPS | KVM_REG_MIPS_COUNT_RESUME | 64 2032 MIPS | KVM_REG_MIPS_COUNT_HZ | 64 2033 MIPS | KVM_REG_MIPS_FPR_32(0..31) | 32 2034 MIPS | KVM_REG_MIPS_FPR_64(0..31) | 64 2035 MIPS | KVM_REG_MIPS_VEC_128(0..31) | 128 2036 MIPS | KVM_REG_MIPS_FCR_IR | 32 2037 MIPS | KVM_REG_MIPS_FCR_CSR | 32 2038 MIPS | KVM_REG_MIPS_MSA_IR | 32 2039 MIPS | KVM_REG_MIPS_MSA_CSR | 32 2040 2041ARM registers are mapped using the lower 32 bits. The upper 16 of that 2042is the register group type, or coprocessor number: 2043 2044ARM core registers have the following id bit patterns: 2045 0x4020 0000 0010 <index into the kvm_regs struct:16> 2046 2047ARM 32-bit CP15 registers have the following id bit patterns: 2048 0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3> 2049 2050ARM 64-bit CP15 registers have the following id bit patterns: 2051 0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3> 2052 2053ARM CCSIDR registers are demultiplexed by CSSELR value: 2054 0x4020 0000 0011 00 <csselr:8> 2055 2056ARM 32-bit VFP control registers have the following id bit patterns: 2057 0x4020 0000 0012 1 <regno:12> 2058 2059ARM 64-bit FP registers have the following id bit patterns: 2060 0x4030 0000 0012 0 <regno:12> 2061 2062 2063arm64 registers are mapped using the lower 32 bits. The upper 16 of 2064that is the register group type, or coprocessor number: 2065 2066arm64 core/FP-SIMD registers have the following id bit patterns. Note 2067that the size of the access is variable, as the kvm_regs structure 2068contains elements ranging from 32 to 128 bits. The index is a 32bit 2069value in the kvm_regs structure seen as a 32bit array. 2070 0x60x0 0000 0010 <index into the kvm_regs struct:16> 2071 2072arm64 CCSIDR registers are demultiplexed by CSSELR value: 2073 0x6020 0000 0011 00 <csselr:8> 2074 2075arm64 system registers have the following id bit patterns: 2076 0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3> 2077 2078 2079MIPS registers are mapped using the lower 32 bits. The upper 16 of that is 2080the register group type: 2081 2082MIPS core registers (see above) have the following id bit patterns: 2083 0x7030 0000 0000 <reg:16> 2084 2085MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit 2086patterns depending on whether they're 32-bit or 64-bit registers: 2087 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit) 2088 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit) 2089 2090MIPS KVM control registers (see above) have the following id bit patterns: 2091 0x7030 0000 0002 <reg:16> 2092 2093MIPS FPU registers (see KVM_REG_MIPS_FPR_{32,64}() above) have the following 2094id bit patterns depending on the size of the register being accessed. They are 2095always accessed according to the current guest FPU mode (Status.FR and 2096Config5.FRE), i.e. as the guest would see them, and they become unpredictable 2097if the guest FPU mode is changed. MIPS SIMD Architecture (MSA) vector 2098registers (see KVM_REG_MIPS_VEC_128() above) have similar patterns as they 2099overlap the FPU registers: 2100 0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers) 2101 0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers) 2102 0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers) 2103 2104MIPS FPU control registers (see KVM_REG_MIPS_FCR_{IR,CSR} above) have the 2105following id bit patterns: 2106 0x7020 0000 0003 01 <0:3> <reg:5> 2107 2108MIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the 2109following id bit patterns: 2110 0x7020 0000 0003 02 <0:3> <reg:5> 2111 2112 21134.69 KVM_GET_ONE_REG 2114 2115Capability: KVM_CAP_ONE_REG 2116Architectures: all 2117Type: vcpu ioctl 2118Parameters: struct kvm_one_reg (in and out) 2119Returns: 0 on success, negative value on failure 2120 2121This ioctl allows to receive the value of a single register implemented 2122in a vcpu. The register to read is indicated by the "id" field of the 2123kvm_one_reg struct passed in. On success, the register value can be found 2124at the memory location pointed to by "addr". 2125 2126The list of registers accessible using this interface is identical to the 2127list in 4.68. 2128 2129 21304.70 KVM_KVMCLOCK_CTRL 2131 2132Capability: KVM_CAP_KVMCLOCK_CTRL 2133Architectures: Any that implement pvclocks (currently x86 only) 2134Type: vcpu ioctl 2135Parameters: None 2136Returns: 0 on success, -1 on error 2137 2138This signals to the host kernel that the specified guest is being paused by 2139userspace. The host will set a flag in the pvclock structure that is checked 2140from the soft lockup watchdog. The flag is part of the pvclock structure that 2141is shared between guest and host, specifically the second bit of the flags 2142field of the pvclock_vcpu_time_info structure. It will be set exclusively by 2143the host and read/cleared exclusively by the guest. The guest operation of 2144checking and clearing the flag must an atomic operation so 2145load-link/store-conditional, or equivalent must be used. There are two cases 2146where the guest will clear the flag: when the soft lockup watchdog timer resets 2147itself or when a soft lockup is detected. This ioctl can be called any time 2148after pausing the vcpu, but before it is resumed. 2149 2150 21514.71 KVM_SIGNAL_MSI 2152 2153Capability: KVM_CAP_SIGNAL_MSI 2154Architectures: x86 2155Type: vm ioctl 2156Parameters: struct kvm_msi (in) 2157Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error 2158 2159Directly inject a MSI message. Only valid with in-kernel irqchip that handles 2160MSI messages. 2161 2162struct kvm_msi { 2163 __u32 address_lo; 2164 __u32 address_hi; 2165 __u32 data; 2166 __u32 flags; 2167 __u8 pad[16]; 2168}; 2169 2170No flags are defined so far. The corresponding field must be 0. 2171 2172 21734.71 KVM_CREATE_PIT2 2174 2175Capability: KVM_CAP_PIT2 2176Architectures: x86 2177Type: vm ioctl 2178Parameters: struct kvm_pit_config (in) 2179Returns: 0 on success, -1 on error 2180 2181Creates an in-kernel device model for the i8254 PIT. This call is only valid 2182after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following 2183parameters have to be passed: 2184 2185struct kvm_pit_config { 2186 __u32 flags; 2187 __u32 pad[15]; 2188}; 2189 2190Valid flags are: 2191 2192#define KVM_PIT_SPEAKER_DUMMY 1 /* emulate speaker port stub */ 2193 2194PIT timer interrupts may use a per-VM kernel thread for injection. If it 2195exists, this thread will have a name of the following pattern: 2196 2197kvm-pit/<owner-process-pid> 2198 2199When running a guest with elevated priorities, the scheduling parameters of 2200this thread may have to be adjusted accordingly. 2201 2202This IOCTL replaces the obsolete KVM_CREATE_PIT. 2203 2204 22054.72 KVM_GET_PIT2 2206 2207Capability: KVM_CAP_PIT_STATE2 2208Architectures: x86 2209Type: vm ioctl 2210Parameters: struct kvm_pit_state2 (out) 2211Returns: 0 on success, -1 on error 2212 2213Retrieves the state of the in-kernel PIT model. Only valid after 2214KVM_CREATE_PIT2. The state is returned in the following structure: 2215 2216struct kvm_pit_state2 { 2217 struct kvm_pit_channel_state channels[3]; 2218 __u32 flags; 2219 __u32 reserved[9]; 2220}; 2221 2222Valid flags are: 2223 2224/* disable PIT in HPET legacy mode */ 2225#define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001 2226 2227This IOCTL replaces the obsolete KVM_GET_PIT. 2228 2229 22304.73 KVM_SET_PIT2 2231 2232Capability: KVM_CAP_PIT_STATE2 2233Architectures: x86 2234Type: vm ioctl 2235Parameters: struct kvm_pit_state2 (in) 2236Returns: 0 on success, -1 on error 2237 2238Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2. 2239See KVM_GET_PIT2 for details on struct kvm_pit_state2. 2240 2241This IOCTL replaces the obsolete KVM_SET_PIT. 2242 2243 22444.74 KVM_PPC_GET_SMMU_INFO 2245 2246Capability: KVM_CAP_PPC_GET_SMMU_INFO 2247Architectures: powerpc 2248Type: vm ioctl 2249Parameters: None 2250Returns: 0 on success, -1 on error 2251 2252This populates and returns a structure describing the features of 2253the "Server" class MMU emulation supported by KVM. 2254This can in turn be used by userspace to generate the appropriate 2255device-tree properties for the guest operating system. 2256 2257The structure contains some global information, followed by an 2258array of supported segment page sizes: 2259 2260 struct kvm_ppc_smmu_info { 2261 __u64 flags; 2262 __u32 slb_size; 2263 __u32 pad; 2264 struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2265 }; 2266 2267The supported flags are: 2268 2269 - KVM_PPC_PAGE_SIZES_REAL: 2270 When that flag is set, guest page sizes must "fit" the backing 2271 store page sizes. When not set, any page size in the list can 2272 be used regardless of how they are backed by userspace. 2273 2274 - KVM_PPC_1T_SEGMENTS 2275 The emulated MMU supports 1T segments in addition to the 2276 standard 256M ones. 2277 2278The "slb_size" field indicates how many SLB entries are supported 2279 2280The "sps" array contains 8 entries indicating the supported base 2281page sizes for a segment in increasing order. Each entry is defined 2282as follow: 2283 2284 struct kvm_ppc_one_seg_page_size { 2285 __u32 page_shift; /* Base page shift of segment (or 0) */ 2286 __u32 slb_enc; /* SLB encoding for BookS */ 2287 struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2288 }; 2289 2290An entry with a "page_shift" of 0 is unused. Because the array is 2291organized in increasing order, a lookup can stop when encoutering 2292such an entry. 2293 2294The "slb_enc" field provides the encoding to use in the SLB for the 2295page size. The bits are in positions such as the value can directly 2296be OR'ed into the "vsid" argument of the slbmte instruction. 2297 2298The "enc" array is a list which for each of those segment base page 2299size provides the list of supported actual page sizes (which can be 2300only larger or equal to the base page size), along with the 2301corresponding encoding in the hash PTE. Similarly, the array is 23028 entries sorted by increasing sizes and an entry with a "0" shift 2303is an empty entry and a terminator: 2304 2305 struct kvm_ppc_one_page_size { 2306 __u32 page_shift; /* Page shift (or 0) */ 2307 __u32 pte_enc; /* Encoding in the HPTE (>>12) */ 2308 }; 2309 2310The "pte_enc" field provides a value that can OR'ed into the hash 2311PTE's RPN field (ie, it needs to be shifted left by 12 to OR it 2312into the hash PTE second double word). 2313 23144.75 KVM_IRQFD 2315 2316Capability: KVM_CAP_IRQFD 2317Architectures: x86 s390 arm arm64 2318Type: vm ioctl 2319Parameters: struct kvm_irqfd (in) 2320Returns: 0 on success, -1 on error 2321 2322Allows setting an eventfd to directly trigger a guest interrupt. 2323kvm_irqfd.fd specifies the file descriptor to use as the eventfd and 2324kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When 2325an event is triggered on the eventfd, an interrupt is injected into 2326the guest using the specified gsi pin. The irqfd is removed using 2327the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd 2328and kvm_irqfd.gsi. 2329 2330With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify 2331mechanism allowing emulation of level-triggered, irqfd-based 2332interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an 2333additional eventfd in the kvm_irqfd.resamplefd field. When operating 2334in resample mode, posting of an interrupt through kvm_irq.fd asserts 2335the specified gsi in the irqchip. When the irqchip is resampled, such 2336as from an EOI, the gsi is de-asserted and the user is notified via 2337kvm_irqfd.resamplefd. It is the user's responsibility to re-queue 2338the interrupt if the device making use of it still requires service. 2339Note that closing the resamplefd is not sufficient to disable the 2340irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment 2341and need not be specified with KVM_IRQFD_FLAG_DEASSIGN. 2342 2343On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared 2344Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is 2345given by gsi + 32. 2346 23474.76 KVM_PPC_ALLOCATE_HTAB 2348 2349Capability: KVM_CAP_PPC_ALLOC_HTAB 2350Architectures: powerpc 2351Type: vm ioctl 2352Parameters: Pointer to u32 containing hash table order (in/out) 2353Returns: 0 on success, -1 on error 2354 2355This requests the host kernel to allocate an MMU hash table for a 2356guest using the PAPR paravirtualization interface. This only does 2357anything if the kernel is configured to use the Book 3S HV style of 2358virtualization. Otherwise the capability doesn't exist and the ioctl 2359returns an ENOTTY error. The rest of this description assumes Book 3S 2360HV. 2361 2362There must be no vcpus running when this ioctl is called; if there 2363are, it will do nothing and return an EBUSY error. 2364 2365The parameter is a pointer to a 32-bit unsigned integer variable 2366containing the order (log base 2) of the desired size of the hash 2367table, which must be between 18 and 46. On successful return from the 2368ioctl, it will have been updated with the order of the hash table that 2369was allocated. 2370 2371If no hash table has been allocated when any vcpu is asked to run 2372(with the KVM_RUN ioctl), the host kernel will allocate a 2373default-sized hash table (16 MB). 2374 2375If this ioctl is called when a hash table has already been allocated, 2376the kernel will clear out the existing hash table (zero all HPTEs) and 2377return the hash table order in the parameter. (If the guest is using 2378the virtualized real-mode area (VRMA) facility, the kernel will 2379re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.) 2380 23814.77 KVM_S390_INTERRUPT 2382 2383Capability: basic 2384Architectures: s390 2385Type: vm ioctl, vcpu ioctl 2386Parameters: struct kvm_s390_interrupt (in) 2387Returns: 0 on success, -1 on error 2388 2389Allows to inject an interrupt to the guest. Interrupts can be floating 2390(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type. 2391 2392Interrupt parameters are passed via kvm_s390_interrupt: 2393 2394struct kvm_s390_interrupt { 2395 __u32 type; 2396 __u32 parm; 2397 __u64 parm64; 2398}; 2399 2400type can be one of the following: 2401 2402KVM_S390_SIGP_STOP (vcpu) - sigp stop; optional flags in parm 2403KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm 2404KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm 2405KVM_S390_RESTART (vcpu) - restart 2406KVM_S390_INT_CLOCK_COMP (vcpu) - clock comparator interrupt 2407KVM_S390_INT_CPU_TIMER (vcpu) - CPU timer interrupt 2408KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt 2409 parameters in parm and parm64 2410KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm 2411KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm 2412KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm 2413KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an 2414 I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel); 2415 I/O interruption parameters in parm (subchannel) and parm64 (intparm, 2416 interruption subclass) 2417KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm, 2418 machine check interrupt code in parm64 (note that 2419 machine checks needing further payload are not 2420 supported by this ioctl) 2421 2422Note that the vcpu ioctl is asynchronous to vcpu execution. 2423 24244.78 KVM_PPC_GET_HTAB_FD 2425 2426Capability: KVM_CAP_PPC_HTAB_FD 2427Architectures: powerpc 2428Type: vm ioctl 2429Parameters: Pointer to struct kvm_get_htab_fd (in) 2430Returns: file descriptor number (>= 0) on success, -1 on error 2431 2432This returns a file descriptor that can be used either to read out the 2433entries in the guest's hashed page table (HPT), or to write entries to 2434initialize the HPT. The returned fd can only be written to if the 2435KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and 2436can only be read if that bit is clear. The argument struct looks like 2437this: 2438 2439/* For KVM_PPC_GET_HTAB_FD */ 2440struct kvm_get_htab_fd { 2441 __u64 flags; 2442 __u64 start_index; 2443 __u64 reserved[2]; 2444}; 2445 2446/* Values for kvm_get_htab_fd.flags */ 2447#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1) 2448#define KVM_GET_HTAB_WRITE ((__u64)0x2) 2449 2450The `start_index' field gives the index in the HPT of the entry at 2451which to start reading. It is ignored when writing. 2452 2453Reads on the fd will initially supply information about all 2454"interesting" HPT entries. Interesting entries are those with the 2455bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise 2456all entries. When the end of the HPT is reached, the read() will 2457return. If read() is called again on the fd, it will start again from 2458the beginning of the HPT, but will only return HPT entries that have 2459changed since they were last read. 2460 2461Data read or written is structured as a header (8 bytes) followed by a 2462series of valid HPT entries (16 bytes) each. The header indicates how 2463many valid HPT entries there are and how many invalid entries follow 2464the valid entries. The invalid entries are not represented explicitly 2465in the stream. The header format is: 2466 2467struct kvm_get_htab_header { 2468 __u32 index; 2469 __u16 n_valid; 2470 __u16 n_invalid; 2471}; 2472 2473Writes to the fd create HPT entries starting at the index given in the 2474header; first `n_valid' valid entries with contents from the data 2475written, then `n_invalid' invalid entries, invalidating any previously 2476valid entries found. 2477 24784.79 KVM_CREATE_DEVICE 2479 2480Capability: KVM_CAP_DEVICE_CTRL 2481Type: vm ioctl 2482Parameters: struct kvm_create_device (in/out) 2483Returns: 0 on success, -1 on error 2484Errors: 2485 ENODEV: The device type is unknown or unsupported 2486 EEXIST: Device already created, and this type of device may not 2487 be instantiated multiple times 2488 2489 Other error conditions may be defined by individual device types or 2490 have their standard meanings. 2491 2492Creates an emulated device in the kernel. The file descriptor returned 2493in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. 2494 2495If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the 2496device type is supported (not necessarily whether it can be created 2497in the current vm). 2498 2499Individual devices should not define flags. Attributes should be used 2500for specifying any behavior that is not implied by the device type 2501number. 2502 2503struct kvm_create_device { 2504 __u32 type; /* in: KVM_DEV_TYPE_xxx */ 2505 __u32 fd; /* out: device handle */ 2506 __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ 2507}; 2508 25094.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR 2510 2511Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device 2512Type: device ioctl, vm ioctl 2513Parameters: struct kvm_device_attr 2514Returns: 0 on success, -1 on error 2515Errors: 2516 ENXIO: The group or attribute is unknown/unsupported for this device 2517 EPERM: The attribute cannot (currently) be accessed this way 2518 (e.g. read-only attribute, or attribute that only makes 2519 sense when the device is in a different state) 2520 2521 Other error conditions may be defined by individual device types. 2522 2523Gets/sets a specified piece of device configuration and/or state. The 2524semantics are device-specific. See individual device documentation in 2525the "devices" directory. As with ONE_REG, the size of the data 2526transferred is defined by the particular attribute. 2527 2528struct kvm_device_attr { 2529 __u32 flags; /* no flags currently defined */ 2530 __u32 group; /* device-defined */ 2531 __u64 attr; /* group-defined */ 2532 __u64 addr; /* userspace address of attr data */ 2533}; 2534 25354.81 KVM_HAS_DEVICE_ATTR 2536 2537Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device 2538Type: device ioctl, vm ioctl 2539Parameters: struct kvm_device_attr 2540Returns: 0 on success, -1 on error 2541Errors: 2542 ENXIO: The group or attribute is unknown/unsupported for this device 2543 2544Tests whether a device supports a particular attribute. A successful 2545return indicates the attribute is implemented. It does not necessarily 2546indicate that the attribute can be read or written in the device's 2547current state. "addr" is ignored. 2548 25494.82 KVM_ARM_VCPU_INIT 2550 2551Capability: basic 2552Architectures: arm, arm64 2553Type: vcpu ioctl 2554Parameters: struct kvm_vcpu_init (in) 2555Returns: 0 on success; -1 on error 2556Errors: 2557 EINVAL: the target is unknown, or the combination of features is invalid. 2558 ENOENT: a features bit specified is unknown. 2559 2560This tells KVM what type of CPU to present to the guest, and what 2561optional features it should have. This will cause a reset of the cpu 2562registers to their initial values. If this is not called, KVM_RUN will 2563return ENOEXEC for that vcpu. 2564 2565Note that because some registers reflect machine topology, all vcpus 2566should be created before this ioctl is invoked. 2567 2568Userspace can call this function multiple times for a given vcpu, including 2569after the vcpu has been run. This will reset the vcpu to its initial 2570state. All calls to this function after the initial call must use the same 2571target and same set of feature flags, otherwise EINVAL will be returned. 2572 2573Possible features: 2574 - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. 2575 Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on 2576 and execute guest code when KVM_RUN is called. 2577 - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. 2578 Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). 2579 - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. 2580 Depends on KVM_CAP_ARM_PSCI_0_2. 2581 2582 25834.83 KVM_ARM_PREFERRED_TARGET 2584 2585Capability: basic 2586Architectures: arm, arm64 2587Type: vm ioctl 2588Parameters: struct struct kvm_vcpu_init (out) 2589Returns: 0 on success; -1 on error 2590Errors: 2591 ENODEV: no preferred target available for the host 2592 2593This queries KVM for preferred CPU target type which can be emulated 2594by KVM on underlying host. 2595 2596The ioctl returns struct kvm_vcpu_init instance containing information 2597about preferred CPU target type and recommended features for it. The 2598kvm_vcpu_init->features bitmap returned will have feature bits set if 2599the preferred target recommends setting these features, but this is 2600not mandatory. 2601 2602The information returned by this ioctl can be used to prepare an instance 2603of struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in 2604in VCPU matching underlying host. 2605 2606 26074.84 KVM_GET_REG_LIST 2608 2609Capability: basic 2610Architectures: arm, arm64, mips 2611Type: vcpu ioctl 2612Parameters: struct kvm_reg_list (in/out) 2613Returns: 0 on success; -1 on error 2614Errors: 2615 E2BIG: the reg index list is too big to fit in the array specified by 2616 the user (the number required will be written into n). 2617 2618struct kvm_reg_list { 2619 __u64 n; /* number of registers in reg[] */ 2620 __u64 reg[0]; 2621}; 2622 2623This ioctl returns the guest registers that are supported for the 2624KVM_GET_ONE_REG/KVM_SET_ONE_REG calls. 2625 2626 26274.85 KVM_ARM_SET_DEVICE_ADDR (deprecated) 2628 2629Capability: KVM_CAP_ARM_SET_DEVICE_ADDR 2630Architectures: arm, arm64 2631Type: vm ioctl 2632Parameters: struct kvm_arm_device_address (in) 2633Returns: 0 on success, -1 on error 2634Errors: 2635 ENODEV: The device id is unknown 2636 ENXIO: Device not supported on current system 2637 EEXIST: Address already set 2638 E2BIG: Address outside guest physical address space 2639 EBUSY: Address overlaps with other device range 2640 2641struct kvm_arm_device_addr { 2642 __u64 id; 2643 __u64 addr; 2644}; 2645 2646Specify a device address in the guest's physical address space where guests 2647can access emulated or directly exposed devices, which the host kernel needs 2648to know about. The id field is an architecture specific identifier for a 2649specific device. 2650 2651ARM/arm64 divides the id field into two parts, a device id and an 2652address type id specific to the individual device. 2653 2654 bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 | 2655 field: | 0x00000000 | device id | addr type id | 2656 2657ARM/arm64 currently only require this when using the in-kernel GIC 2658support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2 2659as the device id. When setting the base address for the guest's 2660mapping of the VGIC virtual CPU and distributor interface, the ioctl 2661must be called after calling KVM_CREATE_IRQCHIP, but before calling 2662KVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the 2663base addresses will return -EEXIST. 2664 2665Note, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API 2666should be used instead. 2667 2668 26694.86 KVM_PPC_RTAS_DEFINE_TOKEN 2670 2671Capability: KVM_CAP_PPC_RTAS 2672Architectures: ppc 2673Type: vm ioctl 2674Parameters: struct kvm_rtas_token_args 2675Returns: 0 on success, -1 on error 2676 2677Defines a token value for a RTAS (Run Time Abstraction Services) 2678service in order to allow it to be handled in the kernel. The 2679argument struct gives the name of the service, which must be the name 2680of a service that has a kernel-side implementation. If the token 2681value is non-zero, it will be associated with that service, and 2682subsequent RTAS calls by the guest specifying that token will be 2683handled by the kernel. If the token value is 0, then any token 2684associated with the service will be forgotten, and subsequent RTAS 2685calls by the guest for that service will be passed to userspace to be 2686handled. 2687 26884.87 KVM_SET_GUEST_DEBUG 2689 2690Capability: KVM_CAP_SET_GUEST_DEBUG 2691Architectures: x86, s390, ppc, arm64 2692Type: vcpu ioctl 2693Parameters: struct kvm_guest_debug (in) 2694Returns: 0 on success; -1 on error 2695 2696struct kvm_guest_debug { 2697 __u32 control; 2698 __u32 pad; 2699 struct kvm_guest_debug_arch arch; 2700}; 2701 2702Set up the processor specific debug registers and configure vcpu for 2703handling guest debug events. There are two parts to the structure, the 2704first a control bitfield indicates the type of debug events to handle 2705when running. Common control bits are: 2706 2707 - KVM_GUESTDBG_ENABLE: guest debugging is enabled 2708 - KVM_GUESTDBG_SINGLESTEP: the next run should single-step 2709 2710The top 16 bits of the control field are architecture specific control 2711flags which can include the following: 2712 2713 - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64] 2714 - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64] 2715 - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] 2716 - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] 2717 - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390] 2718 2719For example KVM_GUESTDBG_USE_SW_BP indicates that software breakpoints 2720are enabled in memory so we need to ensure breakpoint exceptions are 2721correctly trapped and the KVM run loop exits at the breakpoint and not 2722running off into the normal guest vector. For KVM_GUESTDBG_USE_HW_BP 2723we need to ensure the guest vCPUs architecture specific registers are 2724updated to the correct (supplied) values. 2725 2726The second part of the structure is architecture specific and 2727typically contains a set of debug registers. 2728 2729For arm64 the number of debug registers is implementation defined and 2730can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and 2731KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number 2732indicating the number of supported registers. 2733 2734When debug events exit the main run loop with the reason 2735KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run 2736structure containing architecture specific debug information. 2737 27384.88 KVM_GET_EMULATED_CPUID 2739 2740Capability: KVM_CAP_EXT_EMUL_CPUID 2741Architectures: x86 2742Type: system ioctl 2743Parameters: struct kvm_cpuid2 (in/out) 2744Returns: 0 on success, -1 on error 2745 2746struct kvm_cpuid2 { 2747 __u32 nent; 2748 __u32 flags; 2749 struct kvm_cpuid_entry2 entries[0]; 2750}; 2751 2752The member 'flags' is used for passing flags from userspace. 2753 2754#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 2755#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) 2756#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) 2757 2758struct kvm_cpuid_entry2 { 2759 __u32 function; 2760 __u32 index; 2761 __u32 flags; 2762 __u32 eax; 2763 __u32 ebx; 2764 __u32 ecx; 2765 __u32 edx; 2766 __u32 padding[3]; 2767}; 2768 2769This ioctl returns x86 cpuid features which are emulated by 2770kvm.Userspace can use the information returned by this ioctl to query 2771which features are emulated by kvm instead of being present natively. 2772 2773Userspace invokes KVM_GET_EMULATED_CPUID by passing a kvm_cpuid2 2774structure with the 'nent' field indicating the number of entries in 2775the variable-size array 'entries'. If the number of entries is too low 2776to describe the cpu capabilities, an error (E2BIG) is returned. If the 2777number is too high, the 'nent' field is adjusted and an error (ENOMEM) 2778is returned. If the number is just right, the 'nent' field is adjusted 2779to the number of valid entries in the 'entries' array, which is then 2780filled. 2781 2782The entries returned are the set CPUID bits of the respective features 2783which kvm emulates, as returned by the CPUID instruction, with unknown 2784or unsupported feature bits cleared. 2785 2786Features like x2apic, for example, may not be present in the host cpu 2787but are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be 2788emulated efficiently and thus not included here. 2789 2790The fields in each entry are defined as follows: 2791 2792 function: the eax value used to obtain the entry 2793 index: the ecx value used to obtain the entry (for entries that are 2794 affected by ecx) 2795 flags: an OR of zero or more of the following: 2796 KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 2797 if the index field is valid 2798 KVM_CPUID_FLAG_STATEFUL_FUNC: 2799 if cpuid for this function returns different values for successive 2800 invocations; there will be several entries with the same function, 2801 all with this flag set 2802 KVM_CPUID_FLAG_STATE_READ_NEXT: 2803 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is 2804 the first entry to be read by a cpu 2805 eax, ebx, ecx, edx: the values returned by the cpuid instruction for 2806 this function/index combination 2807 28084.89 KVM_S390_MEM_OP 2809 2810Capability: KVM_CAP_S390_MEM_OP 2811Architectures: s390 2812Type: vcpu ioctl 2813Parameters: struct kvm_s390_mem_op (in) 2814Returns: = 0 on success, 2815 < 0 on generic error (e.g. -EFAULT or -ENOMEM), 2816 > 0 if an exception occurred while walking the page tables 2817 2818Read or write data from/to the logical (virtual) memory of a VCPU. 2819 2820Parameters are specified via the following structure: 2821 2822struct kvm_s390_mem_op { 2823 __u64 gaddr; /* the guest address */ 2824 __u64 flags; /* flags */ 2825 __u32 size; /* amount of bytes */ 2826 __u32 op; /* type of operation */ 2827 __u64 buf; /* buffer in userspace */ 2828 __u8 ar; /* the access register number */ 2829 __u8 reserved[31]; /* should be set to 0 */ 2830}; 2831 2832The type of operation is specified in the "op" field. It is either 2833KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or 2834KVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The 2835KVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check 2836whether the corresponding memory access would create an access exception 2837(without touching the data in the memory at the destination). In case an 2838access exception occurred while walking the MMU tables of the guest, the 2839ioctl returns a positive error number to indicate the type of exception. 2840This exception is also raised directly at the corresponding VCPU if the 2841flag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field. 2842 2843The start address of the memory region has to be specified in the "gaddr" 2844field, and the length of the region in the "size" field. "buf" is the buffer 2845supplied by the userspace application where the read data should be written 2846to for KVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written 2847is stored for a KVM_S390_MEMOP_LOGICAL_WRITE. "buf" is unused and can be NULL 2848when KVM_S390_MEMOP_F_CHECK_ONLY is specified. "ar" designates the access 2849register number to be used. 2850 2851The "reserved" field is meant for future extensions. It is not used by 2852KVM with the currently defined set of flags. 2853 28544.90 KVM_S390_GET_SKEYS 2855 2856Capability: KVM_CAP_S390_SKEYS 2857Architectures: s390 2858Type: vm ioctl 2859Parameters: struct kvm_s390_skeys 2860Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage 2861 keys, negative value on error 2862 2863This ioctl is used to get guest storage key values on the s390 2864architecture. The ioctl takes parameters via the kvm_s390_skeys struct. 2865 2866struct kvm_s390_skeys { 2867 __u64 start_gfn; 2868 __u64 count; 2869 __u64 skeydata_addr; 2870 __u32 flags; 2871 __u32 reserved[9]; 2872}; 2873 2874The start_gfn field is the number of the first guest frame whose storage keys 2875you want to get. 2876 2877The count field is the number of consecutive frames (starting from start_gfn) 2878whose storage keys to get. The count field must be at least 1 and the maximum 2879allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 2880will cause the ioctl to return -EINVAL. 2881 2882The skeydata_addr field is the address to a buffer large enough to hold count 2883bytes. This buffer will be filled with storage key data by the ioctl. 2884 28854.91 KVM_S390_SET_SKEYS 2886 2887Capability: KVM_CAP_S390_SKEYS 2888Architectures: s390 2889Type: vm ioctl 2890Parameters: struct kvm_s390_skeys 2891Returns: 0 on success, negative value on error 2892 2893This ioctl is used to set guest storage key values on the s390 2894architecture. The ioctl takes parameters via the kvm_s390_skeys struct. 2895See section on KVM_S390_GET_SKEYS for struct definition. 2896 2897The start_gfn field is the number of the first guest frame whose storage keys 2898you want to set. 2899 2900The count field is the number of consecutive frames (starting from start_gfn) 2901whose storage keys to get. The count field must be at least 1 and the maximum 2902allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 2903will cause the ioctl to return -EINVAL. 2904 2905The skeydata_addr field is the address to a buffer containing count bytes of 2906storage keys. Each byte in the buffer will be set as the storage key for a 2907single frame starting at start_gfn for count frames. 2908 2909Note: If any architecturally invalid key value is found in the given data then 2910the ioctl will return -EINVAL. 2911 29124.92 KVM_S390_IRQ 2913 2914Capability: KVM_CAP_S390_INJECT_IRQ 2915Architectures: s390 2916Type: vcpu ioctl 2917Parameters: struct kvm_s390_irq (in) 2918Returns: 0 on success, -1 on error 2919Errors: 2920 EINVAL: interrupt type is invalid 2921 type is KVM_S390_SIGP_STOP and flag parameter is invalid value 2922 type is KVM_S390_INT_EXTERNAL_CALL and code is bigger 2923 than the maximum of VCPUs 2924 EBUSY: type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped 2925 type is KVM_S390_SIGP_STOP and a stop irq is already pending 2926 type is KVM_S390_INT_EXTERNAL_CALL and an external call interrupt 2927 is already pending 2928 2929Allows to inject an interrupt to the guest. 2930 2931Using struct kvm_s390_irq as a parameter allows 2932to inject additional payload which is not 2933possible via KVM_S390_INTERRUPT. 2934 2935Interrupt parameters are passed via kvm_s390_irq: 2936 2937struct kvm_s390_irq { 2938 __u64 type; 2939 union { 2940 struct kvm_s390_io_info io; 2941 struct kvm_s390_ext_info ext; 2942 struct kvm_s390_pgm_info pgm; 2943 struct kvm_s390_emerg_info emerg; 2944 struct kvm_s390_extcall_info extcall; 2945 struct kvm_s390_prefix_info prefix; 2946 struct kvm_s390_stop_info stop; 2947 struct kvm_s390_mchk_info mchk; 2948 char reserved[64]; 2949 } u; 2950}; 2951 2952type can be one of the following: 2953 2954KVM_S390_SIGP_STOP - sigp stop; parameter in .stop 2955KVM_S390_PROGRAM_INT - program check; parameters in .pgm 2956KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix 2957KVM_S390_RESTART - restart; no parameters 2958KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters 2959KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters 2960KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg 2961KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall 2962KVM_S390_MCHK - machine check interrupt; parameters in .mchk 2963 2964 2965Note that the vcpu ioctl is asynchronous to vcpu execution. 2966 29674.94 KVM_S390_GET_IRQ_STATE 2968 2969Capability: KVM_CAP_S390_IRQ_STATE 2970Architectures: s390 2971Type: vcpu ioctl 2972Parameters: struct kvm_s390_irq_state (out) 2973Returns: >= number of bytes copied into buffer, 2974 -EINVAL if buffer size is 0, 2975 -ENOBUFS if buffer size is too small to fit all pending interrupts, 2976 -EFAULT if the buffer address was invalid 2977 2978This ioctl allows userspace to retrieve the complete state of all currently 2979pending interrupts in a single buffer. Use cases include migration 2980and introspection. The parameter structure contains the address of a 2981userspace buffer and its length: 2982 2983struct kvm_s390_irq_state { 2984 __u64 buf; 2985 __u32 flags; 2986 __u32 len; 2987 __u32 reserved[4]; 2988}; 2989 2990Userspace passes in the above struct and for each pending interrupt a 2991struct kvm_s390_irq is copied to the provided buffer. 2992 2993If -ENOBUFS is returned the buffer provided was too small and userspace 2994may retry with a bigger buffer. 2995 29964.95 KVM_S390_SET_IRQ_STATE 2997 2998Capability: KVM_CAP_S390_IRQ_STATE 2999Architectures: s390 3000Type: vcpu ioctl 3001Parameters: struct kvm_s390_irq_state (in) 3002Returns: 0 on success, 3003 -EFAULT if the buffer address was invalid, 3004 -EINVAL for an invalid buffer length (see below), 3005 -EBUSY if there were already interrupts pending, 3006 errors occurring when actually injecting the 3007 interrupt. See KVM_S390_IRQ. 3008 3009This ioctl allows userspace to set the complete state of all cpu-local 3010interrupts currently pending for the vcpu. It is intended for restoring 3011interrupt state after a migration. The input parameter is a userspace buffer 3012containing a struct kvm_s390_irq_state: 3013 3014struct kvm_s390_irq_state { 3015 __u64 buf; 3016 __u32 len; 3017 __u32 pad; 3018}; 3019 3020The userspace memory referenced by buf contains a struct kvm_s390_irq 3021for each interrupt to be injected into the guest. 3022If one of the interrupts could not be injected for some reason the 3023ioctl aborts. 3024 3025len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0 3026and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq), 3027which is the maximum number of possibly pending cpu-local interrupts. 3028 30294.90 KVM_SMI 3030 3031Capability: KVM_CAP_X86_SMM 3032Architectures: x86 3033Type: vcpu ioctl 3034Parameters: none 3035Returns: 0 on success, -1 on error 3036 3037Queues an SMI on the thread's vcpu. 3038 30395. The kvm_run structure 3040------------------------ 3041 3042Application code obtains a pointer to the kvm_run structure by 3043mmap()ing a vcpu fd. From that point, application code can control 3044execution by changing fields in kvm_run prior to calling the KVM_RUN 3045ioctl, and obtain information about the reason KVM_RUN returned by 3046looking up structure members. 3047 3048struct kvm_run { 3049 /* in */ 3050 __u8 request_interrupt_window; 3051 3052Request that KVM_RUN return when it becomes possible to inject external 3053interrupts into the guest. Useful in conjunction with KVM_INTERRUPT. 3054 3055 __u8 padding1[7]; 3056 3057 /* out */ 3058 __u32 exit_reason; 3059 3060When KVM_RUN has returned successfully (return value 0), this informs 3061application code why KVM_RUN has returned. Allowable values for this 3062field are detailed below. 3063 3064 __u8 ready_for_interrupt_injection; 3065 3066If request_interrupt_window has been specified, this field indicates 3067an interrupt can be injected now with KVM_INTERRUPT. 3068 3069 __u8 if_flag; 3070 3071The value of the current interrupt flag. Only valid if in-kernel 3072local APIC is not used. 3073 3074 __u16 flags; 3075 3076More architecture-specific flags detailing state of the VCPU that may 3077affect the device's behavior. The only currently defined flag is 3078KVM_RUN_X86_SMM, which is valid on x86 machines and is set if the 3079VCPU is in system management mode. 3080 3081 /* in (pre_kvm_run), out (post_kvm_run) */ 3082 __u64 cr8; 3083 3084The value of the cr8 register. Only valid if in-kernel local APIC is 3085not used. Both input and output. 3086 3087 __u64 apic_base; 3088 3089The value of the APIC BASE msr. Only valid if in-kernel local 3090APIC is not used. Both input and output. 3091 3092 union { 3093 /* KVM_EXIT_UNKNOWN */ 3094 struct { 3095 __u64 hardware_exit_reason; 3096 } hw; 3097 3098If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown 3099reasons. Further architecture-specific information is available in 3100hardware_exit_reason. 3101 3102 /* KVM_EXIT_FAIL_ENTRY */ 3103 struct { 3104 __u64 hardware_entry_failure_reason; 3105 } fail_entry; 3106 3107If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due 3108to unknown reasons. Further architecture-specific information is 3109available in hardware_entry_failure_reason. 3110 3111 /* KVM_EXIT_EXCEPTION */ 3112 struct { 3113 __u32 exception; 3114 __u32 error_code; 3115 } ex; 3116 3117Unused. 3118 3119 /* KVM_EXIT_IO */ 3120 struct { 3121#define KVM_EXIT_IO_IN 0 3122#define KVM_EXIT_IO_OUT 1 3123 __u8 direction; 3124 __u8 size; /* bytes */ 3125 __u16 port; 3126 __u32 count; 3127 __u64 data_offset; /* relative to kvm_run start */ 3128 } io; 3129 3130If exit_reason is KVM_EXIT_IO, then the vcpu has 3131executed a port I/O instruction which could not be satisfied by kvm. 3132data_offset describes where the data is located (KVM_EXIT_IO_OUT) or 3133where kvm expects application code to place the data for the next 3134KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array. 3135 3136 /* KVM_EXIT_DEBUG */ 3137 struct { 3138 struct kvm_debug_exit_arch arch; 3139 } debug; 3140 3141If the exit_reason is KVM_EXIT_DEBUG, then a vcpu is processing a debug event 3142for which architecture specific information is returned. 3143 3144 /* KVM_EXIT_MMIO */ 3145 struct { 3146 __u64 phys_addr; 3147 __u8 data[8]; 3148 __u32 len; 3149 __u8 is_write; 3150 } mmio; 3151 3152If exit_reason is KVM_EXIT_MMIO, then the vcpu has 3153executed a memory-mapped I/O instruction which could not be satisfied 3154by kvm. The 'data' member contains the written data if 'is_write' is 3155true, and should be filled by application code otherwise. 3156 3157The 'data' member contains, in its first 'len' bytes, the value as it would 3158appear if the VCPU performed a load or store of the appropriate width directly 3159to the byte array. 3160 3161NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and 3162 KVM_EXIT_EPR the corresponding 3163operations are complete (and guest state is consistent) only after userspace 3164has re-entered the kernel with KVM_RUN. The kernel side will first finish 3165incomplete operations and then check for pending signals. Userspace 3166can re-enter the guest with an unmasked signal pending to complete 3167pending operations. 3168 3169 /* KVM_EXIT_HYPERCALL */ 3170 struct { 3171 __u64 nr; 3172 __u64 args[6]; 3173 __u64 ret; 3174 __u32 longmode; 3175 __u32 pad; 3176 } hypercall; 3177 3178Unused. This was once used for 'hypercall to userspace'. To implement 3179such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390). 3180Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO. 3181 3182 /* KVM_EXIT_TPR_ACCESS */ 3183 struct { 3184 __u64 rip; 3185 __u32 is_write; 3186 __u32 pad; 3187 } tpr_access; 3188 3189To be documented (KVM_TPR_ACCESS_REPORTING). 3190 3191 /* KVM_EXIT_S390_SIEIC */ 3192 struct { 3193 __u8 icptcode; 3194 __u64 mask; /* psw upper half */ 3195 __u64 addr; /* psw lower half */ 3196 __u16 ipa; 3197 __u32 ipb; 3198 } s390_sieic; 3199 3200s390 specific. 3201 3202 /* KVM_EXIT_S390_RESET */ 3203#define KVM_S390_RESET_POR 1 3204#define KVM_S390_RESET_CLEAR 2 3205#define KVM_S390_RESET_SUBSYSTEM 4 3206#define KVM_S390_RESET_CPU_INIT 8 3207#define KVM_S390_RESET_IPL 16 3208 __u64 s390_reset_flags; 3209 3210s390 specific. 3211 3212 /* KVM_EXIT_S390_UCONTROL */ 3213 struct { 3214 __u64 trans_exc_code; 3215 __u32 pgm_code; 3216 } s390_ucontrol; 3217 3218s390 specific. A page fault has occurred for a user controlled virtual 3219machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be 3220resolved by the kernel. 3221The program code and the translation exception code that were placed 3222in the cpu's lowcore are presented here as defined by the z Architecture 3223Principles of Operation Book in the Chapter for Dynamic Address Translation 3224(DAT) 3225 3226 /* KVM_EXIT_DCR */ 3227 struct { 3228 __u32 dcrn; 3229 __u32 data; 3230 __u8 is_write; 3231 } dcr; 3232 3233Deprecated - was used for 440 KVM. 3234 3235 /* KVM_EXIT_OSI */ 3236 struct { 3237 __u64 gprs[32]; 3238 } osi; 3239 3240MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch 3241hypercalls and exit with this exit struct that contains all the guest gprs. 3242 3243If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall. 3244Userspace can now handle the hypercall and when it's done modify the gprs as 3245necessary. Upon guest entry all guest GPRs will then be replaced by the values 3246in this struct. 3247 3248 /* KVM_EXIT_PAPR_HCALL */ 3249 struct { 3250 __u64 nr; 3251 __u64 ret; 3252 __u64 args[9]; 3253 } papr_hcall; 3254 3255This is used on 64-bit PowerPC when emulating a pSeries partition, 3256e.g. with the 'pseries' machine type in qemu. It occurs when the 3257guest does a hypercall using the 'sc 1' instruction. The 'nr' field 3258contains the hypercall number (from the guest R3), and 'args' contains 3259the arguments (from the guest R4 - R12). Userspace should put the 3260return code in 'ret' and any extra returned values in args[]. 3261The possible hypercalls are defined in the Power Architecture Platform 3262Requirements (PAPR) document available from www.power.org (free 3263developer registration required to access it). 3264 3265 /* KVM_EXIT_S390_TSCH */ 3266 struct { 3267 __u16 subchannel_id; 3268 __u16 subchannel_nr; 3269 __u32 io_int_parm; 3270 __u32 io_int_word; 3271 __u32 ipb; 3272 __u8 dequeued; 3273 } s390_tsch; 3274 3275s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled 3276and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O 3277interrupt for the target subchannel has been dequeued and subchannel_id, 3278subchannel_nr, io_int_parm and io_int_word contain the parameters for that 3279interrupt. ipb is needed for instruction parameter decoding. 3280 3281 /* KVM_EXIT_EPR */ 3282 struct { 3283 __u32 epr; 3284 } epr; 3285 3286On FSL BookE PowerPC chips, the interrupt controller has a fast patch 3287interrupt acknowledge path to the core. When the core successfully 3288delivers an interrupt, it automatically populates the EPR register with 3289the interrupt vector number and acknowledges the interrupt inside 3290the interrupt controller. 3291 3292In case the interrupt controller lives in user space, we need to do 3293the interrupt acknowledge cycle through it to fetch the next to be 3294delivered interrupt vector using this exit. 3295 3296It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an 3297external interrupt has just been delivered into the guest. User space 3298should put the acknowledged interrupt vector into the 'epr' field. 3299 3300 /* KVM_EXIT_SYSTEM_EVENT */ 3301 struct { 3302#define KVM_SYSTEM_EVENT_SHUTDOWN 1 3303#define KVM_SYSTEM_EVENT_RESET 2 3304#define KVM_SYSTEM_EVENT_CRASH 3 3305 __u32 type; 3306 __u64 flags; 3307 } system_event; 3308 3309If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered 3310a system-level event using some architecture specific mechanism (hypercall 3311or some special instruction). In case of ARM/ARM64, this is triggered using 3312HVC instruction based PSCI call from the vcpu. The 'type' field describes 3313the system-level event type. The 'flags' field describes architecture 3314specific flags for the system-level event. 3315 3316Valid values for 'type' are: 3317 KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the 3318 VM. Userspace is not obliged to honour this, and if it does honour 3319 this does not need to destroy the VM synchronously (ie it may call 3320 KVM_RUN again before shutdown finally occurs). 3321 KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. 3322 As with SHUTDOWN, userspace can choose to ignore the request, or 3323 to schedule the reset to occur in the future and may call KVM_RUN again. 3324 KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest 3325 has requested a crash condition maintenance. Userspace can choose 3326 to ignore the request, or to gather VM memory core dump and/or 3327 reset/shutdown of the VM. 3328 3329 /* KVM_EXIT_IOAPIC_EOI */ 3330 struct { 3331 __u8 vector; 3332 } eoi; 3333 3334Indicates that the VCPU's in-kernel local APIC received an EOI for a 3335level-triggered IOAPIC interrupt. This exit only triggers when the 3336IOAPIC is implemented in userspace (i.e. KVM_CAP_SPLIT_IRQCHIP is enabled); 3337the userspace IOAPIC should process the EOI and retrigger the interrupt if 3338it is still asserted. Vector is the LAPIC interrupt vector for which the 3339EOI was received. 3340 3341 /* Fix the size of the union. */ 3342 char padding[256]; 3343 }; 3344 3345 /* 3346 * shared registers between kvm and userspace. 3347 * kvm_valid_regs specifies the register classes set by the host 3348 * kvm_dirty_regs specified the register classes dirtied by userspace 3349 * struct kvm_sync_regs is architecture specific, as well as the 3350 * bits for kvm_valid_regs and kvm_dirty_regs 3351 */ 3352 __u64 kvm_valid_regs; 3353 __u64 kvm_dirty_regs; 3354 union { 3355 struct kvm_sync_regs regs; 3356 char padding[1024]; 3357 } s; 3358 3359If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access 3360certain guest registers without having to call SET/GET_*REGS. Thus we can 3361avoid some system call overhead if userspace has to handle the exit. 3362Userspace can query the validity of the structure by checking 3363kvm_valid_regs for specific bits. These bits are architecture specific 3364and usually define the validity of a groups of registers. (e.g. one bit 3365 for general purpose registers) 3366 3367Please note that the kernel is allowed to use the kvm_run structure as the 3368primary storage for certain register types. Therefore, the kernel may use the 3369values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. 3370 3371}; 3372 3373 3374 33756. Capabilities that can be enabled on vCPUs 3376-------------------------------------------- 3377 3378There are certain capabilities that change the behavior of the virtual CPU or 3379the virtual machine when enabled. To enable them, please see section 4.37. 3380Below you can find a list of capabilities and what their effect on the vCPU or 3381the virtual machine is when enabling them. 3382 3383The following information is provided along with the description: 3384 3385 Architectures: which instruction set architectures provide this ioctl. 3386 x86 includes both i386 and x86_64. 3387 3388 Target: whether this is a per-vcpu or per-vm capability. 3389 3390 Parameters: what parameters are accepted by the capability. 3391 3392 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 3393 are not detailed, but errors with specific meanings are. 3394 3395 33966.1 KVM_CAP_PPC_OSI 3397 3398Architectures: ppc 3399Target: vcpu 3400Parameters: none 3401Returns: 0 on success; -1 on error 3402 3403This capability enables interception of OSI hypercalls that otherwise would 3404be treated as normal system calls to be injected into the guest. OSI hypercalls 3405were invented by Mac-on-Linux to have a standardized communication mechanism 3406between the guest and the host. 3407 3408When this capability is enabled, KVM_EXIT_OSI can occur. 3409 3410 34116.2 KVM_CAP_PPC_PAPR 3412 3413Architectures: ppc 3414Target: vcpu 3415Parameters: none 3416Returns: 0 on success; -1 on error 3417 3418This capability enables interception of PAPR hypercalls. PAPR hypercalls are 3419done using the hypercall instruction "sc 1". 3420 3421It also sets the guest privilege level to "supervisor" mode. Usually the guest 3422runs in "hypervisor" privilege mode with a few missing features. 3423 3424In addition to the above, it changes the semantics of SDR1. In this mode, the 3425HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the 3426HTAB invisible to the guest. 3427 3428When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur. 3429 3430 34316.3 KVM_CAP_SW_TLB 3432 3433Architectures: ppc 3434Target: vcpu 3435Parameters: args[0] is the address of a struct kvm_config_tlb 3436Returns: 0 on success; -1 on error 3437 3438struct kvm_config_tlb { 3439 __u64 params; 3440 __u64 array; 3441 __u32 mmu_type; 3442 __u32 array_len; 3443}; 3444 3445Configures the virtual CPU's TLB array, establishing a shared memory area 3446between userspace and KVM. The "params" and "array" fields are userspace 3447addresses of mmu-type-specific data structures. The "array_len" field is an 3448safety mechanism, and should be set to the size in bytes of the memory that 3449userspace has reserved for the array. It must be at least the size dictated 3450by "mmu_type" and "params". 3451 3452While KVM_RUN is active, the shared region is under control of KVM. Its 3453contents are undefined, and any modification by userspace results in 3454boundedly undefined behavior. 3455 3456On return from KVM_RUN, the shared region will reflect the current state of 3457the guest's TLB. If userspace makes any changes, it must call KVM_DIRTY_TLB 3458to tell KVM which entries have been changed, prior to calling KVM_RUN again 3459on this vcpu. 3460 3461For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: 3462 - The "params" field is of type "struct kvm_book3e_206_tlb_params". 3463 - The "array" field points to an array of type "struct 3464 kvm_book3e_206_tlb_entry". 3465 - The array consists of all entries in the first TLB, followed by all 3466 entries in the second TLB. 3467 - Within a TLB, entries are ordered first by increasing set number. Within a 3468 set, entries are ordered by way (increasing ESEL). 3469 - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1) 3470 where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. 3471 - The tsize field of mas1 shall be set to 4K on TLB0, even though the 3472 hardware ignores this value for TLB0. 3473 34746.4 KVM_CAP_S390_CSS_SUPPORT 3475 3476Architectures: s390 3477Target: vcpu 3478Parameters: none 3479Returns: 0 on success; -1 on error 3480 3481This capability enables support for handling of channel I/O instructions. 3482 3483TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are 3484handled in-kernel, while the other I/O instructions are passed to userspace. 3485 3486When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST 3487SUBCHANNEL intercepts. 3488 3489Note that even though this capability is enabled per-vcpu, the complete 3490virtual machine is affected. 3491 34926.5 KVM_CAP_PPC_EPR 3493 3494Architectures: ppc 3495Target: vcpu 3496Parameters: args[0] defines whether the proxy facility is active 3497Returns: 0 on success; -1 on error 3498 3499This capability enables or disables the delivery of interrupts through the 3500external proxy facility. 3501 3502When enabled (args[0] != 0), every time the guest gets an external interrupt 3503delivered, it automatically exits into user space with a KVM_EXIT_EPR exit 3504to receive the topmost interrupt vector. 3505 3506When disabled (args[0] == 0), behavior is as if this facility is unsupported. 3507 3508When this capability is enabled, KVM_EXIT_EPR can occur. 3509 35106.6 KVM_CAP_IRQ_MPIC 3511 3512Architectures: ppc 3513Parameters: args[0] is the MPIC device fd 3514 args[1] is the MPIC CPU number for this vcpu 3515 3516This capability connects the vcpu to an in-kernel MPIC device. 3517 35186.7 KVM_CAP_IRQ_XICS 3519 3520Architectures: ppc 3521Target: vcpu 3522Parameters: args[0] is the XICS device fd 3523 args[1] is the XICS CPU number (server ID) for this vcpu 3524 3525This capability connects the vcpu to an in-kernel XICS device. 3526 35276.8 KVM_CAP_S390_IRQCHIP 3528 3529Architectures: s390 3530Target: vm 3531Parameters: none 3532 3533This capability enables the in-kernel irqchip for s390. Please refer to 3534"4.24 KVM_CREATE_IRQCHIP" for details. 3535 35366.9 KVM_CAP_MIPS_FPU 3537 3538Architectures: mips 3539Target: vcpu 3540Parameters: args[0] is reserved for future use (should be 0). 3541 3542This capability allows the use of the host Floating Point Unit by the guest. It 3543allows the Config1.FP bit to be set to enable the FPU in the guest. Once this is 3544done the KVM_REG_MIPS_FPR_* and KVM_REG_MIPS_FCR_* registers can be accessed 3545(depending on the current guest FPU register mode), and the Status.FR, 3546Config5.FRE bits are accessible via the KVM API and also from the guest, 3547depending on them being supported by the FPU. 3548 35496.10 KVM_CAP_MIPS_MSA 3550 3551Architectures: mips 3552Target: vcpu 3553Parameters: args[0] is reserved for future use (should be 0). 3554 3555This capability allows the use of the MIPS SIMD Architecture (MSA) by the guest. 3556It allows the Config3.MSAP bit to be set to enable the use of MSA by the guest. 3557Once this is done the KVM_REG_MIPS_VEC_* and KVM_REG_MIPS_MSA_* registers can be 3558accessed, and the Config5.MSAEn bit is accessible via the KVM API and also from 3559the guest. 3560 35617. Capabilities that can be enabled on VMs 3562------------------------------------------ 3563 3564There are certain capabilities that change the behavior of the virtual 3565machine when enabled. To enable them, please see section 4.37. Below 3566you can find a list of capabilities and what their effect on the VM 3567is when enabling them. 3568 3569The following information is provided along with the description: 3570 3571 Architectures: which instruction set architectures provide this ioctl. 3572 x86 includes both i386 and x86_64. 3573 3574 Parameters: what parameters are accepted by the capability. 3575 3576 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 3577 are not detailed, but errors with specific meanings are. 3578 3579 35807.1 KVM_CAP_PPC_ENABLE_HCALL 3581 3582Architectures: ppc 3583Parameters: args[0] is the sPAPR hcall number 3584 args[1] is 0 to disable, 1 to enable in-kernel handling 3585 3586This capability controls whether individual sPAPR hypercalls (hcalls) 3587get handled by the kernel or not. Enabling or disabling in-kernel 3588handling of an hcall is effective across the VM. On creation, an 3589initial set of hcalls are enabled for in-kernel handling, which 3590consists of those hcalls for which in-kernel handlers were implemented 3591before this capability was implemented. If disabled, the kernel will 3592not to attempt to handle the hcall, but will always exit to userspace 3593to handle it. Note that it may not make sense to enable some and 3594disable others of a group of related hcalls, but KVM does not prevent 3595userspace from doing that. 3596 3597If the hcall number specified is not one that has an in-kernel 3598implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL 3599error. 3600 36017.2 KVM_CAP_S390_USER_SIGP 3602 3603Architectures: s390 3604Parameters: none 3605 3606This capability controls which SIGP orders will be handled completely in user 3607space. With this capability enabled, all fast orders will be handled completely 3608in the kernel: 3609- SENSE 3610- SENSE RUNNING 3611- EXTERNAL CALL 3612- EMERGENCY SIGNAL 3613- CONDITIONAL EMERGENCY SIGNAL 3614 3615All other orders will be handled completely in user space. 3616 3617Only privileged operation exceptions will be checked for in the kernel (or even 3618in the hardware prior to interception). If this capability is not enabled, the 3619old way of handling SIGP orders is used (partially in kernel and user space). 3620 36217.3 KVM_CAP_S390_VECTOR_REGISTERS 3622 3623Architectures: s390 3624Parameters: none 3625Returns: 0 on success, negative value on error 3626 3627Allows use of the vector registers introduced with z13 processor, and 3628provides for the synchronization between host and user space. Will 3629return -EINVAL if the machine does not support vectors. 3630 36317.4 KVM_CAP_S390_USER_STSI 3632 3633Architectures: s390 3634Parameters: none 3635 3636This capability allows post-handlers for the STSI instruction. After 3637initial handling in the kernel, KVM exits to user space with 3638KVM_EXIT_S390_STSI to allow user space to insert further data. 3639 3640Before exiting to userspace, kvm handlers should fill in s390_stsi field of 3641vcpu->run: 3642struct { 3643 __u64 addr; 3644 __u8 ar; 3645 __u8 reserved; 3646 __u8 fc; 3647 __u8 sel1; 3648 __u16 sel2; 3649} s390_stsi; 3650 3651@addr - guest address of STSI SYSIB 3652@fc - function code 3653@sel1 - selector 1 3654@sel2 - selector 2 3655@ar - access register number 3656 3657KVM handlers should exit to userspace with rc = -EREMOTE. 3658 36597.5 KVM_CAP_SPLIT_IRQCHIP 3660 3661Architectures: x86 3662Parameters: args[0] - number of routes reserved for userspace IOAPICs 3663Returns: 0 on success, -1 on error 3664 3665Create a local apic for each processor in the kernel. This can be used 3666instead of KVM_CREATE_IRQCHIP if the userspace VMM wishes to emulate the 3667IOAPIC and PIC (and also the PIT, even though this has to be enabled 3668separately). 3669 3670This capability also enables in kernel routing of interrupt requests; 3671when KVM_CAP_SPLIT_IRQCHIP only routes of KVM_IRQ_ROUTING_MSI type are 3672used in the IRQ routing table. The first args[0] MSI routes are reserved 3673for the IOAPIC pins. Whenever the LAPIC receives an EOI for these routes, 3674a KVM_EXIT_IOAPIC_EOI vmexit will be reported to userspace. 3675 3676Fails if VCPU has already been created, or if the irqchip is already in the 3677kernel (i.e. KVM_CREATE_IRQCHIP has already been called). 3678 3679 36808. Other capabilities. 3681---------------------- 3682 3683This section lists capabilities that give information about other 3684features of the KVM implementation. 3685 36868.1 KVM_CAP_PPC_HWRNG 3687 3688Architectures: ppc 3689 3690This capability, if KVM_CHECK_EXTENSION indicates that it is 3691available, means that that the kernel has an implementation of the 3692H_RANDOM hypercall backed by a hardware random-number generator. 3693If present, the kernel H_RANDOM handler can be enabled for guest use 3694with the KVM_CAP_PPC_ENABLE_HCALL capability. 3695