diff options
Diffstat (limited to 'Documentation/virt')
-rw-r--r-- | Documentation/virt/acrn/cpuid.rst | 46 | ||||
-rw-r--r-- | Documentation/virt/acrn/index.rst | 12 | ||||
-rw-r--r-- | Documentation/virt/acrn/introduction.rst | 43 | ||||
-rw-r--r-- | Documentation/virt/acrn/io-request.rst | 97 | ||||
-rw-r--r-- | Documentation/virt/index.rst | 1 | ||||
-rw-r--r-- | Documentation/virt/kvm/amd-memory-encryption.rst | 21 | ||||
-rw-r--r-- | Documentation/virt/kvm/api.rst | 353 | ||||
-rw-r--r-- | Documentation/virt/kvm/arm/hyp-abi.rst | 9 | ||||
-rw-r--r-- | Documentation/virt/kvm/locking.rst | 9 | ||||
-rw-r--r-- | Documentation/virt/kvm/s390-pv-boot.rst | 2 |
10 files changed, 527 insertions, 66 deletions
diff --git a/Documentation/virt/acrn/cpuid.rst b/Documentation/virt/acrn/cpuid.rst new file mode 100644 index 000000000000..65fa4b9c1798 --- /dev/null +++ b/Documentation/virt/acrn/cpuid.rst @@ -0,0 +1,46 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +ACRN CPUID bits +=============== + +A guest VM running on an ACRN hypervisor can check some of its features using +CPUID. + +ACRN cpuid functions are: + +function: 0x40000000 + +returns:: + + eax = 0x40000010 + ebx = 0x4e524341 + ecx = 0x4e524341 + edx = 0x4e524341 + +Note that this value in ebx, ecx and edx corresponds to the string +"ACRNACRNACRN". The value in eax corresponds to the maximum cpuid function +present in this leaf, and will be updated if more functions are added in the +future. + +function: define ACRN_CPUID_FEATURES (0x40000001) + +returns:: + + ebx, ecx, edx + eax = an OR'ed group of (1 << flag) + +where ``flag`` is defined as below: + +================================= =========== ================================ +flag value meaning +================================= =========== ================================ +ACRN_FEATURE_PRIVILEGED_VM 0 guest VM is a privileged VM +================================= =========== ================================ + +function: 0x40000010 + +returns:: + + ebx, ecx, edx + eax = (Virtual) TSC frequency in kHz. diff --git a/Documentation/virt/acrn/index.rst b/Documentation/virt/acrn/index.rst new file mode 100644 index 000000000000..b5f793e73df5 --- /dev/null +++ b/Documentation/virt/acrn/index.rst @@ -0,0 +1,12 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +ACRN Hypervisor +=============== + +.. toctree:: + :maxdepth: 1 + + introduction + io-request + cpuid diff --git a/Documentation/virt/acrn/introduction.rst b/Documentation/virt/acrn/introduction.rst new file mode 100644 index 000000000000..f8d081bc084d --- /dev/null +++ b/Documentation/virt/acrn/introduction.rst @@ -0,0 +1,43 @@ +.. SPDX-License-Identifier: GPL-2.0 + +ACRN Hypervisor Introduction +============================ + +The ACRN Hypervisor is a Type 1 hypervisor, running directly on bare-metal +hardware. It has a privileged management VM, called Service VM, to manage User +VMs and do I/O emulation. + +ACRN userspace is an application running in the Service VM that emulates +devices for a User VM based on command line configurations. ACRN Hypervisor +Service Module (HSM) is a kernel module in the Service VM which provides +hypervisor services to the ACRN userspace. + +Below figure shows the architecture. + +:: + + Service VM User VM + +----------------------------+ | +------------------+ + | +--------------+ | | | | + | |ACRN userspace| | | | | + | +--------------+ | | | | + |-----------------ioctl------| | | | ... + |kernel space +----------+ | | | | + | | HSM | | | | Drivers | + | +----------+ | | | | + +--------------------|-------+ | +------------------+ + +---------------------hypercall----------------------------------------+ + | ACRN Hypervisor | + +----------------------------------------------------------------------+ + | Hardware | + +----------------------------------------------------------------------+ + +ACRN userspace allocates memory for the User VM, configures and initializes the +devices used by the User VM, loads the virtual bootloader, initializes the +virtual CPU state and handles I/O request accesses from the User VM. It uses +ioctls to communicate with the HSM. HSM implements hypervisor services by +interacting with the ACRN Hypervisor via hypercalls. HSM exports a char device +interface (/dev/acrn_hsm) to userspace. + +The ACRN hypervisor is open for contribution from anyone. The source repo is +available at https://github.com/projectacrn/acrn-hypervisor. diff --git a/Documentation/virt/acrn/io-request.rst b/Documentation/virt/acrn/io-request.rst new file mode 100644 index 000000000000..6cc3ea0fa1f5 --- /dev/null +++ b/Documentation/virt/acrn/io-request.rst @@ -0,0 +1,97 @@ +.. SPDX-License-Identifier: GPL-2.0 + +I/O request handling +==================== + +An I/O request of a User VM, which is constructed by the hypervisor, is +distributed by the ACRN Hypervisor Service Module to an I/O client +corresponding to the address range of the I/O request. Details of I/O request +handling are described in the following sections. + +1. I/O request +-------------- + +For each User VM, there is a shared 4-KByte memory region used for I/O requests +communication between the hypervisor and Service VM. An I/O request is a +256-byte structure buffer, which is 'struct acrn_io_request', that is filled by +an I/O handler of the hypervisor when a trapped I/O access happens in a User +VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes +the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is +used as an array of 16 I/O request slots with each I/O request slot being 256 +bytes. This array is indexed by vCPU ID. + +2. I/O clients +-------------- + +An I/O client is responsible for handling User VM I/O requests whose accessed +GPA falls in a certain range. Multiple I/O clients can be associated with each +User VM. There is a special client associated with each User VM, called the +default client, that handles all I/O requests that do not fit into the range of +any other clients. The ACRN userspace acts as the default client for each User +VM. + +Below illustration shows the relationship between I/O requests shared buffer, +I/O requests and I/O clients. + +:: + + +------------------------------------------------------+ + | Service VM | + |+--------------------------------------------------+ | + || +----------------------------------------+ | | + || | shared page ACRN userspace | | | + || | +-----------------+ +------------+ | | | + || +----+->| acrn_io_request |<-+ default | | | | + || | | | +-----------------+ | I/O client | | | | + || | | | | ... | +------------+ | | | + || | | | +-----------------+ | | | + || | +-|--------------------------------------+ | | + ||---|----|-----------------------------------------| | + || | | kernel | | + || | | +----------------------+ | | + || | | | +-------------+ HSM | | | + || | +--------------+ | | | | + || | | | I/O clients | | | | + || | | | | | | | + || | | +-------------+ | | | + || | +----------------------+ | | + |+---|----------------------------------------------+ | + +----|-------------------------------------------------+ + | + +----|-------------------------------------------------+ + | +-+-----------+ | + | | I/O handler | ACRN Hypervisor | + | +-------------+ | + +------------------------------------------------------+ + +3. I/O request state transition +------------------------------- + +The state transitions of an ACRN I/O request are as follows. + +:: + + FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ... + +- FREE: this I/O request slot is empty +- PENDING: a valid I/O request is pending in this slot +- PROCESSING: the I/O request is being processed +- COMPLETE: the I/O request has been processed + +An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and +ACRN userspace are in charge of processing the others. + +4. Processing flow of I/O requests +---------------------------------- + +a. The I/O handler of the hypervisor will fill an I/O request with PENDING + state when a trapped I/O access happens in a User VM. +b. The hypervisor makes an upcall, which is a notification interrupt, to + the Service VM. +c. The upcall handler schedules a worker to dispatch I/O requests. +d. The worker looks for the PENDING I/O requests, assigns them to different + registered clients based on the address of the I/O accesses, updates + their state to PROCESSING, and notifies the corresponding client to handle. +e. The notified client handles the assigned I/O requests. +f. The HSM updates I/O requests states to COMPLETE and notifies the hypervisor + of the completion via hypercalls. diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst index 350f5c869b56..edea7fea95a8 100644 --- a/Documentation/virt/index.rst +++ b/Documentation/virt/index.rst @@ -12,6 +12,7 @@ Linux Virtualization Support paravirt_ops guest-halt-polling ne_overview + acrn/index .. only:: html and subproject diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst index 09a8f2a34e39..469a6308765b 100644 --- a/Documentation/virt/kvm/amd-memory-encryption.rst +++ b/Documentation/virt/kvm/amd-memory-encryption.rst @@ -263,6 +263,27 @@ Returns: 0 on success, -negative on error __u32 trans_len; }; +10. KVM_SEV_GET_ATTESTATION_REPORT +---------------------------------- + +The KVM_SEV_GET_ATTESTATION_REPORT command can be used by the hypervisor to query the attestation +report containing the SHA-256 digest of the guest memory and VMSA passed through the KVM_SEV_LAUNCH +commands and signed with the PEK. The digest returned by the command should match the digest +used by the guest owner with the KVM_SEV_LAUNCH_MEASURE. + +Parameters (in): struct kvm_sev_attestation + +Returns: 0 on success, -negative on error + +:: + + struct kvm_sev_attestation_report { + __u8 mnonce[16]; /* A random mnonce that will be placed in the report */ + + __u64 uaddr; /* userspace address where the report should be copied */ + __u32 len; + }; + References ========== diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 99ceb978c8b0..307f2fcf1b02 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -182,6 +182,9 @@ is dependent on the CPU capability and the kernel configuration. The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION ioctl() at run-time. +Creation of the VM will fail if the requested IPA size (whether it is +implicit or explicit) is unsupported on the host. + Please note that configuring the IPA size does not affect the capability exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects size of the address translated by the stage2 level (guest physical to @@ -960,6 +963,14 @@ memory. __u8 pad2[30]; }; +If the KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL flag is returned from the +KVM_CAP_XEN_HVM check, it may be set in the flags field of this ioctl. +This requests KVM to generate the contents of the hypercall page +automatically; hypercalls will be intercepted and passed to userspace +through KVM_EXIT_XEN. In this case, all of the blob size and address +fields must be zero. + +No other flags are currently valid in the struct kvm_xen_hvm_config. 4.29 KVM_GET_CLOCK ------------------ @@ -1484,7 +1495,8 @@ Fails if any VCPU has already been created. Define which vcpu is the Bootstrap Processor (BSP). Values are the same as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default -is vcpu 0. +is vcpu 0. This ioctl has to be called before vcpu creation, +otherwise it will return EBUSY error. 4.42 KVM_GET_XSAVE @@ -2268,6 +2280,8 @@ registers, find a list below: PPC KVM_REG_PPC_PSSCR 64 PPC KVM_REG_PPC_DEC_EXPIRY 64 PPC KVM_REG_PPC_PTCR 64 + PPC KVM_REG_PPC_DAWR1 64 + PPC KVM_REG_PPC_DAWRX1 64 PPC KVM_REG_PPC_TM_GPR0 64 ... PPC KVM_REG_PPC_TM_GPR31 64 @@ -3846,49 +3860,20 @@ base 2 of the page size in the bottom 6 bits. -EFAULT if struct kvm_reinject_control cannot be read, -EINVAL if the supplied shift or flags are invalid, -ENOMEM if unable to allocate the new HPT, - -ENOSPC if there was a hash collision - -:: - - struct kvm_ppc_rmmu_info { - struct kvm_ppc_radix_geom { - __u8 page_shift; - __u8 level_bits[4]; - __u8 pad[3]; - } geometries[8]; - __u32 ap_encodings[8]; - }; - -The geometries[] field gives up to 8 supported geometries for the -radix page table, in terms of the log base 2 of the smallest page -size, and the number of bits indexed at each level of the tree, from -the PTE level up to the PGD level in that order. Any unused entries -will have 0 in the page_shift field. - -The ap_encodings gives the supported page sizes and their AP field -encodings, encoded with the AP value in the top 3 bits and the log -base 2 of the page size in the bottom 6 bits. - -4.102 KVM_PPC_RESIZE_HPT_PREPARE --------------------------------- - -:Capability: KVM_CAP_SPAPR_RESIZE_HPT -:Architectures: powerpc -:Type: vm ioctl -:Parameters: struct kvm_ppc_resize_hpt (in) -:Returns: 0 on successful completion, - >0 if a new HPT is being prepared, the value is an estimated - number of milliseconds until preparation is complete, - -EFAULT if struct kvm_reinject_control cannot be read, - -EINVAL if the supplied shift or flags are invalid,when moving existing - HPT entries to the new HPT, - -EIO on other error conditions Used to implement the PAPR extension for runtime resizing of a guest's Hashed Page Table (HPT). Specifically this starts, stops or monitors the preparation of a new potential HPT for the guest, essentially implementing the H_RESIZE_HPT_PREPARE hypercall. +:: + + struct kvm_ppc_resize_hpt { + __u64 flags; + __u32 shift; + __u32 pad; + }; + If called with shift > 0 when there is no pending HPT for the guest, this begins preparation of a new pending HPT of size 2^(shift) bytes. It then returns a positive integer with the estimated number of @@ -3916,14 +3901,6 @@ Normally this will be called repeatedly with the same parameters until it returns <= 0. The first call will initiate preparation, subsequent ones will monitor preparation until it completes or fails. -:: - - struct kvm_ppc_resize_hpt { - __u64 flags; - __u32 shift; - __u32 pad; - }; - 4.103 KVM_PPC_RESIZE_HPT_COMMIT ------------------------------- @@ -3946,6 +3923,14 @@ Hashed Page Table (HPT). Specifically this requests that the guest be transferred to working with the new HPT, essentially implementing the H_RESIZE_HPT_COMMIT hypercall. +:: + + struct kvm_ppc_resize_hpt { + __u64 flags; + __u32 shift; + __u32 pad; + }; + This should only be called after KVM_PPC_RESIZE_HPT_PREPARE has returned 0 with the same parameters. In other cases KVM_PPC_RESIZE_HPT_COMMIT will return an error (usually -ENXIO or @@ -3961,14 +3946,6 @@ HPT and the previous HPT will be discarded. On failure, the guest will still be operating on its previous HPT. -:: - - struct kvm_ppc_resize_hpt { - __u64 flags; - __u32 shift; - __u32 pad; - }; - 4.104 KVM_X86_GET_MCE_CAP_SUPPORTED ----------------------------------- @@ -4509,6 +4486,7 @@ KVM_GET_SUPPORTED_CPUID ioctl because some of them intersect with KVM feature leaves (0x40000000, 0x40000001). Currently, the following list of CPUID leaves are returned: + - HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS - HYPERV_CPUID_INTERFACE - HYPERV_CPUID_VERSION @@ -4533,6 +4511,7 @@ userspace should not expect to get any particular value there. Note, vcpu version of KVM_GET_SUPPORTED_HV_CPUID is currently deprecated. Unlike system ioctl which exposes all supported feature bits unconditionally, vcpu version has the following quirks: + - HYPERV_CPUID_NESTED_FEATURES leaf and HV_X64_ENLIGHTENED_VMCS_RECOMMENDED feature bit are only exposed when Enlightened VMCS was previously enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS). @@ -4828,9 +4807,142 @@ If an MSR access is not permitted through the filtering, it generates a allows user space to deflect and potentially handle various MSR accesses into user space. -If a vCPU is in running state while this ioctl is invoked, the vCPU may -experience inconsistent filtering behavior on MSR accesses. +Note, invoking this ioctl with a vCPU is running is inherently racy. However, +KVM does guarantee that vCPUs will see either the previous filter or the new +filter, e.g. MSRs with identical settings in both the old and new filter will +have deterministic behavior. + +4.127 KVM_XEN_HVM_SET_ATTR +-------------------------- + +:Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_xen_hvm_attr +:Returns: 0 on success, < 0 on error + +:: + + struct kvm_xen_hvm_attr { + __u16 type; + __u16 pad[3]; + union { + __u8 long_mode; + __u8 vector; + struct { + __u64 gfn; + } shared_info; + __u64 pad[4]; + } u; + }; + +type values: + +KVM_XEN_ATTR_TYPE_LONG_MODE + Sets the ABI mode of the VM to 32-bit or 64-bit (long mode). This + determines the layout of the shared info pages exposed to the VM. + +KVM_XEN_ATTR_TYPE_SHARED_INFO + Sets the guest physical frame number at which the Xen "shared info" + page resides. Note that although Xen places vcpu_info for the first + 32 vCPUs in the shared_info page, KVM does not automatically do so + and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used + explicitly even when the vcpu_info for a given vCPU resides at the + "default" location in the shared_info page. This is because KVM is + not aware of the Xen CPU id which is used as the index into the + vcpu_info[] array, so cannot know the correct default location. + +KVM_XEN_ATTR_TYPE_UPCALL_VECTOR + Sets the exception vector used to deliver Xen event channel upcalls. +4.128 KVM_XEN_HVM_GET_ATTR +-------------------------- + +:Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_xen_hvm_attr +:Returns: 0 on success, < 0 on error + +Allows Xen VM attributes to be read. For the structure and types, +see KVM_XEN_HVM_SET_ATTR above. + +4.129 KVM_XEN_VCPU_SET_ATTR +--------------------------- + +:Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO +:Architectures: x86 +:Type: vcpu ioctl +:Parameters: struct kvm_xen_vcpu_attr +:Returns: 0 on success, < 0 on error + +:: + + struct kvm_xen_vcpu_attr { + __u16 type; + __u16 pad[3]; + union { + __u64 gpa; + __u64 pad[4]; + struct { + __u64 state; + __u64 state_entry_time; + __u64 time_running; + __u64 time_runnable; + __u64 time_blocked; + __u64 time_offline; + } runstate; + } u; + }; + +type values: + +KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO + Sets the guest physical address of the vcpu_info for a given vCPU. + +KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO + Sets the guest physical address of an additional pvclock structure + for a given vCPU. This is typically used for guest vsyscall support. + +KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR + Sets the guest physical address of the vcpu_runstate_info for a given + vCPU. This is how a Xen guest tracks CPU state such as steal time. + +KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT + Sets the runstate (RUNSTATE_running/_runnable/_blocked/_offline) of + the given vCPU from the .u.runstate.state member of the structure. + KVM automatically accounts running and runnable time but blocked + and offline states are only entered explicitly. + +KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA + Sets all fields of the vCPU runstate data from the .u.runstate member + of the structure, including the current runstate. The state_entry_time + must equal the sum of the other four times. + +KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST + This *adds* the contents of the .u.runstate members of the structure + to the corresponding members of the given vCPU's runstate data, thus + permitting atomic adjustments to the runstate times. The adjustment + to the state_entry_time must equal the sum of the adjustments to the + other four times. The state field must be set to -1, or to a valid + runstate value (RUNSTATE_running, RUNSTATE_runnable, RUNSTATE_blocked + or RUNSTATE_offline) to set the current accounted state as of the + adjusted state_entry_time. + +4.130 KVM_XEN_VCPU_GET_ATTR +--------------------------- + +:Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO +:Architectures: x86 +:Type: vcpu ioctl +:Parameters: struct kvm_xen_vcpu_attr +:Returns: 0 on success, < 0 on error + +Allows Xen vCPU attributes to be read. For the structure and types, +see KVM_XEN_VCPU_SET_ATTR above. + +The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used +with the KVM_XEN_VCPU_GET_ATTR ioctl. 5. The kvm_run structure ======================== @@ -4893,9 +5005,12 @@ local APIC is not used. __u16 flags; More architecture-specific flags detailing state of the VCPU that may -affect the device's behavior. The only currently defined flag is -KVM_RUN_X86_SMM, which is valid on x86 machines and is set if the -VCPU is in system management mode. +affect the device's behavior. Current defined flags:: + + /* x86, set if the VCPU is in system management mode */ + #define KVM_RUN_X86_SMM (1 << 0) + /* x86, set if bus lock detected in VM */ + #define KVM_RUN_BUS_LOCK (1 << 1) :: @@ -4996,13 +5111,18 @@ to the byte array. .. note:: - For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, + For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN, KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding operations are complete (and guest state is consistent) only after userspace has re-entered the kernel with KVM_RUN. The kernel side will first finish - incomplete operations and then check for pending signals. Userspace - can re-enter the guest with an unmasked signal pending to complete - pending operations. + incomplete operations and then check for pending signals. + + The pending state of the operation is not preserved in state which is + visible to userspace, thus userspace should ensure that the operation is + completed before performing a live migration. Userspace can re-enter the + guest with an unmasked signal pending or with the immediate_exit field set + to complete pending operations without allowing any further instructions + to be executed. :: @@ -5329,6 +5449,34 @@ vCPU execution. If the MSR write was unsuccessful, user space also sets the :: + + struct kvm_xen_exit { + #define KVM_EXIT_XEN_HCALL 1 + __u32 type; + union { + struct { + __u32 longmode; + __u32 cpl; + __u64 input; + __u64 result; + __u64 params[6]; + } hcall; + } u; + }; + /* KVM_EXIT_XEN */ + struct kvm_hyperv_exit xen; + +Indicates that the VCPU exits into userspace to process some tasks +related to Xen emulation. + +Valid values for 'type' are: + + - KVM_EXIT_XEN_HCALL -- synchronously notify user-space about Xen hypercall. + Userspace is expected to place the hypercall result into the appropriate + field before invoking KVM_RUN again. + +:: + /* Fix the size of the union. */ char padding[256]; }; @@ -6038,6 +6186,53 @@ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications which user space can then handle to implement model specific MSR handling and/or user notifications to inform a user that an MSR was not handled. +7.22 KVM_CAP_X86_BUS_LOCK_EXIT +------------------------------- + +:Architectures: x86 +:Target: VM +:Parameters: args[0] defines the policy used when bus locks detected in guest +:Returns: 0 on success, -EINVAL when args[0] contains invalid bits + +Valid bits in args[0] are:: + + #define KVM_BUS_LOCK_DETECTION_OFF (1 << 0) + #define KVM_BUS_LOCK_DETECTION_EXIT (1 << 1) + +Enabling this capability on a VM provides userspace with a way to select +a policy to handle the bus locks detected in guest. Userspace can obtain +the supported modes from the result of KVM_CHECK_EXTENSION and define it +through the KVM_ENABLE_CAP. + +KVM_BUS_LOCK_DETECTION_OFF and KVM_BUS_LOCK_DETECTION_EXIT are supported +currently and mutually exclusive with each other. More bits can be added in +the future. + +With KVM_BUS_LOCK_DETECTION_OFF set, bus locks in guest will not cause vm exits +so that no additional actions are needed. This is the default mode. + +With KVM_BUS_LOCK_DETECTION_EXIT set, vm exits happen when bus lock detected +in VM. KVM just exits to userspace when handling them. Userspace can enforce +its own throttling or other policy based mitigations. + +This capability is aimed to address the thread that VM can exploit bus locks to +degree the performance of the whole system. Once the userspace enable this +capability and select the KVM_BUS_LOCK_DETECTION_EXIT mode, KVM will set the +KVM_RUN_BUS_LOCK flag in vcpu-run->flags field and exit to userspace. Concerning +the bus lock vm exit can be preempted by a higher priority VM exit, the exit +notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons. +KVM_RUN_BUS_LOCK flag is used to distinguish between them. + +7.23 KVM_CAP_PPC_DAWR1 +---------------------- + +:Architectures: ppc +:Parameters: none +:Returns: 0 on success, -EINVAL when CPU doesn't support 2nd DAWR + +This capability can be used to check / enable 2nd DAWR feature provided +by POWER10 processor. + 8. Other capabilities. ====================== @@ -6415,7 +6610,6 @@ guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf (0x40000001). Otherwise, a guest may use the paravirtual features regardless of what has actually been exposed through the CPUID leaf. - 8.29 KVM_CAP_DIRTY_LOG_RING --------------------------- @@ -6502,3 +6696,34 @@ KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG. After enabling KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual machine will switch to ring-buffer dirty page tracking and further KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail. + +8.30 KVM_CAP_XEN_HVM +-------------------- + +:Architectures: x86 + +This capability indicates the features that Xen supports for hosting Xen +PVHVM guests. Valid flags are:: + + #define KVM_XEN_HVM_CONFIG_HYPERCALL_MSR (1 << 0) + #define KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL (1 << 1) + #define KVM_XEN_HVM_CONFIG_SHARED_INFO (1 << 2) + #define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 2) + +The KVM_XEN_HVM_CONFIG_HYPERCALL_MSR flag indicates that the KVM_XEN_HVM_CONFIG +ioctl is available, for the guest to set its hypercall page. + +If KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL is also set, the same flag may also be +provided in the flags to KVM_XEN_HVM_CONFIG, without providing hypercall page +contents, to request that KVM generate hypercall page content automatically +and also enable interception of guest hypercalls with KVM_EXIT_XEN. + +The KVM_XEN_HVM_CONFIG_SHARED_INFO flag indicates the availability of the +KVM_XEN_HVM_SET_ATTR, KVM_XEN_HVM_GET_ATTR, KVM_XEN_VCPU_SET_ATTR and +KVM_XEN_VCPU_GET_ATTR ioctls, as well as the delivery of exception vectors +for event channel upcalls when the evtchn_upcall_pending field of a vcpu's +vcpu_info is set. + +The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related +features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are +supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls. diff --git a/Documentation/virt/kvm/arm/hyp-abi.rst b/Documentation/virt/kvm/arm/hyp-abi.rst index 83cadd8186fa..4d43fbc25195 100644 --- a/Documentation/virt/kvm/arm/hyp-abi.rst +++ b/Documentation/virt/kvm/arm/hyp-abi.rst @@ -58,6 +58,15 @@ these functions (see arch/arm{,64}/include/asm/virt.h): into place (arm64 only), and jump to the restart address while at HYP/EL2. This hypercall is not expected to return to its caller. +* :: + + x0 = HVC_VHE_RESTART (arm64 only) + + Attempt to upgrade the kernel's exception level from EL1 to EL2 by enabling + the VHE mode. This is conditioned by the CPU supporting VHE, the EL2 MMU + being off, and VHE not being disabled by any other means (command line + option, for example). + Any other value of r0/x0 triggers a hypervisor-specific handling, which is not documented here. diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index b21a34c34a21..0aa4817b466d 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -16,7 +16,14 @@ The acquisition orders for mutexes are as follows: - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring them together is quite rare. -On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock. +On x86: + +- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock + +- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock is + taken inside kvm->arch.mmu_lock, and cannot be taken without already + holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise + there's no need to take kvm->arch.tdp_mmu_pages_lock at all). Everything else is a leaf: no other lock is taken inside the critical sections. diff --git a/Documentation/virt/kvm/s390-pv-boot.rst b/Documentation/virt/kvm/s390-pv-boot.rst index 8b8fa0390409..ad1f7866c001 100644 --- a/Documentation/virt/kvm/s390-pv-boot.rst +++ b/Documentation/virt/kvm/s390-pv-boot.rst @@ -80,5 +80,5 @@ Keys ---- Every CEC will have a unique public key to enable tooling to build encrypted images. -See `s390-tools <https://github.com/ibm-s390-tools/s390-tools/>`_ +See `s390-tools <https://github.com/ibm-s390-linux/s390-tools/>`_ for the tooling. |