1KVM RAS Test Suite 2================== 3The KVM RAS Test Suite is a collection of test scripts for testing the 4Linux kernel MCE processing features in KVM guest system. 5 6Jan 26th, 2010 7 8Jiajia Zheng 9 10 11In the Package 12---------------- 13 14Here is a short description of what is included in the package 15 16host/* 17 Contains host test scripts, which drive test procedure on host system. 18guest/* 19 Contains guest test scripts, which drive test procedure on guest system. 20 21Dependencies 22---------------- 23 24KVM RAS Test Suite has following dependencies on kernel and other tools: 25 26* Linux Kernel: 27 Version 2.6.32 or newer, with MCA high level handlers enabled. 28 29* mce-inject 30 A tool to inject mce error into kernel 31 32* page-types: 33 A tool to query page types, which is accompanied with Linux kernel 34 source (2.6.32 or newer, $KERNEL_SRC/Documentation/vm/page-types.c). 35 36* simple_process: 37 A process constantly access the allocated memeory. (../tools/simple_process) 38 39* kpartx: 40 A tool to list partition mappings from partition tables 41 42* (optionally) losetup 43 A tool to set up and control loop devices 44 45* (optionally) pvdisplay/vgchange: 46 A tool to display/change physical volume attribute 47 48* (optionally) ssh-keygen 49 A tool to provide the authentication key generation, management and conversion 50 51Test method 52--------------- 53- Start a process in the guest OS, get a virtual address from guest OS 54 55- Translate this address untill we get the physical address on the host OS 56 57- Software injects an SRAO MCE at that physical address from host OS 58 59- (optionally) Write to the address from the guest, i.e attempt to 60 write to the poisoned page from the guest. 61 62the expected result: 63 64HOST system dmesg: 65 66... 67Machine check injector initialized 68Triggering MCE exception on CPU 0 69Disabling lock debugging due to kernel taint 70[Hardware Error]: Machine check events logged 71MCE exception done on CPU 0 72MCE 0x806324: Killing qemu-system-x86:8829 early due to hardware memory corruption 73MCE 0x806324: dirty LRU page recovery: Recovered 74... 75 76 77GUEST system dmesg: 78... 79[Hardware Error]: Machine check events logged 80MCE 0x75925: Killing simple_process:2273 early due to hardware memory corruption 81MCE 0x75925: dirty LRU page recovery : Recovered 82... 83 84 85Installation 86--------------- 871. Build *host* kernel with 88 CONFIG_KVM=y 89 CONFIG_KVM_INTEL=y 90 CONFIG_X86_MCE=y 91 CONFIG_X86_MCE_INTEL=y 92 CONFIG_X86_MCE_INJECT=y or CONFIG_X86_MCE_INJECT=m 93 94 and following config both on *host* and *guest* 95 CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y 96 CONFIG_MEMORY_FAILURE=y 97 98 NOTE: if the host machine doesn't support software error recovery 99 (MCG_SER_P in IA32_MCG_CAP[24]), please apply the patch fake_ser_p.patch 100 under ./patches/ 1012. Use ssh-keygen to generate public and privite keys on the host OS, 102 and copy id_rsa and id_rsa.pub from ~/.ssh/ into the testing directory 103 on the host system. 1043. compile and install qemu-kvm 105 the qemu-kvm source can be got from git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git. 106 Before compile qemu-kvm, a patch p2v.patch should be applied. This patch 107 is located under ./patches/ 108 Please ensure *SDL-devel* library is installed on the host machine, otherwise 109 only VNC can be used substituing local graphic output. 1104. install the guest OS on the qemu 111 e.g. 112 step 1: qemu-img create -f qcow2 test.img 10G 113 step 2: qemu-system-x86_64 -hda ./test.img -m 2048 -cdrom rhel6.iso -boot d 114 after the installation, please be sure to execute the following check: 115 a) add necessary command line parameters under your boot item. 116 This is to enable console output redirection to the serial. 117 e.g. 118 before: 119 title Red Hat Enterprise Linux Server (2.6.32kvm) 120 root (hd0,1) 121 kernel /boot/vmlinuz-2.6.32kvm ro root=/dev/sda1 122 initrd /boot/initramfs-2.6.32kvm.img 123 after: 124 title Red Hat Enterprise Linux Server (2.6.32kvm) 125 root (hd0,1) 126 kernel /boot/vmlinuz-2.6.32kvm ro root=/dev/sda1 console=tty0 console=ttyS0,115200n8 127 initrd /boot/initramfs-2.6.32kvm.img 128 b) DHCP guest ethernet card. This operation is to ensure the network connection 129 is OK. 130 e.g. 131 bash> dhclient eth0 132 c) enable SSH public/private key authorization, otherwise, the SSH connection password 133 is necessary to be provided in the test progress. 134 please check /etc/ssh/ssd_config and ensure related options are opened. 135 e.g. 136 RSAAuthentication yes 137 PubkeyAuthentication yes 138 after the related options are opened, please restart ssh service 139 e.g. 140 bash> service sshd restart 141 142 143Start Testing 144--------------- 145Run testing by 146 ./host_run.sh <option> <argument> 147You can get the help information by 148 ./host_run.sh -h 149 150