• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1KVM RAS Test Suite
2==================
3The KVM RAS Test Suite is a collection of test scripts for testing the
4Linux kernel MCE processing features in KVM guest system.
5
6Jan 26th, 2010
7
8Jiajia Zheng
9
10
11In the Package
12----------------
13
14Here is a short description of what is included in the package
15
16host/*
17	Contains host test scripts, which drive test procedure on host system.
18guest/*
19	Contains guest test scripts, which drive test procedure on guest system.
20
21Dependencies
22----------------
23
24KVM RAS Test Suite has following dependencies on kernel and other tools:
25
26* Linux Kernel:
27  Version 2.6.32 or newer, with MCA high level handlers enabled.
28
29* mce-inject
30  A tool to inject mce error into kernel
31
32* page-types:
33  A tool to query page types, which is accompanied with Linux kernel
34  source (2.6.32 or newer, $KERNEL_SRC/Documentation/vm/page-types.c).
35
36* simple_process:
37  A process constantly access the allocated memeory. (../tools/simple_process)
38
39* kpartx:
40  A tool to list partition mappings from partition tables
41
42* (optionally) losetup
43  A tool to set up and control loop devices
44
45* (optionally) pvdisplay/vgchange:
46  A tool to display/change physical volume attribute
47
48* (optionally) ssh-keygen
49  A tool to provide the authentication key generation, management and conversion
50
51Test method
52---------------
53- Start a process in the guest OS, get a virtual address from guest OS
54
55- Translate this address untill we get the physical address on the host OS
56
57- Software injects an SRAO MCE at that physical address from host OS
58
59- (optionally) Write to the address from the guest, i.e attempt to
60  write to the poisoned page from the guest.
61
62the expected result:
63
64HOST system dmesg:
65
66...
67Machine check injector initialized
68Triggering MCE exception on CPU 0
69Disabling lock debugging due to kernel taint
70[Hardware Error]: Machine check events logged
71MCE exception done on CPU 0
72MCE 0x806324: Killing qemu-system-x86:8829 early due to hardware memory corruption
73MCE 0x806324: dirty LRU page recovery: Recovered
74...
75
76
77GUEST system dmesg:
78...
79[Hardware Error]: Machine check events logged
80MCE 0x75925: Killing simple_process:2273 early due to hardware memory corruption
81MCE 0x75925: dirty LRU page recovery : Recovered
82...
83
84
85Installation
86---------------
871. Build *host* kernel with
88	CONFIG_KVM=y
89	CONFIG_KVM_INTEL=y
90	CONFIG_X86_MCE=y
91	CONFIG_X86_MCE_INTEL=y
92	CONFIG_X86_MCE_INJECT=y or CONFIG_X86_MCE_INJECT=m
93
94   and following config both on *host* and *guest*
95	CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
96	CONFIG_MEMORY_FAILURE=y
97
98   NOTE: if the host machine doesn't support software error recovery
99   (MCG_SER_P in IA32_MCG_CAP[24]), please apply the patch fake_ser_p.patch
100   under ./patches/
1012. Use ssh-keygen to generate public and privite keys on the host OS,
102   and copy id_rsa and id_rsa.pub from ~/.ssh/ into the testing directory
103   on the host system.
1043. compile and install qemu-kvm
105   the qemu-kvm source can be got from git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git.
106   Before compile qemu-kvm, a patch p2v.patch should be applied. This patch
107   is located under ./patches/
108   Please ensure *SDL-devel* library is installed on the host machine, otherwise
109   only VNC can be used substituing local graphic output.
1104. install the guest OS on the qemu
111   e.g.
112   step 1: qemu-img create -f qcow2 test.img 10G
113   step 2: qemu-system-x86_64 -hda ./test.img -m 2048 -cdrom rhel6.iso -boot d
114   after the installation, please be sure to execute the following check:
115     a) add necessary command line parameters under your boot item.
116        This is to enable console output redirection to the serial.
117        e.g.
118          before:
119        	title Red Hat Enterprise Linux Server (2.6.32kvm)
120        		root (hd0,1)
121        		kernel /boot/vmlinuz-2.6.32kvm ro root=/dev/sda1
122        		initrd /boot/initramfs-2.6.32kvm.img
123             after:
124        	title Red Hat Enterprise Linux Server (2.6.32kvm)
125        		root (hd0,1)
126        		kernel /boot/vmlinuz-2.6.32kvm ro root=/dev/sda1 console=tty0 console=ttyS0,115200n8
127        		initrd /boot/initramfs-2.6.32kvm.img
128     b) DHCP guest ethernet card. This operation is to ensure the network connection
129        is OK.
130        e.g.
131          bash> dhclient eth0
132     c) enable SSH public/private key authorization, otherwise, the SSH connection password
133        is necessary to be provided in the test progress.
134        please check /etc/ssh/ssd_config and ensure related options are opened.
135        e.g.
136          RSAAuthentication yes
137          PubkeyAuthentication yes
138        after the related options are opened, please restart ssh service
139        e.g.
140          bash> service sshd restart
141
142
143Start Testing
144---------------
145Run testing by
146	./host_run.sh <option> <argument>
147You can get the help information by
148	./host_run.sh -h
149
150