MCE Stress Test HOWTO ==================== Oct 10th, 2009 Haicheng Li Abstract -------- This document explains the design and structure of MCE stress test suite, the kernel configurations and user space tools required for automated stress testing, as well as usage guide and etc. 0. Quick Shortcut ----------------- - Install the Linux kernel (2.6.32 or newer) with full MCA recovery support. Make sure following configuration options are enabled: CONFIG_X86_MCE=y CONFIG_MEMORY_FAILURE=y With these two options enabled, you can do stress testing thru madvise syscall (sec 4.1). - Install page-types tool (sec 3.3), which is accompanied with Linux kernel source (2.6.32 or newer). # cd $KERNEL_SRC/Documentation/vm/ # gcc -o page-types page-types.c # cp page-types /usr/bin/ - Get latest LTP (Linux Test Project) image from http://ltp.sf.net. Refer to INSTALL of LTP to install LTP on your machine. - Build and run stress testing # make # cd stress # ./hwpoison.sh -d $YOUR_PARTITION -M -o $YOUR_LTP_DIR -N Note here, '-d $YOUR_PARTITION' is a mandatory option. Test will create all temporary files on $YOUR_PARTITION, and error injection will just affect the pages associated with $$YOUR_PARTITION. So you must provide a free disk partition to stress test driver! This will do the stress testing thru madvise syscall (sec 4.1). However, there are more advanced test methods provided (sec 4.2, 4.3). Note, for all examples in the rest of this doc, it is supposed that $PWD is the stress subdir. 1. Overview ----------- The MCE stress test suite is a collection of tools and test scripts, which intends to achieve stress testing on Linux kernel MCA high level handlers that include HWPosion page recovery, soft page offline, and so on. In general, this test suite is designed to do stress testing thru various test interfaces, i.e. madvise syscall, HWPoison page injector, and APEI injector (see ACPI4.0 spec). And it's able to support most of popular Linux File Systems (FS), that is, there is an option for user to specify which FS type they want the test to be running on. If you just want to start testing as quickly as possible, you can skip section 2 & 3, just go to section 4 directly. 2. Design Details ----------------- The MCE stress test suite consists of four parts: test driver, workload controller, customized workloads, and background workloads. The main test idea is described as below: - Test driver launchs various customized workloads to continuously generate lots of pages with expected page states, Note, all of these workloads know about their expected results that should not be affected by Linux MCE high level handlers. - Then test driver injects MCE errors to these pages thru either madvise syscall or HWPoison injector or APEI injector. While Linux Kernel handling these MCE errors, all the workloads continue running normally, - After long time running, test driver will collect test result of each workload to see if any unexpected failures happened. In such a way, it can decide if any bug is found. - If any system panics or FS corruption happens, that means there must be a bug. It's the bottom line to decide if test gets pass. 2.1 Test Driver Test driver (a.k.a hwpoison.sh) drives the whole test procedure. It's responsible for managing test environment, setting up error injection interface, controlling test progress, launching workloads, injecting page errors, as well as recording test logs and reporting test result. For detailed usage of hwpoison.sh test driver, please refer to: # ./hwpoison.sh -h 2.2 Workload Controller Workload controller needs to have various test workloads running parallelly and continuously within a required duration time. We select ltp-pan program of Linux Test Project (LTP) as the workload controller of this stress test suite. Test driver (hwpoison.sh) interacts with ltp-pan in following ways: - hwpoison.sh generates a test config file that lists the workload type to be launched by ltp-pan. - hwpoison also passes test duration time and other workload specific parameters to ltp-pan via test config file. - ltp-pan makes each workload run and get finished in time, then test driver can get the result of each workload via corresponding result files. - finally, hwpoison.sh will decide the overall test result based on each workload result, and report final result out. 2.3 Customized Workloads There are three types of customized workloads, which are intended to generate pages with various page state. * Type0: page-poisoning workload, meant to cover: - anonymous pages operations. - file data operations. * Type1: fs-metadata workload, meant to cover: - inode operations. * Type2: fs_type specific workload, meant to cover: - extended functions of some special FS. 2.4 Background Workloads LTP is selected as the background workload to simulate normal system operations in background while stress testing is running. Besides LTP, there are also some alternatives, like AIM. We might extend more background workloads in future. 2.5 Test Result How to determine that stress testing gets pass? - at least no kernel panics happens during stress testing. - fsck on the target disk at the end of stress testing should get pass. - there is no failure found by customized workloads, especially for page-poisoning workload. Where to get detailed test result? - When stress testing is done, the general test result is recorded in result/hwpoison.result, and the general test log is in result/hwpoison.log. However, you can specify them in following way: # hwpoison.sh -r $YOUR_RESULT -l $YOUR_LOG - The test result and test log of each workload are recorded as log/$workload/$workload.result and log/$workload/$workload.log. For example, for page-poisoning workload, its test result and test logs are log/page-poisoning/page-poisoning.result and log/page-poisoning/page-poisoning.log. - Besides, under each workload result dir, you can find other extra logs like pan_log, pan_output and etc. These logs are generated by ltp-pan workload controller. Usually they can help you understand what has been going on with ltp-pan while workload is running. Pls. refer to ltp-pan doc for details. 3. Tools -------- 3.1 page-poisoning It is the page-poisoning workload. page-poisoning workload is an extension of tinjpage test program with a multi-process model. It spawns thousands of processes that inject HWPosion error to various pages simultaneously thru madvise syscall. Then it checks if these errors get handled correctly, i.e. whether each test process receives or doesn't receive SIGBUS signal as expected. For more info about page-poisoning workload, pls. read through README file under stress/tools/page-poisoning/. 3.2 fs-metadata It is the fs-metadata workload. fs-metadata is designed to test i-node operations with heavy workload and make sure every i-node operation gets the expected result. In details, it firstly generates a huge directory hierarchy on the target disk, then it performs unlink operations on this directory hierarchy and duplicate a copy of the directory, finally it checks if these two directories are same as expected. For more info about fs-metadata workload, pls. read through README file under stress/tools/fs-metadata/. 3.3 page-types page-types is a tool to query the page type of every memory page in the system. We use it to filter out pages with required page types. Test will inject error to these pages via error injector, although the page filter of HWPosion handler in Linux Kernel will filter them out for a second time. Note, the reason we need to use page-types to do first time filtering is just about performance. To install page-types on your test machine: # cd $KERNEL_SRC/Documentation/vm/ # gcc -o page-types page-types.c # cp page-types /usr/bin/ 3.4 ltp-pan It's the workload controller of this stress test suite. In fact, ltp-pan is the test harness of LTP (Linux Test Project), and is included in LTP package. For more information, please refer to ltp-pan document of LTP. 4. Usage Guide -------------- This section is trying to show you how to conduct the stress testing thru various test interfaces. As an example, we choose to run stress testing based on partition /dev/sda1 for 1 hour. Note, we've installed LTP to /ltp. 4.1 Stress Test thru Madvise Syscall. To run this stress testing, you need to strictly follow below test instructions. * Test instructions: - make sure following kernel options are enabled: CONFIG_X86_MCE=y CONFIG_MEMORY_FAILURE=y - build and run stress testing # make # ./hwpoison.sh -d $YOUR_PARTITION -M -o $YOUR_LTP_DIR * Example: - launch testing # ./hwpoison.sh -d /dev/sda1 -M -t 3600 - general test results result: result/hwpoison.result logs: result/hwpoison.log - detailed workload results result: log/page-poisoning/page-poisoning.result log: log/page-poisoning/page-poisoning.log 4.2 Stress Test thru HWPosion Page Injector This is the default test method of this stress test suite. To run this stress testing, you need to strictly follow below test instructions. * Test instructions: - make sure following kernel options are enabled: CONFIG_X86_MCE=y CONFIG_MEMORY_FAILURE=y CONFIG_DEBUG_KERNEL=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y CONFIG_HWPOISON_INJECT=y - build and run stress testing # make # ./hwpoison.sh -d $YOUR_PARTITION -o $YOUR_LTP_DIR -L * Example: - launch testing # ./hwpoison.sh -d /dev/sda1 -t 3600 -L - general test results result: result/hwpoison.result logs: result/hwpoison.log - detailed workload results fs-metadata result: log/fs-metadata/fs-metadata.result fs-metadata log: log/fs-metadata/fs-metadata.log ltp result: log/ltp/ltp.result ltp log: log/ltp/ltp.log fs-specific result: log/fs-specific/fs-specific.result fs-specific log: log/fs-specific/fs-specific.log 4.3 Stress Test thru APEI Injector To run this stress testing, you need to follow below test instructions. * Test instructions: - make sure following kernel options are enabled: CONFIG_X86_MCE=y CONFIG_X86_MCE_INTEL=y CONFIG_MEMORY_FAILURE=y CONFIG_ACPI_APEI=y CONFIG_ACPI_APEI_EINJ=y - build and run stress testing # make # ./hwpoison.sh -d $YOUR_PARTITION -o $YOUR_LTP_DIR -L -A * Example: - launch testing # ./hwpoison.sh -d /dev/sda1 -t 3600 -L -A - general test results result: result/hwpoison.result logs: result/hwpoison.log - detailed workload results fs-metadata result: log/fs-metadata/fs-metadata.result fs-metadata log: log/fs-metadata/fs-metadata.log ltp result: log/ltp/ltp.result ltp log: log/ltp/ltp.log fs-specific result: log/fs-specific/fs-specific.result fs-specific log: log/fs-specific/fs-specific.log 5. FAQs ------- Here is a collection of frequently asked questions: Q: How to tell test driver not to format my disk partition? A: Use the option '-N'. Q: Can three types of tests run on same sytem simultaneously? A: No. There are limitations in Linux Kernel HWPoison page filtering. Q: Can I run this stress testing on multiple disks parallely? A: Yes. But it requires updated Kernel patches for HWPosion page filtering. Now, it just supports one same test with same pagetype flags specified.