1 2* Add more test cases. Categories we'd like to cover (with reasonably 3 real-world tests, preferably not microbenchmarks) include: 4 5 (X marks the ones that are fairly well covered now). 6 7 X math (general) 8 X bitops 9 X 3-d (the math bits) 10 - crypto / encoding 11 X string processing 12 - regexps 13 - date processing 14 - array processing 15 - control flow 16 - function calls / recursion 17 - object access (unclear if it is possible to make a realistic 18 benchmark that isolates this) 19 20 I'd specifically like to add all the computer language shootout 21 tests that Mozilla is using. 22 23* Normalize tests. Most of the test cases available have a repeat 24 count of some sort, so the time they take can be tuned. The tests 25 should be tuned so that each category contributes about the same 26 total, and so each test in each category contributes about the same 27 amount. The question is, what implementation should be the baseline? 28 My current thought is to either pick some specific browser on a 29 specific platform (IE 7 or Firefox 2 perhaps), or try to target the 30 average that some set of same-generation release browsers get on 31 each test. The latter is more work. IE7 is probably a reasonable 32 normalization target since it is the latest version of the most 33 popular browser, so results on this benchmark will tell you how much 34 you have to gain or lose by using a different browser. 35 36* Instead of using the standard error, the correct way to calculate 37 a 95% confidence interval for a small sample is the t-test. 38 <http://en.wikipedia.org/wiki/Student%27s_t-test>. Basically this involves 39 using values from a 2-tailed t-distribution table instead of 1.96 to 40 multiply by the error function, a table is available at 41 <http://www.medcalc.be/manual/t-distribution.php> 42 43* Add support to compare two different engines (or two builds of the 44 same engine) interleaved. 45 46* Add support to compare two existing sets of saved results. 47 48* Allow repeat count to be controlled from the browser-hosted version 49 and the WebKitTools wrapper script. 50 51* Add support to run only a subset of the tests (both command-line and 52 web versions). 53 54* Add a profile mode for the command-line version that runs the tests 55 repeatedly in the same command-line interpreter instance, for ease 56 of profiling. 57 58* Make the browser-hosted version prettier, both in general design and 59 maybe using bar graphs for the output. 60 61* Make it possible to track change over time and generate a graph per 62 result showing result and error bar for each version. 63 64* Hook up to automated testing / buildbot infrastructure. 65 66* Possibly... add the ability to download iBench from its original 67 server, pull out the JS test content, preprocess it, and add it as a 68 category to the benchmark. 69 70* Profit. 71