README
    
        1________________________________________________________________________
2
3PYBENCH - A Python Benchmark Suite
4________________________________________________________________________
5
6     Extendable suite of low-level benchmarks for measuring
7          the performance of the Python implementation
8                 (interpreter, compiler or VM).
9
10pybench is a collection of tests that provides a standardized way to
11measure the performance of Python implementations. It takes a very
12close look at different aspects of Python programs and let's you
13decide which factors are more important to you than others, rather
14than wrapping everything up in one number, like the other performance
15tests do (e.g. pystone which is included in the Python Standard
16Library).
17
18pybench has been used in the past by several Python developers to
19track down performance bottlenecks or to demonstrate the impact of
20optimizations and new features in Python.
21
22The command line interface for pybench is the file pybench.py. Run
23this script with option '--help' to get a listing of the possible
24options. Without options, pybench will simply execute the benchmark
25and then print out a report to stdout.
26
27
28Micro-Manual
29------------
30
31Run 'pybench.py -h' to see the help screen.  Run 'pybench.py' to run
32the benchmark suite using default settings and 'pybench.py -f <file>'
33to have it store the results in a file too.
34
35It is usually a good idea to run pybench.py multiple times to see
36whether the environment, timers and benchmark run-times are suitable
37for doing benchmark tests.
38
39You can use the comparison feature of pybench.py ('pybench.py -c
40<file>') to check how well the system behaves in comparison to a
41reference run.
42
43If the differences are well below 10% for each test, then you have a
44system that is good for doing benchmark testings.  Of you get random
45differences of more than 10% or significant differences between the
46values for minimum and average time, then you likely have some
47background processes running which cause the readings to become
48inconsistent. Examples include: web-browsers, email clients, RSS
49readers, music players, backup programs, etc.
50
51If you are only interested in a few tests of the whole suite, you can
52use the filtering option, e.g. 'pybench.py -t string' will only
53run/show the tests that have 'string' in their name.
54
55This is the current output of pybench.py --help:
56
57"""
58------------------------------------------------------------------------
59PYBENCH - a benchmark test suite for Python interpreters/compilers.
60------------------------------------------------------------------------
61
62Synopsis:
63 pybench.py [option] files...
64
65Options and default settings:
66  -n arg           number of rounds (10)
67  -f arg           save benchmark to file arg ()
68  -c arg           compare benchmark with the one in file arg ()
69  -s arg           show benchmark in file arg, then exit ()
70  -w arg           set warp factor to arg (10)
71  -t arg           run only tests with names matching arg ()
72  -C arg           set the number of calibration runs to arg (20)
73  -d               hide noise in comparisons (0)
74  -v               verbose output (not recommended) (0)
75  --with-gc        enable garbage collection (0)
76  --with-syscheck  use default sys check interval (0)
77  --timer arg      use given timer (time.time)
78  -h               show this help text
79  --help           show this help text
80  --debug          enable debugging
81  --copyright      show copyright
82  --examples       show examples of usage
83
84Version:
85 2.0
86
87The normal operation is to run the suite and display the
88results. Use -f to save them for later reuse or comparisons.
89
90Available timers:
91
92   time.time
93   time.clock
94   systimes.processtime
95
96Examples:
97
98python2.1 pybench.py -f p21.pybench
99python2.5 pybench.py -f p25.pybench
100python pybench.py -s p25.pybench -c p21.pybench
101"""
102
103License
104-------
105
106See LICENSE file.
107
108
109Sample output
110-------------
111
112"""
113-------------------------------------------------------------------------------
114PYBENCH 2.0
115-------------------------------------------------------------------------------
116* using Python 2.4.2
117* disabled garbage collection
118* system check interval set to maximum: 2147483647
119* using timer: time.time
120
121Calibrating tests. Please wait...
122
123Running 10 round(s) of the suite at warp factor 10:
124
125* Round 1 done in 6.388 seconds.
126* Round 2 done in 6.485 seconds.
127* Round 3 done in 6.786 seconds.
128...
129* Round 10 done in 6.546 seconds.
130
131-------------------------------------------------------------------------------
132Benchmark: 2006-06-12 12:09:25
133-------------------------------------------------------------------------------
134
135    Rounds: 10
136    Warp:   10
137    Timer:  time.time
138
139    Machine Details:
140       Platform ID:  Linux-2.6.8-24.19-default-x86_64-with-SuSE-9.2-x86-64
141       Processor:    x86_64
142
143    Python:
144       Executable:   /usr/local/bin/python
145       Version:      2.4.2
146       Compiler:     GCC 3.3.4 (pre 3.3.5 20040809)
147       Bits:         64bit
148       Build:        Oct  1 2005 15:24:35 (#1)
149       Unicode:      UCS2
150
151
152Test                             minimum  average  operation  overhead
153-------------------------------------------------------------------------------
154          BuiltinFunctionCalls:    126ms    145ms    0.28us    0.274ms
155           BuiltinMethodLookup:    124ms    130ms    0.12us    0.316ms
156                 CompareFloats:    109ms    110ms    0.09us    0.361ms
157         CompareFloatsIntegers:    100ms    104ms    0.12us    0.271ms
158               CompareIntegers:    137ms    138ms    0.08us    0.542ms
159        CompareInternedStrings:    124ms    127ms    0.08us    1.367ms
160                  CompareLongs:    100ms    104ms    0.10us    0.316ms
161                CompareStrings:    111ms    115ms    0.12us    0.929ms
162                CompareUnicode:    108ms    128ms    0.17us    0.693ms
163                 ConcatStrings:    142ms    155ms    0.31us    0.562ms
164                 ConcatUnicode:    119ms    127ms    0.42us    0.384ms
165               CreateInstances:    123ms    128ms    1.14us    0.367ms
166            CreateNewInstances:    121ms    126ms    1.49us    0.335ms
167       CreateStringsWithConcat:    130ms    135ms    0.14us    0.916ms
168       CreateUnicodeWithConcat:    130ms    135ms    0.34us    0.361ms
169                  DictCreation:    108ms    109ms    0.27us    0.361ms
170             DictWithFloatKeys:    149ms    153ms    0.17us    0.678ms
171           DictWithIntegerKeys:    124ms    126ms    0.11us    0.915ms
172            DictWithStringKeys:    114ms    117ms    0.10us    0.905ms
173                      ForLoops:    110ms    111ms    4.46us    0.063ms
174                    IfThenElse:    118ms    119ms    0.09us    0.685ms
175                   ListSlicing:    116ms    120ms    8.59us    0.103ms
176                NestedForLoops:    125ms    137ms    0.09us    0.019ms
177          NormalClassAttribute:    124ms    136ms    0.11us    0.457ms
178       NormalInstanceAttribute:    110ms    117ms    0.10us    0.454ms
179           PythonFunctionCalls:    107ms    113ms    0.34us    0.271ms
180             PythonMethodCalls:    140ms    149ms    0.66us    0.141ms
181                     Recursion:    156ms    166ms    3.32us    0.452ms
182                  SecondImport:    112ms    118ms    1.18us    0.180ms
183           SecondPackageImport:    118ms    127ms    1.27us    0.180ms
184         SecondSubmoduleImport:    140ms    151ms    1.51us    0.180ms
185       SimpleComplexArithmetic:    128ms    139ms    0.16us    0.361ms
186        SimpleDictManipulation:    134ms    136ms    0.11us    0.452ms
187         SimpleFloatArithmetic:    110ms    113ms    0.09us    0.571ms
188      SimpleIntFloatArithmetic:    106ms    111ms    0.08us    0.548ms
189       SimpleIntegerArithmetic:    106ms    109ms    0.08us    0.544ms
190        SimpleListManipulation:    103ms    113ms    0.10us    0.587ms
191          SimpleLongArithmetic:    112ms    118ms    0.18us    0.271ms
192                    SmallLists:    105ms    116ms    0.17us    0.366ms
193                   SmallTuples:    108ms    128ms    0.24us    0.406ms
194         SpecialClassAttribute:    119ms    136ms    0.11us    0.453ms
195      SpecialInstanceAttribute:    143ms    155ms    0.13us    0.454ms
196                StringMappings:    115ms    121ms    0.48us    0.405ms
197              StringPredicates:    120ms    129ms    0.18us    2.064ms
198                 StringSlicing:    111ms    127ms    0.23us    0.781ms
199                     TryExcept:    125ms    126ms    0.06us    0.681ms
200                TryRaiseExcept:    133ms    137ms    2.14us    0.361ms
201                  TupleSlicing:    117ms    120ms    0.46us    0.066ms
202               UnicodeMappings:    156ms    160ms    4.44us    0.429ms
203             UnicodePredicates:    117ms    121ms    0.22us    2.487ms
204             UnicodeProperties:    115ms    153ms    0.38us    2.070ms
205                UnicodeSlicing:    126ms    129ms    0.26us    0.689ms
206-------------------------------------------------------------------------------
207Totals:                           6283ms   6673ms
208"""
209________________________________________________________________________
210
211Writing New Tests
212________________________________________________________________________
213
214pybench tests are simple modules defining one or more pybench.Test
215subclasses.
216
217Writing a test essentially boils down to providing two methods:
218.test() which runs .rounds number of .operations test operations each
219and .calibrate() which does the same except that it doesn't actually
220execute the operations.
221
222
223Here's an example:
224------------------
225
226from pybench import Test
227
228class IntegerCounting(Test):
229
230    # Version number of the test as float (x.yy); this is important
231    # for comparisons of benchmark runs - tests with unequal version
232    # number will not get compared.
233    version = 1.0
234
235    # The number of abstract operations done in each round of the
236    # test. An operation is the basic unit of what you want to
237    # measure. The benchmark will output the amount of run-time per
238    # operation. Note that in order to raise the measured timings
239    # significantly above noise level, it is often required to repeat
240    # sets of operations more than once per test round. The measured
241    # overhead per test round should be less than 1 second.
242    operations = 20
243
244    # Number of rounds to execute per test run. This should be
245    # adjusted to a figure that results in a test run-time of between
246    # 1-2 seconds (at warp 1).
247    rounds = 100000
248
249    def test(self):
250
251	""" Run the test.
252
253	    The test needs to run self.rounds executing
254	    self.operations number of operations each.
255
256        """
257        # Init the test
258        a = 1
259
260        # Run test rounds
261	#
262        # NOTE: Use xrange() for all test loops unless you want to face
263	# a 20MB process !
264	#
265        for i in xrange(self.rounds):
266
267            # Repeat the operations per round to raise the run-time
268            # per operation significantly above the noise level of the
269            # for-loop overhead.
270
271	    # Execute 20 operations (a += 1):
272            a += 1
273            a += 1
274            a += 1
275            a += 1
276            a += 1
277            a += 1
278            a += 1
279            a += 1
280            a += 1
281            a += 1
282            a += 1
283            a += 1
284            a += 1
285            a += 1
286            a += 1
287            a += 1
288            a += 1
289            a += 1
290            a += 1
291            a += 1
292
293    def calibrate(self):
294
295	""" Calibrate the test.
296
297	    This method should execute everything that is needed to
298	    setup and run the test - except for the actual operations
299	    that you intend to measure. pybench uses this method to
300            measure the test implementation overhead.
301
302        """
303        # Init the test
304        a = 1
305
306        # Run test rounds (without actually doing any operation)
307        for i in xrange(self.rounds):
308
309	    # Skip the actual execution of the operations, since we
310	    # only want to measure the test's administration overhead.
311            pass
312
313Registering a new test module
314-----------------------------
315
316To register a test module with pybench, the classes need to be
317imported into the pybench.Setup module. pybench will then scan all the
318symbols defined in that module for subclasses of pybench.Test and
319automatically add them to the benchmark suite.
320
321
322Breaking Comparability
323----------------------
324
325If a change is made to any individual test that means it is no
326longer strictly comparable with previous runs, the '.version' class
327variable should be updated. Therefafter, comparisons with previous
328versions of the test will list as "n/a" to reflect the change.
329
330
331Version History
332---------------
333
334  2.0: rewrote parts of pybench which resulted in more repeatable
335       timings:
336        - made timer a parameter
337        - changed the platform default timer to use high-resolution
338          timers rather than process timers (which have a much lower
339          resolution)
340        - added option to select timer
341        - added process time timer (using systimes.py)
342        - changed to use min() as timing estimator (average
343          is still taken as well to provide an idea of the difference)
344        - garbage collection is turned off per default
345        - sys check interval is set to the highest possible value
346        - calibration is now a separate step and done using
347          a different strategy that allows measuring the test
348          overhead more accurately
349        - modified the tests to each give a run-time of between
350          100-200ms using warp 10
351        - changed default warp factor to 10 (from 20)
352        - compared results with timeit.py and confirmed measurements
353        - bumped all test versions to 2.0
354        - updated platform.py to the latest version
355        - changed the output format a bit to make it look
356          nicer
357        - refactored the APIs somewhat
358  1.3+: Steve Holden added the NewInstances test and the filtering
359       option during the NeedForSpeed sprint; this also triggered a long
360       discussion on how to improve benchmark timing and finally
361       resulted in the release of 2.0
362  1.3: initial checkin into the Python SVN repository
363
364
365Have fun,
366--
367Marc-Andre Lemburg
368mal@lemburg.com
369