1stringbench is a set of performance tests comparing byte string 2operations with unicode operations. The two string implementations 3are loosely based on each other and sometimes the algorithm for one is 4faster than the other. 5 6These test set was started at the Need For Speed sprint in Reykjavik 7to identify which string methods could be sped up quickly and to 8identify obvious places for improvement. 9 10Here is an example of a benchmark 11 12 13@bench('"Andrew".startswith("A")', 'startswith single character', 1000) 14def startswith_single(STR): 15 s1 = STR("Andrew") 16 s2 = STR("A") 17 s1_startswith = s1.startswith 18 for x in _RANGE_1000: 19 s1_startswith(s2) 20 21The bench decorator takes three parameters. The first is a short 22description of how the code works. In most cases this is Python code 23snippet. It is not the code which is actually run because the real 24code is hand-optimized to focus on the method being tested. 25 26The second parameter is a group title. All benchmarks with the same 27group title are listed together. This lets you compare different 28implementations of the same algorithm, such as "t in s" 29vs. "s.find(t)". 30 31The last is a count. Each benchmark loops over the algorithm either 32100 or 1000 times, depending on the algorithm performance. The output 33time is the time per benchmark call so the reader needs a way to know 34how to scale the performance. 35 36These parameters become function attributes. 37 38 39Here is an example of the output 40 41 42========== count newlines 4338.54 41.60 92.7 ...text.with.2000.newlines.count("\n") (*100) 44========== early match, single character 451.14 1.18 96.8 ("A"*1000).find("A") (*1000) 460.44 0.41 105.6 "A" in "A"*1000 (*1000) 471.15 1.17 98.1 ("A"*1000).index("A") (*1000) 48 49The first column is the run time in milliseconds for byte strings. 50The second is the run time for unicode strings. The third is a 51percentage; byte time / unicode time. It's the percentage by which 52unicode is faster than byte strings. 53 54The last column contains the code snippet and the repeat count for the 55internal benchmark loop. 56 57The times are computed with 'timeit.py' which repeats the test more 58and more times until the total time takes over 0.2 seconds, returning 59the best time for a single iteration. 60 61The final line of the output is the cumulative time for byte and 62unicode strings, and the overall performance of unicode relative to 63bytes. For example 64 654079.83 5432.25 75.1 TOTAL 66 67However, this has no meaning as it evenly weights every test. 68 69