1This document describes some caveats about the use of Valgrind with 2Python. Valgrind is used periodically by Python developers to try 3to ensure there are no memory leaks or invalid memory reads/writes. 4 5If you want to enable valgrind support in Python, you will need to 6configure Python --with-valgrind option or an older option 7--without-pymalloc. 8 9UPDATE: Python 3.6 now supports PYTHONMALLOC=malloc environment variable which 10can be used to force the usage of the malloc() allocator of the C library. 11 12If you don't want to read about the details of using Valgrind, there 13are still two things you must do to suppress the warnings. First, 14you must use a suppressions file. One is supplied in 15Misc/valgrind-python.supp. Second, you must uncomment the lines in 16Misc/valgrind-python.supp that suppress the warnings for PyObject_Free and 17PyObject_Realloc. 18 19If you want to use Valgrind more effectively and catch even more 20memory leaks, you will need to configure python --without-pymalloc. 21PyMalloc allocates a few blocks in big chunks and most object 22allocations don't call malloc, they use chunks doled about by PyMalloc 23from the big blocks. This means Valgrind can't detect 24many allocations (and frees), except for those that are forwarded 25to the system malloc. Note: configuring python --without-pymalloc 26makes Python run much slower, especially when running under Valgrind. 27You may need to run the tests in batches under Valgrind to keep 28the memory usage down to allow the tests to complete. It seems to take 29about 5 times longer to run --without-pymalloc. 30 31Apr 15, 2006: 32 test_ctypes causes Valgrind 3.1.1 to fail (crash). 33 test_socket_ssl should be skipped when running valgrind. 34 The reason is that it purposely uses uninitialized memory. 35 This causes many spurious warnings, so it's easier to just skip it. 36 37 38Details: 39-------- 40Python uses its own small-object allocation scheme on top of malloc, 41called PyMalloc. 42 43Valgrind may show some unexpected results when PyMalloc is used. 44Starting with Python 2.3, PyMalloc is used by default. You can disable 45PyMalloc when configuring python by adding the --without-pymalloc option. 46If you disable PyMalloc, most of the information in this document and 47the supplied suppressions file will not be useful. As discussed above, 48disabling PyMalloc can catch more problems. 49 50PyMalloc uses 256KB chunks of memory, so it can't detect anything 51wrong within these blocks. For that reason, compiling Python 52--without-pymalloc usually increases the usefulness of other tools. 53 54If you use valgrind on a default build of Python, you will see 55many errors like: 56 57 ==6399== Use of uninitialised value of size 4 58 ==6399== at 0x4A9BDE7E: PyObject_Free (obmalloc.c:711) 59 ==6399== by 0x4A9B8198: dictresize (dictobject.c:477) 60 61These are expected and not a problem. Tim Peters explains 62the situation: 63 64 PyMalloc needs to know whether an arbitrary address is one 65 that's managed by it, or is managed by the system malloc. 66 The current scheme allows this to be determined in constant 67 time, regardless of how many memory areas are under pymalloc's 68 control. 69 70 The memory pymalloc manages itself is in one or more "arenas", 71 each a large contiguous memory area obtained from malloc. 72 The base address of each arena is saved by pymalloc 73 in a vector. Each arena is carved into "pools", and a field at 74 the start of each pool contains the index of that pool's arena's 75 base address in that vector. 76 77 Given an arbitrary address, pymalloc computes the pool base 78 address corresponding to it, then looks at "the index" stored 79 near there. If the index read up is out of bounds for the 80 vector of arena base addresses pymalloc maintains, then 81 pymalloc knows for certain that this address is not under 82 pymalloc's control. Otherwise the index is in bounds, and 83 pymalloc compares 84 85 the arena base address stored at that index in the vector 86 87 to 88 89 the arbitrary address pymalloc is investigating 90 91 pymalloc controls this arbitrary address if and only if it lies 92 in the arena the address's pool's index claims it lies in. 93 94 It doesn't matter whether the memory pymalloc reads up ("the 95 index") is initialized. If it's not initialized, then 96 whatever trash gets read up will lead pymalloc to conclude 97 (correctly) that the address isn't controlled by it, either 98 because the index is out of bounds, or the index is in bounds 99 but the arena it represents doesn't contain the address. 100 101 This determination has to be made on every call to one of 102 pymalloc's free/realloc entry points, so its speed is critical 103 (Python allocates and frees dynamic memory at a ferocious rate 104 -- everything in Python, from integers to "stack frames", 105 lives in the heap). 106