• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. highlight:: c
2
3.. _defining-new-types:
4
5**********************************
6Defining Extension Types: Tutorial
7**********************************
8
9.. sectionauthor:: Michael Hudson <mwh@python.net>
10.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
11.. sectionauthor:: Jim Fulton <jim@zope.com>
12
13
14Python allows the writer of a C extension module to define new types that
15can be manipulated from Python code, much like the built-in :class:`str`
16and :class:`list` types.  The code for all extension types follows a
17pattern, but there are some details that you need to understand before you
18can get started.  This document is a gentle introduction to the topic.
19
20
21.. _dnt-basics:
22
23The Basics
24==========
25
26The :term:`CPython` runtime sees all Python objects as variables of type
27:c:expr:`PyObject*`, which serves as a "base type" for all Python objects.
28The :c:type:`PyObject` structure itself only contains the object's
29:term:`reference count` and a pointer to the object's "type object".
30This is where the action is; the type object determines which (C) functions
31get called by the interpreter when, for instance, an attribute gets looked up
32on an object, a method called, or it is multiplied by another object.  These
33C functions are called "type methods".
34
35So, if you want to define a new extension type, you need to create a new type
36object.
37
38This sort of thing can only be explained by example, so here's a minimal, but
39complete, module that defines a new type named :class:`Custom` inside a C
40extension module :mod:`custom`:
41
42.. note::
43   What we're showing here is the traditional way of defining *static*
44   extension types.  It should be adequate for most uses.  The C API also
45   allows defining heap-allocated extension types using the
46   :c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.
47
48.. literalinclude:: ../includes/custom.c
49
50Now that's quite a bit to take in at once, but hopefully bits will seem familiar
51from the previous chapter.  This file defines three things:
52
53#. What a :class:`Custom` **object** contains: this is the ``CustomObject``
54   struct, which is allocated once for each :class:`Custom` instance.
55#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct,
56   which defines a set of flags and function pointers that the interpreter
57   inspects when specific operations are requested.
58#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom``
59   function and the associated ``custommodule`` struct.
60
61The first bit is::
62
63   typedef struct {
64       PyObject_HEAD
65   } CustomObject;
66
67This is what a Custom object will contain.  ``PyObject_HEAD`` is mandatory
68at the start of each object struct and defines a field called ``ob_base``
69of type :c:type:`PyObject`, containing a pointer to a type object and a
70reference count (these can be accessed using the macros :c:macro:`Py_TYPE`
71and :c:macro:`Py_REFCNT` respectively).  The reason for the macro is to
72abstract away the layout and to enable additional fields in :ref:`debug builds
73<debug-build>`.
74
75.. note::
76   There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
77   Be wary of adding one by accident: some compilers will complain.
78
79Of course, objects generally store additional data besides the standard
80``PyObject_HEAD`` boilerplate; for example, here is the definition for
81standard Python floats::
82
83   typedef struct {
84       PyObject_HEAD
85       double ob_fval;
86   } PyFloatObject;
87
88The second bit is the definition of the type object. ::
89
90   static PyTypeObject CustomType = {
91       PyVarObject_HEAD_INIT(NULL, 0)
92       .tp_name = "custom.Custom",
93       .tp_doc = PyDoc_STR("Custom objects"),
94       .tp_basicsize = sizeof(CustomObject),
95       .tp_itemsize = 0,
96       .tp_flags = Py_TPFLAGS_DEFAULT,
97       .tp_new = PyType_GenericNew,
98   };
99
100.. note::
101   We recommend using C99-style designated initializers as above, to
102   avoid listing all the :c:type:`PyTypeObject` fields that you don't care
103   about and also to avoid caring about the fields' declaration order.
104
105The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
106many more :ref:`fields <type-structs>` than the definition above.  The
107remaining fields will be filled with zeros by the C compiler, and it's
108common practice to not specify them explicitly unless you need them.
109
110We're going to pick it apart, one field at a time::
111
112   PyVarObject_HEAD_INIT(NULL, 0)
113
114This line is mandatory boilerplate to initialize the ``ob_base``
115field mentioned above. ::
116
117   .tp_name = "custom.Custom",
118
119The name of our type.  This will appear in the default textual representation of
120our objects and in some error messages, for example:
121
122.. code-block:: pycon
123
124   >>> "" + custom.Custom()
125   Traceback (most recent call last):
126     File "<stdin>", line 1, in <module>
127   TypeError: can only concatenate str (not "custom.Custom") to str
128
129Note that the name is a dotted name that includes both the module name and the
130name of the type within the module. The module in this case is :mod:`custom` and
131the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`.
132Using the real dotted import path is important to make your type compatible
133with the :mod:`pydoc` and :mod:`pickle` modules. ::
134
135   .tp_basicsize = sizeof(CustomObject),
136   .tp_itemsize = 0,
137
138This is so that Python knows how much memory to allocate when creating
139new :class:`Custom` instances.  :c:member:`~PyTypeObject.tp_itemsize` is
140only used for variable-sized objects and should otherwise be zero.
141
142.. note::
143
144   If you want your type to be subclassable from Python, and your type has the same
145   :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
146   inheritance.  A Python subclass of your type will have to list your type first
147   in its :attr:`~class.__bases__`, or else it will not be able to call your type's
148   :meth:`__new__` method without getting an error.  You can avoid this problem by
149   ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
150   base type does.  Most of the time, this will be true anyway, because either your
151   base type will be :class:`object`, or else you will be adding data members to
152   your base type, and therefore increasing its size.
153
154We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. ::
155
156   .tp_flags = Py_TPFLAGS_DEFAULT,
157
158All types should include this constant in their flags.  It enables all of the
159members defined until at least Python 3.3.  If you need further members,
160you will need to OR the corresponding flags.
161
162We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
163
164   .tp_doc = PyDoc_STR("Custom objects"),
165
166To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
167handler.  This is the equivalent of the Python method :meth:`__new__`, but
168has to be specified explicitly.  In this case, we can just use the default
169implementation provided by the API function :c:func:`PyType_GenericNew`. ::
170
171   .tp_new = PyType_GenericNew,
172
173Everything else in the file should be familiar, except for some code in
174:c:func:`PyInit_custom`::
175
176   if (PyType_Ready(&CustomType) < 0)
177       return;
178
179This initializes the :class:`Custom` type, filling in a number of members
180to the appropriate default values, including :attr:`ob_type` that we initially
181set to ``NULL``. ::
182
183   Py_INCREF(&CustomType);
184   if (PyModule_AddObject(m, "Custom", (PyObject *) &CustomType) < 0) {
185       Py_DECREF(&CustomType);
186       Py_DECREF(m);
187       return NULL;
188   }
189
190This adds the type to the module dictionary.  This allows us to create
191:class:`Custom` instances by calling the :class:`Custom` class:
192
193.. code-block:: pycon
194
195   >>> import custom
196   >>> mycustom = custom.Custom()
197
198That's it!  All that remains is to build it; put the above code in a file called
199:file:`custom.c` and:
200
201.. code-block:: python
202
203   from distutils.core import setup, Extension
204   setup(name="custom", version="1.0",
205         ext_modules=[Extension("custom", ["custom.c"])])
206
207in a file called :file:`setup.py`; then typing
208
209.. code-block:: shell-session
210
211   $ python setup.py build
212
213at a shell should produce a file :file:`custom.so` in a subdirectory; move to
214that directory and fire up Python --- you should be able to ``import custom`` and
215play around with Custom objects.
216
217That wasn't so hard, was it?
218
219Of course, the current Custom type is pretty uninteresting. It has no data and
220doesn't do anything. It can't even be subclassed.
221
222.. note::
223   While this documentation showcases the standard :mod:`distutils` module
224   for building C extensions, it is recommended in real-world use cases to
225   use the newer and better-maintained ``setuptools`` library.  Documentation
226   on how to do this is out of scope for this document and can be found in
227   the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_.
228
229
230Adding data and methods to the Basic example
231============================================
232
233Let's extend the basic example to add some data and methods.  Let's also make
234the type usable as a base class. We'll create a new module, :mod:`custom2` that
235adds these capabilities:
236
237.. literalinclude:: ../includes/custom2.c
238
239
240This version of the module has a number of changes.
241
242We've added an extra include::
243
244   #include <structmember.h>
245
246This include provides declarations that we use to handle attributes, as
247described a bit later.
248
249The  :class:`Custom` type now has three data attributes in its C struct,
250*first*, *last*, and *number*.  The *first* and *last* variables are Python
251strings containing first and last names.  The *number* attribute is a C integer.
252
253The object structure is updated accordingly::
254
255   typedef struct {
256       PyObject_HEAD
257       PyObject *first; /* first name */
258       PyObject *last;  /* last name */
259       int number;
260   } CustomObject;
261
262Because we now have data to manage, we have to be more careful about object
263allocation and deallocation.  At a minimum, we need a deallocation method::
264
265   static void
266   Custom_dealloc(CustomObject *self)
267   {
268       Py_XDECREF(self->first);
269       Py_XDECREF(self->last);
270       Py_TYPE(self)->tp_free((PyObject *) self);
271   }
272
273which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
274
275   .tp_dealloc = (destructor) Custom_dealloc,
276
277This method first clears the reference counts of the two Python attributes.
278:c:func:`Py_XDECREF` correctly handles the case where its argument is
279``NULL`` (which might happen here if ``tp_new`` failed midway).  It then
280calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
281(computed by ``Py_TYPE(self)``) to free the object's memory.  Note that
282the object's type might not be :class:`CustomType`, because the object may
283be an instance of a subclass.
284
285.. note::
286   The explicit cast to ``destructor`` above is needed because we defined
287   ``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
288   function pointer expects to receive a ``PyObject *`` argument.  Otherwise,
289   the compiler will emit a warning.  This is object-oriented polymorphism,
290   in C!
291
292We want to make sure that the first and last names are initialized to empty
293strings, so we provide a ``tp_new`` implementation::
294
295   static PyObject *
296   Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
297   {
298       CustomObject *self;
299       self = (CustomObject *) type->tp_alloc(type, 0);
300       if (self != NULL) {
301           self->first = PyUnicode_FromString("");
302           if (self->first == NULL) {
303               Py_DECREF(self);
304               return NULL;
305           }
306           self->last = PyUnicode_FromString("");
307           if (self->last == NULL) {
308               Py_DECREF(self);
309               return NULL;
310           }
311           self->number = 0;
312       }
313       return (PyObject *) self;
314   }
315
316and install it in the :c:member:`~PyTypeObject.tp_new` member::
317
318   .tp_new = Custom_new,
319
320The ``tp_new`` handler is responsible for creating (as opposed to initializing)
321objects of the type.  It is exposed in Python as the :meth:`__new__` method.
322It is not required to define a ``tp_new`` member, and indeed many extension
323types will simply reuse :c:func:`PyType_GenericNew` as done in the first
324version of the ``Custom`` type above.  In this case, we use the ``tp_new``
325handler to initialize the ``first`` and ``last`` attributes to non-``NULL``
326default values.
327
328``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
329if a subclass is instantiated) and any arguments passed when the type was
330called, and is expected to return the instance created.  ``tp_new`` handlers
331always accept positional and keyword arguments, but they often ignore the
332arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
333in C or ``__init__`` in Python) methods.
334
335.. note::
336   ``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
337   will do it itself.
338
339The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
340slot to allocate memory::
341
342   self = (CustomObject *) type->tp_alloc(type, 0);
343
344Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
345result against ``NULL`` before proceeding.
346
347.. note::
348   We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
349   :c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
350   which is :class:`object` by default.  Most types use the default allocation
351   strategy.
352
353.. note::
354   If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
355   that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`),
356   you must *not* try to determine what method to call using method resolution
357   order at runtime.  Always statically determine what type you are going to
358   call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
359   ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
360   type that also inherit from other Python-defined classes may not work correctly.
361   (Specifically, you may not be able to create instances of such subclasses
362   without getting a :exc:`TypeError`.)
363
364We also define an initialization function which accepts arguments to provide
365initial values for our instance::
366
367   static int
368   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
369   {
370       static char *kwlist[] = {"first", "last", "number", NULL};
371       PyObject *first = NULL, *last = NULL, *tmp;
372
373       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
374                                        &first, &last,
375                                        &self->number))
376           return -1;
377
378       if (first) {
379           tmp = self->first;
380           Py_INCREF(first);
381           self->first = first;
382           Py_XDECREF(tmp);
383       }
384       if (last) {
385           tmp = self->last;
386           Py_INCREF(last);
387           self->last = last;
388           Py_XDECREF(tmp);
389       }
390       return 0;
391   }
392
393by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
394
395   .tp_init = (initproc) Custom_init,
396
397The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
398:meth:`__init__` method.  It is used to initialize an object after it's
399created.  Initializers always accept positional and keyword arguments,
400and they should return either ``0`` on success or ``-1`` on error.
401
402Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
403is called at all (for example, the :mod:`pickle` module by default
404doesn't call :meth:`__init__` on unpickled instances).  It can also be
405called multiple times.  Anyone can call the :meth:`__init__` method on
406our objects.  For this reason, we have to be extra careful when assigning
407the new attribute values.  We might be tempted, for example to assign the
408``first`` member like this::
409
410   if (first) {
411       Py_XDECREF(self->first);
412       Py_INCREF(first);
413       self->first = first;
414   }
415
416But this would be risky.  Our type doesn't restrict the type of the
417``first`` member, so it could be any kind of object.  It could have a
418destructor that causes code to be executed that tries to access the
419``first`` member; or that destructor could release the
420:term:`Global interpreter Lock <GIL>` and let arbitrary code run in other
421threads that accesses and modifies our object.
422
423To be paranoid and protect ourselves against this possibility, we almost
424always reassign members before decrementing their reference counts.  When
425don't we have to do this?
426
427* when we absolutely know that the reference count is greater than 1;
428
429* when we know that deallocation of the object [#]_ will neither release
430  the :term:`GIL` nor cause any calls back into our type's code;
431
432* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
433  handler on a type which doesn't support cyclic garbage collection [#]_.
434
435We want to expose our instance variables as attributes. There are a
436number of ways to do that. The simplest way is to define member definitions::
437
438   static PyMemberDef Custom_members[] = {
439       {"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
440        "first name"},
441       {"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
442        "last name"},
443       {"number", T_INT, offsetof(CustomObject, number), 0,
444        "custom number"},
445       {NULL}  /* Sentinel */
446   };
447
448and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
449
450   .tp_members = Custom_members,
451
452Each member definition has a member name, type, offset, access flags and
453documentation string.  See the :ref:`Generic-Attribute-Management` section
454below for details.
455
456A disadvantage of this approach is that it doesn't provide a way to restrict the
457types of objects that can be assigned to the Python attributes.  We expect the
458first and last names to be strings, but any Python objects can be assigned.
459Further, the attributes can be deleted, setting the C pointers to ``NULL``.  Even
460though we can make sure the members are initialized to non-``NULL`` values, the
461members can be set to ``NULL`` if the attributes are deleted.
462
463We define a single method, :meth:`Custom.name()`, that outputs the objects name as the
464concatenation of the first and last names. ::
465
466   static PyObject *
467   Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
468   {
469       if (self->first == NULL) {
470           PyErr_SetString(PyExc_AttributeError, "first");
471           return NULL;
472       }
473       if (self->last == NULL) {
474           PyErr_SetString(PyExc_AttributeError, "last");
475           return NULL;
476       }
477       return PyUnicode_FromFormat("%S %S", self->first, self->last);
478   }
479
480The method is implemented as a C function that takes a :class:`Custom` (or
481:class:`Custom` subclass) instance as the first argument.  Methods always take an
482instance as the first argument. Methods often take positional and keyword
483arguments as well, but in this case we don't take any and don't need to accept
484a positional argument tuple or keyword argument dictionary. This method is
485equivalent to the Python method:
486
487.. code-block:: python
488
489   def name(self):
490       return "%s %s" % (self.first, self.last)
491
492Note that we have to check for the possibility that our :attr:`first` and
493:attr:`last` members are ``NULL``.  This is because they can be deleted, in which
494case they are set to ``NULL``.  It would be better to prevent deletion of these
495attributes and to restrict the attribute values to be strings.  We'll see how to
496do that in the next section.
497
498Now that we've defined the method, we need to create an array of method
499definitions::
500
501   static PyMethodDef Custom_methods[] = {
502       {"name", (PyCFunction) Custom_name, METH_NOARGS,
503        "Return the name, combining the first and last name"
504       },
505       {NULL}  /* Sentinel */
506   };
507
508(note that we used the :const:`METH_NOARGS` flag to indicate that the method
509is expecting no arguments other than *self*)
510
511and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::
512
513   .tp_methods = Custom_methods,
514
515Finally, we'll make our type usable as a base class for subclassing.  We've
516written our methods carefully so far so that they don't make any assumptions
517about the type of the object being created or used, so all we need to do is
518to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
519
520   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
521
522We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the
523module name in the :c:type:`PyModuleDef` struct, and update the full class
524name in the :c:type:`PyTypeObject` struct.
525
526Finally, we update our :file:`setup.py` file to build the new module:
527
528.. code-block:: python
529
530   from distutils.core import setup, Extension
531   setup(name="custom", version="1.0",
532         ext_modules=[
533            Extension("custom", ["custom.c"]),
534            Extension("custom2", ["custom2.c"]),
535            ])
536
537
538Providing finer control over data attributes
539============================================
540
541In this section, we'll provide finer control over how the :attr:`first` and
542:attr:`last` attributes are set in the :class:`Custom` example. In the previous
543version of our module, the instance variables :attr:`first` and :attr:`last`
544could be set to non-string values or even deleted. We want to make sure that
545these attributes always contain strings.
546
547.. literalinclude:: ../includes/custom3.c
548
549
550To provide greater control, over the :attr:`first` and :attr:`last` attributes,
551we'll use custom getter and setter functions.  Here are the functions for
552getting and setting the :attr:`first` attribute::
553
554   static PyObject *
555   Custom_getfirst(CustomObject *self, void *closure)
556   {
557       Py_INCREF(self->first);
558       return self->first;
559   }
560
561   static int
562   Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
563   {
564       PyObject *tmp;
565       if (value == NULL) {
566           PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
567           return -1;
568       }
569       if (!PyUnicode_Check(value)) {
570           PyErr_SetString(PyExc_TypeError,
571                           "The first attribute value must be a string");
572           return -1;
573       }
574       tmp = self->first;
575       Py_INCREF(value);
576       self->first = value;
577       Py_DECREF(tmp);
578       return 0;
579   }
580
581The getter function is passed a :class:`Custom` object and a "closure", which is
582a void pointer.  In this case, the closure is ignored.  (The closure supports an
583advanced usage in which definition data is passed to the getter and setter. This
584could, for example, be used to allow a single set of getter and setter functions
585that decide the attribute to get or set based on data in the closure.)
586
587The setter function is passed the :class:`Custom` object, the new value, and the
588closure.  The new value may be ``NULL``, in which case the attribute is being
589deleted.  In our setter, we raise an error if the attribute is deleted or if its
590new value is not a string.
591
592We create an array of :c:type:`PyGetSetDef` structures::
593
594   static PyGetSetDef Custom_getsetters[] = {
595       {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
596        "first name", NULL},
597       {"last", (getter) Custom_getlast, (setter) Custom_setlast,
598        "last name", NULL},
599       {NULL}  /* Sentinel */
600   };
601
602and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
603
604   .tp_getset = Custom_getsetters,
605
606The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
607above.  In this case, we aren't using a closure, so we just pass ``NULL``.
608
609We also remove the member definitions for these attributes::
610
611   static PyMemberDef Custom_members[] = {
612       {"number", T_INT, offsetof(CustomObject, number), 0,
613        "custom number"},
614       {NULL}  /* Sentinel */
615   };
616
617We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
618allow strings [#]_ to be passed::
619
620   static int
621   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
622   {
623       static char *kwlist[] = {"first", "last", "number", NULL};
624       PyObject *first = NULL, *last = NULL, *tmp;
625
626       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
627                                        &first, &last,
628                                        &self->number))
629           return -1;
630
631       if (first) {
632           tmp = self->first;
633           Py_INCREF(first);
634           self->first = first;
635           Py_DECREF(tmp);
636       }
637       if (last) {
638           tmp = self->last;
639           Py_INCREF(last);
640           self->last = last;
641           Py_DECREF(tmp);
642       }
643       return 0;
644   }
645
646With these changes, we can assure that the ``first`` and ``last`` members are
647never ``NULL`` so we can remove checks for ``NULL`` values in almost all cases.
648This means that most of the :c:func:`Py_XDECREF` calls can be converted to
649:c:func:`Py_DECREF` calls.  The only place we can't change these calls is in
650the ``tp_dealloc`` implementation, where there is the possibility that the
651initialization of these members failed in ``tp_new``.
652
653We also rename the module initialization function and module name in the
654initialization function, as we did before, and we add an extra definition to the
655:file:`setup.py` file.
656
657
658Supporting cyclic garbage collection
659====================================
660
661Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
662can identify unneeded objects even when their reference counts are not zero.
663This can happen when objects are involved in cycles.  For example, consider:
664
665.. code-block:: pycon
666
667   >>> l = []
668   >>> l.append(l)
669   >>> del l
670
671In this example, we create a list that contains itself. When we delete it, it
672still has a reference from itself. Its reference count doesn't drop to zero.
673Fortunately, Python's cyclic garbage collector will eventually figure out that
674the list is garbage and free it.
675
676In the second version of the :class:`Custom` example, we allowed any kind of
677object to be stored in the :attr:`first` or :attr:`last` attributes [#]_.
678Besides, in the second and third versions, we allowed subclassing
679:class:`Custom`, and subclasses may add arbitrary attributes.  For any of
680those two reasons, :class:`Custom` objects can participate in cycles:
681
682.. code-block:: pycon
683
684   >>> import custom3
685   >>> class Derived(custom3.Custom): pass
686   ...
687   >>> n = Derived()
688   >>> n.some_attribute = n
689
690To allow a :class:`Custom` instance participating in a reference cycle to
691be properly detected and collected by the cyclic GC, our :class:`Custom` type
692needs to fill two additional slots and to enable a flag that enables these slots:
693
694.. literalinclude:: ../includes/custom4.c
695
696
697First, the traversal method lets the cyclic GC know about subobjects that could
698participate in cycles::
699
700   static int
701   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
702   {
703       int vret;
704       if (self->first) {
705           vret = visit(self->first, arg);
706           if (vret != 0)
707               return vret;
708       }
709       if (self->last) {
710           vret = visit(self->last, arg);
711           if (vret != 0)
712               return vret;
713       }
714       return 0;
715   }
716
717For each subobject that can participate in cycles, we need to call the
718:c:func:`visit` function, which is passed to the traversal method. The
719:c:func:`visit` function takes as arguments the subobject and the extra argument
720*arg* passed to the traversal method.  It returns an integer value that must be
721returned if it is non-zero.
722
723Python provides a :c:func:`Py_VISIT` macro that automates calling visit
724functions.  With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
725in ``Custom_traverse``::
726
727   static int
728   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
729   {
730       Py_VISIT(self->first);
731       Py_VISIT(self->last);
732       return 0;
733   }
734
735.. note::
736   The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
737   arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.
738
739Second, we need to provide a method for clearing any subobjects that can
740participate in cycles::
741
742   static int
743   Custom_clear(CustomObject *self)
744   {
745       Py_CLEAR(self->first);
746       Py_CLEAR(self->last);
747       return 0;
748   }
749
750Notice the use of the :c:func:`Py_CLEAR` macro.  It is the recommended and safe
751way to clear data attributes of arbitrary types while decrementing
752their reference counts.  If you were to call :c:func:`Py_XDECREF` instead
753on the attribute before setting it to ``NULL``, there is a possibility
754that the attribute's destructor would call back into code that reads the
755attribute again (*especially* if there is a reference cycle).
756
757.. note::
758   You could emulate :c:func:`Py_CLEAR` by writing::
759
760      PyObject *tmp;
761      tmp = self->first;
762      self->first = NULL;
763      Py_XDECREF(tmp);
764
765   Nevertheless, it is much easier and less error-prone to always
766   use :c:func:`Py_CLEAR` when deleting an attribute.  Don't
767   try to micro-optimize at the expense of robustness!
768
769The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
770attributes.  It means the circular GC can be triggered inside the function.
771Since the GC assumes reference count is not zero, we need to untrack the object
772from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
773Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
774and ``Custom_clear``::
775
776   static void
777   Custom_dealloc(CustomObject *self)
778   {
779       PyObject_GC_UnTrack(self);
780       Custom_clear(self);
781       Py_TYPE(self)->tp_free((PyObject *) self);
782   }
783
784Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
785
786   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
787
788That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
789:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
790garbage collection.  Most extensions will use the versions automatically provided.
791
792
793Subclassing other types
794=======================
795
796It is possible to create new extension types that are derived from existing
797types. It is easiest to inherit from the built in types, since an extension can
798easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
799these :c:type:`PyTypeObject` structures between extension modules.
800
801In this example we will create a :class:`SubList` type that inherits from the
802built-in :class:`list` type. The new type will be completely compatible with
803regular lists, but will have an additional :meth:`increment` method that
804increases an internal counter:
805
806.. code-block:: pycon
807
808   >>> import sublist
809   >>> s = sublist.SubList(range(3))
810   >>> s.extend(s)
811   >>> print(len(s))
812   6
813   >>> print(s.increment())
814   1
815   >>> print(s.increment())
816   2
817
818.. literalinclude:: ../includes/sublist.c
819
820
821As you can see, the source code closely resembles the :class:`Custom` examples in
822previous sections. We will break down the main differences between them. ::
823
824   typedef struct {
825       PyListObject list;
826       int state;
827   } SubListObject;
828
829The primary difference for derived type objects is that the base type's
830object structure must be the first value.  The base type will already include
831the :c:func:`PyObject_HEAD` at the beginning of its structure.
832
833When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer
834can be safely cast to both ``PyListObject *`` and ``SubListObject *``::
835
836   static int
837   SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
838   {
839       if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
840           return -1;
841       self->state = 0;
842       return 0;
843   }
844
845We see above how to call through to the :attr:`__init__` method of the base
846type.
847
848This pattern is important when writing a type with custom
849:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
850members.  The :c:member:`~PyTypeObject.tp_new` handler should not actually
851create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
852but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.
853
854The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
855specifying the type's concrete base class.  Due to cross-platform compiler
856issues, you can't fill that field directly with a reference to
857:c:type:`PyList_Type`; it should be done later in the module initialization
858function::
859
860   PyMODINIT_FUNC
861   PyInit_sublist(void)
862   {
863       PyObject* m;
864       SubListType.tp_base = &PyList_Type;
865       if (PyType_Ready(&SubListType) < 0)
866           return NULL;
867
868       m = PyModule_Create(&sublistmodule);
869       if (m == NULL)
870           return NULL;
871
872       Py_INCREF(&SubListType);
873       if (PyModule_AddObject(m, "SubList", (PyObject *) &SubListType) < 0) {
874           Py_DECREF(&SubListType);
875           Py_DECREF(m);
876           return NULL;
877       }
878
879       return m;
880   }
881
882Before calling :c:func:`PyType_Ready`, the type structure must have the
883:c:member:`~PyTypeObject.tp_base` slot filled in.  When we are deriving an
884existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
885slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
886type will be inherited.
887
888After that, calling :c:func:`PyType_Ready` and adding the type object to the
889module is the same as with the basic :class:`Custom` examples.
890
891
892.. rubric:: Footnotes
893
894.. [#] This is true when we know that the object is a basic type, like a string or a
895   float.
896
897.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
898   in this example, because our type doesn't support garbage collection.
899
900.. [#] We now know that the first and last members are strings, so perhaps we
901   could be less careful about decrementing their reference counts, however,
902   we accept instances of string subclasses.  Even though deallocating normal
903   strings won't call back into our objects, we can't guarantee that deallocating
904   an instance of a string subclass won't call back into our objects.
905
906.. [#] Also, even with our attributes restricted to strings instances, the user
907   could pass arbitrary :class:`str` subclasses and therefore still create
908   reference cycles.
909