• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. highlightlang:: c
2
3.. _defining-new-types:
4
5**********************************
6Defining Extension Types: Tutorial
7**********************************
8
9.. sectionauthor:: Michael Hudson <mwh@python.net>
10.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
11.. sectionauthor:: Jim Fulton <jim@zope.com>
12
13
14Python allows the writer of a C extension module to define new types that
15can be manipulated from Python code, much like the built-in :class:`str`
16and :class:`list` types.  The code for all extension types follows a
17pattern, but there are some details that you need to understand before you
18can get started.  This document is a gentle introduction to the topic.
19
20
21.. _dnt-basics:
22
23The Basics
24==========
25
26The :term:`CPython` runtime sees all Python objects as variables of type
27:c:type:`PyObject\*`, which serves as a "base type" for all Python objects.
28The :c:type:`PyObject` structure itself only contains the object's
29:term:`reference count` and a pointer to the object's "type object".
30This is where the action is; the type object determines which (C) functions
31get called by the interpreter when, for instance, an attribute gets looked up
32on an object, a method called, or it is multiplied by another object.  These
33C functions are called "type methods".
34
35So, if you want to define a new extension type, you need to create a new type
36object.
37
38This sort of thing can only be explained by example, so here's a minimal, but
39complete, module that defines a new type named :class:`Custom` inside a C
40extension module :mod:`custom`:
41
42.. note::
43   What we're showing here is the traditional way of defining *static*
44   extension types.  It should be adequate for most uses.  The C API also
45   allows defining heap-allocated extension types using the
46   :c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.
47
48.. literalinclude:: ../includes/custom.c
49
50Now that's quite a bit to take in at once, but hopefully bits will seem familiar
51from the previous chapter.  This file defines three things:
52
53#. What a :class:`Custom` **object** contains: this is the ``CustomObject``
54   struct, which is allocated once for each :class:`Custom` instance.
55#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct,
56   which defines a set of flags and function pointers that the interpreter
57   inspects when specific operations are requested.
58#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom``
59   function and the associated ``custommodule`` struct.
60
61The first bit is::
62
63   typedef struct {
64       PyObject_HEAD
65   } CustomObject;
66
67This is what a Custom object will contain.  ``PyObject_HEAD`` is mandatory
68at the start of each object struct and defines a field called ``ob_base``
69of type :c:type:`PyObject`, containing a pointer to a type object and a
70reference count (these can be accessed using the macros :c:macro:`Py_REFCNT`
71and :c:macro:`Py_TYPE` respectively).  The reason for the macro is to
72abstract away the layout and to enable additional fields in debug builds.
73
74.. note::
75   There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
76   Be wary of adding one by accident: some compilers will complain.
77
78Of course, objects generally store additional data besides the standard
79``PyObject_HEAD`` boilerplate; for example, here is the definition for
80standard Python floats::
81
82   typedef struct {
83       PyObject_HEAD
84       double ob_fval;
85   } PyFloatObject;
86
87The second bit is the definition of the type object. ::
88
89   static PyTypeObject CustomType = {
90       PyVarObject_HEAD_INIT(NULL, 0)
91       .tp_name = "custom.Custom",
92       .tp_doc = "Custom objects",
93       .tp_basicsize = sizeof(CustomObject),
94       .tp_itemsize = 0,
95       .tp_new = PyType_GenericNew,
96   };
97
98.. note::
99   We recommend using C99-style designated initializers as above, to
100   avoid listing all the :c:type:`PyTypeObject` fields that you don't care
101   about and also to avoid caring about the fields' declaration order.
102
103The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
104many more :ref:`fields <type-structs>` than the definition above.  The
105remaining fields will be filled with zeros by the C compiler, and it's
106common practice to not specify them explicitly unless you need them.
107
108We're going to pick it apart, one field at a time::
109
110   PyVarObject_HEAD_INIT(NULL, 0)
111
112This line is mandatory boilerplate to initialize the ``ob_base``
113field mentioned above. ::
114
115   .tp_name = "custom.Custom",
116
117The name of our type.  This will appear in the default textual representation of
118our objects and in some error messages, for example:
119
120.. code-block:: pycon
121
122   >>> "" + custom.Custom()
123   Traceback (most recent call last):
124     File "<stdin>", line 1, in <module>
125   TypeError: can only concatenate str (not "custom.Custom") to str
126
127Note that the name is a dotted name that includes both the module name and the
128name of the type within the module. The module in this case is :mod:`custom` and
129the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`.
130Using the real dotted import path is important to make your type compatible
131with the :mod:`pydoc` and :mod:`pickle` modules. ::
132
133   .tp_basicsize = sizeof(CustomObject),
134   .tp_itemsize = 0,
135
136This is so that Python knows how much memory to allocate when creating
137new :class:`Custom` instances.  :c:member:`~PyTypeObject.tp_itemsize` is
138only used for variable-sized objects and should otherwise be zero.
139
140.. note::
141
142   If you want your type to be subclassable from Python, and your type has the same
143   :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
144   inheritance.  A Python subclass of your type will have to list your type first
145   in its :attr:`~class.__bases__`, or else it will not be able to call your type's
146   :meth:`__new__` method without getting an error.  You can avoid this problem by
147   ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
148   base type does.  Most of the time, this will be true anyway, because either your
149   base type will be :class:`object`, or else you will be adding data members to
150   your base type, and therefore increasing its size.
151
152We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. ::
153
154   .tp_flags = Py_TPFLAGS_DEFAULT,
155
156All types should include this constant in their flags.  It enables all of the
157members defined until at least Python 3.3.  If you need further members,
158you will need to OR the corresponding flags.
159
160We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
161
162   .tp_doc = "Custom objects",
163
164To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
165handler.  This is the equivalent of the Python method :meth:`__new__`, but
166has to be specified explicitly.  In this case, we can just use the default
167implementation provided by the API function :c:func:`PyType_GenericNew`. ::
168
169   .tp_new = PyType_GenericNew,
170
171Everything else in the file should be familiar, except for some code in
172:c:func:`PyInit_custom`::
173
174   if (PyType_Ready(&CustomType) < 0)
175       return;
176
177This initializes the :class:`Custom` type, filling in a number of members
178to the appropriate default values, including :attr:`ob_type` that we initially
179set to *NULL*. ::
180
181   PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
182
183This adds the type to the module dictionary.  This allows us to create
184:class:`Custom` instances by calling the :class:`Custom` class:
185
186.. code-block:: pycon
187
188   >>> import custom
189   >>> mycustom = custom.Custom()
190
191That's it!  All that remains is to build it; put the above code in a file called
192:file:`custom.c` and:
193
194.. code-block:: python
195
196   from distutils.core import setup, Extension
197   setup(name="custom", version="1.0",
198         ext_modules=[Extension("custom", ["custom.c"])])
199
200in a file called :file:`setup.py`; then typing
201
202.. code-block:: shell-session
203
204   $ python setup.py build
205
206at a shell should produce a file :file:`custom.so` in a subdirectory; move to
207that directory and fire up Python --- you should be able to ``import custom`` and
208play around with Custom objects.
209
210That wasn't so hard, was it?
211
212Of course, the current Custom type is pretty uninteresting. It has no data and
213doesn't do anything. It can't even be subclassed.
214
215.. note::
216   While this documentation showcases the standard :mod:`distutils` module
217   for building C extensions, it is recommended in real-world use cases to
218   use the newer and better-maintained ``setuptools`` library.  Documentation
219   on how to do this is out of scope for this document and can be found in
220   the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_.
221
222
223Adding data and methods to the Basic example
224============================================
225
226Let's extend the basic example to add some data and methods.  Let's also make
227the type usable as a base class. We'll create a new module, :mod:`custom2` that
228adds these capabilities:
229
230.. literalinclude:: ../includes/custom2.c
231
232
233This version of the module has a number of changes.
234
235We've added an extra include::
236
237   #include <structmember.h>
238
239This include provides declarations that we use to handle attributes, as
240described a bit later.
241
242The  :class:`Custom` type now has three data attributes in its C struct,
243*first*, *last*, and *number*.  The *first* and *last* variables are Python
244strings containing first and last names.  The *number* attribute is a C integer.
245
246The object structure is updated accordingly::
247
248   typedef struct {
249       PyObject_HEAD
250       PyObject *first; /* first name */
251       PyObject *last;  /* last name */
252       int number;
253   } CustomObject;
254
255Because we now have data to manage, we have to be more careful about object
256allocation and deallocation.  At a minimum, we need a deallocation method::
257
258   static void
259   Custom_dealloc(CustomObject *self)
260   {
261       Py_XDECREF(self->first);
262       Py_XDECREF(self->last);
263       Py_TYPE(self)->tp_free((PyObject *) self);
264   }
265
266which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
267
268   .tp_dealloc = (destructor) Custom_dealloc,
269
270This method first clears the reference counts of the two Python attributes.
271:c:func:`Py_XDECREF` correctly handles the case where its argument is
272*NULL* (which might happen here if ``tp_new`` failed midway).  It then
273calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
274(computed by ``Py_TYPE(self)``) to free the object's memory.  Note that
275the object's type might not be :class:`CustomType`, because the object may
276be an instance of a subclass.
277
278.. note::
279   The explicit cast to ``destructor`` above is needed because we defined
280   ``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
281   function pointer expects to receive a ``PyObject *`` argument.  Otherwise,
282   the compiler will emit a warning.  This is object-oriented polymorphism,
283   in C!
284
285We want to make sure that the first and last names are initialized to empty
286strings, so we provide a ``tp_new`` implementation::
287
288   static PyObject *
289   Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
290   {
291       CustomObject *self;
292       self = (CustomObject *) type->tp_alloc(type, 0);
293       if (self != NULL) {
294           self->first = PyUnicode_FromString("");
295           if (self->first == NULL) {
296               Py_DECREF(self);
297               return NULL;
298           }
299           self->last = PyUnicode_FromString("");
300           if (self->last == NULL) {
301               Py_DECREF(self);
302               return NULL;
303           }
304           self->number = 0;
305       }
306       return (PyObject *) self;
307   }
308
309and install it in the :c:member:`~PyTypeObject.tp_new` member::
310
311   .tp_new = Custom_new,
312
313The ``tp_new`` handler is responsible for creating (as opposed to initializing)
314objects of the type.  It is exposed in Python as the :meth:`__new__` method.
315It is not required to define a ``tp_new`` member, and indeed many extension
316types will simply reuse :c:func:`PyType_GenericNew` as done in the first
317version of the ``Custom`` type above.  In this case, we use the ``tp_new``
318handler to initialize the ``first`` and ``last`` attributes to non-*NULL*
319default values.
320
321``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
322if a subclass is instantiated) and any arguments passed when the type was
323called, and is expected to return the instance created.  ``tp_new`` handlers
324always accept positional and keyword arguments, but they often ignore the
325arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
326in C or ``__init__`` in Python) methods.
327
328.. note::
329   ``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
330   will do it itself.
331
332The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
333slot to allocate memory::
334
335   self = (CustomObject *) type->tp_alloc(type, 0);
336
337Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
338result against *NULL* before proceeding.
339
340.. note::
341   We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
342   :c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
343   which is :class:`object` by default.  Most types use the default allocation
344   strategy.
345
346.. note::
347   If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
348   that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`),
349   you must *not* try to determine what method to call using method resolution
350   order at runtime.  Always statically determine what type you are going to
351   call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
352   ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
353   type that also inherit from other Python-defined classes may not work correctly.
354   (Specifically, you may not be able to create instances of such subclasses
355   without getting a :exc:`TypeError`.)
356
357We also define an initialization function which accepts arguments to provide
358initial values for our instance::
359
360   static int
361   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
362   {
363       static char *kwlist[] = {"first", "last", "number", NULL};
364       PyObject *first = NULL, *last = NULL, *tmp;
365
366       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
367                                        &first, &last,
368                                        &self->number))
369           return -1;
370
371       if (first) {
372           tmp = self->first;
373           Py_INCREF(first);
374           self->first = first;
375           Py_XDECREF(tmp);
376       }
377       if (last) {
378           tmp = self->last;
379           Py_INCREF(last);
380           self->last = last;
381           Py_XDECREF(tmp);
382       }
383       return 0;
384   }
385
386by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
387
388   .tp_init = (initproc) Custom_init,
389
390The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
391:meth:`__init__` method.  It is used to initialize an object after it's
392created.  Initializers always accept positional and keyword arguments,
393and they should return either ``0`` on success or ``-1`` on error.
394
395Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
396is called at all (for example, the :mod:`pickle` module by default
397doesn't call :meth:`__init__` on unpickled instances).  It can also be
398called multiple times.  Anyone can call the :meth:`__init__` method on
399our objects.  For this reason, we have to be extra careful when assigning
400the new attribute values.  We might be tempted, for example to assign the
401``first`` member like this::
402
403   if (first) {
404       Py_XDECREF(self->first);
405       Py_INCREF(first);
406       self->first = first;
407   }
408
409But this would be risky.  Our type doesn't restrict the type of the
410``first`` member, so it could be any kind of object.  It could have a
411destructor that causes code to be executed that tries to access the
412``first`` member; or that destructor could release the
413:term:`Global interpreter Lock` and let arbitrary code run in other
414threads that accesses and modifies our object.
415
416To be paranoid and protect ourselves against this possibility, we almost
417always reassign members before decrementing their reference counts.  When
418don't we have to do this?
419
420* when we absolutely know that the reference count is greater than 1;
421
422* when we know that deallocation of the object [#]_ will neither release
423  the :term:`GIL` nor cause any calls back into our type's code;
424
425* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
426  handler on a type which doesn't support cyclic garbage collection [#]_.
427
428We want to expose our instance variables as attributes. There are a
429number of ways to do that. The simplest way is to define member definitions::
430
431   static PyMemberDef Custom_members[] = {
432       {"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
433        "first name"},
434       {"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
435        "last name"},
436       {"number", T_INT, offsetof(CustomObject, number), 0,
437        "custom number"},
438       {NULL}  /* Sentinel */
439   };
440
441and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
442
443   .tp_members = Custom_members,
444
445Each member definition has a member name, type, offset, access flags and
446documentation string.  See the :ref:`Generic-Attribute-Management` section
447below for details.
448
449A disadvantage of this approach is that it doesn't provide a way to restrict the
450types of objects that can be assigned to the Python attributes.  We expect the
451first and last names to be strings, but any Python objects can be assigned.
452Further, the attributes can be deleted, setting the C pointers to *NULL*.  Even
453though we can make sure the members are initialized to non-*NULL* values, the
454members can be set to *NULL* if the attributes are deleted.
455
456We define a single method, :meth:`Custom.name()`, that outputs the objects name as the
457concatenation of the first and last names. ::
458
459   static PyObject *
460   Custom_name(CustomObject *self)
461   {
462       if (self->first == NULL) {
463           PyErr_SetString(PyExc_AttributeError, "first");
464           return NULL;
465       }
466       if (self->last == NULL) {
467           PyErr_SetString(PyExc_AttributeError, "last");
468           return NULL;
469       }
470       return PyUnicode_FromFormat("%S %S", self->first, self->last);
471   }
472
473The method is implemented as a C function that takes a :class:`Custom` (or
474:class:`Custom` subclass) instance as the first argument.  Methods always take an
475instance as the first argument. Methods often take positional and keyword
476arguments as well, but in this case we don't take any and don't need to accept
477a positional argument tuple or keyword argument dictionary. This method is
478equivalent to the Python method:
479
480.. code-block:: python
481
482   def name(self):
483       return "%s %s" % (self.first, self.last)
484
485Note that we have to check for the possibility that our :attr:`first` and
486:attr:`last` members are *NULL*.  This is because they can be deleted, in which
487case they are set to *NULL*.  It would be better to prevent deletion of these
488attributes and to restrict the attribute values to be strings.  We'll see how to
489do that in the next section.
490
491Now that we've defined the method, we need to create an array of method
492definitions::
493
494   static PyMethodDef Custom_methods[] = {
495       {"name", (PyCFunction) Custom_name, METH_NOARGS,
496        "Return the name, combining the first and last name"
497       },
498       {NULL}  /* Sentinel */
499   };
500
501(note that we used the :const:`METH_NOARGS` flag to indicate that the method
502is expecting no arguments other than *self*)
503
504and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::
505
506   .tp_methods = Custom_methods,
507
508Finally, we'll make our type usable as a base class for subclassing.  We've
509written our methods carefully so far so that they don't make any assumptions
510about the type of the object being created or used, so all we need to do is
511to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
512
513   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
514
515We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the
516module name in the :c:type:`PyModuleDef` struct, and update the full class
517name in the :c:type:`PyTypeObject` struct.
518
519Finally, we update our :file:`setup.py` file to build the new module:
520
521.. code-block:: python
522
523   from distutils.core import setup, Extension
524   setup(name="custom", version="1.0",
525         ext_modules=[
526            Extension("custom", ["custom.c"]),
527            Extension("custom2", ["custom2.c"]),
528            ])
529
530
531Providing finer control over data attributes
532============================================
533
534In this section, we'll provide finer control over how the :attr:`first` and
535:attr:`last` attributes are set in the :class:`Custom` example. In the previous
536version of our module, the instance variables :attr:`first` and :attr:`last`
537could be set to non-string values or even deleted. We want to make sure that
538these attributes always contain strings.
539
540.. literalinclude:: ../includes/custom3.c
541
542
543To provide greater control, over the :attr:`first` and :attr:`last` attributes,
544we'll use custom getter and setter functions.  Here are the functions for
545getting and setting the :attr:`first` attribute::
546
547   static PyObject *
548   Custom_getfirst(CustomObject *self, void *closure)
549   {
550       Py_INCREF(self->first);
551       return self->first;
552   }
553
554   static int
555   Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
556   {
557       PyObject *tmp;
558       if (value == NULL) {
559           PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
560           return -1;
561       }
562       if (!PyUnicode_Check(value)) {
563           PyErr_SetString(PyExc_TypeError,
564                           "The first attribute value must be a string");
565           return -1;
566       }
567       tmp = self->first;
568       Py_INCREF(value);
569       self->first = value;
570       Py_DECREF(tmp);
571       return 0;
572   }
573
574The getter function is passed a :class:`Custom` object and a "closure", which is
575a void pointer.  In this case, the closure is ignored.  (The closure supports an
576advanced usage in which definition data is passed to the getter and setter. This
577could, for example, be used to allow a single set of getter and setter functions
578that decide the attribute to get or set based on data in the closure.)
579
580The setter function is passed the :class:`Custom` object, the new value, and the
581closure.  The new value may be *NULL*, in which case the attribute is being
582deleted.  In our setter, we raise an error if the attribute is deleted or if its
583new value is not a string.
584
585We create an array of :c:type:`PyGetSetDef` structures::
586
587   static PyGetSetDef Custom_getsetters[] = {
588       {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
589        "first name", NULL},
590       {"last", (getter) Custom_getlast, (setter) Custom_setlast,
591        "last name", NULL},
592       {NULL}  /* Sentinel */
593   };
594
595and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
596
597   .tp_getset = Custom_getsetters,
598
599The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
600above.  In this case, we aren't using a closure, so we just pass *NULL*.
601
602We also remove the member definitions for these attributes::
603
604   static PyMemberDef Custom_members[] = {
605       {"number", T_INT, offsetof(CustomObject, number), 0,
606        "custom number"},
607       {NULL}  /* Sentinel */
608   };
609
610We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
611allow strings [#]_ to be passed::
612
613   static int
614   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
615   {
616       static char *kwlist[] = {"first", "last", "number", NULL};
617       PyObject *first = NULL, *last = NULL, *tmp;
618
619       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
620                                        &first, &last,
621                                        &self->number))
622           return -1;
623
624       if (first) {
625           tmp = self->first;
626           Py_INCREF(first);
627           self->first = first;
628           Py_DECREF(tmp);
629       }
630       if (last) {
631           tmp = self->last;
632           Py_INCREF(last);
633           self->last = last;
634           Py_DECREF(tmp);
635       }
636       return 0;
637   }
638
639With these changes, we can assure that the ``first`` and ``last`` members are
640never *NULL* so we can remove checks for *NULL* values in almost all cases.
641This means that most of the :c:func:`Py_XDECREF` calls can be converted to
642:c:func:`Py_DECREF` calls.  The only place we can't change these calls is in
643the ``tp_dealloc`` implementation, where there is the possibility that the
644initialization of these members failed in ``tp_new``.
645
646We also rename the module initialization function and module name in the
647initialization function, as we did before, and we add an extra definition to the
648:file:`setup.py` file.
649
650
651Supporting cyclic garbage collection
652====================================
653
654Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
655can identify unneeded objects even when their reference counts are not zero.
656This can happen when objects are involved in cycles.  For example, consider:
657
658.. code-block:: pycon
659
660   >>> l = []
661   >>> l.append(l)
662   >>> del l
663
664In this example, we create a list that contains itself. When we delete it, it
665still has a reference from itself. Its reference count doesn't drop to zero.
666Fortunately, Python's cyclic garbage collector will eventually figure out that
667the list is garbage and free it.
668
669In the second version of the :class:`Custom` example, we allowed any kind of
670object to be stored in the :attr:`first` or :attr:`last` attributes [#]_.
671Besides, in the second and third versions, we allowed subclassing
672:class:`Custom`, and subclasses may add arbitrary attributes.  For any of
673those two reasons, :class:`Custom` objects can participate in cycles:
674
675.. code-block:: pycon
676
677   >>> import custom3
678   >>> class Derived(custom3.Custom): pass
679   ...
680   >>> n = Derived()
681   >>> n.some_attribute = n
682
683To allow a :class:`Custom` instance participating in a reference cycle to
684be properly detected and collected by the cyclic GC, our :class:`Custom` type
685needs to fill two additional slots and to enable a flag that enables these slots:
686
687.. literalinclude:: ../includes/custom4.c
688
689
690First, the traversal method lets the cyclic GC know about subobjects that could
691participate in cycles::
692
693   static int
694   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
695   {
696       int vret;
697       if (self->first) {
698           vret = visit(self->first, arg);
699           if (vret != 0)
700               return vret;
701       }
702       if (self->last) {
703           vret = visit(self->last, arg);
704           if (vret != 0)
705               return vret;
706       }
707       return 0;
708   }
709
710For each subobject that can participate in cycles, we need to call the
711:c:func:`visit` function, which is passed to the traversal method. The
712:c:func:`visit` function takes as arguments the subobject and the extra argument
713*arg* passed to the traversal method.  It returns an integer value that must be
714returned if it is non-zero.
715
716Python provides a :c:func:`Py_VISIT` macro that automates calling visit
717functions.  With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
718in ``Custom_traverse``::
719
720   static int
721   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
722   {
723       Py_VISIT(self->first);
724       Py_VISIT(self->last);
725       return 0;
726   }
727
728.. note::
729   The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
730   arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.
731
732Second, we need to provide a method for clearing any subobjects that can
733participate in cycles::
734
735   static int
736   Custom_clear(CustomObject *self)
737   {
738       Py_CLEAR(self->first);
739       Py_CLEAR(self->last);
740       return 0;
741   }
742
743Notice the use of the :c:func:`Py_CLEAR` macro.  It is the recommended and safe
744way to clear data attributes of arbitrary types while decrementing
745their reference counts.  If you were to call :c:func:`Py_XDECREF` instead
746on the attribute before setting it to *NULL*, there is a possibility
747that the attribute's destructor would call back into code that reads the
748attribute again (*especially* if there is a reference cycle).
749
750.. note::
751   You could emulate :c:func:`Py_CLEAR` by writing::
752
753      PyObject *tmp;
754      tmp = self->first;
755      self->first = NULL;
756      Py_XDECREF(tmp);
757
758   Nevertheless, it is much easier and less error-prone to always
759   use :c:func:`Py_CLEAR` when deleting an attribute.  Don't
760   try to micro-optimize at the expense of robustness!
761
762The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
763attributes.  It means the circular GC can be triggered inside the function.
764Since the GC assumes reference count is not zero, we need to untrack the object
765from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
766Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
767and ``Custom_clear``::
768
769   static void
770   Custom_dealloc(CustomObject *self)
771   {
772       PyObject_GC_UnTrack(self);
773       Custom_clear(self);
774       Py_TYPE(self)->tp_free((PyObject *) self);
775   }
776
777Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
778
779   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
780
781That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
782:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
783garbage collection.  Most extensions will use the versions automatically provided.
784
785
786Subclassing other types
787=======================
788
789It is possible to create new extension types that are derived from existing
790types. It is easiest to inherit from the built in types, since an extension can
791easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
792these :c:type:`PyTypeObject` structures between extension modules.
793
794In this example we will create a :class:`SubList` type that inherits from the
795built-in :class:`list` type. The new type will be completely compatible with
796regular lists, but will have an additional :meth:`increment` method that
797increases an internal counter:
798
799.. code-block:: pycon
800
801   >>> import sublist
802   >>> s = sublist.SubList(range(3))
803   >>> s.extend(s)
804   >>> print(len(s))
805   6
806   >>> print(s.increment())
807   1
808   >>> print(s.increment())
809   2
810
811.. literalinclude:: ../includes/sublist.c
812
813
814As you can see, the source code closely resembles the :class:`Custom` examples in
815previous sections. We will break down the main differences between them. ::
816
817   typedef struct {
818       PyListObject list;
819       int state;
820   } SubListObject;
821
822The primary difference for derived type objects is that the base type's
823object structure must be the first value.  The base type will already include
824the :c:func:`PyObject_HEAD` at the beginning of its structure.
825
826When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer
827can be safely cast to both ``PyListObject *`` and ``SubListObject *``::
828
829   static int
830   SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
831   {
832       if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
833           return -1;
834       self->state = 0;
835       return 0;
836   }
837
838We see above how to call through to the :attr:`__init__` method of the base
839type.
840
841This pattern is important when writing a type with custom
842:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
843members.  The :c:member:`~PyTypeObject.tp_new` handler should not actually
844create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
845but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.
846
847The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
848specifying the type's concrete base class.  Due to cross-platform compiler
849issues, you can't fill that field directly with a reference to
850:c:type:`PyList_Type`; it should be done later in the module initialization
851function::
852
853   PyMODINIT_FUNC
854   PyInit_sublist(void)
855   {
856       PyObject* m;
857       SubListType.tp_base = &PyList_Type;
858       if (PyType_Ready(&SubListType) < 0)
859           return NULL;
860
861       m = PyModule_Create(&sublistmodule);
862       if (m == NULL)
863           return NULL;
864
865       Py_INCREF(&SubListType);
866       PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
867       return m;
868   }
869
870Before calling :c:func:`PyType_Ready`, the type structure must have the
871:c:member:`~PyTypeObject.tp_base` slot filled in.  When we are deriving an
872existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
873slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
874type will be inherited.
875
876After that, calling :c:func:`PyType_Ready` and adding the type object to the
877module is the same as with the basic :class:`Custom` examples.
878
879
880.. rubric:: Footnotes
881
882.. [#] This is true when we know that the object is a basic type, like a string or a
883   float.
884
885.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
886   in this example, because our type doesn't support garbage collection.
887
888.. [#] We now know that the first and last members are strings, so perhaps we
889   could be less careful about decrementing their reference counts, however,
890   we accept instances of string subclasses.  Even though deallocating normal
891   strings won't call back into our objects, we can't guarantee that deallocating
892   an instance of a string subclass won't call back into our objects.
893
894.. [#] Also, even with our attributes restricted to strings instances, the user
895   could pass arbitrary :class:`str` subclasses and therefore still create
896   reference cycles.
897