• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. highlightlang:: c
2
3
4.. _defining-new-types:
5
6******************
7Defining New Types
8******************
9
10.. sectionauthor:: Michael Hudson <mwh@python.net>
11.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
12.. sectionauthor:: Jim Fulton <jim@zope.com>
13
14
15As mentioned in the last chapter, Python allows the writer of an extension
16module to define new types that can be manipulated from Python code, much like
17strings and lists in core Python.
18
19This is not hard; the code for all extension types follows a pattern, but there
20are some details that you need to understand before you can get started.
21
22.. note::
23
24   The way new types are defined changed dramatically (and for the better) in
25   Python 2.2.  This document documents how to define new types for Python 2.2 and
26   later.  If you need to support older versions of Python, you will need to refer
27   to `older versions of this documentation
28   <https://www.python.org/doc/versions/>`_.
29
30
31.. _dnt-basics:
32
33The Basics
34==========
35
36The Python runtime sees all Python objects as variables of type
37:c:type:`PyObject\*`.  A :c:type:`PyObject` is not a very magnificent object - it
38just contains the refcount and a pointer to the object's "type object".  This is
39where the action is; the type object determines which (C) functions get called
40when, for instance, an attribute gets looked up on an object or it is multiplied
41by another object.  These C functions are called "type methods".
42
43So, if you want to define a new object type, you need to create a new type
44object.
45
46This sort of thing can only be explained by example, so here's a minimal, but
47complete, module that defines a new type:
48
49.. literalinclude:: ../includes/noddy.c
50
51
52Now that's quite a bit to take in at once, but hopefully bits will seem familiar
53from the last chapter.
54
55The first bit that will be new is::
56
57   typedef struct {
58       PyObject_HEAD
59   } noddy_NoddyObject;
60
61This is what a Noddy object will contain---in this case, nothing more than every
62Python object contains, namely a refcount and a pointer to a type object.  These
63are the fields the ``PyObject_HEAD`` macro brings in.  The reason for the macro
64is to standardize the layout and to enable special debugging fields in debug
65builds.  Note that there is no semicolon after the ``PyObject_HEAD`` macro; one
66is included in the macro definition.  Be wary of adding one by accident; it's
67easy to do from habit, and your compiler might not complain, but someone else's
68probably will!  (On Windows, MSVC is known to call this an error and refuse to
69compile the code.)
70
71For contrast, let's take a look at the corresponding definition for standard
72Python integers::
73
74   typedef struct {
75       PyObject_HEAD
76       long ob_ival;
77   } PyIntObject;
78
79Moving on, we come to the crunch --- the type object. ::
80
81   static PyTypeObject noddy_NoddyType = {
82       PyObject_HEAD_INIT(NULL)
83       0,                         /*ob_size*/
84       "noddy.Noddy",             /*tp_name*/
85       sizeof(noddy_NoddyObject), /*tp_basicsize*/
86       0,                         /*tp_itemsize*/
87       0,                         /*tp_dealloc*/
88       0,                         /*tp_print*/
89       0,                         /*tp_getattr*/
90       0,                         /*tp_setattr*/
91       0,                         /*tp_compare*/
92       0,                         /*tp_repr*/
93       0,                         /*tp_as_number*/
94       0,                         /*tp_as_sequence*/
95       0,                         /*tp_as_mapping*/
96       0,                         /*tp_hash */
97       0,                         /*tp_call*/
98       0,                         /*tp_str*/
99       0,                         /*tp_getattro*/
100       0,                         /*tp_setattro*/
101       0,                         /*tp_as_buffer*/
102       Py_TPFLAGS_DEFAULT,        /*tp_flags*/
103       "Noddy objects",           /* tp_doc */
104   };
105
106Now if you go and look up the definition of :c:type:`PyTypeObject` in
107:file:`object.h` you'll see that it has many more fields that the definition
108above.  The remaining fields will be filled with zeros by the C compiler, and
109it's common practice to not specify them explicitly unless you need them.
110
111This is so important that we're going to pick the top of it apart still
112further::
113
114   PyObject_HEAD_INIT(NULL)
115
116This line is a bit of a wart; what we'd like to write is::
117
118   PyObject_HEAD_INIT(&PyType_Type)
119
120as the type of a type object is "type", but this isn't strictly conforming C and
121some compilers complain.  Fortunately, this member will be filled in for us by
122:c:func:`PyType_Ready`. ::
123
124   0,                          /* ob_size */
125
126The :attr:`ob_size` field of the header is not used; its presence in the type
127structure is a historical artifact that is maintained for binary compatibility
128with extension modules compiled for older versions of Python.  Always set this
129field to zero. ::
130
131   "noddy.Noddy",              /* tp_name */
132
133The name of our type.  This will appear in the default textual representation of
134our objects and in some error messages, for example::
135
136   >>> "" + noddy.new_noddy()
137   Traceback (most recent call last):
138     File "<stdin>", line 1, in ?
139   TypeError: cannot add type "noddy.Noddy" to string
140
141Note that the name is a dotted name that includes both the module name and the
142name of the type within the module. The module in this case is :mod:`noddy` and
143the type is :class:`Noddy`, so we set the type name to :class:`noddy.Noddy`.
144One side effect of using an undotted name is that the pydoc documentation tool
145will not list the new type in the module documentation. ::
146
147   sizeof(noddy_NoddyObject),  /* tp_basicsize */
148
149This is so that Python knows how much memory to allocate when you call
150:c:func:`PyObject_New`.
151
152.. note::
153
154   If you want your type to be subclassable from Python, and your type has the same
155   :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
156   inheritance.  A Python subclass of your type will have to list your type first
157   in its :attr:`~class.__bases__`, or else it will not be able to call your type's
158   :meth:`__new__` method without getting an error.  You can avoid this problem by
159   ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
160   base type does.  Most of the time, this will be true anyway, because either your
161   base type will be :class:`object`, or else you will be adding data members to
162   your base type, and therefore increasing its size.
163
164::
165
166   0,                          /* tp_itemsize */
167
168This has to do with variable length objects like lists and strings. Ignore this
169for now.
170
171Skipping a number of type methods that we don't provide, we set the class flags
172to :const:`Py_TPFLAGS_DEFAULT`. ::
173
174   Py_TPFLAGS_DEFAULT,        /*tp_flags*/
175
176All types should include this constant in their flags.  It enables all of the
177members defined by the current version of Python.
178
179We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
180
181   "Noddy objects",           /* tp_doc */
182
183Now we get into the type methods, the things that make your objects different
184from the others.  We aren't going to implement any of these in this version of
185the module.  We'll expand this example later to have more interesting behavior.
186
187For now, all we want to be able to do is to create new :class:`Noddy` objects.
188To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new` implementation.
189In this case, we can just use the default implementation provided by the API
190function :c:func:`PyType_GenericNew`.  We'd like to just assign this to the
191:c:member:`~PyTypeObject.tp_new` slot, but we can't, for portability sake, On some platforms or
192compilers, we can't statically initialize a structure member with a function
193defined in another C module, so, instead, we'll assign the :c:member:`~PyTypeObject.tp_new` slot
194in the module initialization function just before calling
195:c:func:`PyType_Ready`::
196
197   noddy_NoddyType.tp_new = PyType_GenericNew;
198   if (PyType_Ready(&noddy_NoddyType) < 0)
199       return;
200
201All the other type methods are *NULL*, so we'll go over them later --- that's
202for a later section!
203
204Everything else in the file should be familiar, except for some code in
205:c:func:`initnoddy`::
206
207   if (PyType_Ready(&noddy_NoddyType) < 0)
208       return;
209
210This initializes the :class:`Noddy` type, filing in a number of members,
211including :attr:`ob_type` that we initially set to *NULL*. ::
212
213   PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType);
214
215This adds the type to the module dictionary.  This allows us to create
216:class:`Noddy` instances by calling the :class:`Noddy` class::
217
218   >>> import noddy
219   >>> mynoddy = noddy.Noddy()
220
221That's it!  All that remains is to build it; put the above code in a file called
222:file:`noddy.c` and ::
223
224   from distutils.core import setup, Extension
225   setup(name="noddy", version="1.0",
226         ext_modules=[Extension("noddy", ["noddy.c"])])
227
228in a file called :file:`setup.py`; then typing
229
230.. code-block:: shell-session
231
232   $ python setup.py build
233
234at a shell should produce a file :file:`noddy.so` in a subdirectory; move to
235that directory and fire up Python --- you should be able to ``import noddy`` and
236play around with Noddy objects.
237
238That wasn't so hard, was it?
239
240Of course, the current Noddy type is pretty uninteresting. It has no data and
241doesn't do anything. It can't even be subclassed.
242
243
244Adding data and methods to the Basic example
245--------------------------------------------
246
247Let's expend the basic example to add some data and methods.  Let's also make
248the type usable as a base class. We'll create a new module, :mod:`noddy2` that
249adds these capabilities:
250
251.. literalinclude:: ../includes/noddy2.c
252
253
254This version of the module has a number of changes.
255
256We've added an extra include::
257
258   #include <structmember.h>
259
260This include provides declarations that we use to handle attributes, as
261described a bit later.
262
263The name of the :class:`Noddy` object structure has been shortened to
264:class:`Noddy`.  The type object name has been shortened to :class:`NoddyType`.
265
266The  :class:`Noddy` type now has three data attributes, *first*, *last*, and
267*number*.  The *first* and *last* variables are Python strings containing first
268and last names. The *number* attribute is an integer.
269
270The object structure is updated accordingly::
271
272   typedef struct {
273       PyObject_HEAD
274       PyObject *first;
275       PyObject *last;
276       int number;
277   } Noddy;
278
279Because we now have data to manage, we have to be more careful about object
280allocation and deallocation.  At a minimum, we need a deallocation method::
281
282   static void
283   Noddy_dealloc(Noddy* self)
284   {
285       Py_XDECREF(self->first);
286       Py_XDECREF(self->last);
287       self->ob_type->tp_free((PyObject*)self);
288   }
289
290which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
291
292   (destructor)Noddy_dealloc, /*tp_dealloc*/
293
294This method decrements the reference counts of the two Python attributes. We use
295:c:func:`Py_XDECREF` here because the :attr:`first` and :attr:`last` members
296could be *NULL*.  It then calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
297to free the object's memory.  Note that the object's type might not be
298:class:`NoddyType`, because the object may be an instance of a subclass.
299
300We want to make sure that the first and last names are initialized to empty
301strings, so we provide a new method::
302
303   static PyObject *
304   Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
305   {
306       Noddy *self;
307
308       self = (Noddy *)type->tp_alloc(type, 0);
309       if (self != NULL) {
310           self->first = PyString_FromString("");
311           if (self->first == NULL)
312             {
313               Py_DECREF(self);
314               return NULL;
315             }
316
317           self->last = PyString_FromString("");
318           if (self->last == NULL)
319             {
320               Py_DECREF(self);
321               return NULL;
322             }
323
324           self->number = 0;
325       }
326
327       return (PyObject *)self;
328   }
329
330and install it in the :c:member:`~PyTypeObject.tp_new` member::
331
332   Noddy_new,                 /* tp_new */
333
334The new member is responsible for creating (as opposed to initializing) objects
335of the type.  It is exposed in Python as the :meth:`__new__` method.  See the
336paper titled "Unifying types and classes in Python" for a detailed discussion of
337the :meth:`__new__` method.  One reason to implement a new method is to assure
338the initial values of instance variables.  In this case, we use the new method
339to make sure that the initial values of the members :attr:`first` and
340:attr:`last` are not *NULL*. If we didn't care whether the initial values were
341*NULL*, we could have used :c:func:`PyType_GenericNew` as our new method, as we
342did before.  :c:func:`PyType_GenericNew` initializes all of the instance variable
343members to *NULL*.
344
345The new method is a static method that is passed the type being instantiated and
346any arguments passed when the type was called, and that returns the new object
347created. New methods always accept positional and keyword arguments, but they
348often ignore the arguments, leaving the argument handling to initializer
349methods. Note that if the type supports subclassing, the type passed may not be
350the type being defined.  The new method calls the tp_alloc slot to allocate
351memory. We don't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
352:c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
353which is :class:`object` by default.  Most types use the default allocation.
354
355.. note::
356
357   If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one that calls a base type's
358   :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`), you must *not* try to determine what method
359   to call using method resolution order at runtime.  Always statically determine
360   what type you are going to call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
361   ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
362   type that also inherit from other Python-defined classes may not work correctly.
363   (Specifically, you may not be able to create instances of such subclasses
364   without getting a :exc:`TypeError`.)
365
366We provide an initialization function::
367
368   static int
369   Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
370   {
371       PyObject *first=NULL, *last=NULL, *tmp;
372
373       static char *kwlist[] = {"first", "last", "number", NULL};
374
375       if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
376                                         &first, &last,
377                                         &self->number))
378           return -1;
379
380       if (first) {
381           tmp = self->first;
382           Py_INCREF(first);
383           self->first = first;
384           Py_XDECREF(tmp);
385       }
386
387       if (last) {
388           tmp = self->last;
389           Py_INCREF(last);
390           self->last = last;
391           Py_XDECREF(tmp);
392       }
393
394       return 0;
395   }
396
397by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
398
399   (initproc)Noddy_init,         /* tp_init */
400
401The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the :meth:`__init__` method. It
402is used to initialize an object after it's created. Unlike the new method, we
403can't guarantee that the initializer is called.  The initializer isn't called
404when unpickling objects and it can be overridden.  Our initializer accepts
405arguments to provide initial values for our instance. Initializers always accept
406positional and keyword arguments.
407
408Initializers can be called multiple times.  Anyone can call the :meth:`__init__`
409method on our objects.  For this reason, we have to be extra careful when
410assigning the new values.  We might be tempted, for example to assign the
411:attr:`first` member like this::
412
413   if (first) {
414       Py_XDECREF(self->first);
415       Py_INCREF(first);
416       self->first = first;
417   }
418
419But this would be risky.  Our type doesn't restrict the type of the
420:attr:`first` member, so it could be any kind of object.  It could have a
421destructor that causes code to be executed that tries to access the
422:attr:`first` member.  To be paranoid and protect ourselves against this
423possibility, we almost always reassign members before decrementing their
424reference counts.  When don't we have to do this?
425
426* when we absolutely know that the reference count is greater than 1
427
428* when we know that deallocation of the object [#]_ will not cause any calls
429  back into our type's code
430
431* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc` handler when
432  garbage-collections is not supported [#]_
433
434We want to expose our instance variables as attributes. There are a
435number of ways to do that. The simplest way is to define member definitions::
436
437   static PyMemberDef Noddy_members[] = {
438       {"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
439        "first name"},
440       {"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
441        "last name"},
442       {"number", T_INT, offsetof(Noddy, number), 0,
443        "noddy number"},
444       {NULL}  /* Sentinel */
445   };
446
447and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
448
449   Noddy_members,             /* tp_members */
450
451Each member definition has a member name, type, offset, access flags and
452documentation string. See the :ref:`Generic-Attribute-Management` section below for
453details.
454
455A disadvantage of this approach is that it doesn't provide a way to restrict the
456types of objects that can be assigned to the Python attributes.  We expect the
457first and last names to be strings, but any Python objects can be assigned.
458Further, the attributes can be deleted, setting the C pointers to *NULL*.  Even
459though we can make sure the members are initialized to non-*NULL* values, the
460members can be set to *NULL* if the attributes are deleted.
461
462We define a single method, :meth:`name`, that outputs the objects name as the
463concatenation of the first and last names. ::
464
465   static PyObject *
466   Noddy_name(Noddy* self)
467   {
468       static PyObject *format = NULL;
469       PyObject *args, *result;
470
471       if (format == NULL) {
472           format = PyString_FromString("%s %s");
473           if (format == NULL)
474               return NULL;
475       }
476
477       if (self->first == NULL) {
478           PyErr_SetString(PyExc_AttributeError, "first");
479           return NULL;
480       }
481
482       if (self->last == NULL) {
483           PyErr_SetString(PyExc_AttributeError, "last");
484           return NULL;
485       }
486
487       args = Py_BuildValue("OO", self->first, self->last);
488       if (args == NULL)
489           return NULL;
490
491       result = PyString_Format(format, args);
492       Py_DECREF(args);
493
494       return result;
495   }
496
497The method is implemented as a C function that takes a :class:`Noddy` (or
498:class:`Noddy` subclass) instance as the first argument.  Methods always take an
499instance as the first argument. Methods often take positional and keyword
500arguments as well, but in this cased we don't take any and don't need to accept
501a positional argument tuple or keyword argument dictionary. This method is
502equivalent to the Python method::
503
504   def name(self):
505      return "%s %s" % (self.first, self.last)
506
507Note that we have to check for the possibility that our :attr:`first` and
508:attr:`last` members are *NULL*.  This is because they can be deleted, in which
509case they are set to *NULL*.  It would be better to prevent deletion of these
510attributes and to restrict the attribute values to be strings.  We'll see how to
511do that in the next section.
512
513Now that we've defined the method, we need to create an array of method
514definitions::
515
516   static PyMethodDef Noddy_methods[] = {
517       {"name", (PyCFunction)Noddy_name, METH_NOARGS,
518        "Return the name, combining the first and last name"
519       },
520       {NULL}  /* Sentinel */
521   };
522
523and assign them to the :c:member:`~PyTypeObject.tp_methods` slot::
524
525   Noddy_methods,             /* tp_methods */
526
527Note that we used the :const:`METH_NOARGS` flag to indicate that the method is
528passed no arguments.
529
530Finally, we'll make our type usable as a base class.  We've written our methods
531carefully so far so that they don't make any assumptions about the type of the
532object being created or used, so all we need to do is to add the
533:const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
534
535   Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
536
537We rename :c:func:`initnoddy` to :c:func:`initnoddy2` and update the module name
538passed to :c:func:`Py_InitModule3`.
539
540Finally, we update our :file:`setup.py` file to build the new module::
541
542   from distutils.core import setup, Extension
543   setup(name="noddy", version="1.0",
544         ext_modules=[
545            Extension("noddy", ["noddy.c"]),
546            Extension("noddy2", ["noddy2.c"]),
547            ])
548
549
550Providing finer control over data attributes
551--------------------------------------------
552
553In this section, we'll provide finer control over how the :attr:`first` and
554:attr:`last` attributes are set in the :class:`Noddy` example. In the previous
555version of our module, the instance variables :attr:`first` and :attr:`last`
556could be set to non-string values or even deleted. We want to make sure that
557these attributes always contain strings.
558
559.. literalinclude:: ../includes/noddy3.c
560
561
562To provide greater control, over the :attr:`first` and :attr:`last` attributes,
563we'll use custom getter and setter functions.  Here are the functions for
564getting and setting the :attr:`first` attribute::
565
566   Noddy_getfirst(Noddy *self, void *closure)
567   {
568       Py_INCREF(self->first);
569       return self->first;
570   }
571
572   static int
573   Noddy_setfirst(Noddy *self, PyObject *value, void *closure)
574   {
575     if (value == NULL) {
576       PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
577       return -1;
578     }
579
580     if (! PyString_Check(value)) {
581       PyErr_SetString(PyExc_TypeError,
582                       "The first attribute value must be a string");
583       return -1;
584     }
585
586     Py_DECREF(self->first);
587     Py_INCREF(value);
588     self->first = value;
589
590     return 0;
591   }
592
593The getter function is passed a :class:`Noddy` object and a "closure", which is
594void pointer. In this case, the closure is ignored. (The closure supports an
595advanced usage in which definition data is passed to the getter and setter. This
596could, for example, be used to allow a single set of getter and setter functions
597that decide the attribute to get or set based on data in the closure.)
598
599The setter function is passed the :class:`Noddy` object, the new value, and the
600closure. The new value may be *NULL*, in which case the attribute is being
601deleted.  In our setter, we raise an error if the attribute is deleted or if the
602attribute value is not a string.
603
604We create an array of :c:type:`PyGetSetDef` structures::
605
606   static PyGetSetDef Noddy_getseters[] = {
607       {"first",
608        (getter)Noddy_getfirst, (setter)Noddy_setfirst,
609        "first name",
610        NULL},
611       {"last",
612        (getter)Noddy_getlast, (setter)Noddy_setlast,
613        "last name",
614        NULL},
615       {NULL}  /* Sentinel */
616   };
617
618and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
619
620   Noddy_getseters,           /* tp_getset */
621
622to register our attribute getters and setters.
623
624The last item in a :c:type:`PyGetSetDef` structure is the closure mentioned
625above. In this case, we aren't using the closure, so we just pass *NULL*.
626
627We also remove the member definitions for these attributes::
628
629   static PyMemberDef Noddy_members[] = {
630       {"number", T_INT, offsetof(Noddy, number), 0,
631        "noddy number"},
632       {NULL}  /* Sentinel */
633   };
634
635We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only allow strings [#]_ to
636be passed::
637
638   static int
639   Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
640   {
641       PyObject *first=NULL, *last=NULL, *tmp;
642
643       static char *kwlist[] = {"first", "last", "number", NULL};
644
645       if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist,
646                                         &first, &last,
647                                         &self->number))
648           return -1;
649
650       if (first) {
651           tmp = self->first;
652           Py_INCREF(first);
653           self->first = first;
654           Py_DECREF(tmp);
655       }
656
657       if (last) {
658           tmp = self->last;
659           Py_INCREF(last);
660           self->last = last;
661           Py_DECREF(tmp);
662       }
663
664       return 0;
665   }
666
667With these changes, we can assure that the :attr:`first` and :attr:`last`
668members are never *NULL* so we can remove checks for *NULL* values in almost all
669cases. This means that most of the :c:func:`Py_XDECREF` calls can be converted to
670:c:func:`Py_DECREF` calls. The only place we can't change these calls is in the
671deallocator, where there is the possibility that the initialization of these
672members failed in the constructor.
673
674We also rename the module initialization function and module name in the
675initialization function, as we did before, and we add an extra definition to the
676:file:`setup.py` file.
677
678
679Supporting cyclic garbage collection
680------------------------------------
681
682Python has a cyclic-garbage collector that can identify unneeded objects even
683when their reference counts are not zero. This can happen when objects are
684involved in cycles.  For example, consider::
685
686   >>> l = []
687   >>> l.append(l)
688   >>> del l
689
690In this example, we create a list that contains itself. When we delete it, it
691still has a reference from itself. Its reference count doesn't drop to zero.
692Fortunately, Python's cyclic-garbage collector will eventually figure out that
693the list is garbage and free it.
694
695In the second version of the :class:`Noddy` example, we allowed any kind of
696object to be stored in the :attr:`first` or :attr:`last` attributes. [#]_ This
697means that :class:`Noddy` objects can participate in cycles::
698
699   >>> import noddy2
700   >>> n = noddy2.Noddy()
701   >>> l = [n]
702   >>> n.first = l
703
704This is pretty silly, but it gives us an excuse to add support for the
705cyclic-garbage collector to the :class:`Noddy` example.  To support cyclic
706garbage collection, types need to fill two slots and set a class flag that
707enables these slots:
708
709.. literalinclude:: ../includes/noddy4.c
710
711
712The traversal method provides access to subobjects that could participate in
713cycles::
714
715   static int
716   Noddy_traverse(Noddy *self, visitproc visit, void *arg)
717   {
718       int vret;
719
720       if (self->first) {
721           vret = visit(self->first, arg);
722           if (vret != 0)
723               return vret;
724       }
725       if (self->last) {
726           vret = visit(self->last, arg);
727           if (vret != 0)
728               return vret;
729       }
730
731       return 0;
732   }
733
734For each subobject that can participate in cycles, we need to call the
735:c:func:`visit` function, which is passed to the traversal method. The
736:c:func:`visit` function takes as arguments the subobject and the extra argument
737*arg* passed to the traversal method.  It returns an integer value that must be
738returned if it is non-zero.
739
740Python 2.4 and higher provide a :c:func:`Py_VISIT` macro that automates calling
741visit functions.  With :c:func:`Py_VISIT`, :c:func:`Noddy_traverse` can be
742simplified::
743
744   static int
745   Noddy_traverse(Noddy *self, visitproc visit, void *arg)
746   {
747       Py_VISIT(self->first);
748       Py_VISIT(self->last);
749       return 0;
750   }
751
752.. note::
753
754   Note that the :c:member:`~PyTypeObject.tp_traverse` implementation must name its arguments exactly
755   *visit* and *arg* in order to use :c:func:`Py_VISIT`.  This is to encourage
756   uniformity across these boring implementations.
757
758We also need to provide a method for clearing any subobjects that can
759participate in cycles.  We implement the method and reimplement the deallocator
760to use it::
761
762   static int
763   Noddy_clear(Noddy *self)
764   {
765       PyObject *tmp;
766
767       tmp = self->first;
768       self->first = NULL;
769       Py_XDECREF(tmp);
770
771       tmp = self->last;
772       self->last = NULL;
773       Py_XDECREF(tmp);
774
775       return 0;
776   }
777
778   static void
779   Noddy_dealloc(Noddy* self)
780   {
781       Noddy_clear(self);
782       self->ob_type->tp_free((PyObject*)self);
783   }
784
785Notice the use of a temporary variable in :c:func:`Noddy_clear`. We use the
786temporary variable so that we can set each member to *NULL* before decrementing
787its reference count.  We do this because, as was discussed earlier, if the
788reference count drops to zero, we might cause code to run that calls back into
789the object.  In addition, because we now support garbage collection, we also
790have to worry about code being run that triggers garbage collection.  If garbage
791collection is run, our :c:member:`~PyTypeObject.tp_traverse` handler could get called. We can't
792take a chance of having :c:func:`Noddy_traverse` called when a member's reference
793count has dropped to zero and its value hasn't been set to *NULL*.
794
795Python 2.4 and higher provide a :c:func:`Py_CLEAR` that automates the careful
796decrementing of reference counts.  With :c:func:`Py_CLEAR`, the
797:c:func:`Noddy_clear` function can be simplified::
798
799   static int
800   Noddy_clear(Noddy *self)
801   {
802       Py_CLEAR(self->first);
803       Py_CLEAR(self->last);
804       return 0;
805   }
806
807Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
808
809   Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /*tp_flags*/
810
811That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
812:c:member:`~PyTypeObject.tp_free` slots, we'd need to modify them for cyclic-garbage collection.
813Most extensions will use the versions automatically provided.
814
815
816Subclassing other types
817-----------------------
818
819It is possible to create new extension types that are derived from existing
820types. It is easiest to inherit from the built in types, since an extension can
821easily use the :class:`PyTypeObject` it needs. It can be difficult to share
822these :class:`PyTypeObject` structures between extension modules.
823
824In this example we will create a :class:`Shoddy` type that inherits from the
825built-in :class:`list` type. The new type will be completely compatible with
826regular lists, but will have an additional :meth:`increment` method that
827increases an internal counter. ::
828
829   >>> import shoddy
830   >>> s = shoddy.Shoddy(range(3))
831   >>> s.extend(s)
832   >>> print len(s)
833   6
834   >>> print s.increment()
835   1
836   >>> print s.increment()
837   2
838
839.. literalinclude:: ../includes/shoddy.c
840
841
842As you can see, the source code closely resembles the :class:`Noddy` examples in
843previous sections. We will break down the main differences between them. ::
844
845   typedef struct {
846       PyListObject list;
847       int state;
848   } Shoddy;
849
850The primary difference for derived type objects is that the base type's object
851structure must be the first value. The base type will already include the
852:c:func:`PyObject_HEAD` at the beginning of its structure.
853
854When a Python object is a :class:`Shoddy` instance, its *PyObject\** pointer can
855be safely cast to both *PyListObject\** and *Shoddy\**. ::
856
857   static int
858   Shoddy_init(Shoddy *self, PyObject *args, PyObject *kwds)
859   {
860       if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0)
861          return -1;
862       self->state = 0;
863       return 0;
864   }
865
866In the :attr:`__init__` method for our type, we can see how to call through to
867the :attr:`__init__` method of the base type.
868
869This pattern is important when writing a type with custom :attr:`new` and
870:attr:`dealloc` methods. The :attr:`new` method should not actually create the
871memory for the object with :c:member:`~PyTypeObject.tp_alloc`, that will be handled by the base
872class when calling its :c:member:`~PyTypeObject.tp_new`.
873
874When filling out the :c:func:`PyTypeObject` for the :class:`Shoddy` type, you see
875a slot for :c:func:`tp_base`. Due to cross platform compiler issues, you can't
876fill that field directly with the :c:func:`PyList_Type`; it can be done later in
877the module's :c:func:`init` function. ::
878
879   PyMODINIT_FUNC
880   initshoddy(void)
881   {
882       PyObject *m;
883
884       ShoddyType.tp_base = &PyList_Type;
885       if (PyType_Ready(&ShoddyType) < 0)
886           return;
887
888       m = Py_InitModule3("shoddy", NULL, "Shoddy module");
889       if (m == NULL)
890           return;
891
892       Py_INCREF(&ShoddyType);
893       PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType);
894   }
895
896Before calling :c:func:`PyType_Ready`, the type structure must have the
897:c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving a new type, it is not
898necessary to fill out the :c:member:`~PyTypeObject.tp_alloc` slot with :c:func:`PyType_GenericNew`
899-- the allocate function from the base type will be inherited.
900
901After that, calling :c:func:`PyType_Ready` and adding the type object to the
902module is the same as with the basic :class:`Noddy` examples.
903
904
905.. _dnt-type-methods:
906
907Type Methods
908============
909
910This section aims to give a quick fly-by on the various type methods you can
911implement and what they do.
912
913Here is the definition of :c:type:`PyTypeObject`, with some fields only used in
914debug builds omitted:
915
916.. literalinclude:: ../includes/typestruct.h
917
918
919Now that's a *lot* of methods.  Don't worry too much though - if you have a type
920you want to define, the chances are very good that you will only implement a
921handful of these.
922
923As you probably expect by now, we're going to go over this and give more
924information about the various handlers.  We won't go in the order they are
925defined in the structure, because there is a lot of historical baggage that
926impacts the ordering of the fields; be sure your type initialization keeps the
927fields in the right order!  It's often easiest to find an example that includes
928all the fields you need (even if they're initialized to ``0``) and then change
929the values to suit your new type. ::
930
931   char *tp_name; /* For printing */
932
933The name of the type - as mentioned in the last section, this will appear in
934various places, almost entirely for diagnostic purposes. Try to choose something
935that will be helpful in such a situation! ::
936
937   int tp_basicsize, tp_itemsize; /* For allocation */
938
939These fields tell the runtime how much memory to allocate when new objects of
940this type are created.  Python has some built-in support for variable length
941structures (think: strings, lists) which is where the :c:member:`~PyTypeObject.tp_itemsize` field
942comes in.  This will be dealt with later. ::
943
944   char *tp_doc;
945
946Here you can put a string (or its address) that you want returned when the
947Python script references ``obj.__doc__`` to retrieve the doc string.
948
949Now we come to the basic type methods---the ones most extension types will
950implement.
951
952
953Finalization and De-allocation
954------------------------------
955
956.. index::
957   single: object; deallocation
958   single: deallocation, object
959   single: object; finalization
960   single: finalization, of objects
961
962::
963
964   destructor tp_dealloc;
965
966This function is called when the reference count of the instance of your type is
967reduced to zero and the Python interpreter wants to reclaim it.  If your type
968has memory to free or other clean-up to perform, put it here.  The object itself
969needs to be freed here as well.  Here is an example of this function::
970
971   static void
972   newdatatype_dealloc(newdatatypeobject * obj)
973   {
974       free(obj->obj_UnderlyingDatatypePtr);
975       obj->ob_type->tp_free(obj);
976   }
977
978.. index::
979   single: PyErr_Fetch()
980   single: PyErr_Restore()
981
982One important requirement of the deallocator function is that it leaves any
983pending exceptions alone.  This is important since deallocators are frequently
984called as the interpreter unwinds the Python stack; when the stack is unwound
985due to an exception (rather than normal returns), nothing is done to protect the
986deallocators from seeing that an exception has already been set.  Any actions
987which a deallocator performs which may cause additional Python code to be
988executed may detect that an exception has been set.  This can lead to misleading
989errors from the interpreter.  The proper way to protect against this is to save
990a pending exception before performing the unsafe action, and restoring it when
991done.  This can be done using the :c:func:`PyErr_Fetch` and
992:c:func:`PyErr_Restore` functions::
993
994   static void
995   my_dealloc(PyObject *obj)
996   {
997       MyObject *self = (MyObject *) obj;
998       PyObject *cbresult;
999
1000       if (self->my_callback != NULL) {
1001           PyObject *err_type, *err_value, *err_traceback;
1002           int have_error = PyErr_Occurred() ? 1 : 0;
1003
1004           if (have_error)
1005               PyErr_Fetch(&err_type, &err_value, &err_traceback);
1006
1007           cbresult = PyObject_CallObject(self->my_callback, NULL);
1008           if (cbresult == NULL)
1009               PyErr_WriteUnraisable(self->my_callback);
1010           else
1011               Py_DECREF(cbresult);
1012
1013           if (have_error)
1014               PyErr_Restore(err_type, err_value, err_traceback);
1015
1016           Py_DECREF(self->my_callback);
1017       }
1018       obj->ob_type->tp_free((PyObject*)self);
1019   }
1020
1021
1022Object Presentation
1023-------------------
1024
1025.. index::
1026   builtin: repr
1027   builtin: str
1028
1029In Python, there are three ways to generate a textual representation of an
1030object: the :func:`repr` function (or equivalent back-tick syntax), the
1031:func:`str` function, and the :keyword:`print` statement.  For most objects, the
1032:keyword:`print` statement is equivalent to the :func:`str` function, but it is
1033possible to special-case printing to a :c:type:`FILE\*` if necessary; this should
1034only be done if efficiency is identified as a problem and profiling suggests
1035that creating a temporary string object to be written to a file is too
1036expensive.
1037
1038These handlers are all optional, and most types at most need to implement the
1039:c:member:`~PyTypeObject.tp_str` and :c:member:`~PyTypeObject.tp_repr` handlers. ::
1040
1041   reprfunc tp_repr;
1042   reprfunc tp_str;
1043   printfunc tp_print;
1044
1045The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a
1046representation of the instance for which it is called.  Here is a simple
1047example::
1048
1049   static PyObject *
1050   newdatatype_repr(newdatatypeobject * obj)
1051   {
1052       return PyString_FromFormat("Repr-ified_newdatatype{{size:\%d}}",
1053                                  obj->obj_UnderlyingDatatypePtr->size);
1054   }
1055
1056If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a
1057representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying
1058value for the object.
1059
1060The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler
1061described above is to :func:`repr`; that is, it is called when Python code calls
1062:func:`str` on an instance of your object.  Its implementation is very similar
1063to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human
1064consumption.  If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is
1065used instead.
1066
1067Here is a simple example::
1068
1069   static PyObject *
1070   newdatatype_str(newdatatypeobject * obj)
1071   {
1072       return PyString_FromFormat("Stringified_newdatatype{{size:\%d}}",
1073                                  obj->obj_UnderlyingDatatypePtr->size);
1074   }
1075
1076The print function will be called whenever Python needs to "print" an instance
1077of the type.  For example, if 'node' is an instance of type TreeNode, then the
1078print function is called when Python code calls::
1079
1080   print node
1081
1082There is a flags argument and one flag, :const:`Py_PRINT_RAW`, and it suggests
1083that you print without string quotes and possibly without interpreting escape
1084sequences.
1085
1086The print function receives a file object as an argument. You will likely want
1087to write to that file object.
1088
1089Here is a sample print function::
1090
1091   static int
1092   newdatatype_print(newdatatypeobject *obj, FILE *fp, int flags)
1093   {
1094       if (flags & Py_PRINT_RAW) {
1095           fprintf(fp, "<{newdatatype object--size: %d}>",
1096                   obj->obj_UnderlyingDatatypePtr->size);
1097       }
1098       else {
1099           fprintf(fp, "\"<{newdatatype object--size: %d}>\"",
1100                   obj->obj_UnderlyingDatatypePtr->size);
1101       }
1102       return 0;
1103   }
1104
1105
1106Attribute Management
1107--------------------
1108
1109For every object which can support attributes, the corresponding type must
1110provide the functions that control how the attributes are resolved.  There needs
1111to be a function which can retrieve attributes (if any are defined), and another
1112to set attributes (if setting attributes is allowed).  Removing an attribute is
1113a special case, for which the new value passed to the handler is *NULL*.
1114
1115Python supports two pairs of attribute handlers; a type that supports attributes
1116only needs to implement the functions for one pair.  The difference is that one
1117pair takes the name of the attribute as a :c:type:`char\*`, while the other
1118accepts a :c:type:`PyObject\*`.  Each type can use whichever pair makes more
1119sense for the implementation's convenience. ::
1120
1121   getattrfunc  tp_getattr;        /* char * version */
1122   setattrfunc  tp_setattr;
1123   /* ... */
1124   getattrofunc tp_getattrofunc;   /* PyObject * version */
1125   setattrofunc tp_setattrofunc;
1126
1127If accessing attributes of an object is always a simple operation (this will be
1128explained shortly), there are generic implementations which can be used to
1129provide the :c:type:`PyObject\*` version of the attribute management functions.
1130The actual need for type-specific attribute handlers almost completely
1131disappeared starting with Python 2.2, though there are many examples which have
1132not been updated to use some of the new generic mechanism that is available.
1133
1134
1135.. _generic-attribute-management:
1136
1137Generic Attribute Management
1138^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1139
1140.. versionadded:: 2.2
1141
1142Most extension types only use *simple* attributes.  So, what makes the
1143attributes simple?  There are only a couple of conditions that must be met:
1144
1145#. The name of the attributes must be known when :c:func:`PyType_Ready` is
1146   called.
1147
1148#. No special processing is needed to record that an attribute was looked up or
1149   set, nor do actions need to be taken based on the value.
1150
1151Note that this list does not place any restrictions on the values of the
1152attributes, when the values are computed, or how relevant data is stored.
1153
1154When :c:func:`PyType_Ready` is called, it uses three tables referenced by the
1155type object to create :term:`descriptor`\s which are placed in the dictionary of the
1156type object.  Each descriptor controls access to one attribute of the instance
1157object.  Each of the tables is optional; if all three are *NULL*, instances of
1158the type will only have attributes that are inherited from their base type, and
1159should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields *NULL* as
1160well, allowing the base type to handle attributes.
1161
1162The tables are declared as three fields of the type object::
1163
1164   struct PyMethodDef *tp_methods;
1165   struct PyMemberDef *tp_members;
1166   struct PyGetSetDef *tp_getset;
1167
1168If :c:member:`~PyTypeObject.tp_methods` is not *NULL*, it must refer to an array of
1169:c:type:`PyMethodDef` structures.  Each entry in the table is an instance of this
1170structure::
1171
1172   typedef struct PyMethodDef {
1173       const char  *ml_name;       /* method name */
1174       PyCFunction  ml_meth;       /* implementation function */
1175       int          ml_flags;      /* flags */
1176       const char  *ml_doc;        /* docstring */
1177   } PyMethodDef;
1178
1179One entry should be defined for each method provided by the type; no entries are
1180needed for methods inherited from a base type.  One additional entry is needed
1181at the end; it is a sentinel that marks the end of the array.  The
1182:attr:`ml_name` field of the sentinel must be *NULL*.
1183
1184XXX Need to refer to some unified discussion of the structure fields, shared
1185with the next section.
1186
1187The second table is used to define attributes which map directly to data stored
1188in the instance.  A variety of primitive C types are supported, and access may
1189be read-only or read-write.  The structures in the table are defined as::
1190
1191   typedef struct PyMemberDef {
1192       char *name;
1193       int   type;
1194       int   offset;
1195       int   flags;
1196       char *doc;
1197   } PyMemberDef;
1198
1199For each entry in the table, a :term:`descriptor` will be constructed and added to the
1200type which will be able to extract a value from the instance structure.  The
1201:attr:`type` field should contain one of the type codes defined in the
1202:file:`structmember.h` header; the value will be used to determine how to
1203convert Python values to and from C values.  The :attr:`flags` field is used to
1204store flags which control how the attribute can be accessed.
1205
1206XXX Need to move some of this to a shared section!
1207
1208The following flag constants are defined in :file:`structmember.h`; they may be
1209combined using bitwise-OR.
1210
1211+---------------------------+----------------------------------------------+
1212| Constant                  | Meaning                                      |
1213+===========================+==============================================+
1214| :const:`READONLY`         | Never writable.                              |
1215+---------------------------+----------------------------------------------+
1216| :const:`RO`               | Shorthand for :const:`READONLY`.             |
1217+---------------------------+----------------------------------------------+
1218| :const:`READ_RESTRICTED`  | Not readable in restricted mode.             |
1219+---------------------------+----------------------------------------------+
1220| :const:`WRITE_RESTRICTED` | Not writable in restricted mode.             |
1221+---------------------------+----------------------------------------------+
1222| :const:`RESTRICTED`       | Not readable or writable in restricted mode. |
1223+---------------------------+----------------------------------------------+
1224
1225.. index::
1226   single: READONLY
1227   single: RO
1228   single: READ_RESTRICTED
1229   single: WRITE_RESTRICTED
1230   single: RESTRICTED
1231
1232An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build
1233descriptors that are used at runtime is that any attribute defined this way can
1234have an associated doc string simply by providing the text in the table.  An
1235application can use the introspection API to retrieve the descriptor from the
1236class object, and get the doc string using its :attr:`__doc__` attribute.
1237
1238As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value
1239of *NULL* is required.
1240
1241.. XXX Descriptors need to be explained in more detail somewhere, but not here.
1242
1243   Descriptor objects have two handler functions which correspond to the
1244   \member{tp_getattro} and \member{tp_setattro} handlers.  The
1245   \method{__get__()} handler is a function which is passed the descriptor,
1246   instance, and type objects, and returns the value of the attribute, or it
1247   returns \NULL{} and sets an exception.  The \method{__set__()} handler is
1248   passed the descriptor, instance, type, and new value;
1249
1250
1251Type-specific Attribute Management
1252^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1253
1254For simplicity, only the :c:type:`char\*` version will be demonstrated here; the
1255type of the name parameter is the only difference between the :c:type:`char\*`
1256and :c:type:`PyObject\*` flavors of the interface. This example effectively does
1257the same thing as the generic example above, but does not use the generic
1258support added in Python 2.2.  The value in showing this is two-fold: it
1259demonstrates how basic attribute management can be done in a way that is
1260portable to older versions of Python, and explains how the handler functions are
1261called, so that if you do need to extend their functionality, you'll understand
1262what needs to be done.
1263
1264The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute
1265look-up.  It is called in the same situations where the :meth:`__getattr__`
1266method of a class would be called.
1267
1268A likely way to handle this is (1) to implement a set of functions (such as
1269:c:func:`newdatatype_getSize` and :c:func:`newdatatype_setSize` in the example
1270below), (2) provide a method table listing these functions, and (3) provide a
1271getattr function that returns the result of a lookup in that table.  The method
1272table uses the same structure as the :c:member:`~PyTypeObject.tp_methods` field of the type
1273object.
1274
1275Here is an example::
1276
1277   static PyMethodDef newdatatype_methods[] = {
1278       {"getSize", (PyCFunction)newdatatype_getSize, METH_VARARGS,
1279        "Return the current size."},
1280       {"setSize", (PyCFunction)newdatatype_setSize, METH_VARARGS,
1281        "Set the size."},
1282       {NULL, NULL, 0, NULL}           /* sentinel */
1283   };
1284
1285   static PyObject *
1286   newdatatype_getattr(newdatatypeobject *obj, char *name)
1287   {
1288       return Py_FindMethod(newdatatype_methods, (PyObject *)obj, name);
1289   }
1290
1291The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or
1292:meth:`__delattr__` method of a class instance would be called.  When an
1293attribute should be deleted, the third parameter will be *NULL*.  Here is an
1294example that simply raises an exception; if this were really all you wanted, the
1295:c:member:`~PyTypeObject.tp_setattr` handler should be set to *NULL*. ::
1296
1297   static int
1298   newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v)
1299   {
1300       (void)PyErr_Format(PyExc_RuntimeError, "Read-only attribute: \%s", name);
1301       return -1;
1302   }
1303
1304
1305Object Comparison
1306-----------------
1307
1308::
1309
1310   cmpfunc tp_compare;
1311
1312The :c:member:`~PyTypeObject.tp_compare` handler is called when comparisons are needed and the
1313object does not implement the specific rich comparison method which matches the
1314requested comparison.  (It is always used if defined and the
1315:c:func:`PyObject_Compare` or :c:func:`PyObject_Cmp` functions are used, or if
1316:func:`cmp` is used from Python.) It is analogous to the :meth:`__cmp__` method.
1317This function should return ``-1`` if *obj1* is less than *obj2*, ``0`` if they
1318are equal, and ``1`` if *obj1* is greater than *obj2*. (It was previously
1319allowed to return arbitrary negative or positive integers for less than and
1320greater than, respectively; as of Python 2.2, this is no longer allowed.  In the
1321future, other return values may be assigned a different meaning.)
1322
1323A :c:member:`~PyTypeObject.tp_compare` handler may raise an exception.  In this case it should
1324return a negative value.  The caller has to test for the exception using
1325:c:func:`PyErr_Occurred`.
1326
1327Here is a sample implementation::
1328
1329   static int
1330   newdatatype_compare(newdatatypeobject * obj1, newdatatypeobject * obj2)
1331   {
1332       long result;
1333
1334       if (obj1->obj_UnderlyingDatatypePtr->size <
1335           obj2->obj_UnderlyingDatatypePtr->size) {
1336           result = -1;
1337       }
1338       else if (obj1->obj_UnderlyingDatatypePtr->size >
1339                obj2->obj_UnderlyingDatatypePtr->size) {
1340           result = 1;
1341       }
1342       else {
1343           result = 0;
1344       }
1345       return result;
1346   }
1347
1348
1349Abstract Protocol Support
1350-------------------------
1351
1352Python supports a variety of *abstract* 'protocols;' the specific interfaces
1353provided to use these interfaces are documented in :ref:`abstract`.
1354
1355
1356A number of these abstract interfaces were defined early in the development of
1357the Python implementation.  In particular, the number, mapping, and sequence
1358protocols have been part of Python since the beginning.  Other protocols have
1359been added over time.  For protocols which depend on several handler routines
1360from the type implementation, the older protocols have been defined as optional
1361blocks of handlers referenced by the type object.  For newer protocols there are
1362additional slots in the main type object, with a flag bit being set to indicate
1363that the slots are present and should be checked by the interpreter.  (The flag
1364bit does not indicate that the slot values are non-*NULL*. The flag may be set
1365to indicate the presence of a slot, but a slot may still be unfilled.) ::
1366
1367   PyNumberMethods   *tp_as_number;
1368   PySequenceMethods *tp_as_sequence;
1369   PyMappingMethods  *tp_as_mapping;
1370
1371If you wish your object to be able to act like a number, a sequence, or a
1372mapping object, then you place the address of a structure that implements the C
1373type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or
1374:c:type:`PyMappingMethods`, respectively. It is up to you to fill in this
1375structure with appropriate values. You can find examples of the use of each of
1376these in the :file:`Objects` directory of the Python source distribution. ::
1377
1378   hashfunc tp_hash;
1379
1380This function, if you choose to provide it, should return a hash number for an
1381instance of your data type. Here is a moderately pointless example::
1382
1383   static long
1384   newdatatype_hash(newdatatypeobject *obj)
1385   {
1386       long result;
1387       result = obj->obj_UnderlyingDatatypePtr->size;
1388       result = result * 3;
1389       return result;
1390   }
1391
1392::
1393
1394   ternaryfunc tp_call;
1395
1396This function is called when an instance of your data type is "called", for
1397example, if ``obj1`` is an instance of your data type and the Python script
1398contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked.
1399
1400This function takes three arguments:
1401
1402#. *arg1* is the instance of the data type which is the subject of the call. If
1403   the call is ``obj1('hello')``, then *arg1* is ``obj1``.
1404
1405#. *arg2* is a tuple containing the arguments to the call.  You can use
1406   :c:func:`PyArg_ParseTuple` to extract the arguments.
1407
1408#. *arg3* is a dictionary of keyword arguments that were passed. If this is
1409   non-*NULL* and you support keyword arguments, use
1410   :c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments.  If you do not
1411   want to support keyword arguments and this is non-*NULL*, raise a
1412   :exc:`TypeError` with a message saying that keyword arguments are not supported.
1413
1414Here is a desultory example of the implementation of the call function. ::
1415
1416   /* Implement the call function.
1417    *    obj1 is the instance receiving the call.
1418    *    obj2 is a tuple containing the arguments to the call, in this
1419    *         case 3 strings.
1420    */
1421   static PyObject *
1422   newdatatype_call(newdatatypeobject *obj, PyObject *args, PyObject *other)
1423   {
1424       PyObject *result;
1425       char *arg1;
1426       char *arg2;
1427       char *arg3;
1428
1429       if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) {
1430           return NULL;
1431       }
1432       result = PyString_FromFormat(
1433           "Returning -- value: [\%d] arg1: [\%s] arg2: [\%s] arg3: [\%s]\n",
1434           obj->obj_UnderlyingDatatypePtr->size,
1435           arg1, arg2, arg3);
1436       printf("\%s", PyString_AS_STRING(result));
1437       return result;
1438   }
1439
1440XXX some fields need to be added here... ::
1441
1442   /* Added in release 2.2 */
1443   /* Iterators */
1444   getiterfunc tp_iter;
1445   iternextfunc tp_iternext;
1446
1447These functions provide support for the iterator protocol.  Any object which
1448wishes to support iteration over its contents (which may be generated during
1449iteration) must implement the ``tp_iter`` handler.  Objects which are returned
1450by a ``tp_iter`` handler must implement both the ``tp_iter`` and ``tp_iternext``
1451handlers. Both handlers take exactly one parameter, the instance for which they
1452are being called, and return a new reference.  In the case of an error, they
1453should set an exception and return *NULL*.
1454
1455For an object which represents an iterable collection, the ``tp_iter`` handler
1456must return an iterator object.  The iterator object is responsible for
1457maintaining the state of the iteration.  For collections which can support
1458multiple iterators which do not interfere with each other (as lists and tuples
1459do), a new iterator should be created and returned.  Objects which can only be
1460iterated over once (usually due to side effects of iteration) should implement
1461this handler by returning a new reference to themselves, and should also
1462implement the ``tp_iternext`` handler.  File objects are an example of such an
1463iterator.
1464
1465Iterator objects should implement both handlers.  The ``tp_iter`` handler should
1466return a new reference to the iterator (this is the same as the ``tp_iter``
1467handler for objects which can only be iterated over destructively).  The
1468``tp_iternext`` handler should return a new reference to the next object in the
1469iteration if there is one.  If the iteration has reached the end, it may return
1470*NULL* without setting an exception or it may set :exc:`StopIteration`; avoiding
1471the exception can yield slightly better performance.  If an actual error occurs,
1472it should set an exception and return *NULL*.
1473
1474
1475.. _weakref-support:
1476
1477Weak Reference Support
1478----------------------
1479
1480One of the goals of Python's weak-reference implementation is to allow any type
1481to participate in the weak reference mechanism without incurring the overhead on
1482those objects which do not benefit by weak referencing (such as numbers).
1483
1484For an object to be weakly referencable, the extension must include a
1485:c:type:`PyObject\*` field in the instance structure for the use of the weak
1486reference mechanism; it must be initialized to *NULL* by the object's
1487constructor.  It must also set the :c:member:`~PyTypeObject.tp_weaklistoffset` field of the
1488corresponding type object to the offset of the field. For example, the instance
1489type is defined with the following structure::
1490
1491   typedef struct {
1492       PyObject_HEAD
1493       PyClassObject *in_class;       /* The class object */
1494       PyObject      *in_dict;        /* A dictionary */
1495       PyObject      *in_weakreflist; /* List of weak references */
1496   } PyInstanceObject;
1497
1498The statically-declared type object for instances is defined this way::
1499
1500   PyTypeObject PyInstance_Type = {
1501       PyObject_HEAD_INIT(&PyType_Type)
1502       0,
1503       "module.instance",
1504
1505       /* Lots of stuff omitted for brevity... */
1506
1507       Py_TPFLAGS_DEFAULT,                         /* tp_flags */
1508       0,                                          /* tp_doc */
1509       0,                                          /* tp_traverse */
1510       0,                                          /* tp_clear */
1511       0,                                          /* tp_richcompare */
1512       offsetof(PyInstanceObject, in_weakreflist), /* tp_weaklistoffset */
1513   };
1514
1515The type constructor is responsible for initializing the weak reference list to
1516*NULL*::
1517
1518   static PyObject *
1519   instance_new() {
1520       /* Other initialization stuff omitted for brevity */
1521
1522       self->in_weakreflist = NULL;
1523
1524       return (PyObject *) self;
1525   }
1526
1527The only further addition is that the destructor needs to call the weak
1528reference manager to clear any weak references.  This is only required if the
1529weak reference list is non-*NULL*::
1530
1531   static void
1532   instance_dealloc(PyInstanceObject *inst)
1533   {
1534       /* Allocate temporaries if needed, but do not begin
1535          destruction just yet.
1536        */
1537
1538       if (inst->in_weakreflist != NULL)
1539           PyObject_ClearWeakRefs((PyObject *) inst);
1540
1541       /* Proceed with object destruction normally. */
1542   }
1543
1544
1545More Suggestions
1546----------------
1547
1548Remember that you can omit most of these functions, in which case you provide
1549``0`` as a value.  There are type definitions for each of the functions you must
1550provide.  They are in :file:`object.h` in the Python include directory that
1551comes with the source distribution of Python.
1552
1553In order to learn how to implement any specific method for your new data type,
1554do the following: Download and unpack the Python source distribution.  Go the
1555:file:`Objects` directory, then search the C source files for ``tp_`` plus the
1556function you want (for example, ``tp_print`` or ``tp_compare``).  You will find
1557examples of the function you want to implement.
1558
1559When you need to verify that an object is an instance of the type you are
1560implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of its use
1561might be something like the following::
1562
1563   if (! PyObject_TypeCheck(some_object, &MyType)) {
1564       PyErr_SetString(PyExc_TypeError, "arg #1 not a mything");
1565       return NULL;
1566   }
1567
1568.. rubric:: Footnotes
1569
1570.. [#] This is true when we know that the object is a basic type, like a string or a
1571   float.
1572
1573.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler in this example, because our
1574   type doesn't support garbage collection. Even if a type supports garbage
1575   collection, there are calls that can be made to "untrack" the object from
1576   garbage collection, however, these calls are advanced and not covered here.
1577
1578.. [#] We now know that the first and last members are strings, so perhaps we could be
1579   less careful about decrementing their reference counts, however, we accept
1580   instances of string subclasses. Even though deallocating normal strings won't
1581   call back into our objects, we can't guarantee that deallocating an instance of
1582   a string subclass won't call back into our objects.
1583
1584.. [#] Even in the third version, we aren't guaranteed to avoid cycles.  Instances of
1585   string subclasses are allowed and string subclasses could allow cycles even if
1586   normal strings don't.
1587