1.. highlight:: c 2 3.. _new-types-topics: 4 5***************************************** 6Defining Extension Types: Assorted Topics 7***************************************** 8 9.. _dnt-type-methods: 10 11This section aims to give a quick fly-by on the various type methods you can 12implement and what they do. 13 14Here is the definition of :c:type:`PyTypeObject`, with some fields only used in 15:ref:`debug builds <debug-build>` omitted: 16 17.. literalinclude:: ../includes/typestruct.h 18 19 20Now that's a *lot* of methods. Don't worry too much though -- if you have 21a type you want to define, the chances are very good that you will only 22implement a handful of these. 23 24As you probably expect by now, we're going to go over this and give more 25information about the various handlers. We won't go in the order they are 26defined in the structure, because there is a lot of historical baggage that 27impacts the ordering of the fields. It's often easiest to find an example 28that includes the fields you need and then change the values to suit your new 29type. :: 30 31 const char *tp_name; /* For printing */ 32 33The name of the type -- as mentioned in the previous chapter, this will appear in 34various places, almost entirely for diagnostic purposes. Try to choose something 35that will be helpful in such a situation! :: 36 37 Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */ 38 39These fields tell the runtime how much memory to allocate when new objects of 40this type are created. Python has some built-in support for variable length 41structures (think: strings, tuples) which is where the :c:member:`~PyTypeObject.tp_itemsize` field 42comes in. This will be dealt with later. :: 43 44 const char *tp_doc; 45 46Here you can put a string (or its address) that you want returned when the 47Python script references ``obj.__doc__`` to retrieve the doc string. 48 49Now we come to the basic type methods -- the ones most extension types will 50implement. 51 52 53Finalization and De-allocation 54------------------------------ 55 56.. index:: 57 single: object; deallocation 58 single: deallocation, object 59 single: object; finalization 60 single: finalization, of objects 61 62:: 63 64 destructor tp_dealloc; 65 66This function is called when the reference count of the instance of your type is 67reduced to zero and the Python interpreter wants to reclaim it. If your type 68has memory to free or other clean-up to perform, you can put it here. The 69object itself needs to be freed here as well. Here is an example of this 70function:: 71 72 static void 73 newdatatype_dealloc(newdatatypeobject *obj) 74 { 75 free(obj->obj_UnderlyingDatatypePtr); 76 Py_TYPE(obj)->tp_free((PyObject *)obj); 77 } 78 79If your type supports garbage collection, the destructor should call 80:c:func:`PyObject_GC_UnTrack` before clearing any member fields:: 81 82 static void 83 newdatatype_dealloc(newdatatypeobject *obj) 84 { 85 PyObject_GC_UnTrack(obj); 86 Py_CLEAR(obj->other_obj); 87 ... 88 Py_TYPE(obj)->tp_free((PyObject *)obj); 89 } 90 91.. index:: 92 single: PyErr_Fetch (C function) 93 single: PyErr_Restore (C function) 94 95One important requirement of the deallocator function is that it leaves any 96pending exceptions alone. This is important since deallocators are frequently 97called as the interpreter unwinds the Python stack; when the stack is unwound 98due to an exception (rather than normal returns), nothing is done to protect the 99deallocators from seeing that an exception has already been set. Any actions 100which a deallocator performs which may cause additional Python code to be 101executed may detect that an exception has been set. This can lead to misleading 102errors from the interpreter. The proper way to protect against this is to save 103a pending exception before performing the unsafe action, and restoring it when 104done. This can be done using the :c:func:`PyErr_Fetch` and 105:c:func:`PyErr_Restore` functions:: 106 107 static void 108 my_dealloc(PyObject *obj) 109 { 110 MyObject *self = (MyObject *) obj; 111 PyObject *cbresult; 112 113 if (self->my_callback != NULL) { 114 PyObject *err_type, *err_value, *err_traceback; 115 116 /* This saves the current exception state */ 117 PyErr_Fetch(&err_type, &err_value, &err_traceback); 118 119 cbresult = PyObject_CallNoArgs(self->my_callback); 120 if (cbresult == NULL) 121 PyErr_WriteUnraisable(self->my_callback); 122 else 123 Py_DECREF(cbresult); 124 125 /* This restores the saved exception state */ 126 PyErr_Restore(err_type, err_value, err_traceback); 127 128 Py_DECREF(self->my_callback); 129 } 130 Py_TYPE(obj)->tp_free((PyObject*)self); 131 } 132 133.. note:: 134 There are limitations to what you can safely do in a deallocator function. 135 First, if your type supports garbage collection (using :c:member:`~PyTypeObject.tp_traverse` 136 and/or :c:member:`~PyTypeObject.tp_clear`), some of the object's members can have been 137 cleared or finalized by the time :c:member:`~PyTypeObject.tp_dealloc` is called. Second, in 138 :c:member:`~PyTypeObject.tp_dealloc`, your object is in an unstable state: its reference 139 count is equal to zero. Any call to a non-trivial object or API (as in the 140 example above) might end up calling :c:member:`~PyTypeObject.tp_dealloc` again, causing a 141 double free and a crash. 142 143 Starting with Python 3.4, it is recommended not to put any complex 144 finalization code in :c:member:`~PyTypeObject.tp_dealloc`, and instead use the new 145 :c:member:`~PyTypeObject.tp_finalize` type method. 146 147 .. seealso:: 148 :pep:`442` explains the new finalization scheme. 149 150.. index:: 151 single: string; object representation 152 pair: built-in function; repr 153 154Object Presentation 155------------------- 156 157In Python, there are two ways to generate a textual representation of an object: 158the :func:`repr` function, and the :func:`str` function. (The :func:`print` 159function just calls :func:`str`.) These handlers are both optional. 160 161:: 162 163 reprfunc tp_repr; 164 reprfunc tp_str; 165 166The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a 167representation of the instance for which it is called. Here is a simple 168example:: 169 170 static PyObject * 171 newdatatype_repr(newdatatypeobject *obj) 172 { 173 return PyUnicode_FromFormat("Repr-ified_newdatatype{{size:%d}}", 174 obj->obj_UnderlyingDatatypePtr->size); 175 } 176 177If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a 178representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely identifying 179value for the object. 180 181The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler 182described above is to :func:`repr`; that is, it is called when Python code calls 183:func:`str` on an instance of your object. Its implementation is very similar 184to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human 185consumption. If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is 186used instead. 187 188Here is a simple example:: 189 190 static PyObject * 191 newdatatype_str(newdatatypeobject *obj) 192 { 193 return PyUnicode_FromFormat("Stringified_newdatatype{{size:%d}}", 194 obj->obj_UnderlyingDatatypePtr->size); 195 } 196 197 198 199Attribute Management 200-------------------- 201 202For every object which can support attributes, the corresponding type must 203provide the functions that control how the attributes are resolved. There needs 204to be a function which can retrieve attributes (if any are defined), and another 205to set attributes (if setting attributes is allowed). Removing an attribute is 206a special case, for which the new value passed to the handler is ``NULL``. 207 208Python supports two pairs of attribute handlers; a type that supports attributes 209only needs to implement the functions for one pair. The difference is that one 210pair takes the name of the attribute as a :c:expr:`char\*`, while the other 211accepts a :c:expr:`PyObject*`. Each type can use whichever pair makes more 212sense for the implementation's convenience. :: 213 214 getattrfunc tp_getattr; /* char * version */ 215 setattrfunc tp_setattr; 216 /* ... */ 217 getattrofunc tp_getattro; /* PyObject * version */ 218 setattrofunc tp_setattro; 219 220If accessing attributes of an object is always a simple operation (this will be 221explained shortly), there are generic implementations which can be used to 222provide the :c:expr:`PyObject*` version of the attribute management functions. 223The actual need for type-specific attribute handlers almost completely 224disappeared starting with Python 2.2, though there are many examples which have 225not been updated to use some of the new generic mechanism that is available. 226 227 228.. _generic-attribute-management: 229 230Generic Attribute Management 231^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 232 233Most extension types only use *simple* attributes. So, what makes the 234attributes simple? There are only a couple of conditions that must be met: 235 236#. The name of the attributes must be known when :c:func:`PyType_Ready` is 237 called. 238 239#. No special processing is needed to record that an attribute was looked up or 240 set, nor do actions need to be taken based on the value. 241 242Note that this list does not place any restrictions on the values of the 243attributes, when the values are computed, or how relevant data is stored. 244 245When :c:func:`PyType_Ready` is called, it uses three tables referenced by the 246type object to create :term:`descriptor`\s which are placed in the dictionary of the 247type object. Each descriptor controls access to one attribute of the instance 248object. Each of the tables is optional; if all three are ``NULL``, instances of 249the type will only have attributes that are inherited from their base type, and 250should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields ``NULL`` as 251well, allowing the base type to handle attributes. 252 253The tables are declared as three fields of the type object:: 254 255 struct PyMethodDef *tp_methods; 256 struct PyMemberDef *tp_members; 257 struct PyGetSetDef *tp_getset; 258 259If :c:member:`~PyTypeObject.tp_methods` is not ``NULL``, it must refer to an array of 260:c:type:`PyMethodDef` structures. Each entry in the table is an instance of this 261structure:: 262 263 typedef struct PyMethodDef { 264 const char *ml_name; /* method name */ 265 PyCFunction ml_meth; /* implementation function */ 266 int ml_flags; /* flags */ 267 const char *ml_doc; /* docstring */ 268 } PyMethodDef; 269 270One entry should be defined for each method provided by the type; no entries are 271needed for methods inherited from a base type. One additional entry is needed 272at the end; it is a sentinel that marks the end of the array. The 273:c:member:`~PyMethodDef.ml_name` field of the sentinel must be ``NULL``. 274 275The second table is used to define attributes which map directly to data stored 276in the instance. A variety of primitive C types are supported, and access may 277be read-only or read-write. The structures in the table are defined as:: 278 279 typedef struct PyMemberDef { 280 const char *name; 281 int type; 282 int offset; 283 int flags; 284 const char *doc; 285 } PyMemberDef; 286 287For each entry in the table, a :term:`descriptor` will be constructed and added to the 288type which will be able to extract a value from the instance structure. The 289:c:member:`~PyMemberDef.type` field should contain a type code like :c:macro:`Py_T_INT` or 290:c:macro:`Py_T_DOUBLE`; the value will be used to determine how to 291convert Python values to and from C values. The :c:member:`~PyMemberDef.flags` field is used to 292store flags which control how the attribute can be accessed: you can set it to 293:c:macro:`Py_READONLY` to prevent Python code from setting it. 294 295An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build 296descriptors that are used at runtime is that any attribute defined this way can 297have an associated doc string simply by providing the text in the table. An 298application can use the introspection API to retrieve the descriptor from the 299class object, and get the doc string using its :attr:`~type.__doc__` attribute. 300 301As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :c:member:`~PyMethodDef.ml_name` value 302of ``NULL`` is required. 303 304.. XXX Descriptors need to be explained in more detail somewhere, but not here. 305 306 Descriptor objects have two handler functions which correspond to the 307 \member{tp_getattro} and \member{tp_setattro} handlers. The 308 \method{__get__()} handler is a function which is passed the descriptor, 309 instance, and type objects, and returns the value of the attribute, or it 310 returns \NULL{} and sets an exception. The \method{__set__()} handler is 311 passed the descriptor, instance, type, and new value; 312 313 314Type-specific Attribute Management 315^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 316 317For simplicity, only the :c:expr:`char\*` version will be demonstrated here; the 318type of the name parameter is the only difference between the :c:expr:`char\*` 319and :c:expr:`PyObject*` flavors of the interface. This example effectively does 320the same thing as the generic example above, but does not use the generic 321support added in Python 2.2. It explains how the handler functions are 322called, so that if you do need to extend their functionality, you'll understand 323what needs to be done. 324 325The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute 326look-up. It is called in the same situations where the :meth:`~object.__getattr__` 327method of a class would be called. 328 329Here is an example:: 330 331 static PyObject * 332 newdatatype_getattr(newdatatypeobject *obj, char *name) 333 { 334 if (strcmp(name, "data") == 0) 335 { 336 return PyLong_FromLong(obj->data); 337 } 338 339 PyErr_Format(PyExc_AttributeError, 340 "'%.100s' object has no attribute '%.400s'", 341 Py_TYPE(obj)->tp_name, name); 342 return NULL; 343 } 344 345The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`~object.__setattr__` or 346:meth:`~object.__delattr__` method of a class instance would be called. When an 347attribute should be deleted, the third parameter will be ``NULL``. Here is an 348example that simply raises an exception; if this were really all you wanted, the 349:c:member:`~PyTypeObject.tp_setattr` handler should be set to ``NULL``. :: 350 351 static int 352 newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v) 353 { 354 PyErr_Format(PyExc_RuntimeError, "Read-only attribute: %s", name); 355 return -1; 356 } 357 358Object Comparison 359----------------- 360 361:: 362 363 richcmpfunc tp_richcompare; 364 365The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed. It is 366analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like 367:meth:`!__lt__`, and also called by :c:func:`PyObject_RichCompare` and 368:c:func:`PyObject_RichCompareBool`. 369 370This function is called with two Python objects and the operator as arguments, 371where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GE``, 372``Py_LT`` or ``Py_GT``. It should compare the two objects with respect to the 373specified operator and return ``Py_True`` or ``Py_False`` if the comparison is 374successful, ``Py_NotImplemented`` to indicate that comparison is not 375implemented and the other object's comparison method should be tried, or ``NULL`` 376if an exception was set. 377 378Here is a sample implementation, for a datatype that is considered equal if the 379size of an internal pointer is equal:: 380 381 static PyObject * 382 newdatatype_richcmp(newdatatypeobject *obj1, newdatatypeobject *obj2, int op) 383 { 384 PyObject *result; 385 int c, size1, size2; 386 387 /* code to make sure that both arguments are of type 388 newdatatype omitted */ 389 390 size1 = obj1->obj_UnderlyingDatatypePtr->size; 391 size2 = obj2->obj_UnderlyingDatatypePtr->size; 392 393 switch (op) { 394 case Py_LT: c = size1 < size2; break; 395 case Py_LE: c = size1 <= size2; break; 396 case Py_EQ: c = size1 == size2; break; 397 case Py_NE: c = size1 != size2; break; 398 case Py_GT: c = size1 > size2; break; 399 case Py_GE: c = size1 >= size2; break; 400 } 401 result = c ? Py_True : Py_False; 402 Py_INCREF(result); 403 return result; 404 } 405 406 407Abstract Protocol Support 408------------------------- 409 410Python supports a variety of *abstract* 'protocols;' the specific interfaces 411provided to use these interfaces are documented in :ref:`abstract`. 412 413 414A number of these abstract interfaces were defined early in the development of 415the Python implementation. In particular, the number, mapping, and sequence 416protocols have been part of Python since the beginning. Other protocols have 417been added over time. For protocols which depend on several handler routines 418from the type implementation, the older protocols have been defined as optional 419blocks of handlers referenced by the type object. For newer protocols there are 420additional slots in the main type object, with a flag bit being set to indicate 421that the slots are present and should be checked by the interpreter. (The flag 422bit does not indicate that the slot values are non-``NULL``. The flag may be set 423to indicate the presence of a slot, but a slot may still be unfilled.) :: 424 425 PyNumberMethods *tp_as_number; 426 PySequenceMethods *tp_as_sequence; 427 PyMappingMethods *tp_as_mapping; 428 429If you wish your object to be able to act like a number, a sequence, or a 430mapping object, then you place the address of a structure that implements the C 431type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or 432:c:type:`PyMappingMethods`, respectively. It is up to you to fill in this 433structure with appropriate values. You can find examples of the use of each of 434these in the :file:`Objects` directory of the Python source distribution. :: 435 436 hashfunc tp_hash; 437 438This function, if you choose to provide it, should return a hash number for an 439instance of your data type. Here is a simple example:: 440 441 static Py_hash_t 442 newdatatype_hash(newdatatypeobject *obj) 443 { 444 Py_hash_t result; 445 result = obj->some_size + 32767 * obj->some_number; 446 if (result == -1) 447 result = -2; 448 return result; 449 } 450 451:c:type:`Py_hash_t` is a signed integer type with a platform-varying width. 452Returning ``-1`` from :c:member:`~PyTypeObject.tp_hash` indicates an error, 453which is why you should be careful to avoid returning it when hash computation 454is successful, as seen above. 455 456:: 457 458 ternaryfunc tp_call; 459 460This function is called when an instance of your data type is "called", for 461example, if ``obj1`` is an instance of your data type and the Python script 462contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked. 463 464This function takes three arguments: 465 466#. *self* is the instance of the data type which is the subject of the call. 467 If the call is ``obj1('hello')``, then *self* is ``obj1``. 468 469#. *args* is a tuple containing the arguments to the call. You can use 470 :c:func:`PyArg_ParseTuple` to extract the arguments. 471 472#. *kwds* is a dictionary of keyword arguments that were passed. If this is 473 non-``NULL`` and you support keyword arguments, use 474 :c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments. If you 475 do not want to support keyword arguments and this is non-``NULL``, raise a 476 :exc:`TypeError` with a message saying that keyword arguments are not supported. 477 478Here is a toy ``tp_call`` implementation:: 479 480 static PyObject * 481 newdatatype_call(newdatatypeobject *obj, PyObject *args, PyObject *kwds) 482 { 483 PyObject *result; 484 const char *arg1; 485 const char *arg2; 486 const char *arg3; 487 488 if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) { 489 return NULL; 490 } 491 result = PyUnicode_FromFormat( 492 "Returning -- value: [%d] arg1: [%s] arg2: [%s] arg3: [%s]\n", 493 obj->obj_UnderlyingDatatypePtr->size, 494 arg1, arg2, arg3); 495 return result; 496 } 497 498:: 499 500 /* Iterators */ 501 getiterfunc tp_iter; 502 iternextfunc tp_iternext; 503 504These functions provide support for the iterator protocol. Both handlers 505take exactly one parameter, the instance for which they are being called, 506and return a new reference. In the case of an error, they should set an 507exception and return ``NULL``. :c:member:`~PyTypeObject.tp_iter` corresponds 508to the Python :meth:`~object.__iter__` method, while :c:member:`~PyTypeObject.tp_iternext` 509corresponds to the Python :meth:`~iterator.__next__` method. 510 511Any :term:`iterable` object must implement the :c:member:`~PyTypeObject.tp_iter` 512handler, which must return an :term:`iterator` object. Here the same guidelines 513apply as for Python classes: 514 515* For collections (such as lists and tuples) which can support multiple 516 independent iterators, a new iterator should be created and returned by 517 each call to :c:member:`~PyTypeObject.tp_iter`. 518* Objects which can only be iterated over once (usually due to side effects of 519 iteration, such as file objects) can implement :c:member:`~PyTypeObject.tp_iter` 520 by returning a new reference to themselves -- and should also therefore 521 implement the :c:member:`~PyTypeObject.tp_iternext` handler. 522 523Any :term:`iterator` object should implement both :c:member:`~PyTypeObject.tp_iter` 524and :c:member:`~PyTypeObject.tp_iternext`. An iterator's 525:c:member:`~PyTypeObject.tp_iter` handler should return a new reference 526to the iterator. Its :c:member:`~PyTypeObject.tp_iternext` handler should 527return a new reference to the next object in the iteration, if there is one. 528If the iteration has reached the end, :c:member:`~PyTypeObject.tp_iternext` 529may return ``NULL`` without setting an exception, or it may set 530:exc:`StopIteration` *in addition* to returning ``NULL``; avoiding 531the exception can yield slightly better performance. If an actual error 532occurs, :c:member:`~PyTypeObject.tp_iternext` should always set an exception 533and return ``NULL``. 534 535 536.. _weakref-support: 537 538Weak Reference Support 539---------------------- 540 541One of the goals of Python's weak reference implementation is to allow any type 542to participate in the weak reference mechanism without incurring the overhead on 543performance-critical objects (such as numbers). 544 545.. seealso:: 546 Documentation for the :mod:`weakref` module. 547 548For an object to be weakly referenceable, the extension type must set the 549``Py_TPFLAGS_MANAGED_WEAKREF`` bit of the :c:member:`~PyTypeObject.tp_flags` 550field. The legacy :c:member:`~PyTypeObject.tp_weaklistoffset` field should 551be left as zero. 552 553Concretely, here is how the statically declared type object would look:: 554 555 static PyTypeObject TrivialType = { 556 PyVarObject_HEAD_INIT(NULL, 0) 557 /* ... other members omitted for brevity ... */ 558 .tp_flags = Py_TPFLAGS_MANAGED_WEAKREF | ..., 559 }; 560 561 562The only further addition is that ``tp_dealloc`` needs to clear any weak 563references (by calling :c:func:`PyObject_ClearWeakRefs`):: 564 565 static void 566 Trivial_dealloc(TrivialObject *self) 567 { 568 /* Clear weakrefs first before calling any destructors */ 569 PyObject_ClearWeakRefs((PyObject *) self); 570 /* ... remainder of destruction code omitted for brevity ... */ 571 Py_TYPE(self)->tp_free((PyObject *) self); 572 } 573 574 575More Suggestions 576---------------- 577 578In order to learn how to implement any specific method for your new data type, 579get the :term:`CPython` source code. Go to the :file:`Objects` directory, 580then search the C source files for ``tp_`` plus the function you want 581(for example, ``tp_richcompare``). You will find examples of the function 582you want to implement. 583 584When you need to verify that an object is a concrete instance of the type you 585are implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of 586its use might be something like the following:: 587 588 if (!PyObject_TypeCheck(some_object, &MyType)) { 589 PyErr_SetString(PyExc_TypeError, "arg #1 not a mything"); 590 return NULL; 591 } 592 593.. seealso:: 594 Download CPython source releases. 595 https://www.python.org/downloads/source/ 596 597 The CPython project on GitHub, where the CPython source code is developed. 598 https://github.com/python/cpython 599