1.. highlight:: c 2 3.. _new-types-topics: 4 5***************************************** 6Defining Extension Types: Assorted Topics 7***************************************** 8 9.. _dnt-type-methods: 10 11This section aims to give a quick fly-by on the various type methods you can 12implement and what they do. 13 14Here is the definition of :c:type:`PyTypeObject`, with some fields only used in 15debug builds omitted: 16 17.. literalinclude:: ../includes/typestruct.h 18 19 20Now that's a *lot* of methods. Don't worry too much though -- if you have 21a type you want to define, the chances are very good that you will only 22implement a handful of these. 23 24As you probably expect by now, we're going to go over this and give more 25information about the various handlers. We won't go in the order they are 26defined in the structure, because there is a lot of historical baggage that 27impacts the ordering of the fields. It's often easiest to find an example 28that includes the fields you need and then change the values to suit your new 29type. :: 30 31 const char *tp_name; /* For printing */ 32 33The name of the type -- as mentioned in the previous chapter, this will appear in 34various places, almost entirely for diagnostic purposes. Try to choose something 35that will be helpful in such a situation! :: 36 37 Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */ 38 39These fields tell the runtime how much memory to allocate when new objects of 40this type are created. Python has some built-in support for variable length 41structures (think: strings, tuples) which is where the :c:member:`~PyTypeObject.tp_itemsize` field 42comes in. This will be dealt with later. :: 43 44 const char *tp_doc; 45 46Here you can put a string (or its address) that you want returned when the 47Python script references ``obj.__doc__`` to retrieve the doc string. 48 49Now we come to the basic type methods -- the ones most extension types will 50implement. 51 52 53Finalization and De-allocation 54------------------------------ 55 56.. index:: 57 single: object; deallocation 58 single: deallocation, object 59 single: object; finalization 60 single: finalization, of objects 61 62:: 63 64 destructor tp_dealloc; 65 66This function is called when the reference count of the instance of your type is 67reduced to zero and the Python interpreter wants to reclaim it. If your type 68has memory to free or other clean-up to perform, you can put it here. The 69object itself needs to be freed here as well. Here is an example of this 70function:: 71 72 static void 73 newdatatype_dealloc(newdatatypeobject *obj) 74 { 75 free(obj->obj_UnderlyingDatatypePtr); 76 Py_TYPE(obj)->tp_free(obj); 77 } 78 79.. index:: 80 single: PyErr_Fetch() 81 single: PyErr_Restore() 82 83One important requirement of the deallocator function is that it leaves any 84pending exceptions alone. This is important since deallocators are frequently 85called as the interpreter unwinds the Python stack; when the stack is unwound 86due to an exception (rather than normal returns), nothing is done to protect the 87deallocators from seeing that an exception has already been set. Any actions 88which a deallocator performs which may cause additional Python code to be 89executed may detect that an exception has been set. This can lead to misleading 90errors from the interpreter. The proper way to protect against this is to save 91a pending exception before performing the unsafe action, and restoring it when 92done. This can be done using the :c:func:`PyErr_Fetch` and 93:c:func:`PyErr_Restore` functions:: 94 95 static void 96 my_dealloc(PyObject *obj) 97 { 98 MyObject *self = (MyObject *) obj; 99 PyObject *cbresult; 100 101 if (self->my_callback != NULL) { 102 PyObject *err_type, *err_value, *err_traceback; 103 104 /* This saves the current exception state */ 105 PyErr_Fetch(&err_type, &err_value, &err_traceback); 106 107 cbresult = PyObject_CallObject(self->my_callback, NULL); 108 if (cbresult == NULL) 109 PyErr_WriteUnraisable(self->my_callback); 110 else 111 Py_DECREF(cbresult); 112 113 /* This restores the saved exception state */ 114 PyErr_Restore(err_type, err_value, err_traceback); 115 116 Py_DECREF(self->my_callback); 117 } 118 Py_TYPE(obj)->tp_free((PyObject*)self); 119 } 120 121.. note:: 122 There are limitations to what you can safely do in a deallocator function. 123 First, if your type supports garbage collection (using :c:member:`~PyTypeObject.tp_traverse` 124 and/or :c:member:`~PyTypeObject.tp_clear`), some of the object's members can have been 125 cleared or finalized by the time :c:member:`~PyTypeObject.tp_dealloc` is called. Second, in 126 :c:member:`~PyTypeObject.tp_dealloc`, your object is in an unstable state: its reference 127 count is equal to zero. Any call to a non-trivial object or API (as in the 128 example above) might end up calling :c:member:`~PyTypeObject.tp_dealloc` again, causing a 129 double free and a crash. 130 131 Starting with Python 3.4, it is recommended not to put any complex 132 finalization code in :c:member:`~PyTypeObject.tp_dealloc`, and instead use the new 133 :c:member:`~PyTypeObject.tp_finalize` type method. 134 135 .. seealso:: 136 :pep:`442` explains the new finalization scheme. 137 138.. index:: 139 single: string; object representation 140 builtin: repr 141 142Object Presentation 143------------------- 144 145In Python, there are two ways to generate a textual representation of an object: 146the :func:`repr` function, and the :func:`str` function. (The :func:`print` 147function just calls :func:`str`.) These handlers are both optional. 148 149:: 150 151 reprfunc tp_repr; 152 reprfunc tp_str; 153 154The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a 155representation of the instance for which it is called. Here is a simple 156example:: 157 158 static PyObject * 159 newdatatype_repr(newdatatypeobject * obj) 160 { 161 return PyUnicode_FromFormat("Repr-ified_newdatatype{{size:%d}}", 162 obj->obj_UnderlyingDatatypePtr->size); 163 } 164 165If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a 166representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying 167value for the object. 168 169The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler 170described above is to :func:`repr`; that is, it is called when Python code calls 171:func:`str` on an instance of your object. Its implementation is very similar 172to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human 173consumption. If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is 174used instead. 175 176Here is a simple example:: 177 178 static PyObject * 179 newdatatype_str(newdatatypeobject * obj) 180 { 181 return PyUnicode_FromFormat("Stringified_newdatatype{{size:%d}}", 182 obj->obj_UnderlyingDatatypePtr->size); 183 } 184 185 186 187Attribute Management 188-------------------- 189 190For every object which can support attributes, the corresponding type must 191provide the functions that control how the attributes are resolved. There needs 192to be a function which can retrieve attributes (if any are defined), and another 193to set attributes (if setting attributes is allowed). Removing an attribute is 194a special case, for which the new value passed to the handler is ``NULL``. 195 196Python supports two pairs of attribute handlers; a type that supports attributes 197only needs to implement the functions for one pair. The difference is that one 198pair takes the name of the attribute as a :c:type:`char\*`, while the other 199accepts a :c:type:`PyObject\*`. Each type can use whichever pair makes more 200sense for the implementation's convenience. :: 201 202 getattrfunc tp_getattr; /* char * version */ 203 setattrfunc tp_setattr; 204 /* ... */ 205 getattrofunc tp_getattro; /* PyObject * version */ 206 setattrofunc tp_setattro; 207 208If accessing attributes of an object is always a simple operation (this will be 209explained shortly), there are generic implementations which can be used to 210provide the :c:type:`PyObject\*` version of the attribute management functions. 211The actual need for type-specific attribute handlers almost completely 212disappeared starting with Python 2.2, though there are many examples which have 213not been updated to use some of the new generic mechanism that is available. 214 215 216.. _generic-attribute-management: 217 218Generic Attribute Management 219^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 220 221Most extension types only use *simple* attributes. So, what makes the 222attributes simple? There are only a couple of conditions that must be met: 223 224#. The name of the attributes must be known when :c:func:`PyType_Ready` is 225 called. 226 227#. No special processing is needed to record that an attribute was looked up or 228 set, nor do actions need to be taken based on the value. 229 230Note that this list does not place any restrictions on the values of the 231attributes, when the values are computed, or how relevant data is stored. 232 233When :c:func:`PyType_Ready` is called, it uses three tables referenced by the 234type object to create :term:`descriptor`\s which are placed in the dictionary of the 235type object. Each descriptor controls access to one attribute of the instance 236object. Each of the tables is optional; if all three are ``NULL``, instances of 237the type will only have attributes that are inherited from their base type, and 238should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields ``NULL`` as 239well, allowing the base type to handle attributes. 240 241The tables are declared as three fields of the type object:: 242 243 struct PyMethodDef *tp_methods; 244 struct PyMemberDef *tp_members; 245 struct PyGetSetDef *tp_getset; 246 247If :c:member:`~PyTypeObject.tp_methods` is not ``NULL``, it must refer to an array of 248:c:type:`PyMethodDef` structures. Each entry in the table is an instance of this 249structure:: 250 251 typedef struct PyMethodDef { 252 const char *ml_name; /* method name */ 253 PyCFunction ml_meth; /* implementation function */ 254 int ml_flags; /* flags */ 255 const char *ml_doc; /* docstring */ 256 } PyMethodDef; 257 258One entry should be defined for each method provided by the type; no entries are 259needed for methods inherited from a base type. One additional entry is needed 260at the end; it is a sentinel that marks the end of the array. The 261:attr:`ml_name` field of the sentinel must be ``NULL``. 262 263The second table is used to define attributes which map directly to data stored 264in the instance. A variety of primitive C types are supported, and access may 265be read-only or read-write. The structures in the table are defined as:: 266 267 typedef struct PyMemberDef { 268 const char *name; 269 int type; 270 int offset; 271 int flags; 272 const char *doc; 273 } PyMemberDef; 274 275For each entry in the table, a :term:`descriptor` will be constructed and added to the 276type which will be able to extract a value from the instance structure. The 277:attr:`type` field should contain one of the type codes defined in the 278:file:`structmember.h` header; the value will be used to determine how to 279convert Python values to and from C values. The :attr:`flags` field is used to 280store flags which control how the attribute can be accessed. 281 282The following flag constants are defined in :file:`structmember.h`; they may be 283combined using bitwise-OR. 284 285+---------------------------+----------------------------------------------+ 286| Constant | Meaning | 287+===========================+==============================================+ 288| :const:`READONLY` | Never writable. | 289+---------------------------+----------------------------------------------+ 290| :const:`READ_RESTRICTED` | Not readable in restricted mode. | 291+---------------------------+----------------------------------------------+ 292| :const:`WRITE_RESTRICTED` | Not writable in restricted mode. | 293+---------------------------+----------------------------------------------+ 294| :const:`RESTRICTED` | Not readable or writable in restricted mode. | 295+---------------------------+----------------------------------------------+ 296 297.. index:: 298 single: READONLY 299 single: READ_RESTRICTED 300 single: WRITE_RESTRICTED 301 single: RESTRICTED 302 303An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build 304descriptors that are used at runtime is that any attribute defined this way can 305have an associated doc string simply by providing the text in the table. An 306application can use the introspection API to retrieve the descriptor from the 307class object, and get the doc string using its :attr:`__doc__` attribute. 308 309As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value 310of ``NULL`` is required. 311 312.. XXX Descriptors need to be explained in more detail somewhere, but not here. 313 314 Descriptor objects have two handler functions which correspond to the 315 \member{tp_getattro} and \member{tp_setattro} handlers. The 316 \method{__get__()} handler is a function which is passed the descriptor, 317 instance, and type objects, and returns the value of the attribute, or it 318 returns \NULL{} and sets an exception. The \method{__set__()} handler is 319 passed the descriptor, instance, type, and new value; 320 321 322Type-specific Attribute Management 323^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 324 325For simplicity, only the :c:type:`char\*` version will be demonstrated here; the 326type of the name parameter is the only difference between the :c:type:`char\*` 327and :c:type:`PyObject\*` flavors of the interface. This example effectively does 328the same thing as the generic example above, but does not use the generic 329support added in Python 2.2. It explains how the handler functions are 330called, so that if you do need to extend their functionality, you'll understand 331what needs to be done. 332 333The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute 334look-up. It is called in the same situations where the :meth:`__getattr__` 335method of a class would be called. 336 337Here is an example:: 338 339 static PyObject * 340 newdatatype_getattr(newdatatypeobject *obj, char *name) 341 { 342 if (strcmp(name, "data") == 0) 343 { 344 return PyLong_FromLong(obj->data); 345 } 346 347 PyErr_Format(PyExc_AttributeError, 348 "'%.50s' object has no attribute '%.400s'", 349 tp->tp_name, name); 350 return NULL; 351 } 352 353The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or 354:meth:`__delattr__` method of a class instance would be called. When an 355attribute should be deleted, the third parameter will be ``NULL``. Here is an 356example that simply raises an exception; if this were really all you wanted, the 357:c:member:`~PyTypeObject.tp_setattr` handler should be set to ``NULL``. :: 358 359 static int 360 newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v) 361 { 362 PyErr_Format(PyExc_RuntimeError, "Read-only attribute: %s", name); 363 return -1; 364 } 365 366Object Comparison 367----------------- 368 369:: 370 371 richcmpfunc tp_richcompare; 372 373The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed. It is 374analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like 375:meth:`__lt__`, and also called by :c:func:`PyObject_RichCompare` and 376:c:func:`PyObject_RichCompareBool`. 377 378This function is called with two Python objects and the operator as arguments, 379where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GT``, 380``Py_LT`` or ``Py_GT``. It should compare the two objects with respect to the 381specified operator and return ``Py_True`` or ``Py_False`` if the comparison is 382successful, ``Py_NotImplemented`` to indicate that comparison is not 383implemented and the other object's comparison method should be tried, or ``NULL`` 384if an exception was set. 385 386Here is a sample implementation, for a datatype that is considered equal if the 387size of an internal pointer is equal:: 388 389 static PyObject * 390 newdatatype_richcmp(PyObject *obj1, PyObject *obj2, int op) 391 { 392 PyObject *result; 393 int c, size1, size2; 394 395 /* code to make sure that both arguments are of type 396 newdatatype omitted */ 397 398 size1 = obj1->obj_UnderlyingDatatypePtr->size; 399 size2 = obj2->obj_UnderlyingDatatypePtr->size; 400 401 switch (op) { 402 case Py_LT: c = size1 < size2; break; 403 case Py_LE: c = size1 <= size2; break; 404 case Py_EQ: c = size1 == size2; break; 405 case Py_NE: c = size1 != size2; break; 406 case Py_GT: c = size1 > size2; break; 407 case Py_GE: c = size1 >= size2; break; 408 } 409 result = c ? Py_True : Py_False; 410 Py_INCREF(result); 411 return result; 412 } 413 414 415Abstract Protocol Support 416------------------------- 417 418Python supports a variety of *abstract* 'protocols;' the specific interfaces 419provided to use these interfaces are documented in :ref:`abstract`. 420 421 422A number of these abstract interfaces were defined early in the development of 423the Python implementation. In particular, the number, mapping, and sequence 424protocols have been part of Python since the beginning. Other protocols have 425been added over time. For protocols which depend on several handler routines 426from the type implementation, the older protocols have been defined as optional 427blocks of handlers referenced by the type object. For newer protocols there are 428additional slots in the main type object, with a flag bit being set to indicate 429that the slots are present and should be checked by the interpreter. (The flag 430bit does not indicate that the slot values are non-``NULL``. The flag may be set 431to indicate the presence of a slot, but a slot may still be unfilled.) :: 432 433 PyNumberMethods *tp_as_number; 434 PySequenceMethods *tp_as_sequence; 435 PyMappingMethods *tp_as_mapping; 436 437If you wish your object to be able to act like a number, a sequence, or a 438mapping object, then you place the address of a structure that implements the C 439type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or 440:c:type:`PyMappingMethods`, respectively. It is up to you to fill in this 441structure with appropriate values. You can find examples of the use of each of 442these in the :file:`Objects` directory of the Python source distribution. :: 443 444 hashfunc tp_hash; 445 446This function, if you choose to provide it, should return a hash number for an 447instance of your data type. Here is a simple example:: 448 449 static Py_hash_t 450 newdatatype_hash(newdatatypeobject *obj) 451 { 452 Py_hash_t result; 453 result = obj->some_size + 32767 * obj->some_number; 454 if (result == -1) 455 result = -2; 456 return result; 457 } 458 459:c:type:`Py_hash_t` is a signed integer type with a platform-varying width. 460Returning ``-1`` from :c:member:`~PyTypeObject.tp_hash` indicates an error, 461which is why you should be careful to avoid returning it when hash computation 462is successful, as seen above. 463 464:: 465 466 ternaryfunc tp_call; 467 468This function is called when an instance of your data type is "called", for 469example, if ``obj1`` is an instance of your data type and the Python script 470contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked. 471 472This function takes three arguments: 473 474#. *self* is the instance of the data type which is the subject of the call. 475 If the call is ``obj1('hello')``, then *self* is ``obj1``. 476 477#. *args* is a tuple containing the arguments to the call. You can use 478 :c:func:`PyArg_ParseTuple` to extract the arguments. 479 480#. *kwds* is a dictionary of keyword arguments that were passed. If this is 481 non-``NULL`` and you support keyword arguments, use 482 :c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments. If you 483 do not want to support keyword arguments and this is non-``NULL``, raise a 484 :exc:`TypeError` with a message saying that keyword arguments are not supported. 485 486Here is a toy ``tp_call`` implementation:: 487 488 static PyObject * 489 newdatatype_call(newdatatypeobject *self, PyObject *args, PyObject *kwds) 490 { 491 PyObject *result; 492 const char *arg1; 493 const char *arg2; 494 const char *arg3; 495 496 if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) { 497 return NULL; 498 } 499 result = PyUnicode_FromFormat( 500 "Returning -- value: [%d] arg1: [%s] arg2: [%s] arg3: [%s]\n", 501 obj->obj_UnderlyingDatatypePtr->size, 502 arg1, arg2, arg3); 503 return result; 504 } 505 506:: 507 508 /* Iterators */ 509 getiterfunc tp_iter; 510 iternextfunc tp_iternext; 511 512These functions provide support for the iterator protocol. Both handlers 513take exactly one parameter, the instance for which they are being called, 514and return a new reference. In the case of an error, they should set an 515exception and return ``NULL``. :c:member:`~PyTypeObject.tp_iter` corresponds 516to the Python :meth:`__iter__` method, while :c:member:`~PyTypeObject.tp_iternext` 517corresponds to the Python :meth:`~iterator.__next__` method. 518 519Any :term:`iterable` object must implement the :c:member:`~PyTypeObject.tp_iter` 520handler, which must return an :term:`iterator` object. Here the same guidelines 521apply as for Python classes: 522 523* For collections (such as lists and tuples) which can support multiple 524 independent iterators, a new iterator should be created and returned by 525 each call to :c:member:`~PyTypeObject.tp_iter`. 526* Objects which can only be iterated over once (usually due to side effects of 527 iteration, such as file objects) can implement :c:member:`~PyTypeObject.tp_iter` 528 by returning a new reference to themselves -- and should also therefore 529 implement the :c:member:`~PyTypeObject.tp_iternext` handler. 530 531Any :term:`iterator` object should implement both :c:member:`~PyTypeObject.tp_iter` 532and :c:member:`~PyTypeObject.tp_iternext`. An iterator's 533:c:member:`~PyTypeObject.tp_iter` handler should return a new reference 534to the iterator. Its :c:member:`~PyTypeObject.tp_iternext` handler should 535return a new reference to the next object in the iteration, if there is one. 536If the iteration has reached the end, :c:member:`~PyTypeObject.tp_iternext` 537may return ``NULL`` without setting an exception, or it may set 538:exc:`StopIteration` *in addition* to returning ``NULL``; avoiding 539the exception can yield slightly better performance. If an actual error 540occurs, :c:member:`~PyTypeObject.tp_iternext` should always set an exception 541and return ``NULL``. 542 543 544.. _weakref-support: 545 546Weak Reference Support 547---------------------- 548 549One of the goals of Python's weak reference implementation is to allow any type 550to participate in the weak reference mechanism without incurring the overhead on 551performance-critical objects (such as numbers). 552 553.. seealso:: 554 Documentation for the :mod:`weakref` module. 555 556For an object to be weakly referencable, the extension type must do two things: 557 558#. Include a :c:type:`PyObject\*` field in the C object structure dedicated to 559 the weak reference mechanism. The object's constructor should leave it 560 ``NULL`` (which is automatic when using the default 561 :c:member:`~PyTypeObject.tp_alloc`). 562 563#. Set the :c:member:`~PyTypeObject.tp_weaklistoffset` type member 564 to the offset of the aforementioned field in the C object structure, 565 so that the interpreter knows how to access and modify that field. 566 567Concretely, here is how a trivial object structure would be augmented 568with the required field:: 569 570 typedef struct { 571 PyObject_HEAD 572 PyObject *weakreflist; /* List of weak references */ 573 } TrivialObject; 574 575And the corresponding member in the statically-declared type object:: 576 577 static PyTypeObject TrivialType = { 578 PyVarObject_HEAD_INIT(NULL, 0) 579 /* ... other members omitted for brevity ... */ 580 .tp_weaklistoffset = offsetof(TrivialObject, weakreflist), 581 }; 582 583The only further addition is that ``tp_dealloc`` needs to clear any weak 584references (by calling :c:func:`PyObject_ClearWeakRefs`) if the field is 585non-``NULL``:: 586 587 static void 588 Trivial_dealloc(TrivialObject *self) 589 { 590 /* Clear weakrefs first before calling any destructors */ 591 if (self->weakreflist != NULL) 592 PyObject_ClearWeakRefs((PyObject *) self); 593 /* ... remainder of destruction code omitted for brevity ... */ 594 Py_TYPE(self)->tp_free((PyObject *) self); 595 } 596 597 598More Suggestions 599---------------- 600 601In order to learn how to implement any specific method for your new data type, 602get the :term:`CPython` source code. Go to the :file:`Objects` directory, 603then search the C source files for ``tp_`` plus the function you want 604(for example, ``tp_richcompare``). You will find examples of the function 605you want to implement. 606 607When you need to verify that an object is a concrete instance of the type you 608are implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of 609its use might be something like the following:: 610 611 if (!PyObject_TypeCheck(some_object, &MyType)) { 612 PyErr_SetString(PyExc_TypeError, "arg #1 not a mything"); 613 return NULL; 614 } 615 616.. seealso:: 617 Download CPython source releases. 618 https://www.python.org/downloads/source/ 619 620 The CPython project on GitHub, where the CPython source code is developed. 621 https://github.com/python/cpython 622