1.. highlightlang:: c 2 3.. _arg-parsing: 4 5Parsing arguments and building values 6===================================== 7 8These functions are useful when creating your own extensions functions and 9methods. Additional information and examples are available in 10:ref:`extending-index`. 11 12The first three of these functions described, :c:func:`PyArg_ParseTuple`, 13:c:func:`PyArg_ParseTupleAndKeywords`, and :c:func:`PyArg_Parse`, all use 14*format strings* which are used to tell the function about the expected 15arguments. The format strings use the same syntax for each of these 16functions. 17 18A format string consists of zero or more "format units." A format unit 19describes one Python object; it is usually a single character or a 20parenthesized sequence of format units. With a few exceptions, a format unit 21that is not a parenthesized sequence normally corresponds to a single address 22argument to these functions. In the following description, the quoted form is 23the format unit; the entry in (round) parentheses is the Python object type 24that matches the format unit; and the entry in [square] brackets is the type 25of the C variable(s) whose address should be passed. 26 27These formats allow accessing an object as a contiguous chunk of memory. 28You don't have to provide raw storage for the returned unicode or bytes 29area. Also, you won't have to release any memory yourself, except with the 30``es``, ``es#``, ``et`` and ``et#`` formats. 31 32``s`` (string or Unicode) [const char \*] 33 Convert a Python string or Unicode object to a C pointer to a character 34 string. You must not provide storage for the string itself; a pointer to 35 an existing string is stored into the character pointer variable whose 36 address you pass. The C string is NUL-terminated. The Python string must 37 not contain embedded NUL bytes; if it does, a :exc:`TypeError` exception is 38 raised. Unicode objects are converted to C strings using the default 39 encoding. If this conversion fails, a :exc:`UnicodeError` is raised. 40 41``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int (or :c:type:`Py_ssize_t`, see below)] 42 This variant on ``s`` stores into two C variables, the first one a pointer 43 to a character string, the second one its length. In this case the Python 44 string may contain embedded null bytes. Unicode objects pass back a 45 pointer to the default encoded string version of the object if such a 46 conversion is possible. All other read-buffer compatible objects pass back 47 a reference to the raw internal data representation. 48 49 Starting with Python 2.5 the type of the length argument can be controlled 50 by defining the macro :c:macro:`PY_SSIZE_T_CLEAN` before including 51 :file:`Python.h`. If the macro is defined, length is a :c:type:`Py_ssize_t` 52 rather than an int. 53 54``s*`` (string, Unicode, or any buffer compatible object) [Py_buffer] 55 Similar to ``s#``, this code fills a Py_buffer structure provided by the 56 caller. The buffer gets locked, so that the caller can subsequently use 57 the buffer even inside a ``Py_BEGIN_ALLOW_THREADS`` block; the caller is 58 responsible for calling ``PyBuffer_Release`` with the structure after it 59 has processed the data. 60 61 .. versionadded:: 2.6 62 63``z`` (string, Unicode or ``None``) [const char \*] 64 Like ``s``, but the Python object may also be ``None``, in which case the C 65 pointer is set to *NULL*. 66 67``z#`` (string, Unicode, ``None`` or any read buffer compatible object) [const char \*, int] 68 This is to ``s#`` as ``z`` is to ``s``. 69 70``z*`` (string, Unicode, ``None`` or any buffer compatible object) [Py_buffer] 71 This is to ``s*`` as ``z`` is to ``s``. 72 73 .. versionadded:: 2.6 74 75``u`` (Unicode) [Py_UNICODE \*] 76 Convert a Python Unicode object to a C pointer to a NUL-terminated buffer 77 of 16-bit Unicode (UTF-16) data. As with ``s``, there is no need to 78 provide storage for the Unicode data buffer; a pointer to the existing 79 Unicode data is stored into the :c:type:`Py_UNICODE` pointer variable whose 80 address you pass. 81 82``u#`` (Unicode) [Py_UNICODE \*, int] 83 This variant on ``u`` stores into two C variables, the first one a pointer 84 to a Unicode data buffer, the second one its length. Non-Unicode objects 85 are handled by interpreting their read-buffer pointer as pointer to a 86 :c:type:`Py_UNICODE` array. 87 88``es`` (string, Unicode or character buffer compatible object) [const char \*encoding, char \*\*buffer] 89 This variant on ``s`` is used for encoding Unicode and objects convertible 90 to Unicode into a character buffer. It only works for encoded data without 91 embedded NUL bytes. 92 93 This format requires two arguments. The first is only used as input, and 94 must be a :c:type:`const char\*` which points to the name of an encoding as 95 a NUL-terminated string, or *NULL*, in which case the default encoding is 96 used. An exception is raised if the named encoding is not known to Python. 97 The second argument must be a :c:type:`char\*\*`; the value of the pointer 98 it references will be set to a buffer with the contents of the argument 99 text. The text will be encoded in the encoding specified by the first 100 argument. 101 102 :c:func:`PyArg_ParseTuple` will allocate a buffer of the needed size, copy 103 the encoded data into this buffer and adjust *\*buffer* to reference the 104 newly allocated storage. The caller is responsible for calling 105 :c:func:`PyMem_Free` to free the allocated buffer after use. 106 107``et`` (string, Unicode or character buffer compatible object) [const char \*encoding, char \*\*buffer] 108 Same as ``es`` except that 8-bit string objects are passed through without 109 recoding them. Instead, the implementation assumes that the string object 110 uses the encoding passed in as parameter. 111 112``es#`` (string, Unicode or character buffer compatible object) [const char \*encoding, char \*\*buffer, int \*buffer_length] 113 This variant on ``s#`` is used for encoding Unicode and objects convertible 114 to Unicode into a character buffer. Unlike the ``es`` format, this variant 115 allows input data which contains NUL characters. 116 117 It requires three arguments. The first is only used as input, and must be 118 a :c:type:`const char\*` which points to the name of an encoding as a 119 NUL-terminated string, or *NULL*, in which case the default encoding is 120 used. An exception is raised if the named encoding is not known to Python. 121 The second argument must be a :c:type:`char\*\*`; the value of the pointer 122 it references will be set to a buffer with the contents of the argument 123 text. The text will be encoded in the encoding specified by the first 124 argument. The third argument must be a pointer to an integer; the 125 referenced integer will be set to the number of bytes in the output buffer. 126 127 There are two modes of operation: 128 129 If *\*buffer* points a *NULL* pointer, the function will allocate a buffer 130 of the needed size, copy the encoded data into this buffer and set 131 *\*buffer* to reference the newly allocated storage. The caller is 132 responsible for calling :c:func:`PyMem_Free` to free the allocated buffer 133 after usage. 134 135 If *\*buffer* points to a non-*NULL* pointer (an already allocated buffer), 136 :c:func:`PyArg_ParseTuple` will use this location as the buffer and 137 interpret the initial value of *\*buffer_length* as the buffer size. It 138 will then copy the encoded data into the buffer and NUL-terminate it. If 139 the buffer is not large enough, a :exc:`TypeError` will be set. 140 Note: starting from Python 3.6 a :exc:`ValueError` will be set. 141 142 In both cases, *\*buffer_length* is set to the length of the encoded data 143 without the trailing NUL byte. 144 145``et#`` (string, Unicode or character buffer compatible object) [const char \*encoding, char \*\*buffer, int \*buffer_length] 146 Same as ``es#`` except that string objects are passed through without 147 recoding them. Instead, the implementation assumes that the string object 148 uses the encoding passed in as parameter. 149 150``b`` (integer) [unsigned char] 151 Convert a nonnegative Python integer to an unsigned tiny int, stored in a C 152 :c:type:`unsigned char`. 153 154``B`` (integer) [unsigned char] 155 Convert a Python integer to a tiny int without overflow checking, stored in 156 a C :c:type:`unsigned char`. 157 158 .. versionadded:: 2.3 159 160``h`` (integer) [short int] 161 Convert a Python integer to a C :c:type:`short int`. 162 163``H`` (integer) [unsigned short int] 164 Convert a Python integer to a C :c:type:`unsigned short int`, without 165 overflow checking. 166 167 .. versionadded:: 2.3 168 169``i`` (integer) [int] 170 Convert a Python integer to a plain C :c:type:`int`. 171 172``I`` (integer) [unsigned int] 173 Convert a Python integer to a C :c:type:`unsigned int`, without overflow 174 checking. 175 176 .. versionadded:: 2.3 177 178``l`` (integer) [long int] 179 Convert a Python integer to a C :c:type:`long int`. 180 181``k`` (integer) [unsigned long] 182 Convert a Python integer or long integer to a C :c:type:`unsigned long` 183 without overflow checking. 184 185 .. versionadded:: 2.3 186 187``L`` (integer) [PY_LONG_LONG] 188 Convert a Python integer to a C :c:type:`long long`. This format is only 189 available on platforms that support :c:type:`long long` (or :c:type:`_int64` 190 on Windows). 191 192``K`` (integer) [unsigned PY_LONG_LONG] 193 Convert a Python integer or long integer to a C :c:type:`unsigned long long` 194 without overflow checking. This format is only available on platforms that 195 support :c:type:`unsigned long long` (or :c:type:`unsigned _int64` on 196 Windows). 197 198 .. versionadded:: 2.3 199 200``n`` (integer) [Py_ssize_t] 201 Convert a Python integer or long integer to a C :c:type:`Py_ssize_t`. 202 203 .. versionadded:: 2.5 204 205``c`` (string of length 1) [char] 206 Convert a Python character, represented as a string of length 1, to a C 207 :c:type:`char`. 208 209``f`` (float) [float] 210 Convert a Python floating point number to a C :c:type:`float`. 211 212``d`` (float) [double] 213 Convert a Python floating point number to a C :c:type:`double`. 214 215``D`` (complex) [Py_complex] 216 Convert a Python complex number to a C :c:type:`Py_complex` structure. 217 218``O`` (object) [PyObject \*] 219 Store a Python object (without any conversion) in a C object pointer. The 220 C program thus receives the actual object that was passed. The object's 221 reference count is not increased. The pointer stored is not *NULL*. 222 223``O!`` (object) [*typeobject*, PyObject \*] 224 Store a Python object in a C object pointer. This is similar to ``O``, but 225 takes two C arguments: the first is the address of a Python type object, 226 the second is the address of the C variable (of type :c:type:`PyObject\*`) 227 into which the object pointer is stored. If the Python object does not 228 have the required type, :exc:`TypeError` is raised. 229 230``O&`` (object) [*converter*, *anything*] 231 Convert a Python object to a C variable through a *converter* function. 232 This takes two arguments: the first is a function, the second is the 233 address of a C variable (of arbitrary type), converted to :c:type:`void \*`. 234 The *converter* function in turn is called as follows:: 235 236 status = converter(object, address); 237 238 where *object* is the Python object to be converted and *address* is the 239 :c:type:`void\*` argument that was passed to the :c:func:`PyArg_Parse\*` 240 function. The returned *status* should be ``1`` for a successful 241 conversion and ``0`` if the conversion has failed. When the conversion 242 fails, the *converter* function should raise an exception and leave the 243 content of *address* unmodified. 244 245``S`` (string) [PyStringObject \*] 246 Like ``O`` but requires that the Python object is a string object. Raises 247 :exc:`TypeError` if the object is not a string object. The C variable may 248 also be declared as :c:type:`PyObject\*`. 249 250``U`` (Unicode string) [PyUnicodeObject \*] 251 Like ``O`` but requires that the Python object is a Unicode object. Raises 252 :exc:`TypeError` if the object is not a Unicode object. The C variable may 253 also be declared as :c:type:`PyObject\*`. 254 255``t#`` (read-only character buffer) [char \*, int] 256 Like ``s#``, but accepts any object which implements the read-only buffer 257 interface. The :c:type:`char\*` variable is set to point to the first byte 258 of the buffer, and the :c:type:`int` is set to the length of the buffer. 259 Only single-segment buffer objects are accepted; :exc:`TypeError` is raised 260 for all others. 261 262``w`` (read-write character buffer) [char \*] 263 Similar to ``s``, but accepts any object which implements the read-write 264 buffer interface. The caller must determine the length of the buffer by 265 other means, or use ``w#`` instead. Only single-segment buffer objects are 266 accepted; :exc:`TypeError` is raised for all others. 267 268``w#`` (read-write character buffer) [char \*, Py_ssize_t] 269 Like ``s#``, but accepts any object which implements the read-write buffer 270 interface. The :c:type:`char \*` variable is set to point to the first byte 271 of the buffer, and the :c:type:`Py_ssize_t` is set to the length of the 272 buffer. Only single-segment buffer objects are accepted; :exc:`TypeError` 273 is raised for all others. 274 275``w*`` (read-write byte-oriented buffer) [Py_buffer] 276 This is to ``w`` what ``s*`` is to ``s``. 277 278 .. versionadded:: 2.6 279 280``(items)`` (tuple) [*matching-items*] 281 The object must be a Python sequence whose length is the number of format 282 units in *items*. The C arguments must correspond to the individual format 283 units in *items*. Format units for sequences may be nested. 284 285 .. note:: 286 287 Prior to Python version 1.5.2, this format specifier only accepted a 288 tuple containing the individual parameters, not an arbitrary sequence. 289 Code which previously caused :exc:`TypeError` to be raised here may now 290 proceed without an exception. This is not expected to be a problem for 291 existing code. 292 293It is possible to pass Python long integers where integers are requested; 294however no proper range checking is done --- the most significant bits are 295silently truncated when the receiving field is too small to receive the value 296(actually, the semantics are inherited from downcasts in C --- your mileage 297may vary). 298 299A few other characters have a meaning in a format string. These may not occur 300inside nested parentheses. They are: 301 302``|`` 303 Indicates that the remaining arguments in the Python argument list are 304 optional. The C variables corresponding to optional arguments should be 305 initialized to their default value --- when an optional argument is not 306 specified, :c:func:`PyArg_ParseTuple` does not touch the contents of the 307 corresponding C variable(s). 308 309``:`` 310 The list of format units ends here; the string after the colon is used as 311 the function name in error messages (the "associated value" of the 312 exception that :c:func:`PyArg_ParseTuple` raises). 313 314``;`` 315 The list of format units ends here; the string after the semicolon is used 316 as the error message *instead* of the default error message. ``:`` and 317 ``;`` mutually exclude each other. 318 319Note that any Python object references which are provided to the caller are 320*borrowed* references; do not decrement their reference count! 321 322Additional arguments passed to these functions must be addresses of variables 323whose type is determined by the format string; these are used to store values 324from the input tuple. There are a few cases, as described in the list of 325format units above, where these parameters are used as input values; they 326should match what is specified for the corresponding format unit in that case. 327 328For the conversion to succeed, the *arg* object must match the format and the 329format must be exhausted. On success, the :c:func:`PyArg_Parse\*` functions 330return true, otherwise they return false and raise an appropriate exception. 331When the :c:func:`PyArg_Parse\*` functions fail due to conversion failure in 332one of the format units, the variables at the addresses corresponding to that 333and the following format units are left untouched. 334 335 336.. c:function:: int PyArg_ParseTuple(PyObject *args, const char *format, ...) 337 338 Parse the parameters of a function that takes only positional parameters 339 into local variables. Returns true on success; on failure, it returns 340 false and raises the appropriate exception. 341 342 343.. c:function:: int PyArg_VaParse(PyObject *args, const char *format, va_list vargs) 344 345 Identical to :c:func:`PyArg_ParseTuple`, except that it accepts a va_list 346 rather than a variable number of arguments. 347 348 349.. c:function:: int PyArg_ParseTupleAndKeywords(PyObject *args, PyObject *kw, const char *format, char *keywords[], ...) 350 351 Parse the parameters of a function that takes both positional and keyword 352 parameters into local variables. Returns true on success; on failure, it 353 returns false and raises the appropriate exception. 354 355 356.. c:function:: int PyArg_VaParseTupleAndKeywords(PyObject *args, PyObject *kw, const char *format, char *keywords[], va_list vargs) 357 358 Identical to :c:func:`PyArg_ParseTupleAndKeywords`, except that it accepts a 359 va_list rather than a variable number of arguments. 360 361 362.. c:function:: int PyArg_Parse(PyObject *args, const char *format, ...) 363 364 Function used to deconstruct the argument lists of "old-style" functions 365 --- these are functions which use the :const:`METH_OLDARGS` parameter 366 parsing method. This is not recommended for use in parameter parsing in 367 new code, and most code in the standard interpreter has been modified to no 368 longer use this for that purpose. It does remain a convenient way to 369 decompose other tuples, however, and may continue to be used for that 370 purpose. 371 372 373.. c:function:: int PyArg_UnpackTuple(PyObject *args, const char *name, Py_ssize_t min, Py_ssize_t max, ...) 374 375 A simpler form of parameter retrieval which does not use a format string to 376 specify the types of the arguments. Functions which use this method to 377 retrieve their parameters should be declared as :const:`METH_VARARGS` in 378 function or method tables. The tuple containing the actual parameters 379 should be passed as *args*; it must actually be a tuple. The length of the 380 tuple must be at least *min* and no more than *max*; *min* and *max* may be 381 equal. Additional arguments must be passed to the function, each of which 382 should be a pointer to a :c:type:`PyObject\*` variable; these will be filled 383 in with the values from *args*; they will contain borrowed references. The 384 variables which correspond to optional parameters not given by *args* will 385 not be filled in; these should be initialized by the caller. This function 386 returns true on success and false if *args* is not a tuple or contains the 387 wrong number of elements; an exception will be set if there was a failure. 388 389 This is an example of the use of this function, taken from the sources for 390 the :mod:`_weakref` helper module for weak references:: 391 392 static PyObject * 393 weakref_ref(PyObject *self, PyObject *args) 394 { 395 PyObject *object; 396 PyObject *callback = NULL; 397 PyObject *result = NULL; 398 399 if (PyArg_UnpackTuple(args, "ref", 1, 2, &object, &callback)) { 400 result = PyWeakref_NewRef(object, callback); 401 } 402 return result; 403 } 404 405 The call to :c:func:`PyArg_UnpackTuple` in this example is entirely 406 equivalent to this call to :c:func:`PyArg_ParseTuple`:: 407 408 PyArg_ParseTuple(args, "O|O:ref", &object, &callback) 409 410 .. versionadded:: 2.2 411 412 .. versionchanged:: 2.5 413 This function used an :c:type:`int` type for *min* and *max*. This might 414 require changes in your code for properly supporting 64-bit systems. 415 416 417.. c:function:: PyObject* Py_BuildValue(const char *format, ...) 418 419 Create a new value based on a format string similar to those accepted by 420 the :c:func:`PyArg_Parse\*` family of functions and a sequence of values. 421 Returns the value or *NULL* in the case of an error; an exception will be 422 raised if *NULL* is returned. 423 424 :c:func:`Py_BuildValue` does not always build a tuple. It builds a tuple 425 only if its format string contains two or more format units. If the format 426 string is empty, it returns ``None``; if it contains exactly one format 427 unit, it returns whatever object is described by that format unit. To 428 force it to return a tuple of size ``0`` or one, parenthesize the format 429 string. 430 431 When memory buffers are passed as parameters to supply data to build 432 objects, as for the ``s`` and ``s#`` formats, the required data is copied. 433 Buffers provided by the caller are never referenced by the objects created 434 by :c:func:`Py_BuildValue`. In other words, if your code invokes 435 :c:func:`malloc` and passes the allocated memory to :c:func:`Py_BuildValue`, 436 your code is responsible for calling :c:func:`free` for that memory once 437 :c:func:`Py_BuildValue` returns. 438 439 In the following description, the quoted form is the format unit; the entry 440 in (round) parentheses is the Python object type that the format unit will 441 return; and the entry in [square] brackets is the type of the C value(s) to 442 be passed. 443 444 The characters space, tab, colon and comma are ignored in format strings 445 (but not within format units such as ``s#``). This can be used to make 446 long format strings a tad more readable. 447 448 ``s`` (string) [char \*] 449 Convert a null-terminated C string to a Python object. If the C string 450 pointer is *NULL*, ``None`` is used. 451 452 ``s#`` (string) [char \*, int] 453 Convert a C string and its length to a Python object. If the C string 454 pointer is *NULL*, the length is ignored and ``None`` is returned. 455 456 ``z`` (string or ``None``) [char \*] 457 Same as ``s``. 458 459 ``z#`` (string or ``None``) [char \*, int] 460 Same as ``s#``. 461 462 ``u`` (Unicode string) [Py_UNICODE \*] 463 Convert a null-terminated buffer of Unicode (UCS-2 or UCS-4) data to a 464 Python Unicode object. If the Unicode buffer pointer is *NULL*, 465 ``None`` is returned. 466 467 ``u#`` (Unicode string) [Py_UNICODE \*, int] 468 Convert a Unicode (UCS-2 or UCS-4) data buffer and its length to a 469 Python Unicode object. If the Unicode buffer pointer is *NULL*, the 470 length is ignored and ``None`` is returned. 471 472 ``i`` (integer) [int] 473 Convert a plain C :c:type:`int` to a Python integer object. 474 475 ``b`` (integer) [char] 476 Convert a plain C :c:type:`char` to a Python integer object. 477 478 ``h`` (integer) [short int] 479 Convert a plain C :c:type:`short int` to a Python integer object. 480 481 ``l`` (integer) [long int] 482 Convert a C :c:type:`long int` to a Python integer object. 483 484 ``B`` (integer) [unsigned char] 485 Convert a C :c:type:`unsigned char` to a Python integer object. 486 487 ``H`` (integer) [unsigned short int] 488 Convert a C :c:type:`unsigned short int` to a Python integer object. 489 490 ``I`` (integer/long) [unsigned int] 491 Convert a C :c:type:`unsigned int` to a Python integer object or a Python 492 long integer object, if it is larger than ``sys.maxint``. 493 494 ``k`` (integer/long) [unsigned long] 495 Convert a C :c:type:`unsigned long` to a Python integer object or a 496 Python long integer object, if it is larger than ``sys.maxint``. 497 498 ``L`` (long) [PY_LONG_LONG] 499 Convert a C :c:type:`long long` to a Python long integer object. Only 500 available on platforms that support :c:type:`long long`. 501 502 ``K`` (long) [unsigned PY_LONG_LONG] 503 Convert a C :c:type:`unsigned long long` to a Python long integer object. 504 Only available on platforms that support :c:type:`unsigned long long`. 505 506 ``n`` (int) [Py_ssize_t] 507 Convert a C :c:type:`Py_ssize_t` to a Python integer or long integer. 508 509 .. versionadded:: 2.5 510 511 ``c`` (string of length 1) [char] 512 Convert a C :c:type:`int` representing a character to a Python string of 513 length 1. 514 515 ``d`` (float) [double] 516 Convert a C :c:type:`double` to a Python floating point number. 517 518 ``f`` (float) [float] 519 Same as ``d``. 520 521 ``D`` (complex) [Py_complex \*] 522 Convert a C :c:type:`Py_complex` structure to a Python complex number. 523 524 ``O`` (object) [PyObject \*] 525 Pass a Python object untouched (except for its reference count, which is 526 incremented by one). If the object passed in is a *NULL* pointer, it is 527 assumed that this was caused because the call producing the argument 528 found an error and set an exception. Therefore, :c:func:`Py_BuildValue` 529 will return *NULL* but won't raise an exception. If no exception has 530 been raised yet, :exc:`SystemError` is set. 531 532 ``S`` (object) [PyObject \*] 533 Same as ``O``. 534 535 ``N`` (object) [PyObject \*] 536 Same as ``O``, except it doesn't increment the reference count on the 537 object. Useful when the object is created by a call to an object 538 constructor in the argument list. 539 540 ``O&`` (object) [*converter*, *anything*] 541 Convert *anything* to a Python object through a *converter* function. 542 The function is called with *anything* (which should be compatible with 543 :c:type:`void \*`) as its argument and should return a "new" Python 544 object, or *NULL* if an error occurred. 545 546 ``(items)`` (tuple) [*matching-items*] 547 Convert a sequence of C values to a Python tuple with the same number of 548 items. 549 550 ``[items]`` (list) [*matching-items*] 551 Convert a sequence of C values to a Python list with the same number of 552 items. 553 554 ``{items}`` (dictionary) [*matching-items*] 555 Convert a sequence of C values to a Python dictionary. Each pair of 556 consecutive C values adds one item to the dictionary, serving as key and 557 value, respectively. 558 559 If there is an error in the format string, the :exc:`SystemError` exception 560 is set and *NULL* returned. 561 562.. c:function:: PyObject* Py_VaBuildValue(const char *format, va_list vargs) 563 564 Identical to :c:func:`Py_BuildValue`, except that it accepts a va_list 565 rather than a variable number of arguments. 566