1 2:mod:`stringprep` --- Internet String Preparation 3================================================= 4 5.. module:: stringprep 6 :synopsis: String preparation, as per RFC 3453 7.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de> 8.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> 9 10 11.. versionadded:: 2.3 12 13When identifying things (such as host names) in the internet, it is often 14necessary to compare such identifications for "equality". Exactly how this 15comparison is executed may depend on the application domain, e.g. whether it 16should be case-insensitive or not. It may be also necessary to restrict the 17possible identifications, to allow only identifications consisting of 18"printable" characters. 19 20:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet 21protocols. Before passing strings onto the wire, they are processed with the 22preparation procedure, after which they have a certain normalized form. The RFC 23defines a set of tables, which can be combined into profiles. Each profile must 24define which tables it uses, and what other optional parts of the ``stringprep`` 25procedure are part of the profile. One example of a ``stringprep`` profile is 26``nameprep``, which is used for internationalized domain names. 27 28The module :mod:`stringprep` only exposes the tables from RFC 3454. As these 29tables would be very large to represent them as dictionaries or lists, the 30module uses the Unicode character database internally. The module source code 31itself was generated using the ``mkstringprep.py`` utility. 32 33As a result, these tables are exposed as functions, not as data structures. 34There are two kinds of tables in the RFC: sets and mappings. For a set, 35:mod:`stringprep` provides the "characteristic function", i.e. a function that 36returns true if the parameter is part of the set. For mappings, it provides the 37mapping function: given the key, it returns the associated value. Below is a 38list of all functions available in the module. 39 40 41.. function:: in_table_a1(code) 42 43 Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2). 44 45 46.. function:: in_table_b1(code) 47 48 Determine whether *code* is in tableB.1 (Commonly mapped to nothing). 49 50 51.. function:: map_table_b2(code) 52 53 Return the mapped value for *code* according to tableB.2 (Mapping for 54 case-folding used with NFKC). 55 56 57.. function:: map_table_b3(code) 58 59 Return the mapped value for *code* according to tableB.3 (Mapping for 60 case-folding used with no normalization). 61 62 63.. function:: in_table_c11(code) 64 65 Determine whether *code* is in tableC.1.1 (ASCII space characters). 66 67 68.. function:: in_table_c12(code) 69 70 Determine whether *code* is in tableC.1.2 (Non-ASCII space characters). 71 72 73.. function:: in_table_c11_c12(code) 74 75 Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and 76 C.1.2). 77 78 79.. function:: in_table_c21(code) 80 81 Determine whether *code* is in tableC.2.1 (ASCII control characters). 82 83 84.. function:: in_table_c22(code) 85 86 Determine whether *code* is in tableC.2.2 (Non-ASCII control characters). 87 88 89.. function:: in_table_c21_c22(code) 90 91 Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and 92 C.2.2). 93 94 95.. function:: in_table_c3(code) 96 97 Determine whether *code* is in tableC.3 (Private use). 98 99 100.. function:: in_table_c4(code) 101 102 Determine whether *code* is in tableC.4 (Non-character code points). 103 104 105.. function:: in_table_c5(code) 106 107 Determine whether *code* is in tableC.5 (Surrogate codes). 108 109 110.. function:: in_table_c6(code) 111 112 Determine whether *code* is in tableC.6 (Inappropriate for plain text). 113 114 115.. function:: in_table_c7(code) 116 117 Determine whether *code* is in tableC.7 (Inappropriate for canonical 118 representation). 119 120 121.. function:: in_table_c8(code) 122 123 Determine whether *code* is in tableC.8 (Change display properties or are 124 deprecated). 125 126 127.. function:: in_table_c9(code) 128 129 Determine whether *code* is in tableC.9 (Tagging characters). 130 131 132.. function:: in_table_d1(code) 133 134 Determine whether *code* is in tableD.1 (Characters with bidirectional property 135 "R" or "AL"). 136 137 138.. function:: in_table_d2(code) 139 140 Determine whether *code* is in tableD.2 (Characters with bidirectional property 141 "L"). 142 143