1:mod:`stringprep` --- Internet String Preparation 2================================================= 3 4.. module:: stringprep 5 :synopsis: String preparation, as per RFC 3453 6 7.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de> 8.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> 9 10**Source code:** :source:`Lib/stringprep.py` 11 12-------------- 13 14When identifying things (such as host names) in the internet, it is often 15necessary to compare such identifications for "equality". Exactly how this 16comparison is executed may depend on the application domain, e.g. whether it 17should be case-insensitive or not. It may be also necessary to restrict the 18possible identifications, to allow only identifications consisting of 19"printable" characters. 20 21:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet 22protocols. Before passing strings onto the wire, they are processed with the 23preparation procedure, after which they have a certain normalized form. The RFC 24defines a set of tables, which can be combined into profiles. Each profile must 25define which tables it uses, and what other optional parts of the ``stringprep`` 26procedure are part of the profile. One example of a ``stringprep`` profile is 27``nameprep``, which is used for internationalized domain names. 28 29The module :mod:`stringprep` only exposes the tables from :rfc:`3454`. As these 30tables would be very large to represent them as dictionaries or lists, the 31module uses the Unicode character database internally. The module source code 32itself was generated using the ``mkstringprep.py`` utility. 33 34As a result, these tables are exposed as functions, not as data structures. 35There are two kinds of tables in the RFC: sets and mappings. For a set, 36:mod:`stringprep` provides the "characteristic function", i.e. a function that 37returns ``True`` if the parameter is part of the set. For mappings, it provides the 38mapping function: given the key, it returns the associated value. Below is a 39list of all functions available in the module. 40 41 42.. function:: in_table_a1(code) 43 44 Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2). 45 46 47.. function:: in_table_b1(code) 48 49 Determine whether *code* is in tableB.1 (Commonly mapped to nothing). 50 51 52.. function:: map_table_b2(code) 53 54 Return the mapped value for *code* according to tableB.2 (Mapping for 55 case-folding used with NFKC). 56 57 58.. function:: map_table_b3(code) 59 60 Return the mapped value for *code* according to tableB.3 (Mapping for 61 case-folding used with no normalization). 62 63 64.. function:: in_table_c11(code) 65 66 Determine whether *code* is in tableC.1.1 (ASCII space characters). 67 68 69.. function:: in_table_c12(code) 70 71 Determine whether *code* is in tableC.1.2 (Non-ASCII space characters). 72 73 74.. function:: in_table_c11_c12(code) 75 76 Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and 77 C.1.2). 78 79 80.. function:: in_table_c21(code) 81 82 Determine whether *code* is in tableC.2.1 (ASCII control characters). 83 84 85.. function:: in_table_c22(code) 86 87 Determine whether *code* is in tableC.2.2 (Non-ASCII control characters). 88 89 90.. function:: in_table_c21_c22(code) 91 92 Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and 93 C.2.2). 94 95 96.. function:: in_table_c3(code) 97 98 Determine whether *code* is in tableC.3 (Private use). 99 100 101.. function:: in_table_c4(code) 102 103 Determine whether *code* is in tableC.4 (Non-character code points). 104 105 106.. function:: in_table_c5(code) 107 108 Determine whether *code* is in tableC.5 (Surrogate codes). 109 110 111.. function:: in_table_c6(code) 112 113 Determine whether *code* is in tableC.6 (Inappropriate for plain text). 114 115 116.. function:: in_table_c7(code) 117 118 Determine whether *code* is in tableC.7 (Inappropriate for canonical 119 representation). 120 121 122.. function:: in_table_c8(code) 123 124 Determine whether *code* is in tableC.8 (Change display properties or are 125 deprecated). 126 127 128.. function:: in_table_c9(code) 129 130 Determine whether *code* is in tableC.9 (Tagging characters). 131 132 133.. function:: in_table_d1(code) 134 135 Determine whether *code* is in tableD.1 (Characters with bidirectional property 136 "R" or "AL"). 137 138 139.. function:: in_table_d2(code) 140 141 Determine whether *code* is in tableD.2 (Characters with bidirectional property 142 "L"). 143 144