tokenize.rst - OpenGrok cross reference for /external/python/cpython3/Doc/library/tokenize.rst

Lines Matching +full:detect +full:- +full:newline
1 :mod:`tokenize` --- Tokenizer for Python source
12 --------------
16 as well, making it useful for implementing "pretty-printers", including
17 colorizers for on-screen displays.
26 ----------------
37    The generator produces 5-tuples with these members: the token type; the
38    token string; a 2-tuple ``(srow, scol)`` of ints specifying the row and
39    column where the token begins in the source; a 2-tuple ``(erow, ecol)`` of
58    UTF-8 BOM or encoding cookie, according to :pep:`263`.
87     lossless and round-trips are assured.  The guarantee applies only to the
96 :func:`.tokenize` needs to detect the encoding of source files it tokenizes. The
101     The :func:`detect_encoding` function is used to detect the encoding that
109     It detects the encoding from the presence of a UTF-8 BOM or an encoding
112     ``'utf-8-sig'`` will be returned as an encoding.
114     If no encoding is specified, then the default of ``'utf-8'`` will be
118     :func:`detect_encoding` to detect the file encoding.
142 Note that unclosed single-quoted strings do not cause an error to be
147 .. _tokenize-cli:
149 Command-Line Usage
150 ------------------
157 .. code-block:: sh
159    python -m tokenize [-e] [filename.py]
165 .. cmdoption:: -h, --help
169 .. cmdoption:: -e, --exact
177 ------------------
189         >>> s = 'print(+21.3e-5*-.1234/81.7)'
191         "print (+Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7'))"
194         Known cases are "e-007" (Windows) and "e-07" (not Windows).  Since
196         rest of the output should be platform-independent.
199         -3.21716034272e-0...7
205         -3.217160342717258261933904529E-7
208         g = tokenize(BytesIO(s.encode('utf-8')).readline)  # tokenize the string
219         return untokenize(result).decode('utf-8')
232 .. code-block:: shell-session
234     $ python -m tokenize hello.py
235     0,0-0,0:            ENCODING       'utf-8'
236     1,0-1,3:            NAME           'def'
237     1,4-1,13:           NAME           'say_hello'
238     1,13-1,14:          OP             '('
239     1,14-1,15:          OP             ')'
240     1,15-1,16:          OP             ':'
241     1,16-1,17:          NEWLINE        '\n'
242     2,0-2,4:            INDENT         '    '
243     2,4-2,9:            NAME           'print'
244     2,9-2,10:           OP             '('
245     2,10-2,25:          STRING         '"Hello, World!"'
246     2,25-2,26:          OP             ')'
247     2,26-2,27:          NEWLINE        '\n'
248     3,0-3,1:            NL             '\n'
249     4,0-4,0:            DEDENT         ''
250     4,0-4,9:            NAME           'say_hello'
251     4,9-4,10:           OP             '('
252     4,10-4,11:          OP             ')'
253     4,11-4,12:          NEWLINE        '\n'
254     5,0-5,0:            ENDMARKER      ''
256 The exact token type names can be displayed using the :option:`-e` option:
258 .. code-block:: shell-session
260     $ python -m tokenize -e hello.py
261     0,0-0,0:            ENCODING       'utf-8'
262     1,0-1,3:            NAME           'def'
263     1,4-1,13:           NAME           'say_hello'
264     1,13-1,14:          LPAR           '('
265     1,14-1,15:          RPAR           ')'
266     1,15-1,16:          COLON          ':'
267     1,16-1,17:          NEWLINE        '\n'
268     2,0-2,4:            INDENT         '    '
269     2,4-2,9:            NAME           'print'
270     2,9-2,10:           LPAR           '('
271     2,10-2,25:          STRING         '"Hello, World!"'
272     2,25-2,26:          RPAR           ')'
273     2,26-2,27:          NEWLINE        '\n'
274     3,0-3,1:            NL             '\n'
275     4,0-4,0:            DEDENT         ''
276     4,0-4,9:            NAME           'say_hello'
277     4,9-4,10:           LPAR           '('
278     4,10-4,11:          RPAR           ')'
279     4,11-4,12:          NEWLINE        '\n'
280     5,0-5,0:            ENDMARKER      ''