Lines Matching +full:detect +full:- +full:newline
1 :mod:`tokenize` --- Tokenizer for Python source
12 --------------
16 as well, making it useful for implementing "pretty-printers", including
17 colorizers for on-screen displays.
26 ----------------
37 The generator produces 5-tuples with these members: the token type; the
38 token string; a 2-tuple ``(srow, scol)`` of ints specifying the row and
39 column where the token begins in the source; a 2-tuple ``(erow, ecol)`` of
58 UTF-8 BOM or encoding cookie, according to :pep:`263`.
87 lossless and round-trips are assured. The guarantee applies only to the
96 :func:`.tokenize` needs to detect the encoding of source files it tokenizes. The
101 The :func:`detect_encoding` function is used to detect the encoding that
109 It detects the encoding from the presence of a UTF-8 BOM or an encoding
112 ``'utf-8-sig'`` will be returned as an encoding.
114 If no encoding is specified, then the default of ``'utf-8'`` will be
118 :func:`detect_encoding` to detect the file encoding.
142 Note that unclosed single-quoted strings do not cause an error to be
147 .. _tokenize-cli:
149 Command-Line Usage
150 ------------------
157 .. code-block:: sh
159 python -m tokenize [-e] [filename.py]
165 .. cmdoption:: -h, --help
169 .. cmdoption:: -e, --exact
177 ------------------
189 >>> s = 'print(+21.3e-5*-.1234/81.7)'
191 "print (+Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7'))"
194 Known cases are "e-007" (Windows) and "e-07" (not Windows). Since
196 rest of the output should be platform-independent.
199 -3.21716034272e-0...7
205 -3.217160342717258261933904529E-7
208 g = tokenize(BytesIO(s.encode('utf-8')).readline) # tokenize the string
219 return untokenize(result).decode('utf-8')
232 .. code-block:: shell-session
234 $ python -m tokenize hello.py
235 0,0-0,0: ENCODING 'utf-8'
236 1,0-1,3: NAME 'def'
237 1,4-1,13: NAME 'say_hello'
238 1,13-1,14: OP '('
239 1,14-1,15: OP ')'
240 1,15-1,16: OP ':'
241 1,16-1,17: NEWLINE '\n'
242 2,0-2,4: INDENT ' '
243 2,4-2,9: NAME 'print'
244 2,9-2,10: OP '('
245 2,10-2,25: STRING '"Hello, World!"'
246 2,25-2,26: OP ')'
247 2,26-2,27: NEWLINE '\n'
248 3,0-3,1: NL '\n'
249 4,0-4,0: DEDENT ''
250 4,0-4,9: NAME 'say_hello'
251 4,9-4,10: OP '('
252 4,10-4,11: OP ')'
253 4,11-4,12: NEWLINE '\n'
254 5,0-5,0: ENDMARKER ''
256 The exact token type names can be displayed using the :option:`-e` option:
258 .. code-block:: shell-session
260 $ python -m tokenize -e hello.py
261 0,0-0,0: ENCODING 'utf-8'
262 1,0-1,3: NAME 'def'
263 1,4-1,13: NAME 'say_hello'
264 1,13-1,14: LPAR '('
265 1,14-1,15: RPAR ')'
266 1,15-1,16: COLON ':'
267 1,16-1,17: NEWLINE '\n'
268 2,0-2,4: INDENT ' '
269 2,4-2,9: NAME 'print'
270 2,9-2,10: LPAR '('
271 2,10-2,25: STRING '"Hello, World!"'
272 2,25-2,26: RPAR ')'
273 2,26-2,27: NEWLINE '\n'
274 3,0-3,1: NL '\n'
275 4,0-4,0: DEDENT ''
276 4,0-4,9: NAME 'say_hello'
277 4,9-4,10: LPAR '('
278 4,10-4,11: RPAR ')'
279 4,11-4,12: NEWLINE '\n'
280 5,0-5,0: ENDMARKER ''