tokenize.py - OpenGrok cross reference for /external/python/cpython2/Lib/lib2to3/pgen2/tokenize.py

Lines Matching +full:detect +full:- +full:newline
7 text into Python tokens.  It accepts a readline-like method which is called
9 5-tuples with these members:
13     the starting (row, column) indices of the token (a 2-tuple of ints)
14     the ending (row, column) indices of the token (a 2-tuple of ints)
28 __author__ = 'Ka-Ping Yee <ping@lfw.org>'
55 Name = r'[a-zA-Z_]\w*'
58 Hexnumber = r'0[xX][\da-fA-F]*[lL]?'
59 Octnumber = r'0[oO]?[0-7]*[lL]?'
60 Decnumber = r'[1-9]\d*[lL]?'
62 Exponent = r'[eE][-+]?\d+'
78 # Single-line ' or " string.
82 # Because of leftmost-then-longest match semantics, be sure to put the
86                  r"//=?", r"->",
87                  r"[+\-*/%&@|^=<>]=?",
157     print "%d,%d-%d,%d:\t%s\t%s" % \
166     the same interface as the readline() method of built-in file objects.
193         col_offset = col - self.prev_col
206             if tok_type in (NEWLINE, NL):
218         if toknum in (NEWLINE, NL):
232             elif toknum in (NEWLINE, NL):
235                 toks_append(indents[-1])
239 cookie_re = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-\w.]+)')
245     enc = orig_enc[:12].lower().replace("_", "-")
246     if enc == "utf-8" or enc.startswith("utf-8-"):
247         return "utf-8"
248     if enc in ("latin-1", "iso-8859-1", "iso-latin-1") or \
249        enc.startswith(("latin-1-", "iso-8859-1-", "iso-latin-1-")):
250         return "iso-8859-1"
255     The detect_encoding() function is used to detect the encoding that should
263     It detects the encoding from the presence of a utf-8 bom or an encoding
264     cookie as specified in pep-0263. If both a bom and a cookie are present, but
266     charset, raise a SyntaxError.  Note that if a utf-8 bom is found,
267     'utf-8-sig' is returned.
269     If no encoding is specified, then the default of 'utf-8' will be returned.
273     default = 'utf-8'
296             if codec.name != 'utf-8':
298                 raise SyntaxError('encoding problem: utf-8')
299             encoding += '-sig'
306         default = 'utf-8-sig'
333     Round-trip invariant for full input:
336     Round-trip invariant for limited intput:
351     readline() method of built-in file objects. Each call to the function
356     The generator produces 5-tuples with these members: the token type; the
357     token string; a 2-tuple (srow, scol) of ints specifying the row and
358     column where the token begins in the source; a 2-tuple (erow, ecol) of
379                 raise TokenError, ("EOF in multi-line string", strstart)
387             elif needcont and line[-2:] != '\\\n' and line[-3:] != '\\\r\n':
422             if column > indents[-1]:           # count indents or dedents
425             while column < indents[-1]:
430                 indents = indents[:-1]
435                 raise TokenError, ("EOF in multi-line statement", (lnum, 0))
449                     newline = NEWLINE
451                         newline = NL
452                     yield (newline, token, spos, epos, line)
471                     if token[-1] == '\n':                  # continued string
488                     elif initial in ')]}': parenlev = parenlev - 1