1"""Filename globbing utility.""" 2 3import contextlib 4import os 5import re 6import fnmatch 7import itertools 8import stat 9import sys 10 11__all__ = ["glob", "iglob", "escape"] 12 13def glob(pathname, *, root_dir=None, dir_fd=None, recursive=False): 14 """Return a list of paths matching a pathname pattern. 15 16 The pattern may contain simple shell-style wildcards a la 17 fnmatch. However, unlike fnmatch, filenames starting with a 18 dot are special cases that are not matched by '*' and '?' 19 patterns. 20 21 If recursive is true, the pattern '**' will match any files and 22 zero or more directories and subdirectories. 23 """ 24 return list(iglob(pathname, root_dir=root_dir, dir_fd=dir_fd, recursive=recursive)) 25 26def iglob(pathname, *, root_dir=None, dir_fd=None, recursive=False): 27 """Return an iterator which yields the paths matching a pathname pattern. 28 29 The pattern may contain simple shell-style wildcards a la 30 fnmatch. However, unlike fnmatch, filenames starting with a 31 dot are special cases that are not matched by '*' and '?' 32 patterns. 33 34 If recursive is true, the pattern '**' will match any files and 35 zero or more directories and subdirectories. 36 """ 37 sys.audit("glob.glob", pathname, recursive) 38 sys.audit("glob.glob/2", pathname, recursive, root_dir, dir_fd) 39 if root_dir is not None: 40 root_dir = os.fspath(root_dir) 41 else: 42 root_dir = pathname[:0] 43 it = _iglob(pathname, root_dir, dir_fd, recursive, False) 44 if not pathname or recursive and _isrecursive(pathname[:2]): 45 try: 46 s = next(it) # skip empty string 47 if s: 48 it = itertools.chain((s,), it) 49 except StopIteration: 50 pass 51 return it 52 53def _iglob(pathname, root_dir, dir_fd, recursive, dironly): 54 dirname, basename = os.path.split(pathname) 55 if not has_magic(pathname): 56 assert not dironly 57 if basename: 58 if _lexists(_join(root_dir, pathname), dir_fd): 59 yield pathname 60 else: 61 # Patterns ending with a slash should match only directories 62 if _isdir(_join(root_dir, dirname), dir_fd): 63 yield pathname 64 return 65 if not dirname: 66 if recursive and _isrecursive(basename): 67 yield from _glob2(root_dir, basename, dir_fd, dironly) 68 else: 69 yield from _glob1(root_dir, basename, dir_fd, dironly) 70 return 71 # `os.path.split()` returns the argument itself as a dirname if it is a 72 # drive or UNC path. Prevent an infinite recursion if a drive or UNC path 73 # contains magic characters (i.e. r'\\?\C:'). 74 if dirname != pathname and has_magic(dirname): 75 dirs = _iglob(dirname, root_dir, dir_fd, recursive, True) 76 else: 77 dirs = [dirname] 78 if has_magic(basename): 79 if recursive and _isrecursive(basename): 80 glob_in_dir = _glob2 81 else: 82 glob_in_dir = _glob1 83 else: 84 glob_in_dir = _glob0 85 for dirname in dirs: 86 for name in glob_in_dir(_join(root_dir, dirname), basename, dir_fd, dironly): 87 yield os.path.join(dirname, name) 88 89# These 2 helper functions non-recursively glob inside a literal directory. 90# They return a list of basenames. _glob1 accepts a pattern while _glob0 91# takes a literal basename (so it only has to check for its existence). 92 93def _glob1(dirname, pattern, dir_fd, dironly): 94 names = _listdir(dirname, dir_fd, dironly) 95 if not _ishidden(pattern): 96 names = (x for x in names if not _ishidden(x)) 97 return fnmatch.filter(names, pattern) 98 99def _glob0(dirname, basename, dir_fd, dironly): 100 if basename: 101 if _lexists(_join(dirname, basename), dir_fd): 102 return [basename] 103 else: 104 # `os.path.split()` returns an empty basename for paths ending with a 105 # directory separator. 'q*x/' should match only directories. 106 if _isdir(dirname, dir_fd): 107 return [basename] 108 return [] 109 110# Following functions are not public but can be used by third-party code. 111 112def glob0(dirname, pattern): 113 return _glob0(dirname, pattern, None, False) 114 115def glob1(dirname, pattern): 116 return _glob1(dirname, pattern, None, False) 117 118# This helper function recursively yields relative pathnames inside a literal 119# directory. 120 121def _glob2(dirname, pattern, dir_fd, dironly): 122 assert _isrecursive(pattern) 123 yield pattern[:0] 124 yield from _rlistdir(dirname, dir_fd, dironly) 125 126# If dironly is false, yields all file names inside a directory. 127# If dironly is true, yields only directory names. 128def _iterdir(dirname, dir_fd, dironly): 129 try: 130 fd = None 131 fsencode = None 132 if dir_fd is not None: 133 if dirname: 134 fd = arg = os.open(dirname, _dir_open_flags, dir_fd=dir_fd) 135 else: 136 arg = dir_fd 137 if isinstance(dirname, bytes): 138 fsencode = os.fsencode 139 elif dirname: 140 arg = dirname 141 elif isinstance(dirname, bytes): 142 arg = bytes(os.curdir, 'ASCII') 143 else: 144 arg = os.curdir 145 try: 146 with os.scandir(arg) as it: 147 for entry in it: 148 try: 149 if not dironly or entry.is_dir(): 150 if fsencode is not None: 151 yield fsencode(entry.name) 152 else: 153 yield entry.name 154 except OSError: 155 pass 156 finally: 157 if fd is not None: 158 os.close(fd) 159 except OSError: 160 return 161 162def _listdir(dirname, dir_fd, dironly): 163 with contextlib.closing(_iterdir(dirname, dir_fd, dironly)) as it: 164 return list(it) 165 166# Recursively yields relative pathnames inside a literal directory. 167def _rlistdir(dirname, dir_fd, dironly): 168 names = _listdir(dirname, dir_fd, dironly) 169 for x in names: 170 if not _ishidden(x): 171 yield x 172 path = _join(dirname, x) if dirname else x 173 for y in _rlistdir(path, dir_fd, dironly): 174 yield _join(x, y) 175 176 177def _lexists(pathname, dir_fd): 178 # Same as os.path.lexists(), but with dir_fd 179 if dir_fd is None: 180 return os.path.lexists(pathname) 181 try: 182 os.lstat(pathname, dir_fd=dir_fd) 183 except (OSError, ValueError): 184 return False 185 else: 186 return True 187 188def _isdir(pathname, dir_fd): 189 # Same as os.path.isdir(), but with dir_fd 190 if dir_fd is None: 191 return os.path.isdir(pathname) 192 try: 193 st = os.stat(pathname, dir_fd=dir_fd) 194 except (OSError, ValueError): 195 return False 196 else: 197 return stat.S_ISDIR(st.st_mode) 198 199def _join(dirname, basename): 200 # It is common if dirname or basename is empty 201 if not dirname or not basename: 202 return dirname or basename 203 return os.path.join(dirname, basename) 204 205magic_check = re.compile('([*?[])') 206magic_check_bytes = re.compile(b'([*?[])') 207 208def has_magic(s): 209 if isinstance(s, bytes): 210 match = magic_check_bytes.search(s) 211 else: 212 match = magic_check.search(s) 213 return match is not None 214 215def _ishidden(path): 216 return path[0] in ('.', b'.'[0]) 217 218def _isrecursive(pattern): 219 if isinstance(pattern, bytes): 220 return pattern == b'**' 221 else: 222 return pattern == '**' 223 224def escape(pathname): 225 """Escape all special characters. 226 """ 227 # Escaping is done by wrapping any of "*?[" between square brackets. 228 # Metacharacters do not work in the drive part and shouldn't be escaped. 229 drive, pathname = os.path.splitdrive(pathname) 230 if isinstance(pathname, bytes): 231 pathname = magic_check_bytes.sub(br'[\1]', pathname) 232 else: 233 pathname = magic_check.sub(r'[\1]', pathname) 234 return drive + pathname 235 236 237_dir_open_flags = os.O_RDONLY | getattr(os, 'O_DIRECTORY', 0) 238