• Home
  • Raw
  • Download

Lines Matching full:that

20 exploration is needed to discover, is that it is complex.  There are
21 many rules, special cases, and implementation alternatives that all
24 tool that we will make extensive use of is "divide and conquer". For
39 of elements: "slashes" that are sequences of one or more "`/`"
40 characters, and "components" that are sequences of one or more
41 non-"`/`" characters. These form two kinds of paths. Those that
50 component, but that isn't always accurate: a pathname can lack both
60 it must identify a directory that already exists, otherwise an error
66 pathname that is just slashes have a final component. If it does
73 tempting to consider that to have an empty final component. In many
74 ways that would lead to correct results, but not always. In
79 > A pathname that contains at least one non- <slash> character and
80 > that ends with one or more trailing <slash> characters shall not
83 > directory entry that is to be created for a directory immediately
89 checking that the trailing slash is not used where it isn't
94 changes that affect that lookup. One fairly extreme case is that if
96 "a/b/..", that process might successfully resolve on "a/c".
100 "dcache" and an understanding of that is central to understanding
110 contains further information about the object in that parent with
111 the given name. The inode pointer can be `NULL` indicating that the
113 dentry of a directory to the dentries of the children, that linkage is
117 that will be particularly relevant is that it is closely integrated
118 with the mount table that records which filesystem is mounted where.
125 Some filesystems ensure that the information in the dcache is always
128 without checking with the filesystem, and means that the VFS can
132 Other filesystems don't provide that guarantee because they cannot.
133 These are typically filesystems that are shared across a network,
148 you ignore all the places that only run when "`LOOKUP_RCU`"
165 reference count. The special-sauce of this primitive is that the
169 Holding a reference on a dentry ensures that the dentry won't suddenly
195 `d_lock` is a synonym for the spinlock that is part of `d_lockref` above.
202 each candidate dentry that it finds in the hash table and then checks
203 that the parent and name are correct. So it doesn't lock the parent
217 accessing that slot in a hash table, and searching the linked list
218 that is found there.
223 happened to be looking at a dentry that was moved in this way,
229 `rename_lock` is a seqlock that is updated whenever any dentry is
230 renamed. If `d_lookup` finds that a rename happened while it
236 `i_mutex` is a mutex that serializes all changes to a particular
237 directory. This ensures that, for example, an `unlink()` and a `rename()`
239 stable while the filesystem is asked to look up a name that is not
242 This has a complementary role to that of `d_lock`: `i_mutex` on a
243 directory protects all of the names in that directory, while `d_lock`
254 falls back to `lookup_slow()` which takes `i_mutex`, checks again that
261 that the required exclusion can be achieved. How path lookup chooses
268 Per-CPU here means that incrementing the count is cheap as it only
273 `mnt_count` doesn't ensure that the mount remains in the namespace and,
275 does, however, ensure that the `mount` data structure remains coherent,
286 crossing a mount point to check that the crossing was safe. That is,
287 the value in the seqlock is read, then the code finds the mount that
317 all the way back to [First Edition Unix] - of the function that
334 that is the "next" component in the pathname.
347 filesystem. Often that reference won't be needed, so this field is
349 is requested. Keeping a reference in the `nameidata` ensures that
361 escape that subtree. It works a bit like a local `chroot()`.
367 > Given a path (`name`) and a nameidata structure (`nd`), check that the
369 > over one component while updating `last_type` and `last`. If that
377 filesystem to revalidate the result if it is that sort of filesystem.
378 If that doesn't get a good result, it calls "`lookup_slow()`" which
392 seem obvious, but is worth pointing out so that we will recognize its
400 not call `walk_component()` that last time. Handling that final
422 implementation of `lookup_slow()` which skips that step. This is
423 important when unmounting a filesystem that is inaccessible, such as
435 the possibility that the final component is not `LAST_NORM`. If the
439 won't try to create that name. They also check for trailing slashes
450 On filesystems that require it, the lookup routines will call the
451 `->d_revalidate()` dentry method to ensure that the cached information
453 from a server. In some cases it may find that there has been change
454 further up the path and that something that was thought to be valid
461 lookup a name can trigger changes to how that lookup should be
470 to three different flags that might be set in `dentry->d_flags`:
474 If this flag has been set, then the filesystem has requested that the
479 unmounted, the `d_manage()` function will usually wait for that
486 processing. That server process can identify itself to the `autofs`
492 This flag is set on every dentry that is mounted on. As Linux
493 supports multiple filesystem namespaces, it is possible that the
510 report that there was an error, that there was nothing to mount, or
516 There is no new locking of import here and it is important that no
530 We noted that REF-walk is complex because there are numerous details
542 thread from changing the data structures that a given thread is
545 same time, this can be very costly. Even when using locks that permit
548 goal when reading a shared data structure that no other process is
558 other parts it is important that RCU-walk can quickly fall back to
565 notices that something has changed or is changing, or if something
570 `vfsmount` and `dentry`, and ensuring that these are still valid -
571 that a path walk with REF-walk would have found the same entries.
572 This is an invariant that RCU-walk must guarantee. It can only make
573 decisions, such as selecting the next step, that are decisions which
580 This pattern of "try RCU-walk, if that fails try REF-walk" can be
588 that fails with the error `ECHILD` they are called again with no
591 `LOOKUP_RCU`) to ensure that entries found in the cache are forcibly
593 determines that they are too old to trust.
595 The `LOOKUP_RCU` attempt may drop that flag internally and switch to
597 that trip up RCU-walk are much more likely to be near the leaves and
598 so it is very unlikely that there will be much, if any, benefit from
605 `rcu_read_lock()` is held for the entire time that RCU-walk is walking
606 down a path. The particular guarantee it provides is that the key
611 is the only guarantee that RCU provides; everything else is done using
623 To preserve the invariant mentioned above (that RCU-walk may only make
624 decisions that REF-walk could have made), it must make the checks at
625 or near the same places that REF-walk holds the references. So, when
632 However, there is a little bit more to seqlocks than that. If
637 use `read_seqcount_retry()` to validate that copy.
640 imposes a memory barrier so that no memory-read instruction from
652 sufficient to catch any problem that could occur at this point.
654 With that little refresher on seqlocks out of the way we can look at
660 ensure that crossing a mount point is performed safely. RCU-walk uses
661 it for that too, but for quite a bit more.
670 that any "mount" or "unmount" happens.
680 If RCU-walk finds that `mount_lock` hasn't changed then it can be sure
681 that, had REF-walk taken counted references on each vfsmount, the
702 check if we have landed on a mount point and, if so, must find that
705 starting point of the path lookup was in part of the filesystem that
716 `lookup_fast()` is the only lookup routine that is used in RCU-mode,
718 `lookup_fast()` that we find the important "hand over hand" tracking
728 getting a counted reference to the new dentry before dropping that for
733 A mutex is a fairly heavyweight lock that can only be taken when it is
736 take `i_mutex` and modifies the directory in a way that RCU-walk needs
737 to notice, the result will be either that RCU-walk fails to find the
738 dentry that it is looking for, or it will find a dentry which
746 something that actually is there. When RCU-walk fails to find
755 That "dropping down to REF-walk" typically involves a call to
768 Other reasons for dropping out of RCU-walk that do not trigger a call
769 to `unlazy_walk()` are when some inconsistency is found that cannot be
776 takes a reference on each of the pointers that it holds (vfsmount,
777 dentry, and possibly some symbolic links) and then verifies that the
783 incrementing a counter. That works to take a second reference if you
792 `mount_lock` is then used to validate the reference. If that
793 validation fails, it may *not* be safe to just drop that reference in
796 finds that the reference it got might not be safe, checks the
813 In this case an extra "`MAY_NOT_BLOCK`" flag is passed so that it
837 the big picture, there are a couple of related patterns that are worth
840 The first is "try quickly and check, if that fails try slowly". We
841 can see that in the high-level approach of first trying RCU-walk and
847 The second pattern is "try quickly and check, if that fails try
854 "try quickly _and carefully,_ then check". The fact that checking is
855 needed is a reminder that the system is dynamic and only a limited
864 There are several basic issues that we will examine to understand the
874 There are only two sorts of filesystem objects that can usefully
882 a component name refers to a symbolic link, then that component is
883 replaced by the body of the link and, if that body starts with a '/',
920 further limit of eight on the maximum depth of recursion, but that was
924 The `nameidata` structure that we met in an earlier article contains a
925 small stack that can be used to store the remaining part of up to two
928 lookup will never exceed that stack as, once the 40th symlink is
931 It might seem that the name remnants are all that needs to be stored on
932 this stack, but we need a bit more. To see that, we need to move on to
941 able to find and temporarily hold onto these cached entries, so that
953 pathname in a symlink can be seen as the content of that symlink and
957 that the filesystem will allocate some temporary memory and copy or
958 construct the symlink content into that memory whenever it is needed.
962 on the dentry. This means that the mechanisms that pathname lookup
970 on an inode does not imply any reference on cached pages of that
971 inode, and even an `rcu_read_lock()` is not sufficient to ensure that
974 significantly, needs to release that reference when it is finished
979 but that isn't necessarily a big cost and it is better than dropping
980 out of RCU-walk mode completely. Even filesystems that allocate
989 RCU-walk mode as the rewrite is not quite complete. It is likely that
993 looked at previously, `->follow_link()` would need to be careful that
997 code is ready to release the reference when that does happen.
1000 complexity. It requires a reference to the inode so that the
1001 `i_op->put_link()` inode operation can be called. In REF-walk, that
1006 we also need the seq number for the dentry so we can confirm that
1010 provides an opaque "cookie" that must be passed to `->put_link()` so that it
1022 - the `cookie` that tells `->put_path()` what to put.
1024 This means that each entry in the symlink stack needs to hold five
1031 Note that, in a given stack frame, the path remnant (`name`) is not
1032 part of the symlink that the other fields refer to. It is the remnant
1033 to be followed once that symlink has been fully parsed.
1041 symlink, or is restored from the stack, so that much of the loop
1049 called; it then gets the link from the filesystem. Providing that
1059 the symlink-just-found to avoid leaving empty path remnants that would
1064 `walk_component()` is also the last piece of code that needs to look at the
1065 old symlink as it walks that last component. So it is quite
1086 so `NULL` is returned to indicate that the symlink can be released and
1089 The other case involves things in `/proc` that look like symlinks but
1096 something that looks like a symlink. It is really a reference to the
1098 objects you get a name that might refer to the same file - unless it
1102 `nameidata` in place to point to that target. `->follow_link()` then
1113 For some callers, this is all they need; they want to create that
1116 apply special handling to the last component of that symlink, rather
1119 successive symlinks until one is found that doesn't point to another
1123 `path_lookupat()` using a loop that calls `link_path_walk()`, and then
1125 that needs to be followed, then `trailing_symlink()` is called to set
1130 The various functions that examine the final component and possibly
1131 report that it is a symlink are `lookup_last()`, `mountpoint_last()`
1133 `walk_component()` of returning `1` if a symlink was found that needs
1157 If that doesn't work, only then is the lookup restarted from the top.
1168 so does `do_last()` so that `trailing_symlink()` gets called and the
1169 open process continues on the symlink that was found.
1174 We previously said of RCU-walk that it would "take no locks, increment
1175 no counts, leave no footprints." We have since seen that some
1181 footprints in a way that doesn't affect directories is in updating access times.
1188 update the atime on that symlink.
1193 subject. The [clearest statement] is that, if a particular implementation
1195 documented "except that any changes caused by pathname resolution need
1196 not be documented". This seems to imply that POSIX doesn't really
1201 An examination of history shows that prior to [Linux 1.3.87], the ext2
1203 Unfortunately we have no record of why that behavior was changed.
1205 In any case, access time must now be updated and that operation can be
1210 limits the updates of `atime` to once per day on files that aren't
1225 the various flags that can be stored in the `nameidata` to guide the
1239 `LOOKUP_PARENT` indicates that the final component hasn't been reached
1243 `LOOKUP_ROOT` indicates that the `root` field in the `nameidata` was
1247 `LOOKUP_JUMPED` means that the current dentry was chosen not because
1260 considered. Others are only checked for when considering that final
1263 `LOOKUP_AUTOMOUNT` ensures that, if the final component is an automount
1274 `WALK_GET` that we already met, but it is used in a different way.
1276 `LOOKUP_DIRECTORY` insists that the final component is a directory.
1284 if it knows that it will be asked to open or create the file soon.
1293 than even a couple of releases ago. But that doesn't mean it is
1295 symlinks that are stored in the inode so, while it handles many ext4
1296 symlinks, it doesn't help with NFS, XFS, or Btrfs. That support