• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<head>
2<style> p { max-width:50em} ol, ul {max-width: 40em}</style>
3</head>
4
5autofs - how it works
6=====================
7
8Purpose
9-------
10
11The goal of autofs is to provide on-demand mounting and race free
12automatic unmounting of various other filesystems.  This provides two
13key advantages:
14
151. There is no need to delay boot until all filesystems that
16   might be needed are mounted.  Processes that try to access those
17   slow filesystems might be delayed but other processes can
18   continue freely.  This is particularly important for
19   network filesystems (e.g. NFS) or filesystems stored on
20   media with a media-changing robot.
21
222. The names and locations of filesystems can be stored in
23   a remote database and can change at any time.  The content
24   in that data base at the time of access will be used to provide
25   a target for the access.  The interpretation of names in the
26   filesystem can even be programmatic rather than database-backed,
27   allowing wildcards for example, and can vary based on the user who
28   first accessed a name.
29
30Context
31-------
32
33The "autofs" filesystem module is only one part of an autofs system.
34There also needs to be a user-space program which looks up names
35and mounts filesystems.  This will often be the "automount" program,
36though other tools including "systemd" can make use of "autofs".
37This document describes only the kernel module and the interactions
38required with any user-space program.  Subsequent text refers to this
39as the "automount daemon" or simply "the daemon".
40
41"autofs" is a Linux kernel module with provides the "autofs"
42filesystem type.  Several "autofs" filesystems can be mounted and they
43can each be managed separately, or all managed by the same daemon.
44
45Content
46-------
47
48An autofs filesystem can contain 3 sorts of objects: directories,
49symbolic links and mount traps.  Mount traps are directories with
50extra properties as described in the next section.
51
52Objects can only be created by the automount daemon: symlinks are
53created with a regular `symlink` system call, while directories and
54mount traps are created with `mkdir`.  The determination of whether a
55directory should be a mount trap or not is quite _ad hoc_, largely for
56historical reasons, and is determined in part by the
57*direct*/*indirect*/*offset* mount options, and the *maxproto* mount option.
58
59If neither the *direct* or *offset* mount options are given (so the
60mount is considered to be *indirect*), then the root directory is
61always a regular directory, otherwise it is a mount trap when it is
62empty and a regular directory when not empty.  Note that *direct* and
63*offset* are treated identically so a concise summary is that the root
64directory is a mount trap only if the filesystem is mounted *direct*
65and the root is empty.
66
67Directories created in the root directory are mount traps only if the
68filesystem is mounted *indirect* and they are empty.
69
70Directories further down the tree depend on the *maxproto* mount
71option and particularly whether it is less than five or not.
72When *maxproto* is five, no directories further down the
73tree are ever mount traps, they are always regular directories.  When
74the *maxproto* is four (or three), these directories are mount traps
75precisely when they are empty.
76
77So: non-empty (i.e. non-leaf) directories are never mount traps. Empty
78directories are sometimes mount traps, and sometimes not depending on
79where in the tree they are (root, top level, or lower), the *maxproto*,
80and whether the mount was *indirect* or not.
81
82Mount Traps
83---------------
84
85A core element of the implementation of autofs is the Mount Traps
86which are provided by the Linux VFS.  Any directory provided by a
87filesystem can be designated as a trap.  This involves two separate
88features that work together to allow autofs to do its job.
89
90**DCACHE_NEED_AUTOMOUNT**
91
92If a dentry has the DCACHE_NEED_AUTOMOUNT flag set (which gets set if
93the inode has S_AUTOMOUNT set, or can be set directly) then it is
94(potentially) a mount trap.  Any access to this directory beyond a
95"`stat`" will (normally) cause the `d_op->d_automount()` dentry operation
96to be called. The task of this method is to find the filesystem that
97should be mounted on the directory and to return it.  The VFS is
98responsible for actually mounting the root of this filesystem on the
99directory.
100
101autofs doesn't find the filesystem itself but sends a message to the
102automount daemon asking it to find and mount the filesystem.  The
103autofs `d_automount` method then waits for the daemon to report that
104everything is ready.  It will then return "`NULL`" indicating that the
105mount has already happened.  The VFS doesn't try to mount anything but
106follows down the mount that is already there.
107
108This functionality is sufficient for some users of mount traps such
109as NFS which creates traps so that mountpoints on the server can be
110reflected on the client.  However it is not sufficient for autofs.  As
111mounting onto a directory is considered to be "beyond a `stat`", the
112automount daemon would not be able to mount a filesystem on the 'trap'
113directory without some way to avoid getting caught in the trap.  For
114that purpose there is another flag.
115
116**DCACHE_MANAGE_TRANSIT**
117
118If a dentry has DCACHE_MANAGE_TRANSIT set then two very different but
119related behaviours are invoked, both using the `d_op->d_manage()`
120dentry operation.
121
122Firstly, before checking to see if any filesystem is mounted on the
123directory, d_manage() will be called with the `rcu_walk` parameter set
124to `false`.  It may return one of three things:
125
126-  A return value of zero indicates that there is nothing special
127   about this dentry and normal checks for mounts and automounts
128   should proceed.
129
130   autofs normally returns zero, but first waits for any
131   expiry (automatic unmounting of the mounted filesystem) to
132   complete.  This avoids races.
133
134-  A return value of `-EISDIR` tells the VFS to ignore any mounts
135   on the directory and to not consider calling `->d_automount()`.
136   This effectively disables the **DCACHE_NEED_AUTOMOUNT** flag
137   causing the directory not be a mount trap after all.
138
139   autofs returns this if it detects that the process performing the
140   lookup is the automount daemon and that the mount has been
141   requested but has not yet completed.  How it determines this is
142   discussed later.  This allows the automount daemon not to get
143   caught in the mount trap.
144
145   There is a subtlety here.  It is possible that a second autofs
146   filesystem can be mounted below the first and for both of them to
147   be managed by the same daemon.  For the daemon to be able to mount
148   something on the second it must be able to "walk" down past the
149   first.  This means that d_manage cannot *always* return -EISDIR for
150   the automount daemon.  It must only return it when a mount has
151   been requested, but has not yet completed.
152
153   `d_manage` also returns `-EISDIR` if the dentry shouldn't be a
154   mount trap, either because it is a symbolic link or because it is
155   not empty.
156
157-  Any other negative value is treated as an error and returned
158   to the caller.
159
160   autofs can return
161
162   - -ENOENT if the automount daemon failed to mount anything,
163   - -ENOMEM if it ran out of memory,
164   - -EINTR if a signal arrived while waiting for expiry to
165     complete
166   - or any other error sent down by the automount daemon.
167
168
169The second use case only occurs during an "RCU-walk" and so `rcu_walk`
170will be set.
171
172An RCU-walk is a fast and lightweight process for walking down a
173filename path (i.e. it is like running on tip-toes).  RCU-walk cannot
174cope with all situations so when it finds a difficulty it falls back
175to "REF-walk", which is slower but more robust.
176
177RCU-walk will never call `->d_automount`; the filesystems must already
178be mounted or RCU-walk cannot handle the path.
179To determine if a mount-trap is safe for RCU-walk mode it calls
180`->d_manage()` with `rcu_walk` set to `true`.
181
182In this case `d_manage()` must avoid blocking and should avoid taking
183spinlocks if at all possible.  Its sole purpose is to determine if it
184would be safe to follow down into any mounted directory and the only
185reason that it might not be is if an expiry of the mount is
186underway.
187
188In the `rcu_walk` case, `d_manage()` cannot return -EISDIR to tell the
189VFS that this is a directory that doesn't require d_automount.  If
190`rcu_walk` sees a dentry with DCACHE_NEED_AUTOMOUNT set but nothing
191mounted, it *will* fall back to REF-walk.  `d_manage()` cannot make the
192VFS remain in RCU-walk mode, but can only tell it to get out of
193RCU-walk mode by returning `-ECHILD`.
194
195So `d_manage()`, when called with `rcu_walk` set, should either return
196-ECHILD if there is any reason to believe it is unsafe to enter the
197mounted filesystem, otherwise it should return 0.
198
199autofs will return `-ECHILD` if an expiry of the filesystem has been
200initiated or is being considered, otherwise it returns 0.
201
202
203Mountpoint expiry
204-----------------
205
206The VFS has a mechanism for automatically expiring unused mounts,
207much as it can expire any unused dentry information from the dcache.
208This is guided by the MNT_SHRINKABLE flag.  This only applies to
209mounts that were created by `d_automount()` returning a filesystem to be
210mounted.  As autofs doesn't return such a filesystem but leaves the
211mounting to the automount daemon, it must involve the automount daemon
212in unmounting as well.  This also means that autofs has more control
213over expiry.
214
215The VFS also supports "expiry" of mounts using the MNT_EXPIRE flag to
216the `umount` system call.  Unmounting with MNT_EXPIRE will fail unless
217a previous attempt had been made, and the filesystem has been inactive
218and untouched since that previous attempt.  autofs does not depend on
219this but has its own internal tracking of whether filesystems were
220recently used.  This allows individual names in the autofs directory
221to expire separately.
222
223With version 4 of the protocol, the automount daemon can try to
224unmount any filesystems mounted on the autofs filesystem or remove any
225symbolic links or empty directories any time it likes.  If the unmount
226or removal is successful the filesystem will be returned to the state
227it was before the mount or creation, so that any access of the name
228will trigger normal auto-mount processing.  In particular, `rmdir` and
229`unlink` do not leave negative entries in the dcache as a normal
230filesystem would, so an attempt to access a recently-removed object is
231passed to autofs for handling.
232
233With version 5, this is not safe except for unmounting from top-level
234directories.  As lower-level directories are never mount traps, other
235processes will see an empty directory as soon as the filesystem is
236unmounted.  So it is generally safest to use the autofs expiry
237protocol described below.
238
239Normally the daemon only wants to remove entries which haven't been
240used for a while.  For this purpose autofs maintains a "`last_used`"
241time stamp on each directory or symlink.  For symlinks it genuinely
242does record the last time the symlink was "used" or followed to find
243out where it points to.  For directories the field is used slightly
244differently.  The field is updated at mount time and during expire
245checks if it is found to be in use (ie. open file descriptor or
246process working directory) and during path walks. The update done
247during path walks prevents frequent expire and immediate mount of
248frequently accessed automounts. But in the case where a GUI continually
249access or an application frequently scans an autofs directory tree
250there can be an accumulation of mounts that aren't actually being
251used. To cater for this case the "`strictexpire`" autofs mount option
252can be used to avoid the "`last_used`" update on path walk thereby
253preventing this apparent inability to expire mounts that aren't
254really in use.
255
256The daemon is able to ask autofs if anything is due to be expired,
257using an `ioctl` as discussed later.  For a *direct* mount, autofs
258considers if the entire mount-tree can be unmounted or not.  For an
259*indirect* mount, autofs considers each of the names in the top level
260directory to determine if any of those can be unmounted and cleaned
261up.
262
263There is an option with indirect mounts to consider each of the leaves
264that has been mounted on instead of considering the top-level names.
265This was originally intended for compatibility with version 4 of autofs
266and should be considered as deprecated for Sun Format automount maps.
267However, it may be used again for amd format mount maps (which are
268generally indirect maps) because the amd automounter allows for the
269setting of an expire timeout for individual mounts. But there are
270some difficulties in making the needed changes for this.
271
272When autofs considers a directory it checks the `last_used` time and
273compares it with the "timeout" value set when the filesystem was
274mounted, though this check is ignored in some cases. It also checks if
275the directory or anything below it is in use.  For symbolic links,
276only the `last_used` time is ever considered.
277
278If both appear to support expiring the directory or symlink, an action
279is taken.
280
281There are two ways to ask autofs to consider expiry.  The first is to
282use the **AUTOFS_IOC_EXPIRE** ioctl.  This only works for indirect
283mounts.  If it finds something in the root directory to expire it will
284return the name of that thing.  Once a name has been returned the
285automount daemon needs to unmount any filesystems mounted below the
286name normally.  As described above, this is unsafe for non-toplevel
287mounts in a version-5 autofs.  For this reason the current `automount(8)`
288does not use this ioctl.
289
290The second mechanism uses either the **AUTOFS_DEV_IOCTL_EXPIRE_CMD** or
291the **AUTOFS_IOC_EXPIRE_MULTI** ioctl.  This will work for both direct and
292indirect mounts.  If it selects an object to expire, it will notify
293the daemon using the notification mechanism described below.  This
294will block until the daemon acknowledges the expiry notification.
295This implies that the "`EXPIRE`" ioctl must be sent from a different
296thread than the one which handles notification.
297
298While the ioctl is blocking, the entry is marked as "expiring" and
299`d_manage` will block until the daemon affirms that the unmount has
300completed (together with removing any directories that might have been
301necessary), or has been aborted.
302
303Communicating with autofs: detecting the daemon
304-----------------------------------------------
305
306There are several forms of communication between the automount daemon
307and the filesystem.  As we have already seen, the daemon can create and
308remove directories and symlinks using normal filesystem operations.
309autofs knows whether a process requesting some operation is the daemon
310or not based on its process-group id number (see getpgid(1)).
311
312When an autofs filesystem is mounted the pgid of the mounting
313processes is recorded unless the "pgrp=" option is given, in which
314case that number is recorded instead.  Any request arriving from a
315process in that process group is considered to come from the daemon.
316If the daemon ever has to be stopped and restarted a new pgid can be
317provided through an ioctl as will be described below.
318
319Communicating with autofs: the event pipe
320-----------------------------------------
321
322When an autofs filesystem is mounted, the 'write' end of a pipe must
323be passed using the 'fd=' mount option.  autofs will write
324notification messages to this pipe for the daemon to respond to.
325For version 5, the format of the message is:
326
327        struct autofs_v5_packet {
328                int proto_version;                /* Protocol version */
329                int type;                        /* Type of packet */
330                autofs_wqt_t wait_queue_token;
331                __u32 dev;
332                __u64 ino;
333                __u32 uid;
334                __u32 gid;
335                __u32 pid;
336                __u32 tgid;
337                __u32 len;
338                char name[NAME_MAX+1];
339        };
340
341where the type is one of
342
343        autofs_ptype_missing_indirect
344        autofs_ptype_expire_indirect
345        autofs_ptype_missing_direct
346        autofs_ptype_expire_direct
347
348so messages can indicate that a name is missing (something tried to
349access it but it isn't there) or that it has been selected for expiry.
350
351The pipe will be set to "packet mode" (equivalent to passing
352`O_DIRECT`) to _pipe2(2)_ so that a read from the pipe will return at
353most one packet, and any unread portion of a packet will be discarded.
354
355The `wait_queue_token` is a unique number which can identify a
356particular request to be acknowledged.  When a message is sent over
357the pipe the affected dentry is marked as either "active" or
358"expiring" and other accesses to it block until the message is
359acknowledged using one of the ioctls below with the relevant
360`wait_queue_token`.
361
362Communicating with autofs: root directory ioctls
363------------------------------------------------
364
365The root directory of an autofs filesystem will respond to a number of
366ioctls.  The process issuing the ioctl must have the CAP_SYS_ADMIN
367capability, or must be the automount daemon.
368
369The available ioctl commands are:
370
371- **AUTOFS_IOC_READY**: a notification has been handled.  The argument
372    to the ioctl command is the "wait_queue_token" number
373    corresponding to the notification being acknowledged.
374- **AUTOFS_IOC_FAIL**: similar to above, but indicates failure with
375    the error code `ENOENT`.
376- **AUTOFS_IOC_CATATONIC**: Causes the autofs to enter "catatonic"
377    mode meaning that it stops sending notifications to the daemon.
378    This mode is also entered if a write to the pipe fails.
379- **AUTOFS_IOC_PROTOVER**:  This returns the protocol version in use.
380- **AUTOFS_IOC_PROTOSUBVER**: Returns the protocol sub-version which
381    is really a version number for the implementation.
382- **AUTOFS_IOC_SETTIMEOUT**:  This passes a pointer to an unsigned
383    long.  The value is used to set the timeout for expiry, and
384    the current timeout value is stored back through the pointer.
385- **AUTOFS_IOC_ASKUMOUNT**:  Returns, in the pointed-to `int`, 1 if
386    the filesystem could be unmounted.  This is only a hint as
387    the situation could change at any instant.  This call can be
388    used to avoid a more expensive full unmount attempt.
389- **AUTOFS_IOC_EXPIRE**: as described above, this asks if there is
390    anything suitable to expire.  A pointer to a packet:
391
392        struct autofs_packet_expire_multi {
393                int proto_version;              /* Protocol version */
394                int type;                       /* Type of packet */
395                autofs_wqt_t wait_queue_token;
396                int len;
397                char name[NAME_MAX+1];
398        };
399
400     is required.  This is filled in with the name of something
401     that can be unmounted or removed.  If nothing can be expired,
402     `errno` is set to `EAGAIN`.  Even though a `wait_queue_token`
403     is present in the structure, no "wait queue" is established
404     and no acknowledgment is needed.
405- **AUTOFS_IOC_EXPIRE_MULTI**:  This is similar to
406     **AUTOFS_IOC_EXPIRE** except that it causes notification to be
407     sent to the daemon, and it blocks until the daemon acknowledges.
408     The argument is an integer which can contain two different flags.
409
410     **AUTOFS_EXP_IMMEDIATE** causes `last_used` time to be ignored
411     and objects are expired if the are not in use.
412
413     **AUTOFS_EXP_FORCED** causes the in use status to be ignored
414     and objects are expired ieven if they are in use. This assumes
415     that the daemon has requested this because it is capable of
416     performing the umount.
417
418     **AUTOFS_EXP_LEAVES** will select a leaf rather than a top-level
419     name to expire.  This is only safe when *maxproto* is 4.
420
421Communicating with autofs: char-device ioctls
422---------------------------------------------
423
424It is not always possible to open the root of an autofs filesystem,
425particularly a *direct* mounted filesystem.  If the automount daemon
426is restarted there is no way for it to regain control of existing
427mounts using any of the above communication channels.  To address this
428need there is a "miscellaneous" character device (major 10, minor 235)
429which can be used to communicate directly with the autofs filesystem.
430It requires CAP_SYS_ADMIN for access.
431
432The `ioctl`s that can be used on this device are described in a separate
433document `autofs-mount-control.txt`, and are summarised briefly here.
434Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure:
435
436        struct autofs_dev_ioctl {
437                __u32 ver_major;
438                __u32 ver_minor;
439                __u32 size;             /* total size of data passed in
440                                         * including this struct */
441                __s32 ioctlfd;          /* automount command fd */
442
443		/* Command parameters */
444		union {
445			struct args_protover		protover;
446			struct args_protosubver		protosubver;
447			struct args_openmount		openmount;
448			struct args_ready		ready;
449			struct args_fail		fail;
450			struct args_setpipefd		setpipefd;
451			struct args_timeout		timeout;
452			struct args_requester		requester;
453			struct args_expire		expire;
454			struct args_askumount		askumount;
455			struct args_ismountpoint	ismountpoint;
456		};
457
458                char path[0];
459        };
460
461For the **OPEN_MOUNT** and **IS_MOUNTPOINT** commands, the target
462filesystem is identified by the `path`.  All other commands identify
463the filesystem by the `ioctlfd` which is a file descriptor open on the
464root, and which can be returned by **OPEN_MOUNT**.
465
466The `ver_major` and `ver_minor` are in/out parameters which check that
467the requested version is supported, and report the maximum version
468that the kernel module can support.
469
470Commands are:
471
472- **AUTOFS_DEV_IOCTL_VERSION_CMD**: does nothing, except validate and
473    set version numbers.
474- **AUTOFS_DEV_IOCTL_OPENMOUNT_CMD**: return an open file descriptor
475    on the root of an autofs filesystem.  The filesystem is identified
476    by name and device number, which is stored in `openmount.devid`.
477    Device numbers for existing filesystems can be found in
478    `/proc/self/mountinfo`.
479- **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`.
480- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in
481    catatonic mode, this can provide the write end of a new pipe
482    in `setpipefd.pipefd` to re-establish communication with a daemon.
483    The process group of the calling process is used to identify the
484    daemon.
485- **AUTOFS_DEV_IOCTL_REQUESTER_CMD**: `path` should be a
486    name within the filesystem that has been auto-mounted on.
487    On successful return, `requester.uid` and `requester.gid` will be
488    the UID and GID of the process which triggered that mount.
489- **AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD**: Check if path is a
490    mountpoint of a particular type - see separate documentation for
491    details.
492- **AUTOFS_DEV_IOCTL_PROTOVER_CMD**:
493- **AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD**:
494- **AUTOFS_DEV_IOCTL_READY_CMD**:
495- **AUTOFS_DEV_IOCTL_FAIL_CMD**:
496- **AUTOFS_DEV_IOCTL_CATATONIC_CMD**:
497- **AUTOFS_DEV_IOCTL_TIMEOUT_CMD**:
498- **AUTOFS_DEV_IOCTL_EXPIRE_CMD**:
499- **AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD**:  These all have the same
500    function as the similarly named **AUTOFS_IOC** ioctls, except
501    that **FAIL** can be given an explicit error number in `fail.status`
502    instead of assuming `ENOENT`, and this **EXPIRE** command
503    corresponds to **AUTOFS_IOC_EXPIRE_MULTI**.
504
505Catatonic mode
506--------------
507
508As mentioned, an autofs mount can enter "catatonic" mode.  This
509happens if a write to the notification pipe fails, or if it is
510explicitly requested by an `ioctl`.
511
512When entering catatonic mode, the pipe is closed and any pending
513notifications are acknowledged with the error `ENOENT`.
514
515Once in catatonic mode attempts to access non-existing names will
516result in `ENOENT` while attempts to access existing directories will
517be treated in the same way as if they came from the daemon, so mount
518traps will not fire.
519
520When the filesystem is mounted a _uid_ and _gid_ can be given which
521set the ownership of directories and symbolic links.  When the
522filesystem is in catatonic mode, any process with a matching UID can
523create directories or symlinks in the root directory, but not in other
524directories.
525
526Catatonic mode can only be left via the
527**AUTOFS_DEV_IOCTL_OPENMOUNT_CMD** ioctl on the `/dev/autofs`.
528
529The "ignore" mount option
530-------------------------
531
532The "ignore" mount option can be used to provide a generic indicator
533to applications that the mount entry should be ignored when displaying
534mount information.
535
536In other OSes that provide autofs and that provide a mount list to user
537space based on the kernel mount list a no-op mount option ("ignore" is
538the one use on the most common OSes) is allowed so that autofs file
539system users can optionally use it.
540
541This is intended to be used by user space programs to exclude autofs
542mounts from consideration when reading the mounts list.
543
544autofs, name spaces, and shared mounts
545--------------------------------------
546
547With bind mounts and name spaces it is possible for an autofs
548filesystem to appear at multiple places in one or more filesystem
549name spaces.  For this to work sensibly, the autofs filesystem should
550always be mounted "shared". e.g.
551
552> `mount --make-shared /autofs/mount/point`
553
554The automount daemon is only able to manage a single mount location for
555an autofs filesystem and if mounts on that are not 'shared', other
556locations will not behave as expected.  In particular access to those
557other locations will likely result in the `ELOOP` error
558
559> Too many levels of symbolic links
560