1<head> 2<style> p { max-width:50em} ol, ul {max-width: 40em}</style> 3</head> 4 5autofs - how it works 6===================== 7 8Purpose 9------- 10 11The goal of autofs is to provide on-demand mounting and race free 12automatic unmounting of various other filesystems. This provides two 13key advantages: 14 151. There is no need to delay boot until all filesystems that 16 might be needed are mounted. Processes that try to access those 17 slow filesystems might be delayed but other processes can 18 continue freely. This is particularly important for 19 network filesystems (e.g. NFS) or filesystems stored on 20 media with a media-changing robot. 21 222. The names and locations of filesystems can be stored in 23 a remote database and can change at any time. The content 24 in that data base at the time of access will be used to provide 25 a target for the access. The interpretation of names in the 26 filesystem can even be programmatic rather than database-backed, 27 allowing wildcards for example, and can vary based on the user who 28 first accessed a name. 29 30Context 31------- 32 33The "autofs" filesystem module is only one part of an autofs system. 34There also needs to be a user-space program which looks up names 35and mounts filesystems. This will often be the "automount" program, 36though other tools including "systemd" can make use of "autofs". 37This document describes only the kernel module and the interactions 38required with any user-space program. Subsequent text refers to this 39as the "automount daemon" or simply "the daemon". 40 41"autofs" is a Linux kernel module with provides the "autofs" 42filesystem type. Several "autofs" filesystems can be mounted and they 43can each be managed separately, or all managed by the same daemon. 44 45Content 46------- 47 48An autofs filesystem can contain 3 sorts of objects: directories, 49symbolic links and mount traps. Mount traps are directories with 50extra properties as described in the next section. 51 52Objects can only be created by the automount daemon: symlinks are 53created with a regular `symlink` system call, while directories and 54mount traps are created with `mkdir`. The determination of whether a 55directory should be a mount trap or not is quite _ad hoc_, largely for 56historical reasons, and is determined in part by the 57*direct*/*indirect*/*offset* mount options, and the *maxproto* mount option. 58 59If neither the *direct* or *offset* mount options are given (so the 60mount is considered to be *indirect*), then the root directory is 61always a regular directory, otherwise it is a mount trap when it is 62empty and a regular directory when not empty. Note that *direct* and 63*offset* are treated identically so a concise summary is that the root 64directory is a mount trap only if the filesystem is mounted *direct* 65and the root is empty. 66 67Directories created in the root directory are mount traps only if the 68filesystem is mounted *indirect* and they are empty. 69 70Directories further down the tree depend on the *maxproto* mount 71option and particularly whether it is less than five or not. 72When *maxproto* is five, no directories further down the 73tree are ever mount traps, they are always regular directories. When 74the *maxproto* is four (or three), these directories are mount traps 75precisely when they are empty. 76 77So: non-empty (i.e. non-leaf) directories are never mount traps. Empty 78directories are sometimes mount traps, and sometimes not depending on 79where in the tree they are (root, top level, or lower), the *maxproto*, 80and whether the mount was *indirect* or not. 81 82Mount Traps 83--------------- 84 85A core element of the implementation of autofs is the Mount Traps 86which are provided by the Linux VFS. Any directory provided by a 87filesystem can be designated as a trap. This involves two separate 88features that work together to allow autofs to do its job. 89 90**DCACHE_NEED_AUTOMOUNT** 91 92If a dentry has the DCACHE_NEED_AUTOMOUNT flag set (which gets set if 93the inode has S_AUTOMOUNT set, or can be set directly) then it is 94(potentially) a mount trap. Any access to this directory beyond a 95"`stat`" will (normally) cause the `d_op->d_automount()` dentry operation 96to be called. The task of this method is to find the filesystem that 97should be mounted on the directory and to return it. The VFS is 98responsible for actually mounting the root of this filesystem on the 99directory. 100 101autofs doesn't find the filesystem itself but sends a message to the 102automount daemon asking it to find and mount the filesystem. The 103autofs `d_automount` method then waits for the daemon to report that 104everything is ready. It will then return "`NULL`" indicating that the 105mount has already happened. The VFS doesn't try to mount anything but 106follows down the mount that is already there. 107 108This functionality is sufficient for some users of mount traps such 109as NFS which creates traps so that mountpoints on the server can be 110reflected on the client. However it is not sufficient for autofs. As 111mounting onto a directory is considered to be "beyond a `stat`", the 112automount daemon would not be able to mount a filesystem on the 'trap' 113directory without some way to avoid getting caught in the trap. For 114that purpose there is another flag. 115 116**DCACHE_MANAGE_TRANSIT** 117 118If a dentry has DCACHE_MANAGE_TRANSIT set then two very different but 119related behaviours are invoked, both using the `d_op->d_manage()` 120dentry operation. 121 122Firstly, before checking to see if any filesystem is mounted on the 123directory, d_manage() will be called with the `rcu_walk` parameter set 124to `false`. It may return one of three things: 125 126- A return value of zero indicates that there is nothing special 127 about this dentry and normal checks for mounts and automounts 128 should proceed. 129 130 autofs normally returns zero, but first waits for any 131 expiry (automatic unmounting of the mounted filesystem) to 132 complete. This avoids races. 133 134- A return value of `-EISDIR` tells the VFS to ignore any mounts 135 on the directory and to not consider calling `->d_automount()`. 136 This effectively disables the **DCACHE_NEED_AUTOMOUNT** flag 137 causing the directory not be a mount trap after all. 138 139 autofs returns this if it detects that the process performing the 140 lookup is the automount daemon and that the mount has been 141 requested but has not yet completed. How it determines this is 142 discussed later. This allows the automount daemon not to get 143 caught in the mount trap. 144 145 There is a subtlety here. It is possible that a second autofs 146 filesystem can be mounted below the first and for both of them to 147 be managed by the same daemon. For the daemon to be able to mount 148 something on the second it must be able to "walk" down past the 149 first. This means that d_manage cannot *always* return -EISDIR for 150 the automount daemon. It must only return it when a mount has 151 been requested, but has not yet completed. 152 153 `d_manage` also returns `-EISDIR` if the dentry shouldn't be a 154 mount trap, either because it is a symbolic link or because it is 155 not empty. 156 157- Any other negative value is treated as an error and returned 158 to the caller. 159 160 autofs can return 161 162 - -ENOENT if the automount daemon failed to mount anything, 163 - -ENOMEM if it ran out of memory, 164 - -EINTR if a signal arrived while waiting for expiry to 165 complete 166 - or any other error sent down by the automount daemon. 167 168 169The second use case only occurs during an "RCU-walk" and so `rcu_walk` 170will be set. 171 172An RCU-walk is a fast and lightweight process for walking down a 173filename path (i.e. it is like running on tip-toes). RCU-walk cannot 174cope with all situations so when it finds a difficulty it falls back 175to "REF-walk", which is slower but more robust. 176 177RCU-walk will never call `->d_automount`; the filesystems must already 178be mounted or RCU-walk cannot handle the path. 179To determine if a mount-trap is safe for RCU-walk mode it calls 180`->d_manage()` with `rcu_walk` set to `true`. 181 182In this case `d_manage()` must avoid blocking and should avoid taking 183spinlocks if at all possible. Its sole purpose is to determine if it 184would be safe to follow down into any mounted directory and the only 185reason that it might not be is if an expiry of the mount is 186underway. 187 188In the `rcu_walk` case, `d_manage()` cannot return -EISDIR to tell the 189VFS that this is a directory that doesn't require d_automount. If 190`rcu_walk` sees a dentry with DCACHE_NEED_AUTOMOUNT set but nothing 191mounted, it *will* fall back to REF-walk. `d_manage()` cannot make the 192VFS remain in RCU-walk mode, but can only tell it to get out of 193RCU-walk mode by returning `-ECHILD`. 194 195So `d_manage()`, when called with `rcu_walk` set, should either return 196-ECHILD if there is any reason to believe it is unsafe to enter the 197mounted filesystem, otherwise it should return 0. 198 199autofs will return `-ECHILD` if an expiry of the filesystem has been 200initiated or is being considered, otherwise it returns 0. 201 202 203Mountpoint expiry 204----------------- 205 206The VFS has a mechanism for automatically expiring unused mounts, 207much as it can expire any unused dentry information from the dcache. 208This is guided by the MNT_SHRINKABLE flag. This only applies to 209mounts that were created by `d_automount()` returning a filesystem to be 210mounted. As autofs doesn't return such a filesystem but leaves the 211mounting to the automount daemon, it must involve the automount daemon 212in unmounting as well. This also means that autofs has more control 213over expiry. 214 215The VFS also supports "expiry" of mounts using the MNT_EXPIRE flag to 216the `umount` system call. Unmounting with MNT_EXPIRE will fail unless 217a previous attempt had been made, and the filesystem has been inactive 218and untouched since that previous attempt. autofs does not depend on 219this but has its own internal tracking of whether filesystems were 220recently used. This allows individual names in the autofs directory 221to expire separately. 222 223With version 4 of the protocol, the automount daemon can try to 224unmount any filesystems mounted on the autofs filesystem or remove any 225symbolic links or empty directories any time it likes. If the unmount 226or removal is successful the filesystem will be returned to the state 227it was before the mount or creation, so that any access of the name 228will trigger normal auto-mount processing. In particular, `rmdir` and 229`unlink` do not leave negative entries in the dcache as a normal 230filesystem would, so an attempt to access a recently-removed object is 231passed to autofs for handling. 232 233With version 5, this is not safe except for unmounting from top-level 234directories. As lower-level directories are never mount traps, other 235processes will see an empty directory as soon as the filesystem is 236unmounted. So it is generally safest to use the autofs expiry 237protocol described below. 238 239Normally the daemon only wants to remove entries which haven't been 240used for a while. For this purpose autofs maintains a "`last_used`" 241time stamp on each directory or symlink. For symlinks it genuinely 242does record the last time the symlink was "used" or followed to find 243out where it points to. For directories the field is used slightly 244differently. The field is updated at mount time and during expire 245checks if it is found to be in use (ie. open file descriptor or 246process working directory) and during path walks. The update done 247during path walks prevents frequent expire and immediate mount of 248frequently accessed automounts. But in the case where a GUI continually 249access or an application frequently scans an autofs directory tree 250there can be an accumulation of mounts that aren't actually being 251used. To cater for this case the "`strictexpire`" autofs mount option 252can be used to avoid the "`last_used`" update on path walk thereby 253preventing this apparent inability to expire mounts that aren't 254really in use. 255 256The daemon is able to ask autofs if anything is due to be expired, 257using an `ioctl` as discussed later. For a *direct* mount, autofs 258considers if the entire mount-tree can be unmounted or not. For an 259*indirect* mount, autofs considers each of the names in the top level 260directory to determine if any of those can be unmounted and cleaned 261up. 262 263There is an option with indirect mounts to consider each of the leaves 264that has been mounted on instead of considering the top-level names. 265This was originally intended for compatibility with version 4 of autofs 266and should be considered as deprecated for Sun Format automount maps. 267However, it may be used again for amd format mount maps (which are 268generally indirect maps) because the amd automounter allows for the 269setting of an expire timeout for individual mounts. But there are 270some difficulties in making the needed changes for this. 271 272When autofs considers a directory it checks the `last_used` time and 273compares it with the "timeout" value set when the filesystem was 274mounted, though this check is ignored in some cases. It also checks if 275the directory or anything below it is in use. For symbolic links, 276only the `last_used` time is ever considered. 277 278If both appear to support expiring the directory or symlink, an action 279is taken. 280 281There are two ways to ask autofs to consider expiry. The first is to 282use the **AUTOFS_IOC_EXPIRE** ioctl. This only works for indirect 283mounts. If it finds something in the root directory to expire it will 284return the name of that thing. Once a name has been returned the 285automount daemon needs to unmount any filesystems mounted below the 286name normally. As described above, this is unsafe for non-toplevel 287mounts in a version-5 autofs. For this reason the current `automount(8)` 288does not use this ioctl. 289 290The second mechanism uses either the **AUTOFS_DEV_IOCTL_EXPIRE_CMD** or 291the **AUTOFS_IOC_EXPIRE_MULTI** ioctl. This will work for both direct and 292indirect mounts. If it selects an object to expire, it will notify 293the daemon using the notification mechanism described below. This 294will block until the daemon acknowledges the expiry notification. 295This implies that the "`EXPIRE`" ioctl must be sent from a different 296thread than the one which handles notification. 297 298While the ioctl is blocking, the entry is marked as "expiring" and 299`d_manage` will block until the daemon affirms that the unmount has 300completed (together with removing any directories that might have been 301necessary), or has been aborted. 302 303Communicating with autofs: detecting the daemon 304----------------------------------------------- 305 306There are several forms of communication between the automount daemon 307and the filesystem. As we have already seen, the daemon can create and 308remove directories and symlinks using normal filesystem operations. 309autofs knows whether a process requesting some operation is the daemon 310or not based on its process-group id number (see getpgid(1)). 311 312When an autofs filesystem is mounted the pgid of the mounting 313processes is recorded unless the "pgrp=" option is given, in which 314case that number is recorded instead. Any request arriving from a 315process in that process group is considered to come from the daemon. 316If the daemon ever has to be stopped and restarted a new pgid can be 317provided through an ioctl as will be described below. 318 319Communicating with autofs: the event pipe 320----------------------------------------- 321 322When an autofs filesystem is mounted, the 'write' end of a pipe must 323be passed using the 'fd=' mount option. autofs will write 324notification messages to this pipe for the daemon to respond to. 325For version 5, the format of the message is: 326 327 struct autofs_v5_packet { 328 int proto_version; /* Protocol version */ 329 int type; /* Type of packet */ 330 autofs_wqt_t wait_queue_token; 331 __u32 dev; 332 __u64 ino; 333 __u32 uid; 334 __u32 gid; 335 __u32 pid; 336 __u32 tgid; 337 __u32 len; 338 char name[NAME_MAX+1]; 339 }; 340 341where the type is one of 342 343 autofs_ptype_missing_indirect 344 autofs_ptype_expire_indirect 345 autofs_ptype_missing_direct 346 autofs_ptype_expire_direct 347 348so messages can indicate that a name is missing (something tried to 349access it but it isn't there) or that it has been selected for expiry. 350 351The pipe will be set to "packet mode" (equivalent to passing 352`O_DIRECT`) to _pipe2(2)_ so that a read from the pipe will return at 353most one packet, and any unread portion of a packet will be discarded. 354 355The `wait_queue_token` is a unique number which can identify a 356particular request to be acknowledged. When a message is sent over 357the pipe the affected dentry is marked as either "active" or 358"expiring" and other accesses to it block until the message is 359acknowledged using one of the ioctls below with the relevant 360`wait_queue_token`. 361 362Communicating with autofs: root directory ioctls 363------------------------------------------------ 364 365The root directory of an autofs filesystem will respond to a number of 366ioctls. The process issuing the ioctl must have the CAP_SYS_ADMIN 367capability, or must be the automount daemon. 368 369The available ioctl commands are: 370 371- **AUTOFS_IOC_READY**: a notification has been handled. The argument 372 to the ioctl command is the "wait_queue_token" number 373 corresponding to the notification being acknowledged. 374- **AUTOFS_IOC_FAIL**: similar to above, but indicates failure with 375 the error code `ENOENT`. 376- **AUTOFS_IOC_CATATONIC**: Causes the autofs to enter "catatonic" 377 mode meaning that it stops sending notifications to the daemon. 378 This mode is also entered if a write to the pipe fails. 379- **AUTOFS_IOC_PROTOVER**: This returns the protocol version in use. 380- **AUTOFS_IOC_PROTOSUBVER**: Returns the protocol sub-version which 381 is really a version number for the implementation. 382- **AUTOFS_IOC_SETTIMEOUT**: This passes a pointer to an unsigned 383 long. The value is used to set the timeout for expiry, and 384 the current timeout value is stored back through the pointer. 385- **AUTOFS_IOC_ASKUMOUNT**: Returns, in the pointed-to `int`, 1 if 386 the filesystem could be unmounted. This is only a hint as 387 the situation could change at any instant. This call can be 388 used to avoid a more expensive full unmount attempt. 389- **AUTOFS_IOC_EXPIRE**: as described above, this asks if there is 390 anything suitable to expire. A pointer to a packet: 391 392 struct autofs_packet_expire_multi { 393 int proto_version; /* Protocol version */ 394 int type; /* Type of packet */ 395 autofs_wqt_t wait_queue_token; 396 int len; 397 char name[NAME_MAX+1]; 398 }; 399 400 is required. This is filled in with the name of something 401 that can be unmounted or removed. If nothing can be expired, 402 `errno` is set to `EAGAIN`. Even though a `wait_queue_token` 403 is present in the structure, no "wait queue" is established 404 and no acknowledgment is needed. 405- **AUTOFS_IOC_EXPIRE_MULTI**: This is similar to 406 **AUTOFS_IOC_EXPIRE** except that it causes notification to be 407 sent to the daemon, and it blocks until the daemon acknowledges. 408 The argument is an integer which can contain two different flags. 409 410 **AUTOFS_EXP_IMMEDIATE** causes `last_used` time to be ignored 411 and objects are expired if the are not in use. 412 413 **AUTOFS_EXP_FORCED** causes the in use status to be ignored 414 and objects are expired ieven if they are in use. This assumes 415 that the daemon has requested this because it is capable of 416 performing the umount. 417 418 **AUTOFS_EXP_LEAVES** will select a leaf rather than a top-level 419 name to expire. This is only safe when *maxproto* is 4. 420 421Communicating with autofs: char-device ioctls 422--------------------------------------------- 423 424It is not always possible to open the root of an autofs filesystem, 425particularly a *direct* mounted filesystem. If the automount daemon 426is restarted there is no way for it to regain control of existing 427mounts using any of the above communication channels. To address this 428need there is a "miscellaneous" character device (major 10, minor 235) 429which can be used to communicate directly with the autofs filesystem. 430It requires CAP_SYS_ADMIN for access. 431 432The `ioctl`s that can be used on this device are described in a separate 433document `autofs-mount-control.txt`, and are summarised briefly here. 434Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure: 435 436 struct autofs_dev_ioctl { 437 __u32 ver_major; 438 __u32 ver_minor; 439 __u32 size; /* total size of data passed in 440 * including this struct */ 441 __s32 ioctlfd; /* automount command fd */ 442 443 /* Command parameters */ 444 union { 445 struct args_protover protover; 446 struct args_protosubver protosubver; 447 struct args_openmount openmount; 448 struct args_ready ready; 449 struct args_fail fail; 450 struct args_setpipefd setpipefd; 451 struct args_timeout timeout; 452 struct args_requester requester; 453 struct args_expire expire; 454 struct args_askumount askumount; 455 struct args_ismountpoint ismountpoint; 456 }; 457 458 char path[0]; 459 }; 460 461For the **OPEN_MOUNT** and **IS_MOUNTPOINT** commands, the target 462filesystem is identified by the `path`. All other commands identify 463the filesystem by the `ioctlfd` which is a file descriptor open on the 464root, and which can be returned by **OPEN_MOUNT**. 465 466The `ver_major` and `ver_minor` are in/out parameters which check that 467the requested version is supported, and report the maximum version 468that the kernel module can support. 469 470Commands are: 471 472- **AUTOFS_DEV_IOCTL_VERSION_CMD**: does nothing, except validate and 473 set version numbers. 474- **AUTOFS_DEV_IOCTL_OPENMOUNT_CMD**: return an open file descriptor 475 on the root of an autofs filesystem. The filesystem is identified 476 by name and device number, which is stored in `openmount.devid`. 477 Device numbers for existing filesystems can be found in 478 `/proc/self/mountinfo`. 479- **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`. 480- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in 481 catatonic mode, this can provide the write end of a new pipe 482 in `setpipefd.pipefd` to re-establish communication with a daemon. 483 The process group of the calling process is used to identify the 484 daemon. 485- **AUTOFS_DEV_IOCTL_REQUESTER_CMD**: `path` should be a 486 name within the filesystem that has been auto-mounted on. 487 On successful return, `requester.uid` and `requester.gid` will be 488 the UID and GID of the process which triggered that mount. 489- **AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD**: Check if path is a 490 mountpoint of a particular type - see separate documentation for 491 details. 492- **AUTOFS_DEV_IOCTL_PROTOVER_CMD**: 493- **AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD**: 494- **AUTOFS_DEV_IOCTL_READY_CMD**: 495- **AUTOFS_DEV_IOCTL_FAIL_CMD**: 496- **AUTOFS_DEV_IOCTL_CATATONIC_CMD**: 497- **AUTOFS_DEV_IOCTL_TIMEOUT_CMD**: 498- **AUTOFS_DEV_IOCTL_EXPIRE_CMD**: 499- **AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD**: These all have the same 500 function as the similarly named **AUTOFS_IOC** ioctls, except 501 that **FAIL** can be given an explicit error number in `fail.status` 502 instead of assuming `ENOENT`, and this **EXPIRE** command 503 corresponds to **AUTOFS_IOC_EXPIRE_MULTI**. 504 505Catatonic mode 506-------------- 507 508As mentioned, an autofs mount can enter "catatonic" mode. This 509happens if a write to the notification pipe fails, or if it is 510explicitly requested by an `ioctl`. 511 512When entering catatonic mode, the pipe is closed and any pending 513notifications are acknowledged with the error `ENOENT`. 514 515Once in catatonic mode attempts to access non-existing names will 516result in `ENOENT` while attempts to access existing directories will 517be treated in the same way as if they came from the daemon, so mount 518traps will not fire. 519 520When the filesystem is mounted a _uid_ and _gid_ can be given which 521set the ownership of directories and symbolic links. When the 522filesystem is in catatonic mode, any process with a matching UID can 523create directories or symlinks in the root directory, but not in other 524directories. 525 526Catatonic mode can only be left via the 527**AUTOFS_DEV_IOCTL_OPENMOUNT_CMD** ioctl on the `/dev/autofs`. 528 529The "ignore" mount option 530------------------------- 531 532The "ignore" mount option can be used to provide a generic indicator 533to applications that the mount entry should be ignored when displaying 534mount information. 535 536In other OSes that provide autofs and that provide a mount list to user 537space based on the kernel mount list a no-op mount option ("ignore" is 538the one use on the most common OSes) is allowed so that autofs file 539system users can optionally use it. 540 541This is intended to be used by user space programs to exclude autofs 542mounts from consideration when reading the mounts list. 543 544autofs, name spaces, and shared mounts 545-------------------------------------- 546 547With bind mounts and name spaces it is possible for an autofs 548filesystem to appear at multiple places in one or more filesystem 549name spaces. For this to work sensibly, the autofs filesystem should 550always be mounted "shared". e.g. 551 552> `mount --make-shared /autofs/mount/point` 553 554The automount daemon is only able to manage a single mount location for 555an autofs filesystem and if mounts on that are not 'shared', other 556locations will not behave as expected. In particular access to those 557other locations will likely result in the `ELOOP` error 558 559> Too many levels of symbolic links 560