1 2BTRFS 3===== 4 5Btrfs is a copy on write filesystem for Linux aimed at 6implementing advanced features while focusing on fault tolerance, 7repair and easy administration. Initially developed by Oracle, Btrfs 8is licensed under the GPL and open for contribution from anyone. 9 10Linux has a wealth of filesystems to choose from, but we are facing a 11number of challenges with scaling to the large storage subsystems that 12are becoming common in today's data centers. Filesystems need to scale 13in their ability to address and manage large storage, and also in 14their ability to detect, repair and tolerate errors in the data stored 15on disk. Btrfs is under heavy development, and is not suitable for 16any uses other than benchmarking and review. The Btrfs disk format is 17not yet finalized. 18 19The main Btrfs features include: 20 21 * Extent based file storage (2^64 max file size) 22 * Space efficient packing of small files 23 * Space efficient indexed directories 24 * Dynamic inode allocation 25 * Writable snapshots 26 * Subvolumes (separate internal filesystem roots) 27 * Object level mirroring and striping 28 * Checksums on data and metadata (multiple algorithms available) 29 * Compression 30 * Integrated multiple device support, with several raid algorithms 31 * Online filesystem check (not yet implemented) 32 * Very fast offline filesystem check 33 * Efficient incremental backup and FS mirroring (not yet implemented) 34 * Online filesystem defragmentation 35 36 37Mount Options 38============= 39 40When mounting a btrfs filesystem, the following option are accepted. 41Options with (*) are default options and will not show in the mount options. 42 43 alloc_start=<bytes> 44 Debugging option to force all block allocations above a certain 45 byte threshold on each block device. The value is specified in 46 bytes, optionally with a K, M, or G suffix, case insensitive. 47 Default is 1MB. 48 49 noautodefrag(*) 50 autodefrag 51 Disable/enable auto defragmentation. 52 Auto defragmentation detects small random writes into files and queue 53 them up for the defrag process. Works best for small files; 54 Not well suited for large database workloads. 55 56 check_int 57 check_int_data 58 check_int_print_mask=<value> 59 These debugging options control the behavior of the integrity checking 60 module (the BTRFS_FS_CHECK_INTEGRITY config option required). 61 62 check_int enables the integrity checker module, which examines all 63 block write requests to ensure on-disk consistency, at a large 64 memory and CPU cost. 65 66 check_int_data includes extent data in the integrity checks, and 67 implies the check_int option. 68 69 check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values 70 as defined in fs/btrfs/check-integrity.c, to control the integrity 71 checker module behavior. 72 73 See comments at the top of fs/btrfs/check-integrity.c for more info. 74 75 commit=<seconds> 76 Set the interval of periodic commit, 30 seconds by default. Higher 77 values defer data being synced to permanent storage with obvious 78 consequences when the system crashes. The upper bound is not forced, 79 but a warning is printed if it's more than 300 seconds (5 minutes). 80 81 compress 82 compress=<type> 83 compress-force 84 compress-force=<type> 85 Control BTRFS file data compression. Type may be specified as "zlib" 86 "lzo" or "no" (for no compression, used for remounting). If no type 87 is specified, zlib is used. If compress-force is specified, 88 all files will be compressed, whether or not they compress well. 89 If compression is enabled, nodatacow and nodatasum are disabled. 90 91 degraded 92 Allow mounts to continue with missing devices. A read-write mount may 93 fail with too many devices missing, for example if a stripe member 94 is completely missing. 95 96 device=<devicepath> 97 Specify a device during mount so that ioctls on the control device 98 can be avoided. Especially useful when trying to mount a multi-device 99 setup as root. May be specified multiple times for multiple devices. 100 101 nodiscard(*) 102 discard 103 Disable/enable discard mount option. 104 Discard issues frequent commands to let the block device reclaim space 105 freed by the filesystem. 106 This is useful for SSD devices, thinly provisioned 107 LUNs and virtual machine images, but may have a significant 108 performance impact. (The fstrim command is also available to 109 initiate batch trims from userspace). 110 111 noenospc_debug(*) 112 enospc_debug 113 Disable/enable debugging option to be more verbose in some ENOSPC conditions. 114 115 fatal_errors=<action> 116 Action to take when encountering a fatal error: 117 "bug" - BUG() on a fatal error. This is the default. 118 "panic" - panic() on a fatal error. 119 120 noflushoncommit(*) 121 flushoncommit 122 The 'flushoncommit' mount option forces any data dirtied by a write in a 123 prior transaction to commit as part of the current commit. This makes 124 the committed state a fully consistent view of the file system from the 125 application's perspective (i.e., it includes all completed file system 126 operations). This was previously the behavior only when a snapshot is 127 created. 128 129 inode_cache 130 Enable free inode number caching. Defaults to off due to an overflow 131 problem when the free space crcs don't fit inside a single page. 132 133 max_inline=<bytes> 134 Specify the maximum amount of space, in bytes, that can be inlined in 135 a metadata B-tree leaf. The value is specified in bytes, optionally 136 with a K, M, or G suffix, case insensitive. In practice, this value 137 is limited by the root sector size, with some space unavailable due 138 to leaf headers. For a 4k sectorsize, max inline data is ~3900 bytes. 139 140 metadata_ratio=<value> 141 Specify that 1 metadata chunk should be allocated after every <value> 142 data chunks. Off by default. 143 144 acl(*) 145 noacl 146 Enable/disable support for Posix Access Control Lists (ACLs). See the 147 acl(5) manual page for more information about ACLs. 148 149 barrier(*) 150 nobarrier 151 Enable/disable the use of block layer write barriers. Write barriers 152 ensure that certain IOs make it through the device cache and are on 153 persistent storage. If disabled on a device with a volatile 154 (non-battery-backed) write-back cache, nobarrier option will lead to 155 filesystem corruption on a system crash or power loss. 156 157 datacow(*) 158 nodatacow 159 Enable/disable data copy-on-write for newly created files. 160 Nodatacow implies nodatasum, and disables all compression. 161 162 datasum(*) 163 nodatasum 164 Enable/disable data checksumming for newly created files. 165 Datasum implies datacow. 166 167 treelog(*) 168 notreelog 169 Enable/disable the tree logging used for fsync and O_SYNC writes. 170 171 recovery 172 Enable autorecovery attempts if a bad tree root is found at mount time. 173 Currently this scans a list of several previous tree roots and tries to 174 use the first readable. 175 176 rescan_uuid_tree 177 Force check and rebuild procedure of the UUID tree. This should not 178 normally be needed. 179 180 skip_balance 181 Skip automatic resume of interrupted balance operation after mount. 182 May be resumed with "btrfs balance resume." 183 184 space_cache (*) 185 Enable the on-disk freespace cache. 186 nospace_cache 187 Disable freespace cache loading without clearing the cache. 188 clear_cache 189 Force clearing and rebuilding of the disk space cache if something 190 has gone wrong. 191 192 ssd 193 nossd 194 ssd_spread 195 Options to control ssd allocation schemes. By default, BTRFS will 196 enable or disable ssd allocation heuristics depending on whether a 197 rotational or nonrotational disk is in use. The ssd and nossd options 198 can override this autodetection. 199 200 The ssd_spread mount option attempts to allocate into big chunks 201 of unused space, and may perform better on low-end ssds. ssd_spread 202 implies ssd, enabling all other ssd heuristics as well. 203 204 subvol=<path> 205 Mount subvolume at <path> rather than the root subvolume. <path> is 206 relative to the top level subvolume. 207 208 subvolid=<ID> 209 Mount subvolume specified by an ID number rather than the root subvolume. 210 This allows mounting of subvolumes which are not in the root of the mounted 211 filesystem. 212 You can use "btrfs subvolume list" to see subvolume ID numbers. 213 214 subvolrootid=<objectid> (deprecated) 215 Mount subvolume specified by <objectid> rather than the root subvolume. 216 This allows mounting of subvolumes which are not in the root of the mounted 217 filesystem. 218 You can use "btrfs subvolume show " to see the object ID for a subvolume. 219 220 thread_pool=<number> 221 The number of worker threads to allocate. The default number is equal 222 to the number of CPUs + 2, or 8, whichever is smaller. 223 224 user_subvol_rm_allowed 225 Allow subvolumes to be deleted by a non-root user. Use with caution. 226 227MAILING LIST 228============ 229 230There is a Btrfs mailing list hosted on vger.kernel.org. You can 231find details on how to subscribe here: 232 233http://vger.kernel.org/vger-lists.html#linux-btrfs 234 235Mailing list archives are available from gmane: 236 237http://dir.gmane.org/gmane.comp.file-systems.btrfs 238 239 240 241IRC 242=== 243 244Discussion of Btrfs also occurs on the #btrfs channel of the Freenode 245IRC network. 246 247 248 249 UTILITIES 250 ========= 251 252Userspace tools for creating and manipulating Btrfs file systems are 253available from the git repository at the following location: 254 255 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git 256 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git 257 258These include the following tools: 259 260* mkfs.btrfs: create a filesystem 261 262* btrfs: a single tool to manage the filesystems, refer to the manpage for more details 263 264* 'btrfsck' or 'btrfs check': do a consistency check of the filesystem 265 266Other tools for specific tasks: 267 268* btrfs-convert: in-place conversion from ext2/3/4 filesystems 269 270* btrfs-image: dump filesystem metadata for debugging 271