1.. SPDX-License-Identifier: GPL-2.0 2 3========= 4IP Sysctl 5========= 6 7/proc/sys/net/ipv4/* Variables 8============================== 9 10ip_forward - BOOLEAN 11 - 0 - disabled (default) 12 - not 0 - enabled 13 14 Forward Packets between interfaces. 15 16 This variable is special, its change resets all configuration 17 parameters to their default state (RFC1122 for hosts, RFC1812 18 for routers) 19 20ip_default_ttl - INTEGER 21 Default value of TTL field (Time To Live) for outgoing (but not 22 forwarded) IP packets. Should be between 1 and 255 inclusive. 23 Default: 64 (as recommended by RFC1700) 24 25ip_no_pmtu_disc - INTEGER 26 Disable Path MTU Discovery. If enabled in mode 1 and a 27 fragmentation-required ICMP is received, the PMTU to this 28 destination will be set to the smallest of the old MTU to 29 this destination and min_pmtu (see below). You will need 30 to raise min_pmtu to the smallest interface MTU on your system 31 manually if you want to avoid locally generated fragments. 32 33 In mode 2 incoming Path MTU Discovery messages will be 34 discarded. Outgoing frames are handled the same as in mode 1, 35 implicitly setting IP_PMTUDISC_DONT on every created socket. 36 37 Mode 3 is a hardened pmtu discover mode. The kernel will only 38 accept fragmentation-needed errors if the underlying protocol 39 can verify them besides a plain socket lookup. Current 40 protocols for which pmtu events will be honored are TCP, SCTP 41 and DCCP as they verify e.g. the sequence number or the 42 association. This mode should not be enabled globally but is 43 only intended to secure e.g. name servers in namespaces where 44 TCP path mtu must still work but path MTU information of other 45 protocols should be discarded. If enabled globally this mode 46 could break other protocols. 47 48 Possible values: 0-3 49 50 Default: FALSE 51 52min_pmtu - INTEGER 53 default 552 - minimum Path MTU. Unless this is changed mannually, 54 each cached pmtu will never be lower than this setting. 55 56ip_forward_use_pmtu - BOOLEAN 57 By default we don't trust protocol path MTUs while forwarding 58 because they could be easily forged and can lead to unwanted 59 fragmentation by the router. 60 You only need to enable this if you have user-space software 61 which tries to discover path mtus by itself and depends on the 62 kernel honoring this information. This is normally not the 63 case. 64 65 Default: 0 (disabled) 66 67 Possible values: 68 69 - 0 - disabled 70 - 1 - enabled 71 72fwmark_reflect - BOOLEAN 73 Controls the fwmark of kernel-generated IPv4 reply packets that are not 74 associated with a socket for example, TCP RSTs or ICMP echo replies). 75 If unset, these packets have a fwmark of zero. If set, they have the 76 fwmark of the packet they are replying to. 77 78 Default: 0 79 80fib_multipath_use_neigh - BOOLEAN 81 Use status of existing neighbor entry when determining nexthop for 82 multipath routes. If disabled, neighbor information is not used and 83 packets could be directed to a failed nexthop. Only valid for kernels 84 built with CONFIG_IP_ROUTE_MULTIPATH enabled. 85 86 Default: 0 (disabled) 87 88 Possible values: 89 90 - 0 - disabled 91 - 1 - enabled 92 93fib_multipath_hash_policy - INTEGER 94 Controls which hash policy to use for multipath routes. Only valid 95 for kernels built with CONFIG_IP_ROUTE_MULTIPATH enabled. 96 97 Default: 0 (Layer 3) 98 99 Possible values: 100 101 - 0 - Layer 3 102 - 1 - Layer 4 103 - 2 - Layer 3 or inner Layer 3 if present 104 - 3 - Custom multipath hash. Fields used for multipath hash calculation 105 are determined by fib_multipath_hash_fields sysctl 106 107fib_multipath_hash_fields - UNSIGNED INTEGER 108 When fib_multipath_hash_policy is set to 3 (custom multipath hash), the 109 fields used for multipath hash calculation are determined by this 110 sysctl. 111 112 This value is a bitmask which enables various fields for multipath hash 113 calculation. 114 115 Possible fields are: 116 117 ====== ============================ 118 0x0001 Source IP address 119 0x0002 Destination IP address 120 0x0004 IP protocol 121 0x0008 Unused (Flow Label) 122 0x0010 Source port 123 0x0020 Destination port 124 0x0040 Inner source IP address 125 0x0080 Inner destination IP address 126 0x0100 Inner IP protocol 127 0x0200 Inner Flow Label 128 0x0400 Inner source port 129 0x0800 Inner destination port 130 ====== ============================ 131 132 Default: 0x0007 (source IP, destination IP and IP protocol) 133 134fib_sync_mem - UNSIGNED INTEGER 135 Amount of dirty memory from fib entries that can be backlogged before 136 synchronize_rcu is forced. 137 138 Default: 512kB Minimum: 64kB Maximum: 64MB 139 140ip_forward_update_priority - INTEGER 141 Whether to update SKB priority from "TOS" field in IPv4 header after it 142 is forwarded. The new SKB priority is mapped from TOS field value 143 according to an rt_tos2priority table (see e.g. man tc-prio). 144 145 Default: 1 (Update priority.) 146 147 Possible values: 148 149 - 0 - Do not update priority. 150 - 1 - Update priority. 151 152route/max_size - INTEGER 153 Maximum number of routes allowed in the kernel. Increase 154 this when using large numbers of interfaces and/or routes. 155 156 From linux kernel 3.6 onwards, this is deprecated for ipv4 157 as route cache is no longer used. 158 159neigh/default/gc_thresh1 - INTEGER 160 Minimum number of entries to keep. Garbage collector will not 161 purge entries if there are fewer than this number. 162 163 Default: 128 164 165neigh/default/gc_thresh2 - INTEGER 166 Threshold when garbage collector becomes more aggressive about 167 purging entries. Entries older than 5 seconds will be cleared 168 when over this number. 169 170 Default: 512 171 172neigh/default/gc_thresh3 - INTEGER 173 Maximum number of non-PERMANENT neighbor entries allowed. Increase 174 this when using large numbers of interfaces and when communicating 175 with large numbers of directly-connected peers. 176 177 Default: 1024 178 179neigh/default/unres_qlen_bytes - INTEGER 180 The maximum number of bytes which may be used by packets 181 queued for each unresolved address by other network layers. 182 (added in linux 3.3) 183 184 Setting negative value is meaningless and will return error. 185 186 Default: SK_WMEM_MAX, (same as net.core.wmem_default). 187 188 Exact value depends on architecture and kernel options, 189 but should be enough to allow queuing 256 packets 190 of medium size. 191 192neigh/default/unres_qlen - INTEGER 193 The maximum number of packets which may be queued for each 194 unresolved address by other network layers. 195 196 (deprecated in linux 3.3) : use unres_qlen_bytes instead. 197 198 Prior to linux 3.3, the default value is 3 which may cause 199 unexpected packet loss. The current default value is calculated 200 according to default value of unres_qlen_bytes and true size of 201 packet. 202 203 Default: 101 204 205neigh/default/interval_probe_time_ms - INTEGER 206 The probe interval for neighbor entries with NTF_MANAGED flag, 207 the min value is 1. 208 209 Default: 5000 210 211mtu_expires - INTEGER 212 Time, in seconds, that cached PMTU information is kept. 213 214min_adv_mss - INTEGER 215 The advertised MSS depends on the first hop route MTU, but will 216 never be lower than this setting. 217 218fib_notify_on_flag_change - INTEGER 219 Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/ 220 RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed. 221 222 After installing a route to the kernel, user space receives an 223 acknowledgment, which means the route was installed in the kernel, 224 but not necessarily in hardware. 225 It is also possible for a route already installed in hardware to change 226 its action and therefore its flags. For example, a host route that is 227 trapping packets can be "promoted" to perform decapsulation following 228 the installation of an IPinIP/VXLAN tunnel. 229 The notifications will indicate to user-space the state of the route. 230 231 Default: 0 (Do not emit notifications.) 232 233 Possible values: 234 235 - 0 - Do not emit notifications. 236 - 1 - Emit notifications. 237 - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change. 238 239IP Fragmentation: 240 241ipfrag_high_thresh - LONG INTEGER 242 Maximum memory used to reassemble IP fragments. 243 244ipfrag_low_thresh - LONG INTEGER 245 (Obsolete since linux-4.17) 246 Maximum memory used to reassemble IP fragments before the kernel 247 begins to remove incomplete fragment queues to free up resources. 248 The kernel still accepts new fragments for defragmentation. 249 250ipfrag_time - INTEGER 251 Time in seconds to keep an IP fragment in memory. 252 253ipfrag_max_dist - INTEGER 254 ipfrag_max_dist is a non-negative integer value which defines the 255 maximum "disorder" which is allowed among fragments which share a 256 common IP source address. Note that reordering of packets is 257 not unusual, but if a large number of fragments arrive from a source 258 IP address while a particular fragment queue remains incomplete, it 259 probably indicates that one or more fragments belonging to that queue 260 have been lost. When ipfrag_max_dist is positive, an additional check 261 is done on fragments before they are added to a reassembly queue - if 262 ipfrag_max_dist (or more) fragments have arrived from a particular IP 263 address between additions to any IP fragment queue using that source 264 address, it's presumed that one or more fragments in the queue are 265 lost. The existing fragment queue will be dropped, and a new one 266 started. An ipfrag_max_dist value of zero disables this check. 267 268 Using a very small value, e.g. 1 or 2, for ipfrag_max_dist can 269 result in unnecessarily dropping fragment queues when normal 270 reordering of packets occurs, which could lead to poor application 271 performance. Using a very large value, e.g. 50000, increases the 272 likelihood of incorrectly reassembling IP fragments that originate 273 from different IP datagrams, which could result in data corruption. 274 Default: 64 275 276bc_forwarding - INTEGER 277 bc_forwarding enables the feature described in rfc1812#section-5.3.5.2 278 and rfc2644. It allows the router to forward directed broadcast. 279 To enable this feature, the 'all' entry and the input interface entry 280 should be set to 1. 281 Default: 0 282 283INET peer storage 284================= 285 286inet_peer_threshold - INTEGER 287 The approximate size of the storage. Starting from this threshold 288 entries will be thrown aggressively. This threshold also determines 289 entries' time-to-live and time intervals between garbage collection 290 passes. More entries, less time-to-live, less GC interval. 291 292inet_peer_minttl - INTEGER 293 Minimum time-to-live of entries. Should be enough to cover fragment 294 time-to-live on the reassembling side. This minimum time-to-live is 295 guaranteed if the pool size is less than inet_peer_threshold. 296 Measured in seconds. 297 298inet_peer_maxttl - INTEGER 299 Maximum time-to-live of entries. Unused entries will expire after 300 this period of time if there is no memory pressure on the pool (i.e. 301 when the number of entries in the pool is very small). 302 Measured in seconds. 303 304TCP variables 305============= 306 307somaxconn - INTEGER 308 Limit of socket listen() backlog, known in userspace as SOMAXCONN. 309 Defaults to 4096. (Was 128 before linux-5.4) 310 See also tcp_max_syn_backlog for additional tuning for TCP sockets. 311 312tcp_abort_on_overflow - BOOLEAN 313 If listening service is too slow to accept new connections, 314 reset them. Default state is FALSE. It means that if overflow 315 occurred due to a burst, connection will recover. Enable this 316 option _only_ if you are really sure that listening daemon 317 cannot be tuned to accept connections faster. Enabling this 318 option can harm clients of your server. 319 320tcp_adv_win_scale - INTEGER 321 Count buffering overhead as bytes/2^tcp_adv_win_scale 322 (if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), 323 if it is <= 0. 324 325 Possible values are [-31, 31], inclusive. 326 327 Default: 1 328 329tcp_allowed_congestion_control - STRING 330 Show/set the congestion control choices available to non-privileged 331 processes. The list is a subset of those listed in 332 tcp_available_congestion_control. 333 334 Default is "reno" and the default setting (tcp_congestion_control). 335 336tcp_app_win - INTEGER 337 Reserve max(window/2^tcp_app_win, mss) of window for application 338 buffer. Value 0 is special, it means that nothing is reserved. 339 340 Possible values are [0, 31], inclusive. 341 342 Default: 31 343 344tcp_autocorking - BOOLEAN 345 Enable TCP auto corking : 346 When applications do consecutive small write()/sendmsg() system calls, 347 we try to coalesce these small writes as much as possible, to lower 348 total amount of sent packets. This is done if at least one prior 349 packet for the flow is waiting in Qdisc queues or device transmit 350 queue. Applications can still use TCP_CORK for optimal behavior 351 when they know how/when to uncork their sockets. 352 353 Default : 1 354 355tcp_available_congestion_control - STRING 356 Shows the available congestion control choices that are registered. 357 More congestion control algorithms may be available as modules, 358 but not loaded. 359 360tcp_base_mss - INTEGER 361 The initial value of search_low to be used by the packetization layer 362 Path MTU discovery (MTU probing). If MTU probing is enabled, 363 this is the initial MSS used by the connection. 364 365tcp_mtu_probe_floor - INTEGER 366 If MTU probing is enabled this caps the minimum MSS used for search_low 367 for the connection. 368 369 Default : 48 370 371tcp_min_snd_mss - INTEGER 372 TCP SYN and SYNACK messages usually advertise an ADVMSS option, 373 as described in RFC 1122 and RFC 6691. 374 375 If this ADVMSS option is smaller than tcp_min_snd_mss, 376 it is silently capped to tcp_min_snd_mss. 377 378 Default : 48 (at least 8 bytes of payload per segment) 379 380tcp_congestion_control - STRING 381 Set the congestion control algorithm to be used for new 382 connections. The algorithm "reno" is always available, but 383 additional choices may be available based on kernel configuration. 384 Default is set as part of kernel configuration. 385 For passive connections, the listener congestion control choice 386 is inherited. 387 388 [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ] 389 390tcp_dsack - BOOLEAN 391 Allows TCP to send "duplicate" SACKs. 392 393tcp_early_retrans - INTEGER 394 Tail loss probe (TLP) converts RTOs occurring due to tail 395 losses into fast recovery (draft-ietf-tcpm-rack). Note that 396 TLP requires RACK to function properly (see tcp_recovery below) 397 398 Possible values: 399 400 - 0 disables TLP 401 - 3 or 4 enables TLP 402 403 Default: 3 404 405tcp_ecn - INTEGER 406 Control use of Explicit Congestion Notification (ECN) by TCP. 407 ECN is used only when both ends of the TCP connection indicate 408 support for it. This feature is useful in avoiding losses due 409 to congestion by allowing supporting routers to signal 410 congestion before having to drop packets. 411 412 Possible values are: 413 414 = ===================================================== 415 0 Disable ECN. Neither initiate nor accept ECN. 416 1 Enable ECN when requested by incoming connections and 417 also request ECN on outgoing connection attempts. 418 2 Enable ECN when requested by incoming connections 419 but do not request ECN on outgoing connections. 420 = ===================================================== 421 422 Default: 2 423 424tcp_ecn_fallback - BOOLEAN 425 If the kernel detects that ECN connection misbehaves, enable fall 426 back to non-ECN. Currently, this knob implements the fallback 427 from RFC3168, section 6.1.1.1., but we reserve that in future, 428 additional detection mechanisms could be implemented under this 429 knob. The value is not used, if tcp_ecn or per route (or congestion 430 control) ECN settings are disabled. 431 432 Default: 1 (fallback enabled) 433 434tcp_fack - BOOLEAN 435 This is a legacy option, it has no effect anymore. 436 437tcp_fin_timeout - INTEGER 438 The length of time an orphaned (no longer referenced by any 439 application) connection will remain in the FIN_WAIT_2 state 440 before it is aborted at the local end. While a perfectly 441 valid "receive only" state for an un-orphaned connection, an 442 orphaned connection in FIN_WAIT_2 state could otherwise wait 443 forever for the remote to close its end of the connection. 444 445 Cf. tcp_max_orphans 446 447 Default: 60 seconds 448 449tcp_frto - INTEGER 450 Enables Forward RTO-Recovery (F-RTO) defined in RFC5682. 451 F-RTO is an enhanced recovery algorithm for TCP retransmission 452 timeouts. It is particularly beneficial in networks where the 453 RTT fluctuates (e.g., wireless). F-RTO is sender-side only 454 modification. It does not require any support from the peer. 455 456 By default it's enabled with a non-zero value. 0 disables F-RTO. 457 458tcp_fwmark_accept - BOOLEAN 459 If set, incoming connections to listening sockets that do not have a 460 socket mark will set the mark of the accepting socket to the fwmark of 461 the incoming SYN packet. This will cause all packets on that connection 462 (starting from the first SYNACK) to be sent with that fwmark. The 463 listening socket's mark is unchanged. Listening sockets that already 464 have a fwmark set via setsockopt(SOL_SOCKET, SO_MARK, ...) are 465 unaffected. 466 467 Default: 0 468 469tcp_invalid_ratelimit - INTEGER 470 Limit the maximal rate for sending duplicate acknowledgments 471 in response to incoming TCP packets that are for an existing 472 connection but that are invalid due to any of these reasons: 473 474 (a) out-of-window sequence number, 475 (b) out-of-window acknowledgment number, or 476 (c) PAWS (Protection Against Wrapped Sequence numbers) check failure 477 478 This can help mitigate simple "ack loop" DoS attacks, wherein 479 a buggy or malicious middlebox or man-in-the-middle can 480 rewrite TCP header fields in manner that causes each endpoint 481 to think that the other is sending invalid TCP segments, thus 482 causing each side to send an unterminating stream of duplicate 483 acknowledgments for invalid segments. 484 485 Using 0 disables rate-limiting of dupacks in response to 486 invalid segments; otherwise this value specifies the minimal 487 space between sending such dupacks, in milliseconds. 488 489 Default: 500 (milliseconds). 490 491tcp_keepalive_time - INTEGER 492 How often TCP sends out keepalive messages when keepalive is enabled. 493 Default: 2hours. 494 495tcp_keepalive_probes - INTEGER 496 How many keepalive probes TCP sends out, until it decides that the 497 connection is broken. Default value: 9. 498 499tcp_keepalive_intvl - INTEGER 500 How frequently the probes are send out. Multiplied by 501 tcp_keepalive_probes it is time to kill not responding connection, 502 after probes started. Default value: 75sec i.e. connection 503 will be aborted after ~11 minutes of retries. 504 505tcp_l3mdev_accept - BOOLEAN 506 Enables child sockets to inherit the L3 master device index. 507 Enabling this option allows a "global" listen socket to work 508 across L3 master domains (e.g., VRFs) with connected sockets 509 derived from the listen socket to be bound to the L3 domain in 510 which the packets originated. Only valid when the kernel was 511 compiled with CONFIG_NET_L3_MASTER_DEV. 512 513 Default: 0 (disabled) 514 515tcp_low_latency - BOOLEAN 516 This is a legacy option, it has no effect anymore. 517 518tcp_max_orphans - INTEGER 519 Maximal number of TCP sockets not attached to any user file handle, 520 held by system. If this number is exceeded orphaned connections are 521 reset immediately and warning is printed. This limit exists 522 only to prevent simple DoS attacks, you _must_ not rely on this 523 or lower the limit artificially, but rather increase it 524 (probably, after increasing installed memory), 525 if network conditions require more than default value, 526 and tune network services to linger and kill such states 527 more aggressively. Let me to remind again: each orphan eats 528 up to ~64K of unswappable memory. 529 530tcp_max_syn_backlog - INTEGER 531 Maximal number of remembered connection requests (SYN_RECV), 532 which have not received an acknowledgment from connecting client. 533 534 This is a per-listener limit. 535 536 The minimal value is 128 for low memory machines, and it will 537 increase in proportion to the memory of machine. 538 539 If server suffers from overload, try increasing this number. 540 541 Remember to also check /proc/sys/net/core/somaxconn 542 A SYN_RECV request socket consumes about 304 bytes of memory. 543 544tcp_max_tw_buckets - INTEGER 545 Maximal number of timewait sockets held by system simultaneously. 546 If this number is exceeded time-wait socket is immediately destroyed 547 and warning is printed. This limit exists only to prevent 548 simple DoS attacks, you _must_ not lower the limit artificially, 549 but rather increase it (probably, after increasing installed memory), 550 if network conditions require more than default value. 551 552tcp_mem - vector of 3 INTEGERs: min, pressure, max 553 min: below this number of pages TCP is not bothered about its 554 memory appetite. 555 556 pressure: when amount of memory allocated by TCP exceeds this number 557 of pages, TCP moderates its memory consumption and enters memory 558 pressure mode, which is exited when memory consumption falls 559 under "min". 560 561 max: number of pages allowed for queueing by all TCP sockets. 562 563 Defaults are calculated at boot time from amount of available 564 memory. 565 566tcp_min_rtt_wlen - INTEGER 567 The window length of the windowed min filter to track the minimum RTT. 568 A shorter window lets a flow more quickly pick up new (higher) 569 minimum RTT when it is moved to a longer path (e.g., due to traffic 570 engineering). A longer window makes the filter more resistant to RTT 571 inflations such as transient congestion. The unit is seconds. 572 573 Possible values: 0 - 86400 (1 day) 574 575 Default: 300 576 577tcp_moderate_rcvbuf - BOOLEAN 578 If set, TCP performs receive buffer auto-tuning, attempting to 579 automatically size the buffer (no greater than tcp_rmem[2]) to 580 match the size required by the path for full throughput. Enabled by 581 default. 582 583tcp_mtu_probing - INTEGER 584 Controls TCP Packetization-Layer Path MTU Discovery. Takes three 585 values: 586 587 - 0 - Disabled 588 - 1 - Disabled by default, enabled when an ICMP black hole detected 589 - 2 - Always enabled, use initial MSS of tcp_base_mss. 590 591tcp_probe_interval - UNSIGNED INTEGER 592 Controls how often to start TCP Packetization-Layer Path MTU 593 Discovery reprobe. The default is reprobing every 10 minutes as 594 per RFC4821. 595 596tcp_probe_threshold - INTEGER 597 Controls when TCP Packetization-Layer Path MTU Discovery probing 598 will stop in respect to the width of search range in bytes. Default 599 is 8 bytes. 600 601tcp_no_metrics_save - BOOLEAN 602 By default, TCP saves various connection metrics in the route cache 603 when the connection closes, so that connections established in the 604 near future can use these to set initial conditions. Usually, this 605 increases overall performance, but may sometimes cause performance 606 degradation. If set, TCP will not cache metrics on closing 607 connections. 608 609tcp_no_ssthresh_metrics_save - BOOLEAN 610 Controls whether TCP saves ssthresh metrics in the route cache. 611 612 Default is 1, which disables ssthresh metrics. 613 614tcp_orphan_retries - INTEGER 615 This value influences the timeout of a locally closed TCP connection, 616 when RTO retransmissions remain unacknowledged. 617 See tcp_retries2 for more details. 618 619 The default value is 8. 620 621 If your machine is a loaded WEB server, 622 you should think about lowering this value, such sockets 623 may consume significant resources. Cf. tcp_max_orphans. 624 625tcp_recovery - INTEGER 626 This value is a bitmap to enable various experimental loss recovery 627 features. 628 629 ========= ============================================================= 630 RACK: 0x1 enables the RACK loss detection for fast detection of lost 631 retransmissions and tail drops. It also subsumes and disables 632 RFC6675 recovery for SACK connections. 633 634 RACK: 0x2 makes RACK's reordering window static (min_rtt/4). 635 636 RACK: 0x4 disables RACK's DUPACK threshold heuristic 637 ========= ============================================================= 638 639 Default: 0x1 640 641tcp_reflect_tos - BOOLEAN 642 For listening sockets, reuse the DSCP value of the initial SYN message 643 for outgoing packets. This allows to have both directions of a TCP 644 stream to use the same DSCP value, assuming DSCP remains unchanged for 645 the lifetime of the connection. 646 647 This options affects both IPv4 and IPv6. 648 649 Default: 0 (disabled) 650 651tcp_reordering - INTEGER 652 Initial reordering level of packets in a TCP stream. 653 TCP stack can then dynamically adjust flow reordering level 654 between this initial value and tcp_max_reordering 655 656 Default: 3 657 658tcp_max_reordering - INTEGER 659 Maximal reordering level of packets in a TCP stream. 660 300 is a fairly conservative value, but you might increase it 661 if paths are using per packet load balancing (like bonding rr mode) 662 663 Default: 300 664 665tcp_retrans_collapse - BOOLEAN 666 Bug-to-bug compatibility with some broken printers. 667 On retransmit try to send bigger packets to work around bugs in 668 certain TCP stacks. 669 670tcp_retries1 - INTEGER 671 This value influences the time, after which TCP decides, that 672 something is wrong due to unacknowledged RTO retransmissions, 673 and reports this suspicion to the network layer. 674 See tcp_retries2 for more details. 675 676 RFC 1122 recommends at least 3 retransmissions, which is the 677 default. 678 679tcp_retries2 - INTEGER 680 This value influences the timeout of an alive TCP connection, 681 when RTO retransmissions remain unacknowledged. 682 Given a value of N, a hypothetical TCP connection following 683 exponential backoff with an initial RTO of TCP_RTO_MIN would 684 retransmit N times before killing the connection at the (N+1)th RTO. 685 686 The default value of 15 yields a hypothetical timeout of 924.6 687 seconds and is a lower bound for the effective timeout. 688 TCP will effectively time out at the first RTO which exceeds the 689 hypothetical timeout. 690 691 RFC 1122 recommends at least 100 seconds for the timeout, 692 which corresponds to a value of at least 8. 693 694tcp_rfc1337 - BOOLEAN 695 If set, the TCP stack behaves conforming to RFC1337. If unset, 696 we are not conforming to RFC, but prevent TCP TIME_WAIT 697 assassination. 698 699 Default: 0 700 701tcp_rmem - vector of 3 INTEGERs: min, default, max 702 min: Minimal size of receive buffer used by TCP sockets. 703 It is guaranteed to each TCP socket, even under moderate memory 704 pressure. 705 706 Default: 4K 707 708 default: initial size of receive buffer used by TCP sockets. 709 This value overrides net.core.rmem_default used by other protocols. 710 Default: 131072 bytes. 711 This value results in initial window of 65535. 712 713 max: maximal size of receive buffer allowed for automatically 714 selected receiver buffers for TCP socket. This value does not override 715 net.core.rmem_max. Calling setsockopt() with SO_RCVBUF disables 716 automatic tuning of that socket's receive buffer size, in which 717 case this value is ignored. 718 Default: between 131072 and 6MB, depending on RAM size. 719 720tcp_sack - BOOLEAN 721 Enable select acknowledgments (SACKS). 722 723tcp_comp_sack_delay_ns - LONG INTEGER 724 TCP tries to reduce number of SACK sent, using a timer 725 based on 5% of SRTT, capped by this sysctl, in nano seconds. 726 The default is 1ms, based on TSO autosizing period. 727 728 Default : 1,000,000 ns (1 ms) 729 730tcp_comp_sack_slack_ns - LONG INTEGER 731 This sysctl control the slack used when arming the 732 timer used by SACK compression. This gives extra time 733 for small RTT flows, and reduces system overhead by allowing 734 opportunistic reduction of timer interrupts. 735 736 Default : 100,000 ns (100 us) 737 738tcp_comp_sack_nr - INTEGER 739 Max number of SACK that can be compressed. 740 Using 0 disables SACK compression. 741 742 Default : 44 743 744tcp_slow_start_after_idle - BOOLEAN 745 If set, provide RFC2861 behavior and time out the congestion 746 window after an idle period. An idle period is defined at 747 the current RTO. If unset, the congestion window will not 748 be timed out after an idle period. 749 750 Default: 1 751 752tcp_stdurg - BOOLEAN 753 Use the Host requirements interpretation of the TCP urgent pointer field. 754 Most hosts use the older BSD interpretation, so if you turn this on 755 Linux might not communicate correctly with them. 756 757 Default: FALSE 758 759tcp_synack_retries - INTEGER 760 Number of times SYNACKs for a passive TCP connection attempt will 761 be retransmitted. Should not be higher than 255. Default value 762 is 5, which corresponds to 31seconds till the last retransmission 763 with the current initial RTO of 1second. With this the final timeout 764 for a passive TCP connection will happen after 63seconds. 765 766tcp_syncookies - INTEGER 767 Only valid when the kernel was compiled with CONFIG_SYN_COOKIES 768 Send out syncookies when the syn backlog queue of a socket 769 overflows. This is to prevent against the common 'SYN flood attack' 770 Default: 1 771 772 Note, that syncookies is fallback facility. 773 It MUST NOT be used to help highly loaded servers to stand 774 against legal connection rate. If you see SYN flood warnings 775 in your logs, but investigation shows that they occur 776 because of overload with legal connections, you should tune 777 another parameters until this warning disappear. 778 See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. 779 780 syncookies seriously violate TCP protocol, do not allow 781 to use TCP extensions, can result in serious degradation 782 of some services (f.e. SMTP relaying), visible not by you, 783 but your clients and relays, contacting you. While you see 784 SYN flood warnings in logs not being really flooded, your server 785 is seriously misconfigured. 786 787 If you want to test which effects syncookies have to your 788 network connections you can set this knob to 2 to enable 789 unconditionally generation of syncookies. 790 791tcp_migrate_req - BOOLEAN 792 The incoming connection is tied to a specific listening socket when 793 the initial SYN packet is received during the three-way handshake. 794 When a listener is closed, in-flight request sockets during the 795 handshake and established sockets in the accept queue are aborted. 796 797 If the listener has SO_REUSEPORT enabled, other listeners on the 798 same port should have been able to accept such connections. This 799 option makes it possible to migrate such child sockets to another 800 listener after close() or shutdown(). 801 802 The BPF_SK_REUSEPORT_SELECT_OR_MIGRATE type of eBPF program should 803 usually be used to define the policy to pick an alive listener. 804 Otherwise, the kernel will randomly pick an alive listener only if 805 this option is enabled. 806 807 Note that migration between listeners with different settings may 808 crash applications. Let's say migration happens from listener A to 809 B, and only B has TCP_SAVE_SYN enabled. B cannot read SYN data from 810 the requests migrated from A. To avoid such a situation, cancel 811 migration by returning SK_DROP in the type of eBPF program, or 812 disable this option. 813 814 Default: 0 815 816tcp_fastopen - INTEGER 817 Enable TCP Fast Open (RFC7413) to send and accept data in the opening 818 SYN packet. 819 820 The client support is enabled by flag 0x1 (on by default). The client 821 then must use sendmsg() or sendto() with the MSG_FASTOPEN flag, 822 rather than connect() to send data in SYN. 823 824 The server support is enabled by flag 0x2 (off by default). Then 825 either enable for all listeners with another flag (0x400) or 826 enable individual listeners via TCP_FASTOPEN socket option with 827 the option value being the length of the syn-data backlog. 828 829 The values (bitmap) are 830 831 ===== ======== ====================================================== 832 0x1 (client) enables sending data in the opening SYN on the client. 833 0x2 (server) enables the server support, i.e., allowing data in 834 a SYN packet to be accepted and passed to the 835 application before 3-way handshake finishes. 836 0x4 (client) send data in the opening SYN regardless of cookie 837 availability and without a cookie option. 838 0x200 (server) accept data-in-SYN w/o any cookie option present. 839 0x400 (server) enable all listeners to support Fast Open by 840 default without explicit TCP_FASTOPEN socket option. 841 ===== ======== ====================================================== 842 843 Default: 0x1 844 845 Note that additional client or server features are only 846 effective if the basic support (0x1 and 0x2) are enabled respectively. 847 848tcp_fastopen_blackhole_timeout_sec - INTEGER 849 Initial time period in second to disable Fastopen on active TCP sockets 850 when a TFO firewall blackhole issue happens. 851 This time period will grow exponentially when more blackhole issues 852 get detected right after Fastopen is re-enabled and will reset to 853 initial value when the blackhole issue goes away. 854 0 to disable the blackhole detection. 855 856 By default, it is set to 0 (feature is disabled). 857 858tcp_fastopen_key - list of comma separated 32-digit hexadecimal INTEGERs 859 The list consists of a primary key and an optional backup key. The 860 primary key is used for both creating and validating cookies, while the 861 optional backup key is only used for validating cookies. The purpose of 862 the backup key is to maximize TFO validation when keys are rotated. 863 864 A randomly chosen primary key may be configured by the kernel if 865 the tcp_fastopen sysctl is set to 0x400 (see above), or if the 866 TCP_FASTOPEN setsockopt() optname is set and a key has not been 867 previously configured via sysctl. If keys are configured via 868 setsockopt() by using the TCP_FASTOPEN_KEY optname, then those 869 per-socket keys will be used instead of any keys that are specified via 870 sysctl. 871 872 A key is specified as 4 8-digit hexadecimal integers which are separated 873 by a '-' as: xxxxxxxx-xxxxxxxx-xxxxxxxx-xxxxxxxx. Leading zeros may be 874 omitted. A primary and a backup key may be specified by separating them 875 by a comma. If only one key is specified, it becomes the primary key and 876 any previously configured backup keys are removed. 877 878tcp_syn_retries - INTEGER 879 Number of times initial SYNs for an active TCP connection attempt 880 will be retransmitted. Should not be higher than 127. Default value 881 is 6, which corresponds to 63seconds till the last retransmission 882 with the current initial RTO of 1second. With this the final timeout 883 for an active TCP connection attempt will happen after 127seconds. 884 885tcp_timestamps - INTEGER 886 Enable timestamps as defined in RFC1323. 887 888 - 0: Disabled. 889 - 1: Enable timestamps as defined in RFC1323 and use random offset for 890 each connection rather than only using the current time. 891 - 2: Like 1, but without random offsets. 892 893 Default: 1 894 895tcp_min_tso_segs - INTEGER 896 Minimal number of segments per TSO frame. 897 898 Since linux-3.12, TCP does an automatic sizing of TSO frames, 899 depending on flow rate, instead of filling 64Kbytes packets. 900 For specific usages, it's possible to force TCP to build big 901 TSO frames. Note that TCP stack might split too big TSO packets 902 if available window is too small. 903 904 Default: 2 905 906tcp_tso_rtt_log - INTEGER 907 Adjustment of TSO packet sizes based on min_rtt 908 909 Starting from linux-5.18, TCP autosizing can be tweaked 910 for flows having small RTT. 911 912 Old autosizing was splitting the pacing budget to send 1024 TSO 913 per second. 914 915 tso_packet_size = sk->sk_pacing_rate / 1024; 916 917 With the new mechanism, we increase this TSO sizing using: 918 919 distance = min_rtt_usec / (2^tcp_tso_rtt_log) 920 tso_packet_size += gso_max_size >> distance; 921 922 This means that flows between very close hosts can use bigger 923 TSO packets, reducing their cpu costs. 924 925 If you want to use the old autosizing, set this sysctl to 0. 926 927 Default: 9 (2^9 = 512 usec) 928 929tcp_pacing_ss_ratio - INTEGER 930 sk->sk_pacing_rate is set by TCP stack using a ratio applied 931 to current rate. (current_rate = cwnd * mss / srtt) 932 If TCP is in slow start, tcp_pacing_ss_ratio is applied 933 to let TCP probe for bigger speeds, assuming cwnd can be 934 doubled every other RTT. 935 936 Default: 200 937 938tcp_pacing_ca_ratio - INTEGER 939 sk->sk_pacing_rate is set by TCP stack using a ratio applied 940 to current rate. (current_rate = cwnd * mss / srtt) 941 If TCP is in congestion avoidance phase, tcp_pacing_ca_ratio 942 is applied to conservatively probe for bigger throughput. 943 944 Default: 120 945 946tcp_tso_win_divisor - INTEGER 947 This allows control over what percentage of the congestion window 948 can be consumed by a single TSO frame. 949 The setting of this parameter is a choice between burstiness and 950 building larger TSO frames. 951 952 Default: 3 953 954tcp_tw_reuse - INTEGER 955 Enable reuse of TIME-WAIT sockets for new connections when it is 956 safe from protocol viewpoint. 957 958 - 0 - disable 959 - 1 - global enable 960 - 2 - enable for loopback traffic only 961 962 It should not be changed without advice/request of technical 963 experts. 964 965 Default: 2 966 967tcp_window_scaling - BOOLEAN 968 Enable window scaling as defined in RFC1323. 969 970tcp_wmem - vector of 3 INTEGERs: min, default, max 971 min: Amount of memory reserved for send buffers for TCP sockets. 972 Each TCP socket has rights to use it due to fact of its birth. 973 974 Default: 4K 975 976 default: initial size of send buffer used by TCP sockets. This 977 value overrides net.core.wmem_default used by other protocols. 978 979 It is usually lower than net.core.wmem_default. 980 981 Default: 16K 982 983 max: Maximal amount of memory allowed for automatically tuned 984 send buffers for TCP sockets. This value does not override 985 net.core.wmem_max. Calling setsockopt() with SO_SNDBUF disables 986 automatic tuning of that socket's send buffer size, in which case 987 this value is ignored. 988 989 Default: between 64K and 4MB, depending on RAM size. 990 991tcp_notsent_lowat - UNSIGNED INTEGER 992 A TCP socket can control the amount of unsent bytes in its write queue, 993 thanks to TCP_NOTSENT_LOWAT socket option. poll()/select()/epoll() 994 reports POLLOUT events if the amount of unsent bytes is below a per 995 socket value, and if the write queue is not full. sendmsg() will 996 also not add new buffers if the limit is hit. 997 998 This global variable controls the amount of unsent data for 999 sockets not using TCP_NOTSENT_LOWAT. For these sockets, a change 1000 to the global variable has immediate effect. 1001 1002 Default: UINT_MAX (0xFFFFFFFF) 1003 1004tcp_workaround_signed_windows - BOOLEAN 1005 If set, assume no receipt of a window scaling option means the 1006 remote TCP is broken and treats the window as a signed quantity. 1007 If unset, assume the remote TCP is not broken even if we do 1008 not receive a window scaling option from them. 1009 1010 Default: 0 1011 1012tcp_thin_linear_timeouts - BOOLEAN 1013 Enable dynamic triggering of linear timeouts for thin streams. 1014 If set, a check is performed upon retransmission by timeout to 1015 determine if the stream is thin (less than 4 packets in flight). 1016 As long as the stream is found to be thin, up to 6 linear 1017 timeouts may be performed before exponential backoff mode is 1018 initiated. This improves retransmission latency for 1019 non-aggressive thin streams, often found to be time-dependent. 1020 For more information on thin streams, see 1021 Documentation/networking/tcp-thin.rst 1022 1023 Default: 0 1024 1025tcp_limit_output_bytes - INTEGER 1026 Controls TCP Small Queue limit per tcp socket. 1027 TCP bulk sender tends to increase packets in flight until it 1028 gets losses notifications. With SNDBUF autotuning, this can 1029 result in a large amount of packets queued on the local machine 1030 (e.g.: qdiscs, CPU backlog, or device) hurting latency of other 1031 flows, for typical pfifo_fast qdiscs. tcp_limit_output_bytes 1032 limits the number of bytes on qdisc or device to reduce artificial 1033 RTT/cwnd and reduce bufferbloat. 1034 1035 Default: 1048576 (16 * 65536) 1036 1037tcp_challenge_ack_limit - INTEGER 1038 Limits number of Challenge ACK sent per second, as recommended 1039 in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks) 1040 Note that this per netns rate limit can allow some side channel 1041 attacks and probably should not be enabled. 1042 TCP stack implements per TCP socket limits anyway. 1043 Default: INT_MAX (unlimited) 1044 1045tcp_ehash_entries - INTEGER 1046 Show the number of hash buckets for TCP sockets in the current 1047 networking namespace. 1048 1049 A negative value means the networking namespace does not own its 1050 hash buckets and shares the initial networking namespace's one. 1051 1052tcp_child_ehash_entries - INTEGER 1053 Control the number of hash buckets for TCP sockets in the child 1054 networking namespace, which must be set before clone() or unshare(). 1055 1056 If the value is not 0, the kernel uses a value rounded up to 2^n 1057 as the actual hash bucket size. 0 is a special value, meaning 1058 the child networking namespace will share the initial networking 1059 namespace's hash buckets. 1060 1061 Note that the child will use the global one in case the kernel 1062 fails to allocate enough memory. In addition, the global hash 1063 buckets are spread over available NUMA nodes, but the allocation 1064 of the child hash table depends on the current process's NUMA 1065 policy, which could result in performance differences. 1066 1067 Note also that the default value of tcp_max_tw_buckets and 1068 tcp_max_syn_backlog depend on the hash bucket size. 1069 1070 Possible values: 0, 2^n (n: 0 - 24 (16Mi)) 1071 1072 Default: 0 1073 1074UDP variables 1075============= 1076 1077udp_l3mdev_accept - BOOLEAN 1078 Enabling this option allows a "global" bound socket to work 1079 across L3 master domains (e.g., VRFs) with packets capable of 1080 being received regardless of the L3 domain in which they 1081 originated. Only valid when the kernel was compiled with 1082 CONFIG_NET_L3_MASTER_DEV. 1083 1084 Default: 0 (disabled) 1085 1086udp_mem - vector of 3 INTEGERs: min, pressure, max 1087 Number of pages allowed for queueing by all UDP sockets. 1088 1089 min: Number of pages allowed for queueing by all UDP sockets. 1090 1091 pressure: This value was introduced to follow format of tcp_mem. 1092 1093 max: This value was introduced to follow format of tcp_mem. 1094 1095 Default is calculated at boot time from amount of available memory. 1096 1097udp_rmem_min - INTEGER 1098 Minimal size of receive buffer used by UDP sockets in moderation. 1099 Each UDP socket is able to use the size for receiving data, even if 1100 total pages of UDP sockets exceed udp_mem pressure. The unit is byte. 1101 1102 Default: 4K 1103 1104udp_wmem_min - INTEGER 1105 UDP does not have tx memory accounting and this tunable has no effect. 1106 1107RAW variables 1108============= 1109 1110raw_l3mdev_accept - BOOLEAN 1111 Enabling this option allows a "global" bound socket to work 1112 across L3 master domains (e.g., VRFs) with packets capable of 1113 being received regardless of the L3 domain in which they 1114 originated. Only valid when the kernel was compiled with 1115 CONFIG_NET_L3_MASTER_DEV. 1116 1117 Default: 1 (enabled) 1118 1119CIPSOv4 Variables 1120================= 1121 1122cipso_cache_enable - BOOLEAN 1123 If set, enable additions to and lookups from the CIPSO label mapping 1124 cache. If unset, additions are ignored and lookups always result in a 1125 miss. However, regardless of the setting the cache is still 1126 invalidated when required when means you can safely toggle this on and 1127 off and the cache will always be "safe". 1128 1129 Default: 1 1130 1131cipso_cache_bucket_size - INTEGER 1132 The CIPSO label cache consists of a fixed size hash table with each 1133 hash bucket containing a number of cache entries. This variable limits 1134 the number of entries in each hash bucket; the larger the value is, the 1135 more CIPSO label mappings that can be cached. When the number of 1136 entries in a given hash bucket reaches this limit adding new entries 1137 causes the oldest entry in the bucket to be removed to make room. 1138 1139 Default: 10 1140 1141cipso_rbm_optfmt - BOOLEAN 1142 Enable the "Optimized Tag 1 Format" as defined in section 3.4.2.6 of 1143 the CIPSO draft specification (see Documentation/netlabel for details). 1144 This means that when set the CIPSO tag will be padded with empty 1145 categories in order to make the packet data 32-bit aligned. 1146 1147 Default: 0 1148 1149cipso_rbm_structvalid - BOOLEAN 1150 If set, do a very strict check of the CIPSO option when 1151 ip_options_compile() is called. If unset, relax the checks done during 1152 ip_options_compile(). Either way is "safe" as errors are caught else 1153 where in the CIPSO processing code but setting this to 0 (False) should 1154 result in less work (i.e. it should be faster) but could cause problems 1155 with other implementations that require strict checking. 1156 1157 Default: 0 1158 1159IP Variables 1160============ 1161 1162ip_local_port_range - 2 INTEGERS 1163 Defines the local port range that is used by TCP and UDP to 1164 choose the local port. The first number is the first, the 1165 second the last local port number. 1166 If possible, it is better these numbers have different parity 1167 (one even and one odd value). 1168 Must be greater than or equal to ip_unprivileged_port_start. 1169 The default values are 32768 and 60999 respectively. 1170 1171ip_local_reserved_ports - list of comma separated ranges 1172 Specify the ports which are reserved for known third-party 1173 applications. These ports will not be used by automatic port 1174 assignments (e.g. when calling connect() or bind() with port 1175 number 0). Explicit port allocation behavior is unchanged. 1176 1177 The format used for both input and output is a comma separated 1178 list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and 1179 10). Writing to the file will clear all previously reserved 1180 ports and update the current list with the one given in the 1181 input. 1182 1183 Note that ip_local_port_range and ip_local_reserved_ports 1184 settings are independent and both are considered by the kernel 1185 when determining which ports are available for automatic port 1186 assignments. 1187 1188 You can reserve ports which are not in the current 1189 ip_local_port_range, e.g.:: 1190 1191 $ cat /proc/sys/net/ipv4/ip_local_port_range 1192 32000 60999 1193 $ cat /proc/sys/net/ipv4/ip_local_reserved_ports 1194 8080,9148 1195 1196 although this is redundant. However such a setting is useful 1197 if later the port range is changed to a value that will 1198 include the reserved ports. Also keep in mind, that overlapping 1199 of these ranges may affect probability of selecting ephemeral 1200 ports which are right after block of reserved ports. 1201 1202 Default: Empty 1203 1204ip_unprivileged_port_start - INTEGER 1205 This is a per-namespace sysctl. It defines the first 1206 unprivileged port in the network namespace. Privileged ports 1207 require root or CAP_NET_BIND_SERVICE in order to bind to them. 1208 To disable all privileged ports, set this to 0. They must not 1209 overlap with the ip_local_port_range. 1210 1211 Default: 1024 1212 1213ip_nonlocal_bind - BOOLEAN 1214 If set, allows processes to bind() to non-local IP addresses, 1215 which can be quite useful - but may break some applications. 1216 1217 Default: 0 1218 1219ip_autobind_reuse - BOOLEAN 1220 By default, bind() does not select the ports automatically even if 1221 the new socket and all sockets bound to the port have SO_REUSEADDR. 1222 ip_autobind_reuse allows bind() to reuse the port and this is useful 1223 when you use bind()+connect(), but may break some applications. 1224 The preferred solution is to use IP_BIND_ADDRESS_NO_PORT and this 1225 option should only be set by experts. 1226 Default: 0 1227 1228ip_dynaddr - INTEGER 1229 If set non-zero, enables support for dynamic addresses. 1230 If set to a non-zero value larger than 1, a kernel log 1231 message will be printed when dynamic address rewriting 1232 occurs. 1233 1234 Default: 0 1235 1236ip_early_demux - BOOLEAN 1237 Optimize input packet processing down to one demux for 1238 certain kinds of local sockets. Currently we only do this 1239 for established TCP and connected UDP sockets. 1240 1241 It may add an additional cost for pure routing workloads that 1242 reduces overall throughput, in such case you should disable it. 1243 1244 Default: 1 1245 1246ping_group_range - 2 INTEGERS 1247 Restrict ICMP_PROTO datagram sockets to users in the group range. 1248 The default is "1 0", meaning, that nobody (not even root) may 1249 create ping sockets. Setting it to "100 100" would grant permissions 1250 to the single group. "0 4294967294" would enable it for the world, "100 1251 4294967294" would enable it for the users, but not daemons. 1252 1253tcp_early_demux - BOOLEAN 1254 Enable early demux for established TCP sockets. 1255 1256 Default: 1 1257 1258udp_early_demux - BOOLEAN 1259 Enable early demux for connected UDP sockets. Disable this if 1260 your system could experience more unconnected load. 1261 1262 Default: 1 1263 1264icmp_echo_ignore_all - BOOLEAN 1265 If set non-zero, then the kernel will ignore all ICMP ECHO 1266 requests sent to it. 1267 1268 Default: 0 1269 1270icmp_echo_enable_probe - BOOLEAN 1271 If set to one, then the kernel will respond to RFC 8335 PROBE 1272 requests sent to it. 1273 1274 Default: 0 1275 1276icmp_echo_ignore_broadcasts - BOOLEAN 1277 If set non-zero, then the kernel will ignore all ICMP ECHO and 1278 TIMESTAMP requests sent to it via broadcast/multicast. 1279 1280 Default: 1 1281 1282icmp_ratelimit - INTEGER 1283 Limit the maximal rates for sending ICMP packets whose type matches 1284 icmp_ratemask (see below) to specific targets. 1285 0 to disable any limiting, 1286 otherwise the minimal space between responses in milliseconds. 1287 Note that another sysctl, icmp_msgs_per_sec limits the number 1288 of ICMP packets sent on all targets. 1289 1290 Default: 1000 1291 1292icmp_msgs_per_sec - INTEGER 1293 Limit maximal number of ICMP packets sent per second from this host. 1294 Only messages whose type matches icmp_ratemask (see below) are 1295 controlled by this limit. For security reasons, the precise count 1296 of messages per second is randomized. 1297 1298 Default: 1000 1299 1300icmp_msgs_burst - INTEGER 1301 icmp_msgs_per_sec controls number of ICMP packets sent per second, 1302 while icmp_msgs_burst controls the burst size of these packets. 1303 For security reasons, the precise burst size is randomized. 1304 1305 Default: 50 1306 1307icmp_ratemask - INTEGER 1308 Mask made of ICMP types for which rates are being limited. 1309 1310 Significant bits: IHGFEDCBA9876543210 1311 1312 Default mask: 0000001100000011000 (6168) 1313 1314 Bit definitions (see include/linux/icmp.h): 1315 1316 = ========================= 1317 0 Echo Reply 1318 3 Destination Unreachable [1]_ 1319 4 Source Quench [1]_ 1320 5 Redirect 1321 8 Echo Request 1322 B Time Exceeded [1]_ 1323 C Parameter Problem [1]_ 1324 D Timestamp Request 1325 E Timestamp Reply 1326 F Info Request 1327 G Info Reply 1328 H Address Mask Request 1329 I Address Mask Reply 1330 = ========================= 1331 1332 .. [1] These are rate limited by default (see default mask above) 1333 1334icmp_ignore_bogus_error_responses - BOOLEAN 1335 Some routers violate RFC1122 by sending bogus responses to broadcast 1336 frames. Such violations are normally logged via a kernel warning. 1337 If this is set to TRUE, the kernel will not give such warnings, which 1338 will avoid log file clutter. 1339 1340 Default: 1 1341 1342icmp_errors_use_inbound_ifaddr - BOOLEAN 1343 1344 If zero, icmp error messages are sent with the primary address of 1345 the exiting interface. 1346 1347 If non-zero, the message will be sent with the primary address of 1348 the interface that received the packet that caused the icmp error. 1349 This is the behaviour many network administrators will expect from 1350 a router. And it can make debugging complicated network layouts 1351 much easier. 1352 1353 Note that if no primary address exists for the interface selected, 1354 then the primary address of the first non-loopback interface that 1355 has one will be used regardless of this setting. 1356 1357 Default: 0 1358 1359igmp_max_memberships - INTEGER 1360 Change the maximum number of multicast groups we can subscribe to. 1361 Default: 20 1362 1363 Theoretical maximum value is bounded by having to send a membership 1364 report in a single datagram (i.e. the report can't span multiple 1365 datagrams, or risk confusing the switch and leaving groups you don't 1366 intend to). 1367 1368 The number of supported groups 'M' is bounded by the number of group 1369 report entries you can fit into a single datagram of 65535 bytes. 1370 1371 M = 65536-sizeof (ip header)/(sizeof(Group record)) 1372 1373 Group records are variable length, with a minimum of 12 bytes. 1374 So net.ipv4.igmp_max_memberships should not be set higher than: 1375 1376 (65536-24) / 12 = 5459 1377 1378 The value 5459 assumes no IP header options, so in practice 1379 this number may be lower. 1380 1381igmp_max_msf - INTEGER 1382 Maximum number of addresses allowed in the source filter list for a 1383 multicast group. 1384 1385 Default: 10 1386 1387igmp_qrv - INTEGER 1388 Controls the IGMP query robustness variable (see RFC2236 8.1). 1389 1390 Default: 2 (as specified by RFC2236 8.1) 1391 1392 Minimum: 1 (as specified by RFC6636 4.5) 1393 1394force_igmp_version - INTEGER 1395 - 0 - (default) No enforcement of a IGMP version, IGMPv1/v2 fallback 1396 allowed. Will back to IGMPv3 mode again if all IGMPv1/v2 Querier 1397 Present timer expires. 1398 - 1 - Enforce to use IGMP version 1. Will also reply IGMPv1 report if 1399 receive IGMPv2/v3 query. 1400 - 2 - Enforce to use IGMP version 2. Will fallback to IGMPv1 if receive 1401 IGMPv1 query message. Will reply report if receive IGMPv3 query. 1402 - 3 - Enforce to use IGMP version 3. The same react with default 0. 1403 1404 .. note:: 1405 1406 this is not the same with force_mld_version because IGMPv3 RFC3376 1407 Security Considerations does not have clear description that we could 1408 ignore other version messages completely as MLDv2 RFC3810. So make 1409 this value as default 0 is recommended. 1410 1411``conf/interface/*`` 1412 changes special settings per interface (where 1413 interface" is the name of your network interface) 1414 1415``conf/all/*`` 1416 is special, changes the settings for all interfaces 1417 1418log_martians - BOOLEAN 1419 Log packets with impossible addresses to kernel log. 1420 log_martians for the interface will be enabled if at least one of 1421 conf/{all,interface}/log_martians is set to TRUE, 1422 it will be disabled otherwise 1423 1424accept_redirects - BOOLEAN 1425 Accept ICMP redirect messages. 1426 accept_redirects for the interface will be enabled if: 1427 1428 - both conf/{all,interface}/accept_redirects are TRUE in the case 1429 forwarding for the interface is enabled 1430 1431 or 1432 1433 - at least one of conf/{all,interface}/accept_redirects is TRUE in the 1434 case forwarding for the interface is disabled 1435 1436 accept_redirects for the interface will be disabled otherwise 1437 1438 default: 1439 1440 - TRUE (host) 1441 - FALSE (router) 1442 1443forwarding - BOOLEAN 1444 Enable IP forwarding on this interface. This controls whether packets 1445 received _on_ this interface can be forwarded. 1446 1447mc_forwarding - BOOLEAN 1448 Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE 1449 and a multicast routing daemon is required. 1450 conf/all/mc_forwarding must also be set to TRUE to enable multicast 1451 routing for the interface 1452 1453medium_id - INTEGER 1454 Integer value used to differentiate the devices by the medium they 1455 are attached to. Two devices can have different id values when 1456 the broadcast packets are received only on one of them. 1457 The default value 0 means that the device is the only interface 1458 to its medium, value of -1 means that medium is not known. 1459 1460 Currently, it is used to change the proxy_arp behavior: 1461 the proxy_arp feature is enabled for packets forwarded between 1462 two devices attached to different media. 1463 1464proxy_arp - BOOLEAN 1465 Do proxy arp. 1466 1467 proxy_arp for the interface will be enabled if at least one of 1468 conf/{all,interface}/proxy_arp is set to TRUE, 1469 it will be disabled otherwise 1470 1471proxy_arp_pvlan - BOOLEAN 1472 Private VLAN proxy arp. 1473 1474 Basically allow proxy arp replies back to the same interface 1475 (from which the ARP request/solicitation was received). 1476 1477 This is done to support (ethernet) switch features, like RFC 1478 3069, where the individual ports are NOT allowed to 1479 communicate with each other, but they are allowed to talk to 1480 the upstream router. As described in RFC 3069, it is possible 1481 to allow these hosts to communicate through the upstream 1482 router by proxy_arp'ing. Don't need to be used together with 1483 proxy_arp. 1484 1485 This technology is known by different names: 1486 1487 In RFC 3069 it is called VLAN Aggregation. 1488 Cisco and Allied Telesyn call it Private VLAN. 1489 Hewlett-Packard call it Source-Port filtering or port-isolation. 1490 Ericsson call it MAC-Forced Forwarding (RFC Draft). 1491 1492shared_media - BOOLEAN 1493 Send(router) or accept(host) RFC1620 shared media redirects. 1494 Overrides secure_redirects. 1495 1496 shared_media for the interface will be enabled if at least one of 1497 conf/{all,interface}/shared_media is set to TRUE, 1498 it will be disabled otherwise 1499 1500 default TRUE 1501 1502secure_redirects - BOOLEAN 1503 Accept ICMP redirect messages only to gateways listed in the 1504 interface's current gateway list. Even if disabled, RFC1122 redirect 1505 rules still apply. 1506 1507 Overridden by shared_media. 1508 1509 secure_redirects for the interface will be enabled if at least one of 1510 conf/{all,interface}/secure_redirects is set to TRUE, 1511 it will be disabled otherwise 1512 1513 default TRUE 1514 1515send_redirects - BOOLEAN 1516 Send redirects, if router. 1517 1518 send_redirects for the interface will be enabled if at least one of 1519 conf/{all,interface}/send_redirects is set to TRUE, 1520 it will be disabled otherwise 1521 1522 Default: TRUE 1523 1524bootp_relay - BOOLEAN 1525 Accept packets with source address 0.b.c.d destined 1526 not to this host as local ones. It is supposed, that 1527 BOOTP relay daemon will catch and forward such packets. 1528 conf/all/bootp_relay must also be set to TRUE to enable BOOTP relay 1529 for the interface 1530 1531 default FALSE 1532 1533 Not Implemented Yet. 1534 1535accept_source_route - BOOLEAN 1536 Accept packets with SRR option. 1537 conf/all/accept_source_route must also be set to TRUE to accept packets 1538 with SRR option on the interface 1539 1540 default 1541 1542 - TRUE (router) 1543 - FALSE (host) 1544 1545accept_local - BOOLEAN 1546 Accept packets with local source addresses. In combination with 1547 suitable routing, this can be used to direct packets between two 1548 local interfaces over the wire and have them accepted properly. 1549 default FALSE 1550 1551route_localnet - BOOLEAN 1552 Do not consider loopback addresses as martian source or destination 1553 while routing. This enables the use of 127/8 for local routing purposes. 1554 1555 default FALSE 1556 1557rp_filter - INTEGER 1558 - 0 - No source validation. 1559 - 1 - Strict mode as defined in RFC3704 Strict Reverse Path 1560 Each incoming packet is tested against the FIB and if the interface 1561 is not the best reverse path the packet check will fail. 1562 By default failed packets are discarded. 1563 - 2 - Loose mode as defined in RFC3704 Loose Reverse Path 1564 Each incoming packet's source address is also tested against the FIB 1565 and if the source address is not reachable via any interface 1566 the packet check will fail. 1567 1568 Current recommended practice in RFC3704 is to enable strict mode 1569 to prevent IP spoofing from DDos attacks. If using asymmetric routing 1570 or other complicated routing, then loose mode is recommended. 1571 1572 The max value from conf/{all,interface}/rp_filter is used 1573 when doing source validation on the {interface}. 1574 1575 Default value is 0. Note that some distributions enable it 1576 in startup scripts. 1577 1578src_valid_mark - BOOLEAN 1579 - 0 - The fwmark of the packet is not included in reverse path 1580 route lookup. This allows for asymmetric routing configurations 1581 utilizing the fwmark in only one direction, e.g., transparent 1582 proxying. 1583 1584 - 1 - The fwmark of the packet is included in reverse path route 1585 lookup. This permits rp_filter to function when the fwmark is 1586 used for routing traffic in both directions. 1587 1588 This setting also affects the utilization of fmwark when 1589 performing source address selection for ICMP replies, or 1590 determining addresses stored for the IPOPT_TS_TSANDADDR and 1591 IPOPT_RR IP options. 1592 1593 The max value from conf/{all,interface}/src_valid_mark is used. 1594 1595 Default value is 0. 1596 1597arp_filter - BOOLEAN 1598 - 1 - Allows you to have multiple network interfaces on the same 1599 subnet, and have the ARPs for each interface be answered 1600 based on whether or not the kernel would route a packet from 1601 the ARP'd IP out that interface (therefore you must use source 1602 based routing for this to work). In other words it allows control 1603 of which cards (usually 1) will respond to an arp request. 1604 1605 - 0 - (default) The kernel can respond to arp requests with addresses 1606 from other interfaces. This may seem wrong but it usually makes 1607 sense, because it increases the chance of successful communication. 1608 IP addresses are owned by the complete host on Linux, not by 1609 particular interfaces. Only for more complex setups like load- 1610 balancing, does this behaviour cause problems. 1611 1612 arp_filter for the interface will be enabled if at least one of 1613 conf/{all,interface}/arp_filter is set to TRUE, 1614 it will be disabled otherwise 1615 1616arp_announce - INTEGER 1617 Define different restriction levels for announcing the local 1618 source IP address from IP packets in ARP requests sent on 1619 interface: 1620 1621 - 0 - (default) Use any local address, configured on any interface 1622 - 1 - Try to avoid local addresses that are not in the target's 1623 subnet for this interface. This mode is useful when target 1624 hosts reachable via this interface require the source IP 1625 address in ARP requests to be part of their logical network 1626 configured on the receiving interface. When we generate the 1627 request we will check all our subnets that include the 1628 target IP and will preserve the source address if it is from 1629 such subnet. If there is no such subnet we select source 1630 address according to the rules for level 2. 1631 - 2 - Always use the best local address for this target. 1632 In this mode we ignore the source address in the IP packet 1633 and try to select local address that we prefer for talks with 1634 the target host. Such local address is selected by looking 1635 for primary IP addresses on all our subnets on the outgoing 1636 interface that include the target IP address. If no suitable 1637 local address is found we select the first local address 1638 we have on the outgoing interface or on all other interfaces, 1639 with the hope we will receive reply for our request and 1640 even sometimes no matter the source IP address we announce. 1641 1642 The max value from conf/{all,interface}/arp_announce is used. 1643 1644 Increasing the restriction level gives more chance for 1645 receiving answer from the resolved target while decreasing 1646 the level announces more valid sender's information. 1647 1648arp_ignore - INTEGER 1649 Define different modes for sending replies in response to 1650 received ARP requests that resolve local target IP addresses: 1651 1652 - 0 - (default): reply for any local target IP address, configured 1653 on any interface 1654 - 1 - reply only if the target IP address is local address 1655 configured on the incoming interface 1656 - 2 - reply only if the target IP address is local address 1657 configured on the incoming interface and both with the 1658 sender's IP address are part from same subnet on this interface 1659 - 3 - do not reply for local addresses configured with scope host, 1660 only resolutions for global and link addresses are replied 1661 - 4-7 - reserved 1662 - 8 - do not reply for all local addresses 1663 1664 The max value from conf/{all,interface}/arp_ignore is used 1665 when ARP request is received on the {interface} 1666 1667arp_notify - BOOLEAN 1668 Define mode for notification of address and device changes. 1669 1670 == ========================================================== 1671 0 (default): do nothing 1672 1 Generate gratuitous arp requests when device is brought up 1673 or hardware address changes. 1674 == ========================================================== 1675 1676arp_accept - INTEGER 1677 Define behavior for accepting gratuitous ARP (garp) frames from devices 1678 that are not already present in the ARP table: 1679 1680 - 0 - don't create new entries in the ARP table 1681 - 1 - create new entries in the ARP table 1682 - 2 - create new entries only if the source IP address is in the same 1683 subnet as an address configured on the interface that received the 1684 garp message. 1685 1686 Both replies and requests type gratuitous arp will trigger the 1687 ARP table to be updated, if this setting is on. 1688 1689 If the ARP table already contains the IP address of the 1690 gratuitous arp frame, the arp table will be updated regardless 1691 if this setting is on or off. 1692 1693arp_evict_nocarrier - BOOLEAN 1694 Clears the ARP cache on NOCARRIER events. This option is important for 1695 wireless devices where the ARP cache should not be cleared when roaming 1696 between access points on the same network. In most cases this should 1697 remain as the default (1). 1698 1699 - 1 - (default): Clear the ARP cache on NOCARRIER events 1700 - 0 - Do not clear ARP cache on NOCARRIER events 1701 1702mcast_solicit - INTEGER 1703 The maximum number of multicast probes in INCOMPLETE state, 1704 when the associated hardware address is unknown. Defaults 1705 to 3. 1706 1707ucast_solicit - INTEGER 1708 The maximum number of unicast probes in PROBE state, when 1709 the hardware address is being reconfirmed. Defaults to 3. 1710 1711app_solicit - INTEGER 1712 The maximum number of probes to send to the user space ARP daemon 1713 via netlink before dropping back to multicast probes (see 1714 mcast_resolicit). Defaults to 0. 1715 1716mcast_resolicit - INTEGER 1717 The maximum number of multicast probes after unicast and 1718 app probes in PROBE state. Defaults to 0. 1719 1720disable_policy - BOOLEAN 1721 Disable IPSEC policy (SPD) for this interface 1722 1723disable_xfrm - BOOLEAN 1724 Disable IPSEC encryption on this interface, whatever the policy 1725 1726igmpv2_unsolicited_report_interval - INTEGER 1727 The interval in milliseconds in which the next unsolicited 1728 IGMPv1 or IGMPv2 report retransmit will take place. 1729 1730 Default: 10000 (10 seconds) 1731 1732igmpv3_unsolicited_report_interval - INTEGER 1733 The interval in milliseconds in which the next unsolicited 1734 IGMPv3 report retransmit will take place. 1735 1736 Default: 1000 (1 seconds) 1737 1738ignore_routes_with_linkdown - BOOLEAN 1739 Ignore routes whose link is down when performing a FIB lookup. 1740 1741promote_secondaries - BOOLEAN 1742 When a primary IP address is removed from this interface 1743 promote a corresponding secondary IP address instead of 1744 removing all the corresponding secondary IP addresses. 1745 1746drop_unicast_in_l2_multicast - BOOLEAN 1747 Drop any unicast IP packets that are received in link-layer 1748 multicast (or broadcast) frames. 1749 1750 This behavior (for multicast) is actually a SHOULD in RFC 1751 1122, but is disabled by default for compatibility reasons. 1752 1753 Default: off (0) 1754 1755drop_gratuitous_arp - BOOLEAN 1756 Drop all gratuitous ARP frames, for example if there's a known 1757 good ARP proxy on the network and such frames need not be used 1758 (or in the case of 802.11, must not be used to prevent attacks.) 1759 1760 Default: off (0) 1761 1762 1763tag - INTEGER 1764 Allows you to write a number, which can be used as required. 1765 1766 Default value is 0. 1767 1768xfrm4_gc_thresh - INTEGER 1769 (Obsolete since linux-4.14) 1770 The threshold at which we will start garbage collecting for IPv4 1771 destination cache entries. At twice this value the system will 1772 refuse new allocations. 1773 1774igmp_link_local_mcast_reports - BOOLEAN 1775 Enable IGMP reports for link local multicast groups in the 1776 224.0.0.X range. 1777 1778 Default TRUE 1779 1780Alexey Kuznetsov. 1781kuznet@ms2.inr.ac.ru 1782 1783Updated by: 1784 1785- Andi Kleen 1786 ak@muc.de 1787- Nicolas Delon 1788 delon.nicolas@wanadoo.fr 1789 1790 1791 1792 1793/proc/sys/net/ipv6/* Variables 1794============================== 1795 1796IPv6 has no global variables such as tcp_*. tcp_* settings under ipv4/ also 1797apply to IPv6 [XXX?]. 1798 1799bindv6only - BOOLEAN 1800 Default value for IPV6_V6ONLY socket option, 1801 which restricts use of the IPv6 socket to IPv6 communication 1802 only. 1803 1804 - TRUE: disable IPv4-mapped address feature 1805 - FALSE: enable IPv4-mapped address feature 1806 1807 Default: FALSE (as specified in RFC3493) 1808 1809flowlabel_consistency - BOOLEAN 1810 Protect the consistency (and unicity) of flow label. 1811 You have to disable it to use IPV6_FL_F_REFLECT flag on the 1812 flow label manager. 1813 1814 - TRUE: enabled 1815 - FALSE: disabled 1816 1817 Default: TRUE 1818 1819auto_flowlabels - INTEGER 1820 Automatically generate flow labels based on a flow hash of the 1821 packet. This allows intermediate devices, such as routers, to 1822 identify packet flows for mechanisms like Equal Cost Multipath 1823 Routing (see RFC 6438). 1824 1825 = =========================================================== 1826 0 automatic flow labels are completely disabled 1827 1 automatic flow labels are enabled by default, they can be 1828 disabled on a per socket basis using the IPV6_AUTOFLOWLABEL 1829 socket option 1830 2 automatic flow labels are allowed, they may be enabled on a 1831 per socket basis using the IPV6_AUTOFLOWLABEL socket option 1832 3 automatic flow labels are enabled and enforced, they cannot 1833 be disabled by the socket option 1834 = =========================================================== 1835 1836 Default: 1 1837 1838flowlabel_state_ranges - BOOLEAN 1839 Split the flow label number space into two ranges. 0-0x7FFFF is 1840 reserved for the IPv6 flow manager facility, 0x80000-0xFFFFF 1841 is reserved for stateless flow labels as described in RFC6437. 1842 1843 - TRUE: enabled 1844 - FALSE: disabled 1845 1846 Default: true 1847 1848flowlabel_reflect - INTEGER 1849 Control flow label reflection. Needed for Path MTU 1850 Discovery to work with Equal Cost Multipath Routing in anycast 1851 environments. See RFC 7690 and: 1852 https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01 1853 1854 This is a bitmask. 1855 1856 - 1: enabled for established flows 1857 1858 Note that this prevents automatic flowlabel changes, as done 1859 in "tcp: change IPv6 flow-label upon receiving spurious retransmission" 1860 and "tcp: Change txhash on every SYN and RTO retransmit" 1861 1862 - 2: enabled for TCP RESET packets (no active listener) 1863 If set, a RST packet sent in response to a SYN packet on a closed 1864 port will reflect the incoming flow label. 1865 1866 - 4: enabled for ICMPv6 echo reply messages. 1867 1868 Default: 0 1869 1870fib_multipath_hash_policy - INTEGER 1871 Controls which hash policy to use for multipath routes. 1872 1873 Default: 0 (Layer 3) 1874 1875 Possible values: 1876 1877 - 0 - Layer 3 (source and destination addresses plus flow label) 1878 - 1 - Layer 4 (standard 5-tuple) 1879 - 2 - Layer 3 or inner Layer 3 if present 1880 - 3 - Custom multipath hash. Fields used for multipath hash calculation 1881 are determined by fib_multipath_hash_fields sysctl 1882 1883fib_multipath_hash_fields - UNSIGNED INTEGER 1884 When fib_multipath_hash_policy is set to 3 (custom multipath hash), the 1885 fields used for multipath hash calculation are determined by this 1886 sysctl. 1887 1888 This value is a bitmask which enables various fields for multipath hash 1889 calculation. 1890 1891 Possible fields are: 1892 1893 ====== ============================ 1894 0x0001 Source IP address 1895 0x0002 Destination IP address 1896 0x0004 IP protocol 1897 0x0008 Flow Label 1898 0x0010 Source port 1899 0x0020 Destination port 1900 0x0040 Inner source IP address 1901 0x0080 Inner destination IP address 1902 0x0100 Inner IP protocol 1903 0x0200 Inner Flow Label 1904 0x0400 Inner source port 1905 0x0800 Inner destination port 1906 ====== ============================ 1907 1908 Default: 0x0007 (source IP, destination IP and IP protocol) 1909 1910anycast_src_echo_reply - BOOLEAN 1911 Controls the use of anycast addresses as source addresses for ICMPv6 1912 echo reply 1913 1914 - TRUE: enabled 1915 - FALSE: disabled 1916 1917 Default: FALSE 1918 1919idgen_delay - INTEGER 1920 Controls the delay in seconds after which time to retry 1921 privacy stable address generation if a DAD conflict is 1922 detected. 1923 1924 Default: 1 (as specified in RFC7217) 1925 1926idgen_retries - INTEGER 1927 Controls the number of retries to generate a stable privacy 1928 address if a DAD conflict is detected. 1929 1930 Default: 3 (as specified in RFC7217) 1931 1932mld_qrv - INTEGER 1933 Controls the MLD query robustness variable (see RFC3810 9.1). 1934 1935 Default: 2 (as specified by RFC3810 9.1) 1936 1937 Minimum: 1 (as specified by RFC6636 4.5) 1938 1939max_dst_opts_number - INTEGER 1940 Maximum number of non-padding TLVs allowed in a Destination 1941 options extension header. If this value is less than zero 1942 then unknown options are disallowed and the number of known 1943 TLVs allowed is the absolute value of this number. 1944 1945 Default: 8 1946 1947max_hbh_opts_number - INTEGER 1948 Maximum number of non-padding TLVs allowed in a Hop-by-Hop 1949 options extension header. If this value is less than zero 1950 then unknown options are disallowed and the number of known 1951 TLVs allowed is the absolute value of this number. 1952 1953 Default: 8 1954 1955max_dst_opts_length - INTEGER 1956 Maximum length allowed for a Destination options extension 1957 header. 1958 1959 Default: INT_MAX (unlimited) 1960 1961max_hbh_length - INTEGER 1962 Maximum length allowed for a Hop-by-Hop options extension 1963 header. 1964 1965 Default: INT_MAX (unlimited) 1966 1967skip_notify_on_dev_down - BOOLEAN 1968 Controls whether an RTM_DELROUTE message is generated for routes 1969 removed when a device is taken down or deleted. IPv4 does not 1970 generate this message; IPv6 does by default. Setting this sysctl 1971 to true skips the message, making IPv4 and IPv6 on par in relying 1972 on userspace caches to track link events and evict routes. 1973 1974 Default: false (generate message) 1975 1976nexthop_compat_mode - BOOLEAN 1977 New nexthop API provides a means for managing nexthops independent of 1978 prefixes. Backwards compatibilty with old route format is enabled by 1979 default which means route dumps and notifications contain the new 1980 nexthop attribute but also the full, expanded nexthop definition. 1981 Further, updates or deletes of a nexthop configuration generate route 1982 notifications for each fib entry using the nexthop. Once a system 1983 understands the new API, this sysctl can be disabled to achieve full 1984 performance benefits of the new API by disabling the nexthop expansion 1985 and extraneous notifications. 1986 Default: true (backward compat mode) 1987 1988fib_notify_on_flag_change - INTEGER 1989 Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/ 1990 RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed. 1991 1992 After installing a route to the kernel, user space receives an 1993 acknowledgment, which means the route was installed in the kernel, 1994 but not necessarily in hardware. 1995 It is also possible for a route already installed in hardware to change 1996 its action and therefore its flags. For example, a host route that is 1997 trapping packets can be "promoted" to perform decapsulation following 1998 the installation of an IPinIP/VXLAN tunnel. 1999 The notifications will indicate to user-space the state of the route. 2000 2001 Default: 0 (Do not emit notifications.) 2002 2003 Possible values: 2004 2005 - 0 - Do not emit notifications. 2006 - 1 - Emit notifications. 2007 - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change. 2008 2009ioam6_id - INTEGER 2010 Define the IOAM id of this node. Uses only 24 bits out of 32 in total. 2011 2012 Min: 0 2013 Max: 0xFFFFFF 2014 2015 Default: 0xFFFFFF 2016 2017ioam6_id_wide - LONG INTEGER 2018 Define the wide IOAM id of this node. Uses only 56 bits out of 64 in 2019 total. Can be different from ioam6_id. 2020 2021 Min: 0 2022 Max: 0xFFFFFFFFFFFFFF 2023 2024 Default: 0xFFFFFFFFFFFFFF 2025 2026IPv6 Fragmentation: 2027 2028ip6frag_high_thresh - INTEGER 2029 Maximum memory used to reassemble IPv6 fragments. When 2030 ip6frag_high_thresh bytes of memory is allocated for this purpose, 2031 the fragment handler will toss packets until ip6frag_low_thresh 2032 is reached. 2033 2034ip6frag_low_thresh - INTEGER 2035 See ip6frag_high_thresh 2036 2037ip6frag_time - INTEGER 2038 Time in seconds to keep an IPv6 fragment in memory. 2039 2040``conf/default/*``: 2041 Change the interface-specific default settings. 2042 2043 These settings would be used during creating new interfaces. 2044 2045 2046``conf/all/*``: 2047 Change all the interface-specific settings. 2048 2049 [XXX: Other special features than forwarding?] 2050 2051conf/all/disable_ipv6 - BOOLEAN 2052 Changing this value is same as changing ``conf/default/disable_ipv6`` 2053 setting and also all per-interface ``disable_ipv6`` settings to the same 2054 value. 2055 2056 Reading this value does not have any particular meaning. It does not say 2057 whether IPv6 support is enabled or disabled. Returned value can be 1 2058 also in the case when some interface has ``disable_ipv6`` set to 0 and 2059 has configured IPv6 addresses. 2060 2061conf/all/forwarding - BOOLEAN 2062 Enable global IPv6 forwarding between all interfaces. 2063 2064 IPv4 and IPv6 work differently here; e.g. netfilter must be used 2065 to control which interfaces may forward packets and which not. 2066 2067 This also sets all interfaces' Host/Router setting 2068 'forwarding' to the specified value. See below for details. 2069 2070 This referred to as global forwarding. 2071 2072proxy_ndp - BOOLEAN 2073 Do proxy ndp. 2074 2075fwmark_reflect - BOOLEAN 2076 Controls the fwmark of kernel-generated IPv6 reply packets that are not 2077 associated with a socket for example, TCP RSTs or ICMPv6 echo replies). 2078 If unset, these packets have a fwmark of zero. If set, they have the 2079 fwmark of the packet they are replying to. 2080 2081 Default: 0 2082 2083``conf/interface/*``: 2084 Change special settings per interface. 2085 2086 The functional behaviour for certain settings is different 2087 depending on whether local forwarding is enabled or not. 2088 2089accept_ra - INTEGER 2090 Accept Router Advertisements; autoconfigure using them. 2091 2092 It also determines whether or not to transmit Router 2093 Solicitations. If and only if the functional setting is to 2094 accept Router Advertisements, Router Solicitations will be 2095 transmitted. 2096 2097 Possible values are: 2098 2099 == =========================================================== 2100 0 Do not accept Router Advertisements. 2101 1 Accept Router Advertisements if forwarding is disabled. 2102 2 Overrule forwarding behaviour. Accept Router Advertisements 2103 even if forwarding is enabled. 2104 == =========================================================== 2105 2106 Functional default: 2107 2108 - enabled if local forwarding is disabled. 2109 - disabled if local forwarding is enabled. 2110 2111accept_ra_defrtr - BOOLEAN 2112 Learn default router in Router Advertisement. 2113 2114 Functional default: 2115 2116 - enabled if accept_ra is enabled. 2117 - disabled if accept_ra is disabled. 2118 2119ra_defrtr_metric - UNSIGNED INTEGER 2120 Route metric for default route learned in Router Advertisement. This value 2121 will be assigned as metric for the default route learned via IPv6 Router 2122 Advertisement. Takes affect only if accept_ra_defrtr is enabled. 2123 2124 Possible values: 2125 1 to 0xFFFFFFFF 2126 2127 Default: IP6_RT_PRIO_USER i.e. 1024. 2128 2129accept_ra_from_local - BOOLEAN 2130 Accept RA with source-address that is found on local machine 2131 if the RA is otherwise proper and able to be accepted. 2132 2133 Default is to NOT accept these as it may be an un-intended 2134 network loop. 2135 2136 Functional default: 2137 2138 - enabled if accept_ra_from_local is enabled 2139 on a specific interface. 2140 - disabled if accept_ra_from_local is disabled 2141 on a specific interface. 2142 2143accept_ra_min_hop_limit - INTEGER 2144 Minimum hop limit Information in Router Advertisement. 2145 2146 Hop limit Information in Router Advertisement less than this 2147 variable shall be ignored. 2148 2149 Default: 1 2150 2151accept_ra_min_lft - INTEGER 2152 Minimum acceptable lifetime value in Router Advertisement. 2153 2154 RA sections with a lifetime less than this value shall be 2155 ignored. Zero lifetimes stay unaffected. 2156 2157 Default: 0 2158 2159accept_ra_pinfo - BOOLEAN 2160 Learn Prefix Information in Router Advertisement. 2161 2162 Functional default: 2163 2164 - enabled if accept_ra is enabled. 2165 - disabled if accept_ra is disabled. 2166 2167accept_ra_rt_info_min_plen - INTEGER 2168 Minimum prefix length of Route Information in RA. 2169 2170 Route Information w/ prefix smaller than this variable shall 2171 be ignored. 2172 2173 Functional default: 2174 2175 * 0 if accept_ra_rtr_pref is enabled. 2176 * -1 if accept_ra_rtr_pref is disabled. 2177 2178accept_ra_rt_info_max_plen - INTEGER 2179 Maximum prefix length of Route Information in RA. 2180 2181 Route Information w/ prefix larger than this variable shall 2182 be ignored. 2183 2184 Functional default: 2185 2186 * 0 if accept_ra_rtr_pref is enabled. 2187 * -1 if accept_ra_rtr_pref is disabled. 2188 2189accept_ra_rtr_pref - BOOLEAN 2190 Accept Router Preference in RA. 2191 2192 Functional default: 2193 2194 - enabled if accept_ra is enabled. 2195 - disabled if accept_ra is disabled. 2196 2197accept_ra_mtu - BOOLEAN 2198 Apply the MTU value specified in RA option 5 (RFC4861). If 2199 disabled, the MTU specified in the RA will be ignored. 2200 2201 Functional default: 2202 2203 - enabled if accept_ra is enabled. 2204 - disabled if accept_ra is disabled. 2205 2206accept_redirects - BOOLEAN 2207 Accept Redirects. 2208 2209 Functional default: 2210 2211 - enabled if local forwarding is disabled. 2212 - disabled if local forwarding is enabled. 2213 2214accept_source_route - INTEGER 2215 Accept source routing (routing extension header). 2216 2217 - >= 0: Accept only routing header type 2. 2218 - < 0: Do not accept routing header. 2219 2220 Default: 0 2221 2222autoconf - BOOLEAN 2223 Autoconfigure addresses using Prefix Information in Router 2224 Advertisements. 2225 2226 Functional default: 2227 2228 - enabled if accept_ra_pinfo is enabled. 2229 - disabled if accept_ra_pinfo is disabled. 2230 2231dad_transmits - INTEGER 2232 The amount of Duplicate Address Detection probes to send. 2233 2234 Default: 1 2235 2236forwarding - INTEGER 2237 Configure interface-specific Host/Router behaviour. 2238 2239 .. note:: 2240 2241 It is recommended to have the same setting on all 2242 interfaces; mixed router/host scenarios are rather uncommon. 2243 2244 Possible values are: 2245 2246 - 0 Forwarding disabled 2247 - 1 Forwarding enabled 2248 2249 **FALSE (0)**: 2250 2251 By default, Host behaviour is assumed. This means: 2252 2253 1. IsRouter flag is not set in Neighbour Advertisements. 2254 2. If accept_ra is TRUE (default), transmit Router 2255 Solicitations. 2256 3. If accept_ra is TRUE (default), accept Router 2257 Advertisements (and do autoconfiguration). 2258 4. If accept_redirects is TRUE (default), accept Redirects. 2259 2260 **TRUE (1)**: 2261 2262 If local forwarding is enabled, Router behaviour is assumed. 2263 This means exactly the reverse from the above: 2264 2265 1. IsRouter flag is set in Neighbour Advertisements. 2266 2. Router Solicitations are not sent unless accept_ra is 2. 2267 3. Router Advertisements are ignored unless accept_ra is 2. 2268 4. Redirects are ignored. 2269 2270 Default: 0 (disabled) if global forwarding is disabled (default), 2271 otherwise 1 (enabled). 2272 2273hop_limit - INTEGER 2274 Default Hop Limit to set. 2275 2276 Default: 64 2277 2278mtu - INTEGER 2279 Default Maximum Transfer Unit 2280 2281 Default: 1280 (IPv6 required minimum) 2282 2283ip_nonlocal_bind - BOOLEAN 2284 If set, allows processes to bind() to non-local IPv6 addresses, 2285 which can be quite useful - but may break some applications. 2286 2287 Default: 0 2288 2289router_probe_interval - INTEGER 2290 Minimum interval (in seconds) between Router Probing described 2291 in RFC4191. 2292 2293 Default: 60 2294 2295router_solicitation_delay - INTEGER 2296 Number of seconds to wait after interface is brought up 2297 before sending Router Solicitations. 2298 2299 Default: 1 2300 2301router_solicitation_interval - INTEGER 2302 Number of seconds to wait between Router Solicitations. 2303 2304 Default: 4 2305 2306router_solicitations - INTEGER 2307 Number of Router Solicitations to send until assuming no 2308 routers are present. 2309 2310 Default: 3 2311 2312use_oif_addrs_only - BOOLEAN 2313 When enabled, the candidate source addresses for destinations 2314 routed via this interface are restricted to the set of addresses 2315 configured on this interface (vis. RFC 6724, section 4). 2316 2317 Default: false 2318 2319use_tempaddr - INTEGER 2320 Preference for Privacy Extensions (RFC3041). 2321 2322 * <= 0 : disable Privacy Extensions 2323 * == 1 : enable Privacy Extensions, but prefer public 2324 addresses over temporary addresses. 2325 * > 1 : enable Privacy Extensions and prefer temporary 2326 addresses over public addresses. 2327 2328 Default: 2329 2330 * 0 (for most devices) 2331 * -1 (for point-to-point devices and loopback devices) 2332 2333temp_valid_lft - INTEGER 2334 valid lifetime (in seconds) for temporary addresses. 2335 2336 Default: 172800 (2 days) 2337 2338temp_prefered_lft - INTEGER 2339 Preferred lifetime (in seconds) for temporary addresses. 2340 2341 Default: 86400 (1 day) 2342 2343keep_addr_on_down - INTEGER 2344 Keep all IPv6 addresses on an interface down event. If set static 2345 global addresses with no expiration time are not flushed. 2346 2347 * >0 : enabled 2348 * 0 : system default 2349 * <0 : disabled 2350 2351 Default: 0 (addresses are removed) 2352 2353max_desync_factor - INTEGER 2354 Maximum value for DESYNC_FACTOR, which is a random value 2355 that ensures that clients don't synchronize with each 2356 other and generate new addresses at exactly the same time. 2357 value is in seconds. 2358 2359 Default: 600 2360 2361regen_max_retry - INTEGER 2362 Number of attempts before give up attempting to generate 2363 valid temporary addresses. 2364 2365 Default: 5 2366 2367max_addresses - INTEGER 2368 Maximum number of autoconfigured addresses per interface. Setting 2369 to zero disables the limitation. It is not recommended to set this 2370 value too large (or to zero) because it would be an easy way to 2371 crash the kernel by allowing too many addresses to be created. 2372 2373 Default: 16 2374 2375disable_ipv6 - BOOLEAN 2376 Disable IPv6 operation. If accept_dad is set to 2, this value 2377 will be dynamically set to TRUE if DAD fails for the link-local 2378 address. 2379 2380 Default: FALSE (enable IPv6 operation) 2381 2382 When this value is changed from 1 to 0 (IPv6 is being enabled), 2383 it will dynamically create a link-local address on the given 2384 interface and start Duplicate Address Detection, if necessary. 2385 2386 When this value is changed from 0 to 1 (IPv6 is being disabled), 2387 it will dynamically delete all addresses and routes on the given 2388 interface. From now on it will not possible to add addresses/routes 2389 to the selected interface. 2390 2391accept_dad - INTEGER 2392 Whether to accept DAD (Duplicate Address Detection). 2393 2394 == ============================================================== 2395 0 Disable DAD 2396 1 Enable DAD (default) 2397 2 Enable DAD, and disable IPv6 operation if MAC-based duplicate 2398 link-local address has been found. 2399 == ============================================================== 2400 2401 DAD operation and mode on a given interface will be selected according 2402 to the maximum value of conf/{all,interface}/accept_dad. 2403 2404force_tllao - BOOLEAN 2405 Enable sending the target link-layer address option even when 2406 responding to a unicast neighbor solicitation. 2407 2408 Default: FALSE 2409 2410 Quoting from RFC 2461, section 4.4, Target link-layer address: 2411 2412 "The option MUST be included for multicast solicitations in order to 2413 avoid infinite Neighbor Solicitation "recursion" when the peer node 2414 does not have a cache entry to return a Neighbor Advertisements 2415 message. When responding to unicast solicitations, the option can be 2416 omitted since the sender of the solicitation has the correct link- 2417 layer address; otherwise it would not have be able to send the unicast 2418 solicitation in the first place. However, including the link-layer 2419 address in this case adds little overhead and eliminates a potential 2420 race condition where the sender deletes the cached link-layer address 2421 prior to receiving a response to a previous solicitation." 2422 2423ndisc_notify - BOOLEAN 2424 Define mode for notification of address and device changes. 2425 2426 * 0 - (default): do nothing 2427 * 1 - Generate unsolicited neighbour advertisements when device is brought 2428 up or hardware address changes. 2429 2430ndisc_tclass - INTEGER 2431 The IPv6 Traffic Class to use by default when sending IPv6 Neighbor 2432 Discovery (Router Solicitation, Router Advertisement, Neighbor 2433 Solicitation, Neighbor Advertisement, Redirect) messages. 2434 These 8 bits can be interpreted as 6 high order bits holding the DSCP 2435 value and 2 low order bits representing ECN (which you probably want 2436 to leave cleared). 2437 2438 * 0 - (default) 2439 2440ndisc_evict_nocarrier - BOOLEAN 2441 Clears the neighbor discovery table on NOCARRIER events. This option is 2442 important for wireless devices where the neighbor discovery cache should 2443 not be cleared when roaming between access points on the same network. 2444 In most cases this should remain as the default (1). 2445 2446 - 1 - (default): Clear neighbor discover cache on NOCARRIER events. 2447 - 0 - Do not clear neighbor discovery cache on NOCARRIER events. 2448 2449mldv1_unsolicited_report_interval - INTEGER 2450 The interval in milliseconds in which the next unsolicited 2451 MLDv1 report retransmit will take place. 2452 2453 Default: 10000 (10 seconds) 2454 2455mldv2_unsolicited_report_interval - INTEGER 2456 The interval in milliseconds in which the next unsolicited 2457 MLDv2 report retransmit will take place. 2458 2459 Default: 1000 (1 second) 2460 2461force_mld_version - INTEGER 2462 * 0 - (default) No enforcement of a MLD version, MLDv1 fallback allowed 2463 * 1 - Enforce to use MLD version 1 2464 * 2 - Enforce to use MLD version 2 2465 2466suppress_frag_ndisc - INTEGER 2467 Control RFC 6980 (Security Implications of IPv6 Fragmentation 2468 with IPv6 Neighbor Discovery) behavior: 2469 2470 * 1 - (default) discard fragmented neighbor discovery packets 2471 * 0 - allow fragmented neighbor discovery packets 2472 2473optimistic_dad - BOOLEAN 2474 Whether to perform Optimistic Duplicate Address Detection (RFC 4429). 2475 2476 * 0: disabled (default) 2477 * 1: enabled 2478 2479 Optimistic Duplicate Address Detection for the interface will be enabled 2480 if at least one of conf/{all,interface}/optimistic_dad is set to 1, 2481 it will be disabled otherwise. 2482 2483use_optimistic - BOOLEAN 2484 If enabled, do not classify optimistic addresses as deprecated during 2485 source address selection. Preferred addresses will still be chosen 2486 before optimistic addresses, subject to other ranking in the source 2487 address selection algorithm. 2488 2489 * 0: disabled (default) 2490 * 1: enabled 2491 2492 This will be enabled if at least one of 2493 conf/{all,interface}/use_optimistic is set to 1, disabled otherwise. 2494 2495stable_secret - IPv6 address 2496 This IPv6 address will be used as a secret to generate IPv6 2497 addresses for link-local addresses and autoconfigured 2498 ones. All addresses generated after setting this secret will 2499 be stable privacy ones by default. This can be changed via the 2500 addrgenmode ip-link. conf/default/stable_secret is used as the 2501 secret for the namespace, the interface specific ones can 2502 overwrite that. Writes to conf/all/stable_secret are refused. 2503 2504 It is recommended to generate this secret during installation 2505 of a system and keep it stable after that. 2506 2507 By default the stable secret is unset. 2508 2509addr_gen_mode - INTEGER 2510 Defines how link-local and autoconf addresses are generated. 2511 2512 = ================================================================= 2513 0 generate address based on EUI64 (default) 2514 1 do no generate a link-local address, use EUI64 for addresses 2515 generated from autoconf 2516 2 generate stable privacy addresses, using the secret from 2517 stable_secret (RFC7217) 2518 3 generate stable privacy addresses, using a random secret if unset 2519 = ================================================================= 2520 2521drop_unicast_in_l2_multicast - BOOLEAN 2522 Drop any unicast IPv6 packets that are received in link-layer 2523 multicast (or broadcast) frames. 2524 2525 By default this is turned off. 2526 2527drop_unsolicited_na - BOOLEAN 2528 Drop all unsolicited neighbor advertisements, for example if there's 2529 a known good NA proxy on the network and such frames need not be used 2530 (or in the case of 802.11, must not be used to prevent attacks.) 2531 2532 By default this is turned off. 2533 2534accept_untracked_na - INTEGER 2535 Define behavior for accepting neighbor advertisements from devices that 2536 are absent in the neighbor cache: 2537 2538 - 0 - (default) Do not accept unsolicited and untracked neighbor 2539 advertisements. 2540 2541 - 1 - Add a new neighbor cache entry in STALE state for routers on 2542 receiving a neighbor advertisement (either solicited or unsolicited) 2543 with target link-layer address option specified if no neighbor entry 2544 is already present for the advertised IPv6 address. Without this knob, 2545 NAs received for untracked addresses (absent in neighbor cache) are 2546 silently ignored. 2547 2548 This is as per router-side behavior documented in RFC9131. 2549 2550 This has lower precedence than drop_unsolicited_na. 2551 2552 This will optimize the return path for the initial off-link 2553 communication that is initiated by a directly connected host, by 2554 ensuring that the first-hop router which turns on this setting doesn't 2555 have to buffer the initial return packets to do neighbor-solicitation. 2556 The prerequisite is that the host is configured to send unsolicited 2557 neighbor advertisements on interface bringup. This setting should be 2558 used in conjunction with the ndisc_notify setting on the host to 2559 satisfy this prerequisite. 2560 2561 - 2 - Extend option (1) to add a new neighbor cache entry only if the 2562 source IP address is in the same subnet as an address configured on 2563 the interface that received the neighbor advertisement. 2564 2565enhanced_dad - BOOLEAN 2566 Include a nonce option in the IPv6 neighbor solicitation messages used for 2567 duplicate address detection per RFC7527. A received DAD NS will only signal 2568 a duplicate address if the nonce is different. This avoids any false 2569 detection of duplicates due to loopback of the NS messages that we send. 2570 The nonce option will be sent on an interface unless both of 2571 conf/{all,interface}/enhanced_dad are set to FALSE. 2572 2573 Default: TRUE 2574 2575``icmp/*``: 2576=========== 2577 2578ratelimit - INTEGER 2579 Limit the maximal rates for sending ICMPv6 messages. 2580 2581 0 to disable any limiting, 2582 otherwise the minimal space between responses in milliseconds. 2583 2584 Default: 1000 2585 2586ratemask - list of comma separated ranges 2587 For ICMPv6 message types matching the ranges in the ratemask, limit 2588 the sending of the message according to ratelimit parameter. 2589 2590 The format used for both input and output is a comma separated 2591 list of ranges (e.g. "0-127,129" for ICMPv6 message type 0 to 127 and 2592 129). Writing to the file will clear all previous ranges of ICMPv6 2593 message types and update the current list with the input. 2594 2595 Refer to: https://www.iana.org/assignments/icmpv6-parameters/icmpv6-parameters.xhtml 2596 for numerical values of ICMPv6 message types, e.g. echo request is 128 2597 and echo reply is 129. 2598 2599 Default: 0-1,3-127 (rate limit ICMPv6 errors except Packet Too Big) 2600 2601echo_ignore_all - BOOLEAN 2602 If set non-zero, then the kernel will ignore all ICMP ECHO 2603 requests sent to it over the IPv6 protocol. 2604 2605 Default: 0 2606 2607echo_ignore_multicast - BOOLEAN 2608 If set non-zero, then the kernel will ignore all ICMP ECHO 2609 requests sent to it over the IPv6 protocol via multicast. 2610 2611 Default: 0 2612 2613echo_ignore_anycast - BOOLEAN 2614 If set non-zero, then the kernel will ignore all ICMP ECHO 2615 requests sent to it over the IPv6 protocol destined to anycast address. 2616 2617 Default: 0 2618 2619xfrm6_gc_thresh - INTEGER 2620 (Obsolete since linux-4.14) 2621 The threshold at which we will start garbage collecting for IPv6 2622 destination cache entries. At twice this value the system will 2623 refuse new allocations. 2624 2625 2626IPv6 Update by: 2627Pekka Savola <pekkas@netcore.fi> 2628YOSHIFUJI Hideaki / USAGI Project <yoshfuji@linux-ipv6.org> 2629 2630 2631/proc/sys/net/bridge/* Variables: 2632================================= 2633 2634bridge-nf-call-arptables - BOOLEAN 2635 - 1 : pass bridged ARP traffic to arptables' FORWARD chain. 2636 - 0 : disable this. 2637 2638 Default: 1 2639 2640bridge-nf-call-iptables - BOOLEAN 2641 - 1 : pass bridged IPv4 traffic to iptables' chains. 2642 - 0 : disable this. 2643 2644 Default: 1 2645 2646bridge-nf-call-ip6tables - BOOLEAN 2647 - 1 : pass bridged IPv6 traffic to ip6tables' chains. 2648 - 0 : disable this. 2649 2650 Default: 1 2651 2652bridge-nf-filter-vlan-tagged - BOOLEAN 2653 - 1 : pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables. 2654 - 0 : disable this. 2655 2656 Default: 0 2657 2658bridge-nf-filter-pppoe-tagged - BOOLEAN 2659 - 1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables. 2660 - 0 : disable this. 2661 2662 Default: 0 2663 2664bridge-nf-pass-vlan-input-dev - BOOLEAN 2665 - 1: if bridge-nf-filter-vlan-tagged is enabled, try to find a vlan 2666 interface on the bridge and set the netfilter input device to the 2667 vlan. This allows use of e.g. "iptables -i br0.1" and makes the 2668 REDIRECT target work with vlan-on-top-of-bridge interfaces. When no 2669 matching vlan interface is found, or this switch is off, the input 2670 device is set to the bridge interface. 2671 2672 - 0: disable bridge netfilter vlan interface lookup. 2673 2674 Default: 0 2675 2676``proc/sys/net/sctp/*`` Variables: 2677================================== 2678 2679addip_enable - BOOLEAN 2680 Enable or disable extension of Dynamic Address Reconfiguration 2681 (ADD-IP) functionality specified in RFC5061. This extension provides 2682 the ability to dynamically add and remove new addresses for the SCTP 2683 associations. 2684 2685 1: Enable extension. 2686 2687 0: Disable extension. 2688 2689 Default: 0 2690 2691pf_enable - INTEGER 2692 Enable or disable pf (pf is short for potentially failed) state. A value 2693 of pf_retrans > path_max_retrans also disables pf state. That is, one of 2694 both pf_enable and pf_retrans > path_max_retrans can disable pf state. 2695 Since pf_retrans and path_max_retrans can be changed by userspace 2696 application, sometimes user expects to disable pf state by the value of 2697 pf_retrans > path_max_retrans, but occasionally the value of pf_retrans 2698 or path_max_retrans is changed by the user application, this pf state is 2699 enabled. As such, it is necessary to add this to dynamically enable 2700 and disable pf state. See: 2701 https://datatracker.ietf.org/doc/draft-ietf-tsvwg-sctp-failover for 2702 details. 2703 2704 1: Enable pf. 2705 2706 0: Disable pf. 2707 2708 Default: 1 2709 2710pf_expose - INTEGER 2711 Unset or enable/disable pf (pf is short for potentially failed) state 2712 exposure. Applications can control the exposure of the PF path state 2713 in the SCTP_PEER_ADDR_CHANGE event and the SCTP_GET_PEER_ADDR_INFO 2714 sockopt. When it's unset, no SCTP_PEER_ADDR_CHANGE event with 2715 SCTP_ADDR_PF state will be sent and a SCTP_PF-state transport info 2716 can be got via SCTP_GET_PEER_ADDR_INFO sockopt; When it's enabled, 2717 a SCTP_PEER_ADDR_CHANGE event will be sent for a transport becoming 2718 SCTP_PF state and a SCTP_PF-state transport info can be got via 2719 SCTP_GET_PEER_ADDR_INFO sockopt; When it's diabled, no 2720 SCTP_PEER_ADDR_CHANGE event will be sent and it returns -EACCES when 2721 trying to get a SCTP_PF-state transport info via SCTP_GET_PEER_ADDR_INFO 2722 sockopt. 2723 2724 0: Unset pf state exposure, Compatible with old applications. 2725 2726 1: Disable pf state exposure. 2727 2728 2: Enable pf state exposure. 2729 2730 Default: 0 2731 2732addip_noauth_enable - BOOLEAN 2733 Dynamic Address Reconfiguration (ADD-IP) requires the use of 2734 authentication to protect the operations of adding or removing new 2735 addresses. This requirement is mandated so that unauthorized hosts 2736 would not be able to hijack associations. However, older 2737 implementations may not have implemented this requirement while 2738 allowing the ADD-IP extension. For reasons of interoperability, 2739 we provide this variable to control the enforcement of the 2740 authentication requirement. 2741 2742 == =============================================================== 2743 1 Allow ADD-IP extension to be used without authentication. This 2744 should only be set in a closed environment for interoperability 2745 with older implementations. 2746 2747 0 Enforce the authentication requirement 2748 == =============================================================== 2749 2750 Default: 0 2751 2752auth_enable - BOOLEAN 2753 Enable or disable Authenticated Chunks extension. This extension 2754 provides the ability to send and receive authenticated chunks and is 2755 required for secure operation of Dynamic Address Reconfiguration 2756 (ADD-IP) extension. 2757 2758 - 1: Enable this extension. 2759 - 0: Disable this extension. 2760 2761 Default: 0 2762 2763prsctp_enable - BOOLEAN 2764 Enable or disable the Partial Reliability extension (RFC3758) which 2765 is used to notify peers that a given DATA should no longer be expected. 2766 2767 - 1: Enable extension 2768 - 0: Disable 2769 2770 Default: 1 2771 2772max_burst - INTEGER 2773 The limit of the number of new packets that can be initially sent. It 2774 controls how bursty the generated traffic can be. 2775 2776 Default: 4 2777 2778association_max_retrans - INTEGER 2779 Set the maximum number for retransmissions that an association can 2780 attempt deciding that the remote end is unreachable. If this value 2781 is exceeded, the association is terminated. 2782 2783 Default: 10 2784 2785max_init_retransmits - INTEGER 2786 The maximum number of retransmissions of INIT and COOKIE-ECHO chunks 2787 that an association will attempt before declaring the destination 2788 unreachable and terminating. 2789 2790 Default: 8 2791 2792path_max_retrans - INTEGER 2793 The maximum number of retransmissions that will be attempted on a given 2794 path. Once this threshold is exceeded, the path is considered 2795 unreachable, and new traffic will use a different path when the 2796 association is multihomed. 2797 2798 Default: 5 2799 2800pf_retrans - INTEGER 2801 The number of retransmissions that will be attempted on a given path 2802 before traffic is redirected to an alternate transport (should one 2803 exist). Note this is distinct from path_max_retrans, as a path that 2804 passes the pf_retrans threshold can still be used. Its only 2805 deprioritized when a transmission path is selected by the stack. This 2806 setting is primarily used to enable fast failover mechanisms without 2807 having to reduce path_max_retrans to a very low value. See: 2808 http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt 2809 for details. Note also that a value of pf_retrans > path_max_retrans 2810 disables this feature. Since both pf_retrans and path_max_retrans can 2811 be changed by userspace application, a variable pf_enable is used to 2812 disable pf state. 2813 2814 Default: 0 2815 2816ps_retrans - INTEGER 2817 Primary.Switchover.Max.Retrans (PSMR), it's a tunable parameter coming 2818 from section-5 "Primary Path Switchover" in rfc7829. The primary path 2819 will be changed to another active path when the path error counter on 2820 the old primary path exceeds PSMR, so that "the SCTP sender is allowed 2821 to continue data transmission on a new working path even when the old 2822 primary destination address becomes active again". Note this feature 2823 is disabled by initializing 'ps_retrans' per netns as 0xffff by default, 2824 and its value can't be less than 'pf_retrans' when changing by sysctl. 2825 2826 Default: 0xffff 2827 2828rto_initial - INTEGER 2829 The initial round trip timeout value in milliseconds that will be used 2830 in calculating round trip times. This is the initial time interval 2831 for retransmissions. 2832 2833 Default: 3000 2834 2835rto_max - INTEGER 2836 The maximum value (in milliseconds) of the round trip timeout. This 2837 is the largest time interval that can elapse between retransmissions. 2838 2839 Default: 60000 2840 2841rto_min - INTEGER 2842 The minimum value (in milliseconds) of the round trip timeout. This 2843 is the smallest time interval the can elapse between retransmissions. 2844 2845 Default: 1000 2846 2847hb_interval - INTEGER 2848 The interval (in milliseconds) between HEARTBEAT chunks. These chunks 2849 are sent at the specified interval on idle paths to probe the state of 2850 a given path between 2 associations. 2851 2852 Default: 30000 2853 2854sack_timeout - INTEGER 2855 The amount of time (in milliseconds) that the implementation will wait 2856 to send a SACK. 2857 2858 Default: 200 2859 2860valid_cookie_life - INTEGER 2861 The default lifetime of the SCTP cookie (in milliseconds). The cookie 2862 is used during association establishment. 2863 2864 Default: 60000 2865 2866cookie_preserve_enable - BOOLEAN 2867 Enable or disable the ability to extend the lifetime of the SCTP cookie 2868 that is used during the establishment phase of SCTP association 2869 2870 - 1: Enable cookie lifetime extension. 2871 - 0: Disable 2872 2873 Default: 1 2874 2875cookie_hmac_alg - STRING 2876 Select the hmac algorithm used when generating the cookie value sent by 2877 a listening sctp socket to a connecting client in the INIT-ACK chunk. 2878 Valid values are: 2879 2880 * md5 2881 * sha1 2882 * none 2883 2884 Ability to assign md5 or sha1 as the selected alg is predicated on the 2885 configuration of those algorithms at build time (CONFIG_CRYPTO_MD5 and 2886 CONFIG_CRYPTO_SHA1). 2887 2888 Default: Dependent on configuration. MD5 if available, else SHA1 if 2889 available, else none. 2890 2891rcvbuf_policy - INTEGER 2892 Determines if the receive buffer is attributed to the socket or to 2893 association. SCTP supports the capability to create multiple 2894 associations on a single socket. When using this capability, it is 2895 possible that a single stalled association that's buffering a lot 2896 of data may block other associations from delivering their data by 2897 consuming all of the receive buffer space. To work around this, 2898 the rcvbuf_policy could be set to attribute the receiver buffer space 2899 to each association instead of the socket. This prevents the described 2900 blocking. 2901 2902 - 1: rcvbuf space is per association 2903 - 0: rcvbuf space is per socket 2904 2905 Default: 0 2906 2907sndbuf_policy - INTEGER 2908 Similar to rcvbuf_policy above, this applies to send buffer space. 2909 2910 - 1: Send buffer is tracked per association 2911 - 0: Send buffer is tracked per socket. 2912 2913 Default: 0 2914 2915sctp_mem - vector of 3 INTEGERs: min, pressure, max 2916 Number of pages allowed for queueing by all SCTP sockets. 2917 2918 min: Below this number of pages SCTP is not bothered about its 2919 memory appetite. When amount of memory allocated by SCTP exceeds 2920 this number, SCTP starts to moderate memory usage. 2921 2922 pressure: This value was introduced to follow format of tcp_mem. 2923 2924 max: Number of pages allowed for queueing by all SCTP sockets. 2925 2926 Default is calculated at boot time from amount of available memory. 2927 2928sctp_rmem - vector of 3 INTEGERs: min, default, max 2929 Only the first value ("min") is used, "default" and "max" are 2930 ignored. 2931 2932 min: Minimal size of receive buffer used by SCTP socket. 2933 It is guaranteed to each SCTP socket (but not association) even 2934 under moderate memory pressure. 2935 2936 Default: 4K 2937 2938sctp_wmem - vector of 3 INTEGERs: min, default, max 2939 Only the first value ("min") is used, "default" and "max" are 2940 ignored. 2941 2942 min: Minimum size of send buffer that can be used by SCTP sockets. 2943 It is guaranteed to each SCTP socket (but not association) even 2944 under moderate memory pressure. 2945 2946 Default: 4K 2947 2948addr_scope_policy - INTEGER 2949 Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00 2950 2951 - 0 - Disable IPv4 address scoping 2952 - 1 - Enable IPv4 address scoping 2953 - 2 - Follow draft but allow IPv4 private addresses 2954 - 3 - Follow draft but allow IPv4 link local addresses 2955 2956 Default: 1 2957 2958udp_port - INTEGER 2959 The listening port for the local UDP tunneling sock. Normally it's 2960 using the IANA-assigned UDP port number 9899 (sctp-tunneling). 2961 2962 This UDP sock is used for processing the incoming UDP-encapsulated 2963 SCTP packets (from RFC6951), and shared by all applications in the 2964 same net namespace. This UDP sock will be closed when the value is 2965 set to 0. 2966 2967 The value will also be used to set the src port of the UDP header 2968 for the outgoing UDP-encapsulated SCTP packets. For the dest port, 2969 please refer to 'encap_port' below. 2970 2971 Default: 0 2972 2973encap_port - INTEGER 2974 The default remote UDP encapsulation port. 2975 2976 This value is used to set the dest port of the UDP header for the 2977 outgoing UDP-encapsulated SCTP packets by default. Users can also 2978 change the value for each sock/asoc/transport by using setsockopt. 2979 For further information, please refer to RFC6951. 2980 2981 Note that when connecting to a remote server, the client should set 2982 this to the port that the UDP tunneling sock on the peer server is 2983 listening to and the local UDP tunneling sock on the client also 2984 must be started. On the server, it would get the encap_port from 2985 the incoming packet's source port. 2986 2987 Default: 0 2988 2989plpmtud_probe_interval - INTEGER 2990 The time interval (in milliseconds) for the PLPMTUD probe timer, 2991 which is configured to expire after this period to receive an 2992 acknowledgment to a probe packet. This is also the time interval 2993 between the probes for the current pmtu when the probe search 2994 is done. 2995 2996 PLPMTUD will be disabled when 0 is set, and other values for it 2997 must be >= 5000. 2998 2999 Default: 0 3000 3001reconf_enable - BOOLEAN 3002 Enable or disable extension of Stream Reconfiguration functionality 3003 specified in RFC6525. This extension provides the ability to "reset" 3004 a stream, and it includes the Parameters of "Outgoing/Incoming SSN 3005 Reset", "SSN/TSN Reset" and "Add Outgoing/Incoming Streams". 3006 3007 - 1: Enable extension. 3008 - 0: Disable extension. 3009 3010 Default: 0 3011 3012intl_enable - BOOLEAN 3013 Enable or disable extension of User Message Interleaving functionality 3014 specified in RFC8260. This extension allows the interleaving of user 3015 messages sent on different streams. With this feature enabled, I-DATA 3016 chunk will replace DATA chunk to carry user messages if also supported 3017 by the peer. Note that to use this feature, one needs to set this option 3018 to 1 and also needs to set socket options SCTP_FRAGMENT_INTERLEAVE to 2 3019 and SCTP_INTERLEAVING_SUPPORTED to 1. 3020 3021 - 1: Enable extension. 3022 - 0: Disable extension. 3023 3024 Default: 0 3025 3026ecn_enable - BOOLEAN 3027 Control use of Explicit Congestion Notification (ECN) by SCTP. 3028 Like in TCP, ECN is used only when both ends of the SCTP connection 3029 indicate support for it. This feature is useful in avoiding losses 3030 due to congestion by allowing supporting routers to signal congestion 3031 before having to drop packets. 3032 3033 1: Enable ecn. 3034 0: Disable ecn. 3035 3036 Default: 1 3037 3038 3039``/proc/sys/net/core/*`` 3040======================== 3041 3042 Please see: Documentation/admin-guide/sysctl/net.rst for descriptions of these entries. 3043 3044 3045``/proc/sys/net/unix/*`` 3046======================== 3047 3048max_dgram_qlen - INTEGER 3049 The maximum length of dgram socket receive queue 3050 3051 Default: 10 3052 3053