• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1------------------------------------------------------------------------------
2                       T H E  /proc   F I L E S Y S T E M
3------------------------------------------------------------------------------
4/proc/sys         Terrehon Bowden <terrehon@pacbell.net>        October 7 1999
5                  Bodo Bauer <bb@ricochet.net>
6
72.4.x update	  Jorge Nerin <comandante@zaralinux.com>      November 14 2000
8move /proc/sys	  Shen Feng <shen@cn.fujitsu.com>		  April 1 2009
9------------------------------------------------------------------------------
10Version 1.3                                              Kernel version 2.2.12
11					      Kernel version 2.4.0-test11-pre4
12------------------------------------------------------------------------------
13fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>       June 9 2009
14
15Table of Contents
16-----------------
17
18  0     Preface
19  0.1	Introduction/Credits
20  0.2	Legal Stuff
21
22  1	Collecting System Information
23  1.1	Process-Specific Subdirectories
24  1.2	Kernel data
25  1.3	IDE devices in /proc/ide
26  1.4	Networking info in /proc/net
27  1.5	SCSI info
28  1.6	Parallel port info in /proc/parport
29  1.7	TTY info in /proc/tty
30  1.8	Miscellaneous kernel statistics in /proc/stat
31  1.9 Ext4 file system parameters
32
33  2	Modifying System Parameters
34
35  3	Per-Process Parameters
36  3.1	/proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj - Adjust the oom-killer
37								score
38  3.2	/proc/<pid>/oom_score - Display current oom-killer score
39  3.3	/proc/<pid>/io - Display the IO accounting fields
40  3.4	/proc/<pid>/coredump_filter - Core dump filtering settings
41  3.5	/proc/<pid>/mountinfo - Information about mounts
42  3.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
43
44  4	Configuring procfs
45  4.1	Mount options
46
47------------------------------------------------------------------------------
48Preface
49------------------------------------------------------------------------------
50
510.1 Introduction/Credits
52------------------------
53
54This documentation is  part of a soon (or  so we hope) to be  released book on
55the SuSE  Linux distribution. As  there is  no complete documentation  for the
56/proc file system and we've used  many freely available sources to write these
57chapters, it  seems only fair  to give the work  back to the  Linux community.
58This work is  based on the 2.2.*  kernel version and the  upcoming 2.4.*. I'm
59afraid it's still far from complete, but we  hope it will be useful. As far as
60we know, it is the first 'all-in-one' document about the /proc file system. It
61is focused  on the Intel  x86 hardware,  so if you  are looking for  PPC, ARM,
62SPARC, AXP, etc., features, you probably  won't find what you are looking for.
63It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But
64additions and patches  are welcome and will  be added to this  document if you
65mail them to Bodo.
66
67We'd like  to  thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of
68other people for help compiling this documentation. We'd also like to extend a
69special thank  you to Andi Kleen for documentation, which we relied on heavily
70to create  this  document,  as well as the additional information he provided.
71Thanks to  everybody  else  who contributed source or docs to the Linux kernel
72and helped create a great piece of software... :)
73
74If you  have  any comments, corrections or additions, please don't hesitate to
75contact Bodo  Bauer  at  bb@ricochet.net.  We'll  be happy to add them to this
76document.
77
78The   latest   version    of   this   document   is    available   online   at
79http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html
80
81If  the above  direction does  not works  for you,  you could  try the  kernel
82mailing  list  at  linux-kernel@vger.kernel.org  and/or try  to  reach  me  at
83comandante@zaralinux.com.
84
850.2 Legal Stuff
86---------------
87
88We don't  guarantee  the  correctness  of this document, and if you come to us
89complaining about  how  you  screwed  up  your  system  because  of  incorrect
90documentation, we won't feel responsible...
91
92------------------------------------------------------------------------------
93CHAPTER 1: COLLECTING SYSTEM INFORMATION
94------------------------------------------------------------------------------
95
96------------------------------------------------------------------------------
97In This Chapter
98------------------------------------------------------------------------------
99* Investigating  the  properties  of  the  pseudo  file  system  /proc and its
100  ability to provide information on the running Linux system
101* Examining /proc's structure
102* Uncovering  various  information  about the kernel and the processes running
103  on the system
104------------------------------------------------------------------------------
105
106
107The proc  file  system acts as an interface to internal data structures in the
108kernel. It  can  be  used to obtain information about the system and to change
109certain kernel parameters at runtime (sysctl).
110
111First, we'll  take  a  look  at the read-only parts of /proc. In Chapter 2, we
112show you how you can use /proc/sys to change settings.
113
1141.1 Process-Specific Subdirectories
115-----------------------------------
116
117The directory  /proc  contains  (among other things) one subdirectory for each
118process running on the system, which is named after the process ID (PID).
119
120The link  self  points  to  the  process reading the file system. Each process
121subdirectory has the entries listed in Table 1-1.
122
123
124Table 1-1: Process specific entries in /proc
125..............................................................................
126 File		Content
127 clear_refs	Clears page referenced bits shown in smaps output
128 cmdline	Command line arguments
129 cpu		Current and last cpu in which it was executed	(2.4)(smp)
130 cwd		Link to the current working directory
131 environ	Values of environment variables
132 exe		Link to the executable of this process
133 fd		Directory, which contains all file descriptors
134 maps		Memory maps to executables and library files	(2.4)
135 mem		Memory held by this process
136 root		Link to the root directory of this process
137 stat		Process status
138 statm		Process memory status information
139 status		Process status in human readable form
140 wchan		If CONFIG_KALLSYMS is set, a pre-decoded wchan
141 pagemap	Page table
142 stack		Report full stack trace, enable via CONFIG_STACKTRACE
143 smaps		a extension based on maps, showing the memory consumption of
144		each mapping
145..............................................................................
146
147For example, to get the status information of a process, all you have to do is
148read the file /proc/PID/status:
149
150  >cat /proc/self/status
151  Name:   cat
152  State:  R (running)
153  Tgid:   5452
154  Pid:    5452
155  PPid:   743
156  TracerPid:      0						(2.4)
157  Uid:    501     501     501     501
158  Gid:    100     100     100     100
159  FDSize: 256
160  Groups: 100 14 16
161  VmPeak:     5004 kB
162  VmSize:     5004 kB
163  VmLck:         0 kB
164  VmHWM:       476 kB
165  VmRSS:       476 kB
166  VmData:      156 kB
167  VmStk:        88 kB
168  VmExe:        68 kB
169  VmLib:      1412 kB
170  VmPTE:        20 kb
171  VmSwap:        0 kB
172  Threads:        1
173  SigQ:   0/28578
174  SigPnd: 0000000000000000
175  ShdPnd: 0000000000000000
176  SigBlk: 0000000000000000
177  SigIgn: 0000000000000000
178  SigCgt: 0000000000000000
179  CapInh: 00000000fffffeff
180  CapPrm: 0000000000000000
181  CapEff: 0000000000000000
182  CapBnd: ffffffffffffffff
183  voluntary_ctxt_switches:        0
184  nonvoluntary_ctxt_switches:     1
185
186This shows you nearly the same information you would get if you viewed it with
187the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its
188information.  But you get a more detailed  view of the  process by reading the
189file /proc/PID/status. It fields are described in table 1-2.
190
191The  statm  file  contains  more  detailed  information about the process
192memory usage. Its seven fields are explained in Table 1-3.  The stat file
193contains details information about the process itself.  Its fields are
194explained in Table 1-4.
195
196(for SMP CONFIG users)
197For making accounting scalable, RSS related information are handled in
198asynchronous manner and the vaule may not be very precise. To see a precise
199snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
200It's slow but very precise.
201
202Table 1-2: Contents of the status files (as of 2.6.30-rc7)
203..............................................................................
204 Field                       Content
205 Name                        filename of the executable
206 State                       state (R is running, S is sleeping, D is sleeping
207                             in an uninterruptible wait, Z is zombie,
208			     T is traced or stopped)
209 Tgid                        thread group ID
210 Pid                         process id
211 PPid                        process id of the parent process
212 TracerPid                   PID of process tracing this process (0 if not)
213 Uid                         Real, effective, saved set, and  file system UIDs
214 Gid                         Real, effective, saved set, and  file system GIDs
215 FDSize                      number of file descriptor slots currently allocated
216 Groups                      supplementary group list
217 VmPeak                      peak virtual memory size
218 VmSize                      total program size
219 VmLck                       locked memory size
220 VmHWM                       peak resident set size ("high water mark")
221 VmRSS                       size of memory portions
222 VmData                      size of data, stack, and text segments
223 VmStk                       size of data, stack, and text segments
224 VmExe                       size of text segment
225 VmLib                       size of shared library code
226 VmPTE                       size of page table entries
227 VmSwap                      size of swap usage (the number of referred swapents)
228 Threads                     number of threads
229 SigQ                        number of signals queued/max. number for queue
230 SigPnd                      bitmap of pending signals for the thread
231 ShdPnd                      bitmap of shared pending signals for the process
232 SigBlk                      bitmap of blocked signals
233 SigIgn                      bitmap of ignored signals
234 SigCgt                      bitmap of catched signals
235 CapInh                      bitmap of inheritable capabilities
236 CapPrm                      bitmap of permitted capabilities
237 CapEff                      bitmap of effective capabilities
238 CapBnd                      bitmap of capabilities bounding set
239 Cpus_allowed                mask of CPUs on which this process may run
240 Cpus_allowed_list           Same as previous, but in "list format"
241 Mems_allowed                mask of memory nodes allowed to this process
242 Mems_allowed_list           Same as previous, but in "list format"
243 voluntary_ctxt_switches     number of voluntary context switches
244 nonvoluntary_ctxt_switches  number of non voluntary context switches
245..............................................................................
246
247Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
248..............................................................................
249 Field    Content
250 size     total program size (pages)		(same as VmSize in status)
251 resident size of memory portions (pages)	(same as VmRSS in status)
252 shared   number of pages that are shared	(i.e. backed by a file)
253 trs      number of pages that are 'code'	(not including libs; broken,
254							includes data segment)
255 lrs      number of pages of library		(always 0 on 2.6)
256 drs      number of pages of data/stack		(including libs; broken,
257							includes library text)
258 dt       number of dirty pages			(always 0 on 2.6)
259..............................................................................
260
261
262Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
263..............................................................................
264 Field          Content
265  pid           process id
266  tcomm         filename of the executable
267  state         state (R is running, S is sleeping, D is sleeping in an
268                uninterruptible wait, Z is zombie, T is traced or stopped)
269  ppid          process id of the parent process
270  pgrp          pgrp of the process
271  sid           session id
272  tty_nr        tty the process uses
273  tty_pgrp      pgrp of the tty
274  flags         task flags
275  min_flt       number of minor faults
276  cmin_flt      number of minor faults with child's
277  maj_flt       number of major faults
278  cmaj_flt      number of major faults with child's
279  utime         user mode jiffies
280  stime         kernel mode jiffies
281  cutime        user mode jiffies with child's
282  cstime        kernel mode jiffies with child's
283  priority      priority level
284  nice          nice level
285  num_threads   number of threads
286  it_real_value	(obsolete, always 0)
287  start_time    time the process started after system boot
288  vsize         virtual memory size
289  rss           resident set memory size
290  rsslim        current limit in bytes on the rss
291  start_code    address above which program text can run
292  end_code      address below which program text can run
293  start_stack   address of the start of the main process stack
294  esp           current value of ESP
295  eip           current value of EIP
296  pending       bitmap of pending signals
297  blocked       bitmap of blocked signals
298  sigign        bitmap of ignored signals
299  sigcatch      bitmap of catched signals
300  wchan         address where process went to sleep
301  0             (place holder)
302  0             (place holder)
303  exit_signal   signal to send to parent thread on exit
304  task_cpu      which CPU the task is scheduled on
305  rt_priority   realtime priority
306  policy        scheduling policy (man sched_setscheduler)
307  blkio_ticks   time spent waiting for block IO
308  gtime         guest time of the task in jiffies
309  cgtime        guest time of the task children in jiffies
310  start_data    address above which program data+bss is placed
311  end_data      address below which program data+bss is placed
312  start_brk     address above which program heap can be expanded with brk()
313..............................................................................
314
315The /proc/PID/maps file containing the currently mapped memory regions and
316their access permissions.
317
318The format is:
319
320address           perms offset  dev   inode      pathname
321
32208048000-08049000 r-xp 00000000 03:00 8312       /opt/test
32308049000-0804a000 rw-p 00001000 03:00 8312       /opt/test
3240804a000-0806b000 rw-p 00000000 00:00 0          [heap]
325a7cb1000-a7cb2000 ---p 00000000 00:00 0
326a7cb2000-a7eb2000 rw-p 00000000 00:00 0
327a7eb2000-a7eb3000 ---p 00000000 00:00 0
328a7eb3000-a7ed5000 rw-p 00000000 00:00 0          [stack:1001]
329a7ed5000-a8008000 r-xp 00000000 03:00 4222       /lib/libc.so.6
330a8008000-a800a000 r--p 00133000 03:00 4222       /lib/libc.so.6
331a800a000-a800b000 rw-p 00135000 03:00 4222       /lib/libc.so.6
332a800b000-a800e000 rw-p 00000000 00:00 0
333a800e000-a8022000 r-xp 00000000 03:00 14462      /lib/libpthread.so.0
334a8022000-a8023000 r--p 00013000 03:00 14462      /lib/libpthread.so.0
335a8023000-a8024000 rw-p 00014000 03:00 14462      /lib/libpthread.so.0
336a8024000-a8027000 rw-p 00000000 00:00 0
337a8027000-a8043000 r-xp 00000000 03:00 8317       /lib/ld-linux.so.2
338a8043000-a8044000 r--p 0001b000 03:00 8317       /lib/ld-linux.so.2
339a8044000-a8045000 rw-p 0001c000 03:00 8317       /lib/ld-linux.so.2
340aff35000-aff4a000 rw-p 00000000 00:00 0          [stack]
341ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]
342
343where "address" is the address space in the process that it occupies, "perms"
344is a set of permissions:
345
346 r = read
347 w = write
348 x = execute
349 s = shared
350 p = private (copy on write)
351
352"offset" is the offset into the mapping, "dev" is the device (major:minor), and
353"inode" is the inode  on that device.  0 indicates that  no inode is associated
354with the memory region, as the case would be with BSS (uninitialized data).
355The "pathname" shows the name associated file for this mapping.  If the mapping
356is not associated with a file:
357
358 [heap]                   = the heap of the program
359 [stack]                  = the stack of the main process
360 [stack:1001]             = the stack of the thread with tid 1001
361 [vdso]                   = the "virtual dynamic shared object",
362                            the kernel system call handler
363 [anon:<name>]            = an anonymous mapping that has been
364                            named by userspace
365
366 or if empty, the mapping is anonymous.
367
368The /proc/PID/task/TID/maps is a view of the virtual memory from the viewpoint
369of the individual tasks of a process. In this file you will see a mapping marked
370as [stack] if that task sees it as a stack. This is a key difference from the
371content of /proc/PID/maps, where you will see all mappings that are being used
372as stack by all of those tasks. Hence, for the example above, the task-level
373map, i.e. /proc/PID/task/TID/maps for thread 1001 will look like this:
374
37508048000-08049000 r-xp 00000000 03:00 8312       /opt/test
37608049000-0804a000 rw-p 00001000 03:00 8312       /opt/test
3770804a000-0806b000 rw-p 00000000 00:00 0          [heap]
378a7cb1000-a7cb2000 ---p 00000000 00:00 0
379a7cb2000-a7eb2000 rw-p 00000000 00:00 0
380a7eb2000-a7eb3000 ---p 00000000 00:00 0
381a7eb3000-a7ed5000 rw-p 00000000 00:00 0          [stack]
382a7ed5000-a8008000 r-xp 00000000 03:00 4222       /lib/libc.so.6
383a8008000-a800a000 r--p 00133000 03:00 4222       /lib/libc.so.6
384a800a000-a800b000 rw-p 00135000 03:00 4222       /lib/libc.so.6
385a800b000-a800e000 rw-p 00000000 00:00 0
386a800e000-a8022000 r-xp 00000000 03:00 14462      /lib/libpthread.so.0
387a8022000-a8023000 r--p 00013000 03:00 14462      /lib/libpthread.so.0
388a8023000-a8024000 rw-p 00014000 03:00 14462      /lib/libpthread.so.0
389a8024000-a8027000 rw-p 00000000 00:00 0
390a8027000-a8043000 r-xp 00000000 03:00 8317       /lib/ld-linux.so.2
391a8043000-a8044000 r--p 0001b000 03:00 8317       /lib/ld-linux.so.2
392a8044000-a8045000 rw-p 0001c000 03:00 8317       /lib/ld-linux.so.2
393aff35000-aff4a000 rw-p 00000000 00:00 0
394ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]
395
396The /proc/PID/smaps is an extension based on maps, showing the memory
397consumption for each of the process's mappings. For each of mappings there
398is a series of lines such as the following:
399
40008048000-080bc000 r-xp 00000000 03:02 13130      /bin/bash
401Size:               1084 kB
402Rss:                 892 kB
403Pss:                 374 kB
404Shared_Clean:        892 kB
405Shared_Dirty:          0 kB
406Private_Clean:         0 kB
407Private_Dirty:         0 kB
408Referenced:          892 kB
409Anonymous:             0 kB
410Swap:                  0 kB
411SwapPss:               0 kB
412KernelPageSize:        4 kB
413MMUPageSize:           4 kB
414Locked:              374 kB
415Name:           name from userspace
416
417The first of these lines shows the same information as is displayed for the
418mapping in /proc/PID/maps.  The remaining lines show the size of the mapping
419(size), the amount of the mapping that is currently resident in RAM (RSS), the
420process' proportional share of this mapping (PSS), the number of clean and
421dirty private pages in the mapping.
422
423The "proportional set size" (PSS) of a process is the count of pages it has
424in memory, where each page is divided by the number of processes sharing it.
425So if a process has 1000 pages all to itself, and 1000 shared with one other
426process, its PSS will be 1500.
427Note that even a page which is part of a MAP_SHARED mapping, but has only
428a single pte mapped, i.e.  is currently used by only one process, is accounted
429as private and not as shared.
430"Referenced" indicates the amount of memory currently marked as referenced or
431accessed.
432"Anonymous" shows the amount of memory that does not belong to any file.  Even
433a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
434and a page is modified, the file page is replaced by a private anonymous copy.
435"Swap" shows how much would-be-anonymous memory is also used, but out on
436swap.
437"SwapPss" shows proportional swap share of this mapping.
438
439The "Name" field will only be present on a mapping that has been named by
440userspace, and will show the name passed in by userspace.
441
442This file is only present if the CONFIG_MMU kernel configuration option is
443enabled.
444
445The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
446bits on both physical and virtual pages associated with a process.
447To clear the bits for all the pages associated with the process
448    > echo 1 > /proc/PID/clear_refs
449
450To clear the bits for the anonymous pages associated with the process
451    > echo 2 > /proc/PID/clear_refs
452
453To clear the bits for the file mapped pages associated with the process
454    > echo 3 > /proc/PID/clear_refs
455Any other value written to /proc/PID/clear_refs will have no effect.
456
457To reset the peak resident set size ("high water mark") to the process's
458current value:
459    > echo 5 > /proc/PID/clear_refs
460
461The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
462using /proc/kpageflags and number of times a page is mapped using
463/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
464
4651.2 Kernel data
466---------------
467
468Similar to  the  process entries, the kernel data files give information about
469the running kernel. The files used to obtain this information are contained in
470/proc and  are  listed  in Table 1-5. Not all of these will be present in your
471system. It  depends  on the kernel configuration and the loaded modules, which
472files are there, and which are missing.
473
474Table 1-5: Kernel info in /proc
475..............................................................................
476 File        Content
477 apm         Advanced power management info
478 buddyinfo   Kernel memory allocator information (see text)	(2.5)
479 bus         Directory containing bus specific information
480 cmdline     Kernel command line
481 cpuinfo     Info about the CPU
482 devices     Available devices (block and character)
483 dma         Used DMS channels
484 filesystems Supported filesystems
485 driver	     Various drivers grouped here, currently rtc (2.4)
486 execdomains Execdomains, related to security			(2.4)
487 fb	     Frame Buffer devices				(2.4)
488 fs	     File system parameters, currently nfs/exports	(2.4)
489 ide         Directory containing info about the IDE subsystem
490 interrupts  Interrupt usage
491 iomem	     Memory map						(2.4)
492 ioports     I/O port usage
493 irq	     Masks for irq to cpu affinity			(2.4)(smp?)
494 isapnp	     ISA PnP (Plug&Play) Info				(2.4)
495 kcore       Kernel core image (can be ELF or A.OUT(deprecated in 2.4))
496 kmsg        Kernel messages
497 ksyms       Kernel symbol table
498 loadavg     Load average of last 1, 5 & 15 minutes
499 locks       Kernel locks
500 meminfo     Memory info
501 misc        Miscellaneous
502 modules     List of loaded modules
503 mounts      Mounted filesystems
504 net         Networking info (see text)
505 pagetypeinfo Additional page allocator information (see text)  (2.5)
506 partitions  Table of partitions known to the system
507 pci	     Deprecated info of PCI bus (new way -> /proc/bus/pci/,
508             decoupled by lspci					(2.4)
509 rtc         Real time clock
510 scsi        SCSI info (see text)
511 slabinfo    Slab pool info
512 softirqs    softirq usage
513 stat        Overall statistics
514 swaps       Swap space utilization
515 sys         See chapter 2
516 sysvipc     Info of SysVIPC Resources (msg, sem, shm)		(2.4)
517 tty	     Info of tty drivers
518 uptime      System uptime
519 version     Kernel version
520 video	     bttv info of video resources			(2.4)
521 vmallocinfo Show vmalloced areas
522..............................................................................
523
524You can,  for  example,  check  which interrupts are currently in use and what
525they are used for by looking in the file /proc/interrupts:
526
527  > cat /proc/interrupts
528             CPU0
529    0:    8728810          XT-PIC  timer
530    1:        895          XT-PIC  keyboard
531    2:          0          XT-PIC  cascade
532    3:     531695          XT-PIC  aha152x
533    4:    2014133          XT-PIC  serial
534    5:      44401          XT-PIC  pcnet_cs
535    8:          2          XT-PIC  rtc
536   11:          8          XT-PIC  i82365
537   12:     182918          XT-PIC  PS/2 Mouse
538   13:          1          XT-PIC  fpu
539   14:    1232265          XT-PIC  ide0
540   15:          7          XT-PIC  ide1
541  NMI:          0
542
543In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the
544output of a SMP machine):
545
546  > cat /proc/interrupts
547
548             CPU0       CPU1
549    0:    1243498    1214548    IO-APIC-edge  timer
550    1:       8949       8958    IO-APIC-edge  keyboard
551    2:          0          0          XT-PIC  cascade
552    5:      11286      10161    IO-APIC-edge  soundblaster
553    8:          1          0    IO-APIC-edge  rtc
554    9:      27422      27407    IO-APIC-edge  3c503
555   12:     113645     113873    IO-APIC-edge  PS/2 Mouse
556   13:          0          0          XT-PIC  fpu
557   14:      22491      24012    IO-APIC-edge  ide0
558   15:       2183       2415    IO-APIC-edge  ide1
559   17:      30564      30414   IO-APIC-level  eth0
560   18:        177        164   IO-APIC-level  bttv
561  NMI:    2457961    2457959
562  LOC:    2457882    2457881
563  ERR:       2155
564
565NMI is incremented in this case because every timer interrupt generates a NMI
566(Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups.
567
568LOC is the local interrupt counter of the internal APIC of every CPU.
569
570ERR is incremented in the case of errors in the IO-APIC bus (the bus that
571connects the CPUs in a SMP system. This means that an error has been detected,
572the IO-APIC automatically retry the transmission, so it should not be a big
573problem, but you should read the SMP-FAQ.
574
575In 2.6.2* /proc/interrupts was expanded again.  This time the goal was for
576/proc/interrupts to display every IRQ vector in use by the system, not
577just those considered 'most important'.  The new vectors are:
578
579  THR -- interrupt raised when a machine check threshold counter
580  (typically counting ECC corrected errors of memory or cache) exceeds
581  a configurable threshold.  Only available on some systems.
582
583  TRM -- a thermal event interrupt occurs when a temperature threshold
584  has been exceeded for the CPU.  This interrupt may also be generated
585  when the temperature drops back to normal.
586
587  SPU -- a spurious interrupt is some interrupt that was raised then lowered
588  by some IO device before it could be fully processed by the APIC.  Hence
589  the APIC sees the interrupt but does not know what device it came from.
590  For this case the APIC will generate the interrupt with a IRQ vector
591  of 0xff. This might also be generated by chipset bugs.
592
593  RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are
594  sent from one CPU to another per the needs of the OS.  Typically,
595  their statistics are used by kernel developers and interested users to
596  determine the occurrence of interrupts of the given type.
597
598The above IRQ vectors are displayed only when relevant.  For example,
599the threshold vector does not exist on x86_64 platforms.  Others are
600suppressed when the system is a uniprocessor.  As of this writing, only
601i386 and x86_64 platforms support the new IRQ vector displays.
602
603Of some interest is the introduction of the /proc/irq directory to 2.4.
604It could be used to set IRQ to CPU affinity, this means that you can "hook" an
605IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the
606irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
607prof_cpu_mask.
608
609For example
610  > ls /proc/irq/
611  0  10  12  14  16  18  2  4  6  8  prof_cpu_mask
612  1  11  13  15  17  19  3  5  7  9  default_smp_affinity
613  > ls /proc/irq/0/
614  smp_affinity
615
616smp_affinity is a bitmask, in which you can specify which CPUs can handle the
617IRQ, you can set it by doing:
618
619  > echo 1 > /proc/irq/10/smp_affinity
620
621This means that only the first CPU will handle the IRQ, but you can also echo
6225 which means that only the first and fourth CPU can handle the IRQ.
623
624The contents of each smp_affinity file is the same by default:
625
626  > cat /proc/irq/0/smp_affinity
627  ffffffff
628
629There is an alternate interface, smp_affinity_list which allows specifying
630a cpu range instead of a bitmask:
631
632  > cat /proc/irq/0/smp_affinity_list
633  1024-1031
634
635The default_smp_affinity mask applies to all non-active IRQs, which are the
636IRQs which have not yet been allocated/activated, and hence which lack a
637/proc/irq/[0-9]* directory.
638
639The node file on an SMP system shows the node to which the device using the IRQ
640reports itself as being attached. This hardware locality information does not
641include information about any possible driver locality preference.
642
643prof_cpu_mask specifies which CPUs are to be profiled by the system wide
644profiler. Default value is ffffffff (all cpus if there are only 32 of them).
645
646The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
647between all the CPUs which are allowed to handle it. As usual the kernel has
648more info than you and does a better job than you, so the defaults are the
649best choice for almost everyone.  [Note this applies only to those IO-APIC's
650that support "Round Robin" interrupt distribution.]
651
652There are  three  more  important subdirectories in /proc: net, scsi, and sys.
653The general  rule  is  that  the  contents,  or  even  the  existence of these
654directories, depend  on your kernel configuration. If SCSI is not enabled, the
655directory scsi  may  not  exist. The same is true with the net, which is there
656only when networking support is present in the running kernel.
657
658The slabinfo  file  gives  information  about  memory usage at the slab level.
659Linux uses  slab  pools for memory management above page level in version 2.2.
660Commonly used  objects  have  their  own  slab  pool (such as network buffers,
661directory cache, and so on).
662
663..............................................................................
664
665> cat /proc/buddyinfo
666
667Node 0, zone      DMA      0      4      5      4      4      3 ...
668Node 0, zone   Normal      1      0      0      1    101      8 ...
669Node 0, zone  HighMem      2      0      0      1      1      0 ...
670
671External fragmentation is a problem under some workloads, and buddyinfo is a
672useful tool for helping diagnose these problems.  Buddyinfo will give you a
673clue as to how big an area you can safely allocate, or why a previous
674allocation failed.
675
676Each column represents the number of pages of a certain order which are
677available.  In this case, there are 0 chunks of 2^0*PAGE_SIZE available in
678ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE
679available in ZONE_NORMAL, etc...
680
681More information relevant to external fragmentation can be found in
682pagetypeinfo.
683
684> cat /proc/pagetypeinfo
685Page block order: 9
686Pages per block:  512
687
688Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
689Node    0, zone      DMA, type    Unmovable      0      0      0      1      1      1      1      1      1      1      0
690Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
691Node    0, zone      DMA, type      Movable      1      1      2      1      2      1      1      0      1      0      2
692Node    0, zone      DMA, type      Reserve      0      0      0      0      0      0      0      0      0      1      0
693Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
694Node    0, zone    DMA32, type    Unmovable    103     54     77      1      1      1     11      8      7      1      9
695Node    0, zone    DMA32, type  Reclaimable      0      0      2      1      0      0      0      0      1      0      0
696Node    0, zone    DMA32, type      Movable    169    152    113     91     77     54     39     13      6      1    452
697Node    0, zone    DMA32, type      Reserve      1      2      2      2      2      0      1      1      1      1      0
698Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
699
700Number of blocks type     Unmovable  Reclaimable      Movable      Reserve      Isolate
701Node 0, zone      DMA            2            0            5            1            0
702Node 0, zone    DMA32           41            6          967            2            0
703
704Fragmentation avoidance in the kernel works by grouping pages of different
705migrate types into the same contiguous regions of memory called page blocks.
706A page block is typically the size of the default hugepage size e.g. 2MB on
707X86-64. By keeping pages grouped based on their ability to move, the kernel
708can reclaim pages within a page block to satisfy a high-order allocation.
709
710The pagetypinfo begins with information on the size of a page block. It
711then gives the same type of information as buddyinfo except broken down
712by migrate-type and finishes with details on how many page blocks of each
713type exist.
714
715If min_free_kbytes has been tuned correctly (recommendations made by hugeadm
716from libhugetlbfs http://sourceforge.net/projects/libhugetlbfs/), one can
717make an estimate of the likely number of huge pages that can be allocated
718at a given point in time. All the "Movable" blocks should be allocatable
719unless memory has been mlock()'d. Some of the Reclaimable blocks should
720also be allocatable although a lot of filesystem metadata may have to be
721reclaimed to achieve this.
722
723..............................................................................
724
725meminfo:
726
727Provides information about distribution and utilization of memory.  This
728varies by architecture and compile options.  The following is from a
72916GB PIII, which has highmem enabled.  You may not have all of these fields.
730
731> cat /proc/meminfo
732
733The "Locked" indicates whether the mapping is locked in memory or not.
734
735
736MemTotal:     16344972 kB
737MemFree:      13634064 kB
738Buffers:          3656 kB
739Cached:        1195708 kB
740SwapCached:          0 kB
741Active:         891636 kB
742Inactive:      1077224 kB
743HighTotal:    15597528 kB
744HighFree:     13629632 kB
745LowTotal:       747444 kB
746LowFree:          4432 kB
747SwapTotal:           0 kB
748SwapFree:            0 kB
749Dirty:             968 kB
750Writeback:           0 kB
751AnonPages:      861800 kB
752Mapped:         280372 kB
753Slab:           284364 kB
754SReclaimable:   159856 kB
755SUnreclaim:     124508 kB
756PageTables:      24448 kB
757NFS_Unstable:        0 kB
758Bounce:              0 kB
759WritebackTmp:        0 kB
760CommitLimit:   7669796 kB
761Committed_AS:   100056 kB
762VmallocTotal:   112216 kB
763VmallocUsed:       428 kB
764VmallocChunk:   111088 kB
765
766    MemTotal: Total usable ram (i.e. physical ram minus a few reserved
767              bits and the kernel binary code)
768     MemFree: The sum of LowFree+HighFree
769     Buffers: Relatively temporary storage for raw disk blocks
770              shouldn't get tremendously large (20MB or so)
771      Cached: in-memory cache for files read from the disk (the
772              pagecache).  Doesn't include SwapCached
773  SwapCached: Memory that once was swapped out, is swapped back in but
774              still also is in the swapfile (if memory is needed it
775              doesn't need to be swapped out AGAIN because it is already
776              in the swapfile. This saves I/O)
777      Active: Memory that has been used more recently and usually not
778              reclaimed unless absolutely necessary.
779    Inactive: Memory which has been less recently used.  It is more
780              eligible to be reclaimed for other purposes
781   HighTotal:
782    HighFree: Highmem is all memory above ~860MB of physical memory
783              Highmem areas are for use by userspace programs, or
784              for the pagecache.  The kernel must use tricks to access
785              this memory, making it slower to access than lowmem.
786    LowTotal:
787     LowFree: Lowmem is memory which can be used for everything that
788              highmem can be used for, but it is also available for the
789              kernel's use for its own data structures.  Among many
790              other things, it is where everything from the Slab is
791              allocated.  Bad things happen when you're out of lowmem.
792   SwapTotal: total amount of swap space available
793    SwapFree: Memory which has been evicted from RAM, and is temporarily
794              on the disk
795       Dirty: Memory which is waiting to get written back to the disk
796   Writeback: Memory which is actively being written back to the disk
797   AnonPages: Non-file backed pages mapped into userspace page tables
798      Mapped: files which have been mmaped, such as libraries
799        Slab: in-kernel data structures cache
800SReclaimable: Part of Slab, that might be reclaimed, such as caches
801  SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure
802  PageTables: amount of memory dedicated to the lowest level of page
803              tables.
804NFS_Unstable: NFS pages sent to the server, but not yet committed to stable
805	      storage
806      Bounce: Memory used for block device "bounce buffers"
807WritebackTmp: Memory used by FUSE for temporary writeback buffers
808 CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'),
809              this is the total amount of  memory currently available to
810              be allocated on the system. This limit is only adhered to
811              if strict overcommit accounting is enabled (mode 2 in
812              'vm.overcommit_memory').
813              The CommitLimit is calculated with the following formula:
814              CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap
815              For example, on a system with 1G of physical RAM and 7G
816              of swap with a `vm.overcommit_ratio` of 30 it would
817              yield a CommitLimit of 7.3G.
818              For more details, see the memory overcommit documentation
819              in vm/overcommit-accounting.
820Committed_AS: The amount of memory presently allocated on the system.
821              The committed memory is a sum of all of the memory which
822              has been allocated by processes, even if it has not been
823              "used" by them as of yet. A process which malloc()'s 1G
824              of memory, but only touches 300M of it will only show up
825              as using 300M of memory even if it has the address space
826              allocated for the entire 1G. This 1G is memory which has
827              been "committed" to by the VM and can be used at any time
828              by the allocating application. With strict overcommit
829              enabled on the system (mode 2 in 'vm.overcommit_memory'),
830              allocations which would exceed the CommitLimit (detailed
831              above) will not be permitted. This is useful if one needs
832              to guarantee that processes will not fail due to lack of
833              memory once that memory has been successfully allocated.
834VmallocTotal: total size of vmalloc memory area
835 VmallocUsed: amount of vmalloc area which is used
836VmallocChunk: largest contiguous block of vmalloc area which is free
837
838..............................................................................
839
840vmallocinfo:
841
842Provides information about vmalloced/vmaped areas. One line per area,
843containing the virtual address range of the area, size in bytes,
844caller information of the creator, and optional information depending
845on the kind of area :
846
847 pages=nr    number of pages
848 phys=addr   if a physical address was specified
849 ioremap     I/O mapping (ioremap() and friends)
850 vmalloc     vmalloc() area
851 vmap        vmap()ed pages
852 user        VM_USERMAP area
853 vpages      buffer for pages pointers was vmalloced (huge area)
854 N<node>=nr  (Only on NUMA kernels)
855             Number of pages allocated on memory node <node>
856
857> cat /proc/vmallocinfo
8580xffffc20000000000-0xffffc20000201000 2101248 alloc_large_system_hash+0x204 ...
859  /0x2c0 pages=512 vmalloc N0=128 N1=128 N2=128 N3=128
8600xffffc20000201000-0xffffc20000302000 1052672 alloc_large_system_hash+0x204 ...
861  /0x2c0 pages=256 vmalloc N0=64 N1=64 N2=64 N3=64
8620xffffc20000302000-0xffffc20000304000    8192 acpi_tb_verify_table+0x21/0x4f...
863  phys=7fee8000 ioremap
8640xffffc20000304000-0xffffc20000307000   12288 acpi_tb_verify_table+0x21/0x4f...
865  phys=7fee7000 ioremap
8660xffffc2000031d000-0xffffc2000031f000    8192 init_vdso_vars+0x112/0x210
8670xffffc2000031f000-0xffffc2000032b000   49152 cramfs_uncompress_init+0x2e ...
868  /0x80 pages=11 vmalloc N0=3 N1=3 N2=2 N3=3
8690xffffc2000033a000-0xffffc2000033d000   12288 sys_swapon+0x640/0xac0      ...
870  pages=2 vmalloc N1=2
8710xffffc20000347000-0xffffc2000034c000   20480 xt_alloc_table_info+0xfe ...
872  /0x130 [x_tables] pages=4 vmalloc N0=4
8730xffffffffa0000000-0xffffffffa000f000   61440 sys_init_module+0xc27/0x1d00 ...
874   pages=14 vmalloc N2=14
8750xffffffffa000f000-0xffffffffa0014000   20480 sys_init_module+0xc27/0x1d00 ...
876   pages=4 vmalloc N1=4
8770xffffffffa0014000-0xffffffffa0017000   12288 sys_init_module+0xc27/0x1d00 ...
878   pages=2 vmalloc N1=2
8790xffffffffa0017000-0xffffffffa0022000   45056 sys_init_module+0xc27/0x1d00 ...
880   pages=10 vmalloc N0=10
881
882..............................................................................
883
884softirqs:
885
886Provides counts of softirq handlers serviced since boot time, for each cpu.
887
888> cat /proc/softirqs
889                CPU0       CPU1       CPU2       CPU3
890      HI:          0          0          0          0
891   TIMER:      27166      27120      27097      27034
892  NET_TX:          0          0          0         17
893  NET_RX:         42          0          0         39
894   BLOCK:          0          0        107       1121
895 TASKLET:          0          0          0        290
896   SCHED:      27035      26983      26971      26746
897 HRTIMER:          0          0          0          0
898     RCU:       1678       1769       2178       2250
899
900
9011.3 IDE devices in /proc/ide
902----------------------------
903
904The subdirectory /proc/ide contains information about all IDE devices of which
905the kernel  is  aware.  There is one subdirectory for each IDE controller, the
906file drivers  and a link for each IDE device, pointing to the device directory
907in the controller specific subtree.
908
909The file  drivers  contains general information about the drivers used for the
910IDE devices:
911
912  > cat /proc/ide/drivers
913  ide-cdrom version 4.53
914  ide-disk version 1.08
915
916More detailed  information  can  be  found  in  the  controller  specific
917subdirectories. These  are  named  ide0,  ide1  and  so  on.  Each  of  these
918directories contains the files shown in table 1-6.
919
920
921Table 1-6: IDE controller info in  /proc/ide/ide?
922..............................................................................
923 File    Content
924 channel IDE channel (0 or 1)
925 config  Configuration (only for PCI/IDE bridge)
926 mate    Mate name
927 model   Type/Chipset of IDE controller
928..............................................................................
929
930Each device  connected  to  a  controller  has  a separate subdirectory in the
931controllers directory.  The  files  listed in table 1-7 are contained in these
932directories.
933
934
935Table 1-7: IDE device information
936..............................................................................
937 File             Content
938 cache            The cache
939 capacity         Capacity of the medium (in 512Byte blocks)
940 driver           driver and version
941 geometry         physical and logical geometry
942 identify         device identify block
943 media            media type
944 model            device identifier
945 settings         device setup
946 smart_thresholds IDE disk management thresholds
947 smart_values     IDE disk management values
948..............................................................................
949
950The most  interesting  file is settings. This file contains a nice overview of
951the drive parameters:
952
953  # cat /proc/ide/ide0/hda/settings
954  name                    value           min             max             mode
955  ----                    -----           ---             ---             ----
956  bios_cyl                526             0               65535           rw
957  bios_head               255             0               255             rw
958  bios_sect               63              0               63              rw
959  breada_readahead        4               0               127             rw
960  bswap                   0               0               1               r
961  file_readahead          72              0               2097151         rw
962  io_32bit                0               0               3               rw
963  keepsettings            0               0               1               rw
964  max_kb_per_request      122             1               127             rw
965  multcount               0               0               8               rw
966  nice1                   1               0               1               rw
967  nowerr                  0               0               1               rw
968  pio_mode                write-only      0               255             w
969  slow                    0               0               1               rw
970  unmaskirq               0               0               1               rw
971  using_dma               0               0               1               rw
972
973
9741.4 Networking info in /proc/net
975--------------------------------
976
977The subdirectory  /proc/net  follows  the  usual  pattern. Table 1-8 shows the
978additional values  you  get  for  IP  version 6 if you configure the kernel to
979support this. Table 1-9 lists the files and their meaning.
980
981
982Table 1-8: IPv6 info in /proc/net
983..............................................................................
984 File       Content
985 udp6       UDP sockets (IPv6)
986 tcp6       TCP sockets (IPv6)
987 raw6       Raw device statistics (IPv6)
988 igmp6      IP multicast addresses, which this host joined (IPv6)
989 if_inet6   List of IPv6 interface addresses
990 ipv6_route Kernel routing table for IPv6
991 rt6_stats  Global IPv6 routing tables statistics
992 sockstat6  Socket statistics (IPv6)
993 snmp6      Snmp data (IPv6)
994..............................................................................
995
996
997Table 1-9: Network info in /proc/net
998..............................................................................
999 File          Content
1000 arp           Kernel  ARP table
1001 dev           network devices with statistics
1002 dev_mcast     the Layer2 multicast groups a device is listening too
1003               (interface index, label, number of references, number of bound
1004               addresses).
1005 dev_stat      network device status
1006 ip_fwchains   Firewall chain linkage
1007 ip_fwnames    Firewall chain names
1008 ip_masq       Directory containing the masquerading tables
1009 ip_masquerade Major masquerading table
1010 netstat       Network statistics
1011 raw           raw device statistics
1012 route         Kernel routing table
1013 rpc           Directory containing rpc info
1014 rt_cache      Routing cache
1015 snmp          SNMP data
1016 sockstat      Socket statistics
1017 tcp           TCP  sockets
1018 tr_rif        Token ring RIF routing table
1019 udp           UDP sockets
1020 unix          UNIX domain sockets
1021 wireless      Wireless interface data (Wavelan etc)
1022 igmp          IP multicast addresses, which this host joined
1023 psched        Global packet scheduler parameters.
1024 netlink       List of PF_NETLINK sockets
1025 ip_mr_vifs    List of multicast virtual interfaces
1026 ip_mr_cache   List of multicast routing cache
1027..............................................................................
1028
1029You can  use  this  information  to see which network devices are available in
1030your system and how much traffic was routed over those devices:
1031
1032  > cat /proc/net/dev
1033  Inter-|Receive                                                   |[...
1034   face |bytes    packets errs drop fifo frame compressed multicast|[...
1035      lo:  908188   5596     0    0    0     0          0         0 [...
1036    ppp0:15475140  20721   410    0    0   410          0         0 [...
1037    eth0:  614530   7085     0    0    0     0          0         1 [...
1038
1039  ...] Transmit
1040  ...] bytes    packets errs drop fifo colls carrier compressed
1041  ...]  908188     5596    0    0    0     0       0          0
1042  ...] 1375103    17405    0    0    0     0       0          0
1043  ...] 1703981     5535    0    0    0     3       0          0
1044
1045In addition, each Channel Bond interface has its own directory.  For
1046example, the bond0 device will have a directory called /proc/net/bond0/.
1047It will contain information that is specific to that bond, such as the
1048current slaves of the bond, the link status of the slaves, and how
1049many times the slaves link has failed.
1050
10511.5 SCSI info
1052-------------
1053
1054If you  have  a  SCSI  host adapter in your system, you'll find a subdirectory
1055named after  the driver for this adapter in /proc/scsi. You'll also see a list
1056of all recognized SCSI devices in /proc/scsi:
1057
1058  >cat /proc/scsi/scsi
1059  Attached devices:
1060  Host: scsi0 Channel: 00 Id: 00 Lun: 00
1061    Vendor: IBM      Model: DGHS09U          Rev: 03E0
1062    Type:   Direct-Access                    ANSI SCSI revision: 03
1063  Host: scsi0 Channel: 00 Id: 06 Lun: 00
1064    Vendor: PIONEER  Model: CD-ROM DR-U06S   Rev: 1.04
1065    Type:   CD-ROM                           ANSI SCSI revision: 02
1066
1067
1068The directory  named  after  the driver has one file for each adapter found in
1069the system.  These  files  contain information about the controller, including
1070the used  IRQ  and  the  IO  address range. The amount of information shown is
1071dependent on  the adapter you use. The example shows the output for an Adaptec
1072AHA-2940 SCSI adapter:
1073
1074  > cat /proc/scsi/aic7xxx/0
1075
1076  Adaptec AIC7xxx driver version: 5.1.19/3.2.4
1077  Compile Options:
1078    TCQ Enabled By Default : Disabled
1079    AIC7XXX_PROC_STATS     : Disabled
1080    AIC7XXX_RESET_DELAY    : 5
1081  Adapter Configuration:
1082             SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter
1083                             Ultra Wide Controller
1084      PCI MMAPed I/O Base: 0xeb001000
1085   Adapter SEEPROM Config: SEEPROM found and used.
1086        Adaptec SCSI BIOS: Enabled
1087                      IRQ: 10
1088                     SCBs: Active 0, Max Active 2,
1089                           Allocated 15, HW 16, Page 255
1090               Interrupts: 160328
1091        BIOS Control Word: 0x18b6
1092     Adapter Control Word: 0x005b
1093     Extended Translation: Enabled
1094  Disconnect Enable Flags: 0xffff
1095       Ultra Enable Flags: 0x0001
1096   Tag Queue Enable Flags: 0x0000
1097  Ordered Queue Tag Flags: 0x0000
1098  Default Tag Queue Depth: 8
1099      Tagged Queue By Device array for aic7xxx host instance 0:
1100        {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
1101      Actual queue depth per device for aic7xxx host instance 0:
1102        {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}
1103  Statistics:
1104  (scsi0:0:0:0)
1105    Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8
1106    Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0)
1107    Total transfers 160151 (74577 reads and 85574 writes)
1108  (scsi0:0:6:0)
1109    Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15
1110    Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0)
1111    Total transfers 0 (0 reads and 0 writes)
1112
1113
11141.6 Parallel port info in /proc/parport
1115---------------------------------------
1116
1117The directory  /proc/parport  contains information about the parallel ports of
1118your system.  It  has  one  subdirectory  for  each port, named after the port
1119number (0,1,2,...).
1120
1121These directories contain the four files shown in Table 1-10.
1122
1123
1124Table 1-10: Files in /proc/parport
1125..............................................................................
1126 File      Content
1127 autoprobe Any IEEE-1284 device ID information that has been acquired.
1128 devices   list of the device drivers using that port. A + will appear by the
1129           name of the device currently using the port (it might not appear
1130           against any).
1131 hardware  Parallel port's base address, IRQ line and DMA channel.
1132 irq       IRQ that parport is using for that port. This is in a separate
1133           file to allow you to alter it by writing a new value in (IRQ
1134           number or none).
1135..............................................................................
1136
11371.7 TTY info in /proc/tty
1138-------------------------
1139
1140Information about  the  available  and actually used tty's can be found in the
1141directory /proc/tty.You'll  find  entries  for drivers and line disciplines in
1142this directory, as shown in Table 1-11.
1143
1144
1145Table 1-11: Files in /proc/tty
1146..............................................................................
1147 File          Content
1148 drivers       list of drivers and their usage
1149 ldiscs        registered line disciplines
1150 driver/serial usage statistic and status of single tty lines
1151..............................................................................
1152
1153To see  which  tty's  are  currently in use, you can simply look into the file
1154/proc/tty/drivers:
1155
1156  > cat /proc/tty/drivers
1157  pty_slave            /dev/pts      136   0-255 pty:slave
1158  pty_master           /dev/ptm      128   0-255 pty:master
1159  pty_slave            /dev/ttyp       3   0-255 pty:slave
1160  pty_master           /dev/pty        2   0-255 pty:master
1161  serial               /dev/cua        5   64-67 serial:callout
1162  serial               /dev/ttyS       4   64-67 serial
1163  /dev/tty0            /dev/tty0       4       0 system:vtmaster
1164  /dev/ptmx            /dev/ptmx       5       2 system
1165  /dev/console         /dev/console    5       1 system:console
1166  /dev/tty             /dev/tty        5       0 system:/dev/tty
1167  unknown              /dev/tty        4    1-63 console
1168
1169
11701.8 Miscellaneous kernel statistics in /proc/stat
1171-------------------------------------------------
1172
1173Various pieces   of  information about  kernel activity  are  available in the
1174/proc/stat file.  All  of  the numbers reported  in  this file are  aggregates
1175since the system first booted.  For a quick look, simply cat the file:
1176
1177  > cat /proc/stat
1178  cpu  2255 34 2290 22625563 6290 127 456 0 0
1179  cpu0 1132 34 1441 11311718 3675 127 438 0 0
1180  cpu1 1123 0 849 11313845 2614 0 18 0 0
1181  intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...]
1182  ctxt 1990473
1183  btime 1062191376
1184  processes 2915
1185  procs_running 1
1186  procs_blocked 0
1187  softirq 183433 0 21755 12 39 1137 231 21459 2263
1188
1189The very first  "cpu" line aggregates the  numbers in all  of the other "cpuN"
1190lines.  These numbers identify the amount of time the CPU has spent performing
1191different kinds of work.  Time units are in USER_HZ (typically hundredths of a
1192second).  The meanings of the columns are as follows, from left to right:
1193
1194- user: normal processes executing in user mode
1195- nice: niced processes executing in user mode
1196- system: processes executing in kernel mode
1197- idle: twiddling thumbs
1198- iowait: waiting for I/O to complete
1199- irq: servicing interrupts
1200- softirq: servicing softirqs
1201- steal: involuntary wait
1202- guest: running a normal guest
1203- guest_nice: running a niced guest
1204
1205The "intr" line gives counts of interrupts  serviced since boot time, for each
1206of the  possible system interrupts.   The first  column  is the  total of  all
1207interrupts serviced; each  subsequent column is the  total for that particular
1208interrupt.
1209
1210The "ctxt" line gives the total number of context switches across all CPUs.
1211
1212The "btime" line gives  the time at which the  system booted, in seconds since
1213the Unix epoch.
1214
1215The "processes" line gives the number  of processes and threads created, which
1216includes (but  is not limited  to) those  created by  calls to the  fork() and
1217clone() system calls.
1218
1219The "procs_running" line gives the total number of threads that are
1220running or ready to run (i.e., the total number of runnable threads).
1221
1222The   "procs_blocked" line gives  the  number of  processes currently blocked,
1223waiting for I/O to complete.
1224
1225The "softirq" line gives counts of softirqs serviced since boot time, for each
1226of the possible system softirqs. The first column is the total of all
1227softirqs serviced; each subsequent column is the total for that particular
1228softirq.
1229
1230
12311.9 Ext4 file system parameters
1232------------------------------
1233
1234Information about mounted ext4 file systems can be found in
1235/proc/fs/ext4.  Each mounted filesystem will have a directory in
1236/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
1237/proc/fs/ext4/dm-0).   The files in each per-device directory are shown
1238in Table 1-12, below.
1239
1240Table 1-12: Files in /proc/fs/ext4/<devname>
1241..............................................................................
1242 File            Content
1243 mb_groups       details of multiblock allocator buddy cache of free blocks
1244..............................................................................
1245
12462.0 /proc/consoles
1247------------------
1248Shows registered system console lines.
1249
1250To see which character device lines are currently used for the system console
1251/dev/console, you may simply look into the file /proc/consoles:
1252
1253  > cat /proc/consoles
1254  tty0                 -WU (ECp)       4:7
1255  ttyS0                -W- (Ep)        4:64
1256
1257The columns are:
1258
1259  device               name of the device
1260  operations           R = can do read operations
1261                       W = can do write operations
1262                       U = can do unblank
1263  flags                E = it is enabled
1264                       C = it is preferred console
1265                       B = it is primary boot console
1266                       p = it is used for printk buffer
1267                       b = it is not a TTY but a Braille device
1268                       a = it is safe to use when cpu is offline
1269  major:minor          major and minor number of the device separated by a colon
1270
1271------------------------------------------------------------------------------
1272Summary
1273------------------------------------------------------------------------------
1274The /proc file system serves information about the running system. It not only
1275allows access to process data but also allows you to request the kernel status
1276by reading files in the hierarchy.
1277
1278The directory  structure  of /proc reflects the types of information and makes
1279it easy, if not obvious, where to look for specific data.
1280------------------------------------------------------------------------------
1281
1282------------------------------------------------------------------------------
1283CHAPTER 2: MODIFYING SYSTEM PARAMETERS
1284------------------------------------------------------------------------------
1285
1286------------------------------------------------------------------------------
1287In This Chapter
1288------------------------------------------------------------------------------
1289* Modifying kernel parameters by writing into files found in /proc/sys
1290* Exploring the files which modify certain parameters
1291* Review of the /proc/sys file tree
1292------------------------------------------------------------------------------
1293
1294
1295A very  interesting part of /proc is the directory /proc/sys. This is not only
1296a source  of  information,  it also allows you to change parameters within the
1297kernel. Be  very  careful  when attempting this. You can optimize your system,
1298but you  can  also  cause  it  to  crash.  Never  alter kernel parameters on a
1299production system.  Set  up  a  development machine and test to make sure that
1300everything works  the  way  you want it to. You may have no alternative but to
1301reboot the machine once an error has been made.
1302
1303To change  a  value,  simply  echo  the new value into the file. An example is
1304given below  in the section on the file system data. You need to be root to do
1305this. You  can  create  your  own  boot script to perform this every time your
1306system boots.
1307
1308The files  in /proc/sys can be used to fine tune and monitor miscellaneous and
1309general things  in  the operation of the Linux kernel. Since some of the files
1310can inadvertently  disrupt  your  system,  it  is  advisable  to  read  both
1311documentation and  source  before actually making adjustments. In any case, be
1312very careful  when  writing  to  any  of these files. The entries in /proc may
1313change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt
1314review the kernel documentation in the directory /usr/src/linux/Documentation.
1315This chapter  is  heavily  based  on the documentation included in the pre 2.2
1316kernels, and became part of it in version 2.2.1 of the Linux kernel.
1317
1318Please see: Documentation/sysctl/ directory for descriptions of these
1319entries.
1320
1321------------------------------------------------------------------------------
1322Summary
1323------------------------------------------------------------------------------
1324Certain aspects  of  kernel  behavior  can be modified at runtime, without the
1325need to  recompile  the kernel, or even to reboot the system. The files in the
1326/proc/sys tree  can  not only be read, but also modified. You can use the echo
1327command to write value into these files, thereby changing the default settings
1328of the kernel.
1329------------------------------------------------------------------------------
1330
1331------------------------------------------------------------------------------
1332CHAPTER 3: PER-PROCESS PARAMETERS
1333------------------------------------------------------------------------------
1334
13353.1 /proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj- Adjust the oom-killer score
1336--------------------------------------------------------------------------------
1337
1338These file can be used to adjust the badness heuristic used to select which
1339process gets killed in out of memory conditions.
1340
1341The badness heuristic assigns a value to each candidate task ranging from 0
1342(never kill) to 1000 (always kill) to determine which process is targeted.  The
1343units are roughly a proportion along that range of allowed memory the process
1344may allocate from based on an estimation of its current memory and swap use.
1345For example, if a task is using all allowed memory, its badness score will be
13461000.  If it is using half of its allowed memory, its score will be 500.
1347
1348There is an additional factor included in the badness score: root
1349processes are given 3% extra memory over other tasks.
1350
1351The amount of "allowed" memory depends on the context in which the oom killer
1352was called.  If it is due to the memory assigned to the allocating task's cpuset
1353being exhausted, the allowed memory represents the set of mems assigned to that
1354cpuset.  If it is due to a mempolicy's node(s) being exhausted, the allowed
1355memory represents the set of mempolicy nodes.  If it is due to a memory
1356limit (or swap limit) being reached, the allowed memory is that configured
1357limit.  Finally, if it is due to the entire system being out of memory, the
1358allowed memory represents all allocatable resources.
1359
1360The value of /proc/<pid>/oom_score_adj is added to the badness score before it
1361is used to determine which task to kill.  Acceptable values range from -1000
1362(OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX).  This allows userspace to
1363polarize the preference for oom killing either by always preferring a certain
1364task or completely disabling it.  The lowest possible value, -1000, is
1365equivalent to disabling oom killing entirely for that task since it will always
1366report a badness score of 0.
1367
1368Consequently, it is very simple for userspace to define the amount of memory to
1369consider for each task.  Setting a /proc/<pid>/oom_score_adj value of +500, for
1370example, is roughly equivalent to allowing the remainder of tasks sharing the
1371same system, cpuset, mempolicy, or memory controller resources to use at least
137250% more memory.  A value of -500, on the other hand, would be roughly
1373equivalent to discounting 50% of the task's allowed memory from being considered
1374as scoring against the task.
1375
1376For backwards compatibility with previous kernels, /proc/<pid>/oom_adj may also
1377be used to tune the badness score.  Its acceptable values range from -16
1378(OOM_ADJUST_MIN) to +15 (OOM_ADJUST_MAX) and a special value of -17
1379(OOM_DISABLE) to disable oom killing entirely for that task.  Its value is
1380scaled linearly with /proc/<pid>/oom_score_adj.
1381
1382Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the
1383other with its scaled value.
1384
1385The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last
1386value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
1387requires CAP_SYS_RESOURCE.
1388
1389NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
1390Documentation/feature-removal-schedule.txt.
1391
1392Caveat: when a parent task is selected, the oom killer will sacrifice any first
1393generation children with separate address spaces instead, if possible.  This
1394avoids servers and important system daemons from being killed and loses the
1395minimal amount of work.
1396
1397
13983.2 /proc/<pid>/oom_score - Display current oom-killer score
1399-------------------------------------------------------------
1400
1401This file can be used to check the current score used by the oom-killer is for
1402any given <pid>. Use it together with /proc/<pid>/oom_adj to tune which
1403process should be killed in an out-of-memory situation.
1404
1405
14063.3  /proc/<pid>/io - Display the IO accounting fields
1407-------------------------------------------------------
1408
1409This file contains IO statistics for each running process
1410
1411Example
1412-------
1413
1414test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
1415[1] 3828
1416
1417test:/tmp # cat /proc/3828/io
1418rchar: 323934931
1419wchar: 323929600
1420syscr: 632687
1421syscw: 632675
1422read_bytes: 0
1423write_bytes: 323932160
1424cancelled_write_bytes: 0
1425
1426
1427Description
1428-----------
1429
1430rchar
1431-----
1432
1433I/O counter: chars read
1434The number of bytes which this task has caused to be read from storage. This
1435is simply the sum of bytes which this process passed to read() and pread().
1436It includes things like tty IO and it is unaffected by whether or not actual
1437physical disk IO was required (the read might have been satisfied from
1438pagecache)
1439
1440
1441wchar
1442-----
1443
1444I/O counter: chars written
1445The number of bytes which this task has caused, or shall cause to be written
1446to disk. Similar caveats apply here as with rchar.
1447
1448
1449syscr
1450-----
1451
1452I/O counter: read syscalls
1453Attempt to count the number of read I/O operations, i.e. syscalls like read()
1454and pread().
1455
1456
1457syscw
1458-----
1459
1460I/O counter: write syscalls
1461Attempt to count the number of write I/O operations, i.e. syscalls like
1462write() and pwrite().
1463
1464
1465read_bytes
1466----------
1467
1468I/O counter: bytes read
1469Attempt to count the number of bytes which this process really did cause to
1470be fetched from the storage layer. Done at the submit_bio() level, so it is
1471accurate for block-backed filesystems. <please add status regarding NFS and
1472CIFS at a later time>
1473
1474
1475write_bytes
1476-----------
1477
1478I/O counter: bytes written
1479Attempt to count the number of bytes which this process caused to be sent to
1480the storage layer. This is done at page-dirtying time.
1481
1482
1483cancelled_write_bytes
1484---------------------
1485
1486The big inaccuracy here is truncate. If a process writes 1MB to a file and
1487then deletes the file, it will in fact perform no writeout. But it will have
1488been accounted as having caused 1MB of write.
1489In other words: The number of bytes which this process caused to not happen,
1490by truncating pagecache. A task can cause "negative" IO too. If this task
1491truncates some dirty pagecache, some IO which another task has been accounted
1492for (in its write_bytes) will not be happening. We _could_ just subtract that
1493from the truncating task's write_bytes, but there is information loss in doing
1494that.
1495
1496
1497Note
1498----
1499
1500At its current implementation state, this is a bit racy on 32-bit machines: if
1501process A reads process B's /proc/pid/io while process B is updating one of
1502those 64-bit counters, process A could see an intermediate result.
1503
1504
1505More information about this can be found within the taskstats documentation in
1506Documentation/accounting.
1507
15083.4 /proc/<pid>/coredump_filter - Core dump filtering settings
1509---------------------------------------------------------------
1510When a process is dumped, all anonymous memory is written to a core file as
1511long as the size of the core file isn't limited. But sometimes we don't want
1512to dump some memory segments, for example, huge shared memory. Conversely,
1513sometimes we want to save file-backed memory segments into a core file, not
1514only the individual files.
1515
1516/proc/<pid>/coredump_filter allows you to customize which memory segments
1517will be dumped when the <pid> process is dumped. coredump_filter is a bitmask
1518of memory types. If a bit of the bitmask is set, memory segments of the
1519corresponding memory type are dumped, otherwise they are not dumped.
1520
1521The following 7 memory types are supported:
1522  - (bit 0) anonymous private memory
1523  - (bit 1) anonymous shared memory
1524  - (bit 2) file-backed private memory
1525  - (bit 3) file-backed shared memory
1526  - (bit 4) ELF header pages in file-backed private memory areas (it is
1527            effective only if the bit 2 is cleared)
1528  - (bit 5) hugetlb private memory
1529  - (bit 6) hugetlb shared memory
1530
1531  Note that MMIO pages such as frame buffer are never dumped and vDSO pages
1532  are always dumped regardless of the bitmask status.
1533
1534  Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
1535  effected by bit 5-6.
1536
1537Default value of coredump_filter is 0x23; this means all anonymous memory
1538segments and hugetlb private memory are dumped.
1539
1540If you don't want to dump all shared memory segments attached to pid 1234,
1541write 0x21 to the process's proc file.
1542
1543  $ echo 0x21 > /proc/1234/coredump_filter
1544
1545When a new process is created, the process inherits the bitmask status from its
1546parent. It is useful to set up coredump_filter before the program runs.
1547For example:
1548
1549  $ echo 0x7 > /proc/self/coredump_filter
1550  $ ./some_program
1551
15523.5	/proc/<pid>/mountinfo - Information about mounts
1553--------------------------------------------------------
1554
1555This file contains lines of the form:
1556
155736 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
1558(1)(2)(3)   (4)   (5)      (6)      (7)   (8) (9)   (10)         (11)
1559
1560(1) mount ID:  unique identifier of the mount (may be reused after umount)
1561(2) parent ID:  ID of parent (or of self for the top of the mount tree)
1562(3) major:minor:  value of st_dev for files on filesystem
1563(4) root:  root of the mount within the filesystem
1564(5) mount point:  mount point relative to the process's root
1565(6) mount options:  per mount options
1566(7) optional fields:  zero or more fields of the form "tag[:value]"
1567(8) separator:  marks the end of the optional fields
1568(9) filesystem type:  name of filesystem of the form "type[.subtype]"
1569(10) mount source:  filesystem specific information or "none"
1570(11) super options:  per super block options
1571
1572Parsers should ignore all unrecognised optional fields.  Currently the
1573possible optional fields are:
1574
1575shared:X  mount is shared in peer group X
1576master:X  mount is slave to peer group X
1577propagate_from:X  mount is slave and receives propagation from peer group X (*)
1578unbindable  mount is unbindable
1579
1580(*) X is the closest dominant peer group under the process's root.  If
1581X is the immediate master of the mount, or if there's no dominant peer
1582group under the same root, then only the "master:X" field is present
1583and not the "propagate_from:X" field.
1584
1585For more information on mount propagation see:
1586
1587  Documentation/filesystems/sharedsubtree.txt
1588
1589
15903.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
1591--------------------------------------------------------
1592These files provide a method to access a tasks comm value. It also allows for
1593a task to set its own or one of its thread siblings comm value. The comm value
1594is limited in size compared to the cmdline value, so writing anything longer
1595then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated
1596comm value.
1597
1598
1599------------------------------------------------------------------------------
1600Configuring procfs
1601------------------------------------------------------------------------------
1602
16034.1	Mount options
1604---------------------
1605
1606The following mount options are supported:
1607
1608	hidepid=	Set /proc/<pid>/ access mode.
1609	gid=		Set the group authorized to learn processes information.
1610
1611hidepid=0 means classic mode - everybody may access all /proc/<pid>/ directories
1612(default).
1613
1614hidepid=1 means users may not access any /proc/<pid>/ directories but their
1615own.  Sensitive files like cmdline, sched*, status are now protected against
1616other users.  This makes it impossible to learn whether any user runs
1617specific program (given the program doesn't reveal itself by its behaviour).
1618As an additional bonus, as /proc/<pid>/cmdline is unaccessible for other users,
1619poorly written programs passing sensitive information via program arguments are
1620now protected against local eavesdroppers.
1621
1622hidepid=2 means hidepid=1 plus all /proc/<pid>/ will be fully invisible to other
1623users.  It doesn't mean that it hides a fact whether a process with a specific
1624pid value exists (it can be learned by other means, e.g. by "kill -0 $PID"),
1625but it hides process' uid and gid, which may be learned by stat()'ing
1626/proc/<pid>/ otherwise.  It greatly complicates an intruder's task of gathering
1627information about running processes, whether some daemon runs with elevated
1628privileges, whether other user runs some sensitive program, whether other users
1629run any program at all, etc.
1630
1631gid= defines a group authorized to learn processes information otherwise
1632prohibited by hidepid=.  If you use some daemon like identd which needs to learn
1633information about processes information, just add identd to this group.
1634