memory.rst - OpenGrok cross reference for /Documentation/admin-guide/cgroup-v1/memory.rst

Lines Matching +full:locality +full:- +full:specific
18       we call it "memory cgroup". When you see git-log and source code, you'll
30    Memory-hungry applications can be isolated and limited to a smaller
42 Current Status: linux-2.6.34-mmotm(development version of 2010/April)
46  - accounting anonymous pages, file caches, swap caches usage and limiting them.
47  - pages are linked to per-memcg LRU exclusively, and there is no global LRU.
48  - optionally, memory+swap usage can be accounted and limited.
49  - hierarchical accounting
50  - soft limit
51  - moving (recharging) account at moving a task is selectable.
52  - usage threshold notifier
53  - memory pressure notifier
54  - oom-killer disable knob and oom-notifier
55  - Root cgroup has no limit controls.
138 -----------
143 specific data structure (mem_cgroup) associated with it.
146 ---------------
150 		+--------------------+
153 		+--------------------+
156            +---------------+  |        +---------------+
159            +---------------+  |        +---------------+
161                               + --------------+
163            +---------------+           +------+--------+
164            | page          +---------->  page_cgroup|
166            +---------------+           +---------------+
182 If everything goes well, a page meta-data-structure called page_cgroup is
184 (*) page_cgroup structure is allocated at boot/memory-hotplug time.
187 ------------------------
195 inserted into inode (radix-tree). While it's mapped into the page tables of
199 unaccounted when it's removed from radix-tree. Even if RSS pages are fully
202 A swapped-in page is not accounted until it's mapped.
204 Note: The kernel does swapin-readahead and reads multiple swaps at once.
205 This means swapped-in pages may contain pages for other tasks than a task
206 causing page fault. So, we avoid accounting at swap-in I/O.
210 Note: we just account pages-on-LRU because our purpose is to control amount
211 of used pages; not-on-LRU pages tend to be out-of-control from VM view.
214 --------------------------
220 the cgroup that brought it in -- this will happen on memory pressure).
226 When you do swapoff and make swapped-out pages of shmem(tmpfs) to
231 --------------------------------------
233 Swap Extension allows you to record charge for swap. A swapped-in page is
238  - memory.memsw.usage_in_bytes.
239  - memory.memsw.limit_in_bytes.
252 The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
260 When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
261 in this cgroup. Then, swap-out will not be done by cgroup routine and file
267 -----------
277 pages that are selected for reclaiming come from the per-cgroup LRU
291 -----------
299      mm->page_table_lock
300          pgdat->lru_lock
305   per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
306   pgdat->lru_lock, it has no lock of its own.
309 -----------------------------------------------
317 it can be disabled system-wide by passing cgroup.memory=nokmem to the kernel
332 -----------------------------------------------
356 ----------------------
369     deployments where the total amount of memory per-cgroup is overcommited.
371     box can still run out of non-reclaimable memory.
391 ------------------
399 -------------------------------------------------------------------
403 	# mount -t tmpfs none /sys/fs/cgroup
405 	# mount -t cgroup none /sys/fs/cgroup/memory -o memory
422   We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
440 availability of memory on the system. The user is required to re-read
462 Page-fault scalability is also important. At measuring parallel
463 page fault test, multi-process test may be better than multi-thread
470 -------------------
485 ------------------
496 ---------------------
517 ---------------
527   charged file caches. Some out-of-use page caches may keep charged until
538 -------------
542 per-memory cgroup local status
564 inactive_file	# of bytes of file-backed memory on inactive LRU list.
565 active_file	# of bytes of file-backed memory on active LRU list.
611 --------------
622 -----------
634 ------------------
644 -------------
646 This is similar to numa_maps but operates on a per-memcg basis.  This is
647 useful for providing visibility into the numa locality information within
653 per-node page counts including "hierarchical_<counter>" which sums up all
689 ------------------------------------------------
723 Please note that soft limits is a best-effort feature; it comes with
730 -------------
757 -------------
770       Charges are moved only when you move mm->owner, in other words,
784 --------------------------------------
791 +---+--------------------------------------------------------------------------+
796 +---+--------------------------------------------------------------------------+
805 +---+--------------------------------------------------------------------------+
808 --------
810 - All of moving charge operations are done under cgroup_mutex. It's not good
822 - create an eventfd using eventfd(2);
823 - open memory.usage_in_bytes or memory.memsw.usage_in_bytes;
824 - write string like "<event_fd> <fd of memory.usage_in_bytes> <threshold>" to
830 It's applicable for root and non-root cgroup.
843  - create an eventfd using eventfd(2)
844  - open memory.oom_control file
845  - write string like "<event_fd> <fd of memory.oom_control>" to
851 You can disable the OOM-killer by writing "1" to memory.oom_control file, as:
855 If OOM-killer is disabled, tasks under cgroup will hang/sleep
856 in memory cgroup's OOM-waitqueue when they request accountable memory.
872 	- oom_kill_disable 0 or 1
873 	  (if 1, oom-killer is disabled)
874 	- under_oom	   0 or 1
895 resources that can be easily reconstructed or re-read from a disk.
898 about to out of memory (OOM) or even the in-kernel OOM killer is on its
904 events are not pass-through. For example, you have three cgroups: A->B->C. Now
914  - "default": this is the default behavior specified above. This mode is the
918  - "hierarchy": events always propagate up to the root, similar to the default
923  - "local": events are pass-through, i.e. they only receive notifications when
932 specified by a comma-delimited string, i.e. "low,hierarchy" specifies
933 hierarchical, pass-through, notification for all ancestor memcgs. Notification
934 that is the default, non pass-through behavior, does not specify a mode.
935 "medium,local" specifies pass-through notification for the medium level.
940 - create an eventfd using eventfd(2);
941 - open memory.pressure_level;
942 - write string as "<event_fd> <fd of memory.pressure_level> <level[,mode]>"
946 the specific level (or higher). Read/write operations to
964    (Expect a bunch of notifications, and eventually, the oom-killer will
970 1. Make per-cgroup scanner reclaim not-shared pages first
971 2. Teach controller to account for shared-pages