Lines Matching +full:use +full:- +full:minimum +full:- +full:ecc
33 -------------
44 * Memory – add error correction logic (ECC) to detect and correct errors;
47 Self-Monitoring, Analysis and Reporting Technology (SMART).
55 ---------------
57 Most mechanisms used on modern systems use use technologies like Hamming
68 * **Correctable Error (CE)** - the error detection mechanism detected and
72 * **Uncorrected Error (UE)** - the amount of errors happened above the error
73 correction threshold, and the system was unable to auto-correct.
75 * **Fatal Error** - when an UE error happens on a critical component of the
79 * **Non-fatal Error** - when an UE error happens on an unused component,
87 The mechanism for handling non-fatal errors is usually complex and may
92 ------------------------------------
113 Locator: ChannelA-DIMM0
121 On the above example, a DDR4 SO-DIMM memory module is located at the
127 Unfortunately, not all systems use the same field to specify the memory
150 Such kind of memory is called Error-correcting code memory (ECC memory).
153 labels on their system's board to use exactly the same BIOS, meaning that
156 ECC memory
157 ----------
159 As mentioned on the previous section, ECC memory has extra bits to be
172 ECC code used on write, producing a word with *data width* and a *syndrome*.
176 there was an error, and if the ECC code was able to fix such error.
187 mode called "Lock-Step", where it groups two memory modules together,
188 doing 128-bit reads/writes. That gives 16 bits for error correction, with
198 memory modules (or 4 memory modules, if the system is also on Lock-step
204 EDAC - Error Detection And Correction
210 was "out-of-tree" and maintained at http://bluesmoke.sourceforge.net.
218 -------
224 ------
240 -----------------------
245 This new device type allows for non-memory type of ECC hardware detectors
249 Some architectures have ECC detectors for L1, L2 and L3 caches,
257 ----------------
263 There are several add-in adapters that do **not** follow the PCI specification
280 ----------
293 -------
298 hardware-specific modules and have the dependencies load the necessary
310 ---------------
325 ----------------------------
328 are laid out in a Chip-Select Row (``csrowX``) and Channel table (``chX``).
331 .. [#f4] Nowadays, the term DIMM (Dual In-line Memory Module) is widely
333 packaging alternatives, like SO-DIMM, SIMM, etc. Along this document,
335 modules, even when they use a different kind of packaging.
343 for more than 2 channels, like Fully Buffered DIMMs (FB-DIMMs) memory
346 +------------+-----------------------+
348 +------------+-----------+-----------+
352 +------------+ | |
354 +------------+-----------+-----------+
356 +------------+ | |
358 +------------+-----------+-----------+
363 +---------+---------+
365 +---------+---------+
367 +---------+---------+
369 Labels for these slots are usually silk-screened on the motherboard.
391 |->mc0
392 |->mc1
393 |->mc2
401 |->csrow0
402 |->csrow2
403 |->csrow3
408 order to have dual-channel mode be operational. Since both csrow2 and
416 -------------------
423 Documentation/ABI/testing/sysfs-devices-edac
427 ----------------------------------
429 The recommended way to use the EDAC subsystem is to look at the information
485 - ``size`` - Total memory managed by this csrow attribute file
490 - ``dimm_ue_count`` - Uncorrectable Errors count attribute file
497 - ``dimm_ce_count`` - Correctable Errors count attribute file
503 monitored for non-zero values and report such information
506 - ``dimm_dev_type`` - Device type attribute file
512 - x1
513 - x2
514 - x4
515 - x8
517 - ``dimm_edac_mode`` - EDAC Mode of operation attribute file
522 - ``dimm_label`` - memory module label control file
536 - ``dimm_location`` - location of the memory module
543 - *csrow* and *channel* - used when the memory controller
544 doesn't identify a single DIMM - e. g. in ``rankX`` dir;
545 - *branch*, *channel*, *slot* - typically used on FB-DIMM memory
547 - *channel*, *slot* - used on Nehalem and newer Intel drivers.
549 - ``dimm_mem_type`` - Memory Type attribute file
555 - Registered-DDR
556 - Unbuffered-DDR
568 ----------------------
571 directories. As this API doesn't work properly for Rambus, FB-DIMMs and
579 - ``ue_count`` - Total Uncorrectable Errors count attribute file
587 - ``ce_count`` - Total Correctable Errors count attribute file
593 monitored for non-zero values and report such information
597 - ``size_mb`` - Total memory managed by this csrow attribute file
603 - ``mem_type`` - Memory Type attribute file
609 - Registered-DDR
610 - Unbuffered-DDR
613 - ``edac_mode`` - EDAC Mode of operation attribute file
619 - ``dev_type`` - Device type attribute file
625 - x1
626 - x2
627 - x4
628 - x8
631 - ``ch0_ce_count`` - Channel 0 CE Count attribute file
637 - ``ch0_ue_count`` - Channel 0 UE Count attribute file
643 - ``ch0_dimm_label`` - Channel 0 DIMM Label control file
659 - ``ch1_ce_count`` - Channel 1 CE Count attribute file
666 - ``ch1_ue_count`` - Channel 1 UE Count attribute file
673 - ``ch1_dimm_label`` - Channel 1 DIMM Label control file
689 --------------
700 +---------------------------------------+-------------+
704 +---------------------------------------+-------------+
706 +---------------------------------------+-------------+
708 +---------------------------------------+-------------+
710 +---------------------------------------+-------------+
713 +---------------------------------------+-------------+
715 +---------------------------------------+-------------+
717 +---------------------------------------+-------------+
719 +---------------------------------------+-------------+
721 +---------------------------------------+-------------+
722 | And then an optional, driver-specific | |
725 +---------------------------------------+-------------+
728 type, a notice of "no info" and then an optional, driver-specific error
733 ------------------------
743 -------------------
749 - ``check_pci_parity`` - Enable/Disable PCI Parity checking control file
764 - ``pci_parity_count`` - Parity Count
771 -----------------
773 - ``edac_mc_panic_on_ue`` - Panic on UE control file
777 occurs - it is indeterminate what was uncorrected and the operating
791 - ``edac_mc_log_ue`` - Log UE control file
807 - ``edac_mc_log_ce`` - Log CE control file
823 - ``edac_mc_poll_msec`` - Polling period control file
842 - ``panic_on_pci_parity`` - Panic on PCI PARITY Error
864 ----------------
878 /sys/devices/system/edac/test-instance
900 One out-of-tree driver uses controls here to allow
908 ---------
913 +----------------+
914 | test-instance0 |
915 +----------------+
927 ------
932 +-------------+
933 | test-block0 |
934 +-------------+
949 test-block-bits-0 for every POLL cycle this counter
951 test-block-bits-1 every 10 cycles, this counter is bumped once,
952 and test-block-bits-0 is set to 0
953 test-block-bits-2 every 100 cycles, this counter is bumped once,
954 and test-block-bits-1 is set to 0
955 test-block-bits-3 every 1000 cycles, this counter is bumped once,
956 and test-block-bits-2 is set to 0
961 reset-counters writing ANY thing to this control will
966 Use of the ``test_device_edac`` driver should enable any others to create their own
974 --------------------------------------------------
987 The Xeon E7 processor families use a separate chip for the memory
999 The minimum known unity is DIMMs. There are no information about csrows.
1000 As EDAC API maps the minimum unity is csrows, the driver sequentially
1030 - ``inject_addrmatch/*``:
1056 - ``inject_eccmask``:
1059 - ``inject_section``:
1060 specifies what ECC cache section will get the error::
1066 - ``inject_type``:
1069 bit 0 - repeat
1070 bit 1 - ecc
1071 bit 2 - parity
1073 - ``inject_enable``:
1097 …-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome…
1146 ------------------------------------------
1149 (available from http://support.amd.com/en-us/search/tech-docs):
1172 Models 30h-3Fh Processors
1176 :Link: http://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf
1179 Models 60h-6Fh Processors
1183 :Link: http://support.amd.com/TechDocs/50742_15h_Models_60h-6Fh_BKDG.pdf
1186 Models 00h-0Fh Processors
1197 - 7 Dec 2005
1198 - 17 Jul 2007 Updated
1202 - 05 Aug 2009 Nehalem interface
1203 - 26 Oct 2016 Converted to ReST and cleanups at the Nehalem section
1207 - Doug Thompson, Dave Jiang, Dave Peterson et al,
1208 - Mauro Carvalho Chehab
1209 - Borislav Petkov
1210 - original author: Thayne Harbaugh