• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Linux and the Device Tree
2-------------------------
3The Linux usage model for device tree data
4
5Author: Grant Likely <grant.likely@secretlab.ca>
6
7This article describes how Linux uses the device tree.  An overview of
8the device tree data format can be found on the device tree usage page
9at devicetree.org[1].
10
11[1] http://devicetree.org/Device_Tree_Usage
12
13The "Open Firmware Device Tree", or simply Device Tree (DT), is a data
14structure and language for describing hardware.  More specifically, it
15is a description of hardware that is readable by an operating system
16so that the operating system doesn't need to hard code details of the
17machine.
18
19Structurally, the DT is a tree, or acyclic graph with named nodes, and
20nodes may have an arbitrary number of named properties encapsulating
21arbitrary data.  A mechanism also exists to create arbitrary
22links from one node to another outside of the natural tree structure.
23
24Conceptually, a common set of usage conventions, called 'bindings',
25is defined for how data should appear in the tree to describe typical
26hardware characteristics including data busses, interrupt lines, GPIO
27connections, and peripheral devices.
28
29As much as possible, hardware is described using existing bindings to
30maximize use of existing support code, but since property and node
31names are simply text strings, it is easy to extend existing bindings
32or create new ones by defining new nodes and properties.  Be wary,
33however, of creating a new binding without first doing some homework
34about what already exists.  There are currently two different,
35incompatible, bindings for i2c busses that came about because the new
36binding was created without first investigating how i2c devices were
37already being enumerated in existing systems.
38
391. History
40----------
41The DT was originally created by Open Firmware as part of the
42communication method for passing data from Open Firmware to a client
43program (like to an operating system).  An operating system used the
44Device Tree to discover the topology of the hardware at runtime, and
45thereby support a majority of available hardware without hard coded
46information (assuming drivers were available for all devices).
47
48Since Open Firmware is commonly used on PowerPC and SPARC platforms,
49the Linux support for those architectures has for a long time used the
50Device Tree.
51
52In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit
53and 64-bit support, the decision was made to require DT support on all
54powerpc platforms, regardless of whether or not they used Open
55Firmware.  To do this, a DT representation called the Flattened Device
56Tree (FDT) was created which could be passed to the kernel as a binary
57blob without requiring a real Open Firmware implementation.  U-Boot,
58kexec, and other bootloaders were modified to support both passing a
59Device Tree Binary (dtb) and to modify a dtb at boot time.  DT was
60also added to the PowerPC boot wrapper (arch/powerpc/boot/*) so that
61a dtb could be wrapped up with the kernel image to support booting
62existing non-DT aware firmware.
63
64Some time later, FDT infrastructure was generalized to be usable by
65all architectures.  At the time of this writing, 6 mainlined
66architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1
67out of mainline (nios) have some level of DT support.
68
692. Data Model
70-------------
71If you haven't already read the Device Tree Usage[1] page,
72then go read it now.  It's okay, I'll wait....
73
742.1 High Level View
75-------------------
76The most important thing to understand is that the DT is simply a data
77structure that describes the hardware.  There is nothing magical about
78it, and it doesn't magically make all hardware configuration problems
79go away.  What it does do is provide a language for decoupling the
80hardware configuration from the board and device driver support in the
81Linux kernel (or any other operating system for that matter).  Using
82it allows board and device support to become data driven; to make
83setup decisions based on data passed into the kernel instead of on
84per-machine hard coded selections.
85
86Ideally, data driven platform setup should result in less code
87duplication and make it easier to support a wide range of hardware
88with a single kernel image.
89
90Linux uses DT data for three major purposes:
911) platform identification,
922) runtime configuration, and
933) device population.
94
952.2 Platform Identification
96---------------------------
97First and foremost, the kernel will use data in the DT to identify the
98specific machine.  In a perfect world, the specific platform shouldn't
99matter to the kernel because all platform details would be described
100perfectly by the device tree in a consistent and reliable manner.
101Hardware is not perfect though, and so the kernel must identify the
102machine during early boot so that it has the opportunity to run
103machine-specific fixups.
104
105In the majority of cases, the machine identity is irrelevant, and the
106kernel will instead select setup code based on the machine's core
107CPU or SoC.  On ARM for example, setup_arch() in
108arch/arm/kernel/setup.c will call setup_machine_fdt() in
109arch/arm/kernel/devicetree.c which searches through the machine_desc
110table and selects the machine_desc which best matches the device tree
111data.  It determines the best match by looking at the 'compatible'
112property in the root device tree node, and comparing it with the
113dt_compat list in struct machine_desc.
114
115The 'compatible' property contains a sorted list of strings starting
116with the exact name of the machine, followed by an optional list of
117boards it is compatible with sorted from most compatible to least.  For
118example, the root compatible properties for the TI BeagleBoard and its
119successor, the BeagleBoard xM board might look like:
120
121	compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3";
122	compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3";
123
124Where "ti,omap3-beagleboard-xm" specifies the exact model, it also
125claims that it compatible with the OMAP 3450 SoC, and the omap3 family
126of SoCs in general.  You'll notice that the list is sorted from most
127specific (exact board) to least specific (SoC family).
128
129Astute readers might point out that the Beagle xM could also claim
130compatibility with the original Beagle board.  However, one should be
131cautioned about doing so at the board level since there is typically a
132high level of change from one board to another, even within the same
133product line, and it is hard to nail down exactly what is meant when one
134board claims to be compatible with another.  For the top level, it is
135better to err on the side of caution and not claim one board is
136compatible with another.  The notable exception would be when one
137board is a carrier for another, such as a CPU module attached to a
138carrier board.
139
140One more note on compatible values.  Any string used in a compatible
141property must be documented as to what it indicates.  Add
142documentation for compatible strings in Documentation/devicetree/bindings.
143
144Again on ARM, for each machine_desc, the kernel looks to see if
145any of the dt_compat list entries appear in the compatible property.
146If one does, then that machine_desc is a candidate for driving the
147machine.  After searching the entire table of machine_descs,
148setup_machine_fdt() returns the 'most compatible' machine_desc based
149on which entry in the compatible property each machine_desc matches
150against.  If no matching machine_desc is found, then it returns NULL.
151
152The reasoning behind this scheme is the observation that in the majority
153of cases, a single machine_desc can support a large number of boards
154if they all use the same SoC, or same family of SoCs.  However,
155invariably there will be some exceptions where a specific board will
156require special setup code that is not useful in the generic case.
157Special cases could be handled by explicitly checking for the
158troublesome board(s) in generic setup code, but doing so very quickly
159becomes ugly and/or unmaintainable if it is more than just a couple of
160cases.
161
162Instead, the compatible list allows a generic machine_desc to provide
163support for a wide common set of boards by specifying "less
164compatible" value in the dt_compat list.  In the example above,
165generic board support can claim compatibility with "ti,omap3" or
166"ti,omap3450".  If a bug was discovered on the original beagleboard
167that required special workaround code during early boot, then a new
168machine_desc could be added which implements the workarounds and only
169matches on "ti,omap3-beagleboard".
170
171PowerPC uses a slightly different scheme where it calls the .probe()
172hook from each machine_desc, and the first one returning TRUE is used.
173However, this approach does not take into account the priority of the
174compatible list, and probably should be avoided for new architecture
175support.
176
1772.3 Runtime configuration
178-------------------------
179In most cases, a DT will be the sole method of communicating data from
180firmware to the kernel, so also gets used to pass in runtime and
181configuration data like the kernel parameters string and the location
182of an initrd image.
183
184Most of this data is contained in the /chosen node, and when booting
185Linux it will look something like this:
186
187	chosen {
188		bootargs = "console=ttyS0,115200 loglevel=8";
189		initrd-start = <0xc8000000>;
190		initrd-end = <0xc8200000>;
191	};
192
193The bootargs property contains the kernel arguments, and the initrd-*
194properties define the address and size of an initrd blob.  Note that
195initrd-end is the first address after the initrd image, so this doesn't
196match the usual semantic of struct resource.  The chosen node may also
197optionally contain an arbitrary number of additional properties for
198platform-specific configuration data.
199
200During early boot, the architecture setup code calls of_scan_flat_dt()
201several times with different helper callbacks to parse device tree
202data before paging is setup.  The of_scan_flat_dt() code scans through
203the device tree and uses the helpers to extract information required
204during early boot.  Typically the early_init_dt_scan_chosen() helper
205is used to parse the chosen node including kernel parameters,
206early_init_dt_scan_root() to initialize the DT address space model,
207and early_init_dt_scan_memory() to determine the size and
208location of usable RAM.
209
210On ARM, the function setup_machine_fdt() is responsible for early
211scanning of the device tree after selecting the correct machine_desc
212that supports the board.
213
2142.4 Device population
215---------------------
216After the board has been identified, and after the early configuration data
217has been parsed, then kernel initialization can proceed in the normal
218way.  At some point in this process, unflatten_device_tree() is called
219to convert the data into a more efficient runtime representation.
220This is also when machine-specific setup hooks will get called, like
221the machine_desc .init_early(), .init_irq() and .init_machine() hooks
222on ARM.  The remainder of this section uses examples from the ARM
223implementation, but all architectures will do pretty much the same
224thing when using a DT.
225
226As can be guessed by the names, .init_early() is used for any machine-
227specific setup that needs to be executed early in the boot process,
228and .init_irq() is used to set up interrupt handling.  Using a DT
229doesn't materially change the behaviour of either of these functions.
230If a DT is provided, then both .init_early() and .init_irq() are able
231to call any of the DT query functions (of_* in include/linux/of*.h) to
232get additional data about the platform.
233
234The most interesting hook in the DT context is .init_machine() which
235is primarily responsible for populating the Linux device model with
236data about the platform.  Historically this has been implemented on
237embedded platforms by defining a set of static clock structures,
238platform_devices, and other data in the board support .c file, and
239registering it en-masse in .init_machine().  When DT is used, then
240instead of hard coding static devices for each platform, the list of
241devices can be obtained by parsing the DT, and allocating device
242structures dynamically.
243
244The simplest case is when .init_machine() is only responsible for
245registering a block of platform_devices.  A platform_device is a concept
246used by Linux for memory or I/O mapped devices which cannot be detected
247by hardware, and for 'composite' or 'virtual' devices (more on those
248later).  While there is no 'platform device' terminology for the DT,
249platform devices roughly correspond to device nodes at the root of the
250tree and children of simple memory mapped bus nodes.
251
252About now is a good time to lay out an example.  Here is part of the
253device tree for the NVIDIA Tegra board.
254
255/{
256	compatible = "nvidia,harmony", "nvidia,tegra20";
257	#address-cells = <1>;
258	#size-cells = <1>;
259	interrupt-parent = <&intc>;
260
261	chosen { };
262	aliases { };
263
264	memory {
265		device_type = "memory";
266		reg = <0x00000000 0x40000000>;
267	};
268
269	soc {
270		compatible = "nvidia,tegra20-soc", "simple-bus";
271		#address-cells = <1>;
272		#size-cells = <1>;
273		ranges;
274
275		intc: interrupt-controller@50041000 {
276			compatible = "nvidia,tegra20-gic";
277			interrupt-controller;
278			#interrupt-cells = <1>;
279			reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >;
280		};
281
282		serial@70006300 {
283			compatible = "nvidia,tegra20-uart";
284			reg = <0x70006300 0x100>;
285			interrupts = <122>;
286		};
287
288		i2s1: i2s@70002800 {
289			compatible = "nvidia,tegra20-i2s";
290			reg = <0x70002800 0x100>;
291			interrupts = <77>;
292			codec = <&wm8903>;
293		};
294
295		i2c@7000c000 {
296			compatible = "nvidia,tegra20-i2c";
297			#address-cells = <1>;
298			#size-cells = <0>;
299			reg = <0x7000c000 0x100>;
300			interrupts = <70>;
301
302			wm8903: codec@1a {
303				compatible = "wlf,wm8903";
304				reg = <0x1a>;
305				interrupts = <347>;
306			};
307		};
308	};
309
310	sound {
311		compatible = "nvidia,harmony-sound";
312		i2s-controller = <&i2s1>;
313		i2s-codec = <&wm8903>;
314	};
315};
316
317At .init_machine() time, Tegra board support code will need to look at
318this DT and decide which nodes to create platform_devices for.
319However, looking at the tree, it is not immediately obvious what kind
320of device each node represents, or even if a node represents a device
321at all.  The /chosen, /aliases, and /memory nodes are informational
322nodes that don't describe devices (although arguably memory could be
323considered a device).  The children of the /soc node are memory mapped
324devices, but the codec@1a is an i2c device, and the sound node
325represents not a device, but rather how other devices are connected
326together to create the audio subsystem.  I know what each device is
327because I'm familiar with the board design, but how does the kernel
328know what to do with each node?
329
330The trick is that the kernel starts at the root of the tree and looks
331for nodes that have a 'compatible' property.  First, it is generally
332assumed that any node with a 'compatible' property represents a device
333of some kind, and second, it can be assumed that any node at the root
334of the tree is either directly attached to the processor bus, or is a
335miscellaneous system device that cannot be described any other way.
336For each of these nodes, Linux allocates and registers a
337platform_device, which in turn may get bound to a platform_driver.
338
339Why is using a platform_device for these nodes a safe assumption?
340Well, for the way that Linux models devices, just about all bus_types
341assume that its devices are children of a bus controller.  For
342example, each i2c_client is a child of an i2c_master.  Each spi_device
343is a child of an SPI bus.  Similarly for USB, PCI, MDIO, etc.  The
344same hierarchy is also found in the DT, where I2C device nodes only
345ever appear as children of an I2C bus node.  Ditto for SPI, MDIO, USB,
346etc.  The only devices which do not require a specific type of parent
347device are platform_devices (and amba_devices, but more on that
348later), which will happily live at the base of the Linux /sys/devices
349tree.  Therefore, if a DT node is at the root of the tree, then it
350really probably is best registered as a platform_device.
351
352Linux board support code calls of_platform_populate(NULL, NULL, NULL, NULL)
353to kick off discovery of devices at the root of the tree.  The
354parameters are all NULL because when starting from the root of the
355tree, there is no need to provide a starting node (the first NULL), a
356parent struct device (the last NULL), and we're not using a match
357table (yet).  For a board that only needs to register devices,
358.init_machine() can be completely empty except for the
359of_platform_populate() call.
360
361In the Tegra example, this accounts for the /soc and /sound nodes, but
362what about the children of the SoC node?  Shouldn't they be registered
363as platform devices too?  For Linux DT support, the generic behaviour
364is for child devices to be registered by the parent's device driver at
365driver .probe() time.  So, an i2c bus device driver will register a
366i2c_client for each child node, an SPI bus driver will register
367its spi_device children, and similarly for other bus_types.
368According to that model, a driver could be written that binds to the
369SoC node and simply registers platform_devices for each of its
370children.  The board support code would allocate and register an SoC
371device, a (theoretical) SoC device driver could bind to the SoC device,
372and register platform_devices for /soc/interrupt-controller, /soc/serial,
373/soc/i2s, and /soc/i2c in its .probe() hook.  Easy, right?
374
375Actually, it turns out that registering children of some
376platform_devices as more platform_devices is a common pattern, and the
377device tree support code reflects that and makes the above example
378simpler.  The second argument to of_platform_populate() is an
379of_device_id table, and any node that matches an entry in that table
380will also get its child nodes registered.  In the tegra case, the code
381can look something like this:
382
383static void __init harmony_init_machine(void)
384{
385	/* ... */
386	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
387}
388
389"simple-bus" is defined in the ePAPR 1.0 specification as a property
390meaning a simple memory mapped bus, so the of_platform_populate() code
391could be written to just assume simple-bus compatible nodes will
392always be traversed.  However, we pass it in as an argument so that
393board support code can always override the default behaviour.
394
395[Need to add discussion of adding i2c/spi/etc child devices]
396
397Appendix A: AMBA devices
398------------------------
399
400ARM Primecells are a certain kind of device attached to the ARM AMBA
401bus which include some support for hardware detection and power
402management.  In Linux, struct amba_device and the amba_bus_type is
403used to represent Primecell devices.  However, the fiddly bit is that
404not all devices on an AMBA bus are Primecells, and for Linux it is
405typical for both amba_device and platform_device instances to be
406siblings of the same bus segment.
407
408When using the DT, this creates problems for of_platform_populate()
409because it must decide whether to register each node as either a
410platform_device or an amba_device.  This unfortunately complicates the
411device creation model a little bit, but the solution turns out not to
412be too invasive.  If a node is compatible with "arm,amba-primecell", then
413of_platform_populate() will register it as an amba_device instead of a
414platform_device.
415