1ARM Trusted Firmware Design 2=========================== 3 4 5.. section-numbering:: 6 :suffix: . 7 8.. contents:: 9 10The ARM Trusted Firmware implements a subset of the Trusted Board Boot 11Requirements (TBBR) Platform Design Document (PDD) [1]_ for ARM reference 12platforms. The TBB sequence starts when the platform is powered on and runs up 13to the stage where it hands-off control to firmware running in the normal 14world in DRAM. This is the cold boot path. 15 16The ARM Trusted Firmware also implements the Power State Coordination Interface 17PDD [2]_ as a runtime service. PSCI is the interface from normal world software 18to firmware implementing power management use-cases (for example, secondary CPU 19boot, hotplug and idle). Normal world software can access ARM Trusted Firmware 20runtime services via the ARM SMC (Secure Monitor Call) instruction. The SMC 21instruction must be used as mandated by the SMC Calling Convention [3]_. 22 23The ARM Trusted Firmware implements a framework for configuring and managing 24interrupts generated in either security state. The details of the interrupt 25management framework and its design can be found in ARM Trusted Firmware 26Interrupt Management Design guide [4]_. 27 28The ARM Trusted Firmware also implements a library for setting up and managing 29the translation tables. The details of this library can be found in 30`Xlat_tables design`_. 31 32The ARM Trusted Firmware can be built to support either AArch64 or AArch32 33execution state. 34 35Cold boot 36--------- 37 38The cold boot path starts when the platform is physically turned on. If 39``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the 40primary CPU, and the remaining CPUs are considered secondary CPUs. The primary 41CPU is chosen through platform-specific means. The cold boot path is mainly 42executed by the primary CPU, other than essential CPU initialization executed by 43all CPUs. The secondary CPUs are kept in a safe platform-specific state until 44the primary CPU has performed enough initialization to boot them. 45 46Refer to the `Reset Design`_ for more information on the effect of the 47``COLD_BOOT_SINGLE_CPU`` platform build option. 48 49The cold boot path in this implementation of the ARM Trusted Firmware, 50depends on the execution state. 51For AArch64, it is divided into five steps (in order of execution): 52 53- Boot Loader stage 1 (BL1) *AP Trusted ROM* 54- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* 55- Boot Loader stage 3-1 (BL31) *EL3 Runtime Software* 56- Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional) 57- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* 58 59For AArch32, it is divided into four steps (in order of execution): 60 61- Boot Loader stage 1 (BL1) *AP Trusted ROM* 62- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* 63- Boot Loader stage 3-2 (BL32) *EL3 Runtime Software* 64- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* 65 66ARM development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a 67combination of the following types of memory regions. Each bootloader stage uses 68one or more of these memory regions. 69 70- Regions accessible from both non-secure and secure states. For example, 71 non-trusted SRAM, ROM and DRAM. 72- Regions accessible from only the secure state. For example, trusted SRAM and 73 ROM. The FVPs also implement the trusted DRAM which is statically 74 configured. Additionally, the Base FVPs and Juno development platform 75 configure the TrustZone Controller (TZC) to create a region in the DRAM 76 which is accessible only from the secure state. 77 78The sections below provide the following details: 79 80- initialization and execution of the first three stages during cold boot 81- specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for 82 AArch32) entrypoint requirements for use by alternative Trusted Boot 83 Firmware in place of the provided BL1 and BL2 84 85BL1 86~~~ 87 88This stage begins execution from the platform's reset vector at EL3. The reset 89address is platform dependent but it is usually located in a Trusted ROM area. 90The BL1 data section is copied to trusted SRAM at runtime. 91 92On the ARM development platforms, BL1 code starts execution from the reset 93vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied 94to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``. 95 96The functionality implemented by this stage is as follows. 97 98Determination of boot path 99^^^^^^^^^^^^^^^^^^^^^^^^^^ 100 101Whenever a CPU is released from reset, BL1 needs to distinguish between a warm 102boot and a cold boot. This is done using platform-specific mechanisms (see the 103``plat_get_my_entrypoint()`` function in the `Porting Guide`_). In the case of a 104warm boot, a CPU is expected to continue execution from a separate 105entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe 106platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in 107the `Porting Guide`_) while the primary CPU executes the remaining cold boot path 108as described in the following sections. 109 110This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the 111`Reset Design`_ for more information on the effect of the 112``PROGRAMMABLE_RESET_ADDRESS`` platform build option. 113 114Architectural initialization 115^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 116 117BL1 performs minimal architectural initialization as follows. 118 119- Exception vectors 120 121 BL1 sets up simple exception vectors for both synchronous and asynchronous 122 exceptions. The default behavior upon receiving an exception is to populate 123 a status code in the general purpose register ``X0/R0`` and call the 124 ``plat_report_exception()`` function (see the `Porting Guide`_). The status 125 code is one of: 126 127 For AArch64: 128 129 :: 130 131 0x0 : Synchronous exception from Current EL with SP_EL0 132 0x1 : IRQ exception from Current EL with SP_EL0 133 0x2 : FIQ exception from Current EL with SP_EL0 134 0x3 : System Error exception from Current EL with SP_EL0 135 0x4 : Synchronous exception from Current EL with SP_ELx 136 0x5 : IRQ exception from Current EL with SP_ELx 137 0x6 : FIQ exception from Current EL with SP_ELx 138 0x7 : System Error exception from Current EL with SP_ELx 139 0x8 : Synchronous exception from Lower EL using aarch64 140 0x9 : IRQ exception from Lower EL using aarch64 141 0xa : FIQ exception from Lower EL using aarch64 142 0xb : System Error exception from Lower EL using aarch64 143 0xc : Synchronous exception from Lower EL using aarch32 144 0xd : IRQ exception from Lower EL using aarch32 145 0xe : FIQ exception from Lower EL using aarch32 146 0xf : System Error exception from Lower EL using aarch32 147 148 For AArch32: 149 150 :: 151 152 0x10 : User mode 153 0x11 : FIQ mode 154 0x12 : IRQ mode 155 0x13 : SVC mode 156 0x16 : Monitor mode 157 0x17 : Abort mode 158 0x1a : Hypervisor mode 159 0x1b : Undefined mode 160 0x1f : System mode 161 162 The ``plat_report_exception()`` implementation on the ARM FVP port programs 163 the Versatile Express System LED register in the following format to 164 indicate the occurence of an unexpected exception: 165 166 :: 167 168 SYS_LED[0] - Security state (Secure=0/Non-Secure=1) 169 SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) 170 For AArch32 it is always 0x0 171 SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value 172 of the status code 173 174 A write to the LED register reflects in the System LEDs (S6LED0..7) in the 175 CLCD window of the FVP. 176 177 BL1 does not expect to receive any exceptions other than the SMC exception. 178 For the latter, BL1 installs a simple stub. The stub expects to receive a 179 limited set of SMC types (determined by their function IDs in the general 180 purpose register ``X0/R0``): 181 182 - ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control 183 to EL3 Runtime Software. 184 - All SMCs listed in section "BL1 SMC Interface" in the `Firmware Update`_ 185 Design Guide are supported for AArch64 only. These SMCs are currently 186 not supported when BL1 is built for AArch32. 187 188 Any other SMC leads to an assertion failure. 189 190- CPU initialization 191 192 BL1 calls the ``reset_handler()`` function which in turn calls the CPU 193 specific reset handler function (see the section: "CPU specific operations 194 framework"). 195 196- Control register setup (for AArch64) 197 198 - ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I`` 199 bit. Alignment and stack alignment checking is enabled by setting the 200 ``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to 201 little-endian by clearing the ``SCTLR_EL3.EE`` bit. 202 203 - ``SCR_EL3``. The register width of the next lower exception level is set 204 to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap 205 both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is 206 also set to disable instruction fetches from Non-secure memory when in 207 secure state. 208 209 - ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the 210 ``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by 211 clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is 212 configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit. 213 Instructions that access the registers associated with Floating Point 214 and Advanced SIMD execution are configured to not trap to EL3 by 215 clearing the ``CPTR_EL3.TFP`` bit. 216 217 - ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt 218 mask bit. 219 220 - ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and 221 ``MDCR_EL3.TPM``, are set so that accesses to the registers they control 222 do not trap to EL3. AArch64 Secure self-hosted debug is disabled by 223 setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to 224 disable AArch32 Secure self-hosted privileged debug from S-EL1. 225 226- Control register setup (for AArch32) 227 228 - ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit. 229 Alignment checking is enabled by setting the ``SCTLR.A`` bit. 230 Exception endianness is set to little-endian by clearing the 231 ``SCTLR.EE`` bit. 232 233 - ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from 234 Non-secure memory when in secure state. 235 236 - ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1, 237 by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality 238 is configured not to trap to undefined mode by clearing the 239 ``CPACR.TRCDIS`` bit. 240 241 - ``NSACR``. Enable non-secure access to Advanced SIMD functionality and 242 system register access to implemented trace registers. 243 244 - ``FPEXC``. Enable access to the Advanced SIMD and floating-point 245 functionality from all Exception levels. 246 247 - ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing 248 the Asynchronous data abort interrupt mask bit. 249 250 - ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure 251 self-hosted privileged debug. 252 253Platform initialization 254^^^^^^^^^^^^^^^^^^^^^^^ 255 256On ARM platforms, BL1 performs the following platform initializations: 257 258- Enable the Trusted Watchdog. 259- Initialize the console. 260- Configure the Interconnect to enable hardware coherency. 261- Enable the MMU and map the memory it needs to access. 262- Configure any required platform storage to load the next bootloader image 263 (BL2). 264 265Firmware Update detection and execution 266^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 267 268After performing platform setup, BL1 common code calls 269``bl1_plat_get_next_image_id()`` to determine if `Firmware Update`_ is required or 270to proceed with the normal boot process. If the platform code returns 271``BL2_IMAGE_ID`` then the normal boot sequence is executed as described in the 272next section, else BL1 assumes that `Firmware Update`_ is required and execution 273passes to the first image in the `Firmware Update`_ process. In either case, BL1 274retrieves a descriptor of the next image by calling ``bl1_plat_get_image_desc()``. 275The image descriptor contains an ``entry_point_info_t`` structure, which BL1 276uses to initialize the execution state of the next image. 277 278BL2 image load and execution 279^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 280 281In the normal boot flow, BL1 execution continues as follows: 282 283#. BL1 prints the following string from the primary CPU to indicate successful 284 execution of the BL1 stage: 285 286 :: 287 288 "Booting Trusted Firmware" 289 290#. BL1 determines the amount of free trusted SRAM memory available by 291 calculating the extent of its own data section, which also resides in 292 trusted SRAM. BL1 loads a BL2 raw binary image from platform storage, at a 293 platform-specific base address. If the BL2 image file is not present or if 294 there is not enough free trusted SRAM the following error message is 295 printed: 296 297 :: 298 299 "Failed to load BL2 firmware." 300 301 BL1 calculates the amount of Trusted SRAM that can be used by the BL2 302 image. The exact load location of the image is provided as a base address 303 in the platform header. Further description of the memory layout can be 304 found later in this document. 305 306#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at 307 Secure SVC mode (for AArch32), starting from its load address. 308 309#. BL1 also passes information about the amount of trusted SRAM used and 310 available for use. This information is populated at a platform-specific 311 memory address. 312 313BL2 314~~~ 315 316BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure 317SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific 318base address (more information can be found later in this document). 319The functionality implemented by BL2 is as follows. 320 321Architectural initialization 322^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 323 324For AArch64, BL2 performs the minimal architectural initialization required 325for subsequent stages of the ARM Trusted Firmware and normal world software. 326EL1 and EL0 are given access to Floating Point and Advanced SIMD registers 327by clearing the ``CPACR.FPEN`` bits. 328 329For AArch32, the minimal architectural initialization required for subsequent 330stages of the ARM Trusted Firmware and normal world software is taken care of 331in BL1 as both BL1 and BL2 execute at PL1. 332 333Platform initialization 334^^^^^^^^^^^^^^^^^^^^^^^ 335 336On ARM platforms, BL2 performs the following platform initializations: 337 338- Initialize the console. 339- Configure any required platform storage to allow loading further bootloader 340 images. 341- Enable the MMU and map the memory it needs to access. 342- Perform platform security setup to allow access to controlled components. 343- Reserve some memory for passing information to the next bootloader image 344 EL3 Runtime Software and populate it. 345- Define the extents of memory available for loading each subsequent 346 bootloader image. 347 348Image loading in BL2 349^^^^^^^^^^^^^^^^^^^^ 350 351Image loading scheme in BL2 depends on ``LOAD_IMAGE_V2`` build option. If the 352flag is disabled, the BLxx images are loaded, by calling the respective 353load\_blxx() function from BL2 generic code. If the flag is enabled, the BL2 354generic code loads the images based on the list of loadable images provided 355by the platform. BL2 passes the list of executable images provided by the 356platform to the next handover BL image. By default, this flag is disabled for 357AArch64 and the AArch32 build is supported only if this flag is enabled. 358 359SCP\_BL2 (System Control Processor Firmware) image load 360^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 361 362Some systems have a separate System Control Processor (SCP) for power, clock, 363reset and system control. BL2 loads the optional SCP\_BL2 image from platform 364storage into a platform-specific region of secure memory. The subsequent 365handling of SCP\_BL2 is platform specific. For example, on the Juno ARM 366development platform port the image is transferred into SCP's internal memory 367using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM 368memory. The SCP executes SCP\_BL2 and signals to the Application Processor (AP) 369for BL2 execution to continue. 370 371EL3 Runtime Software image load 372^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 373 374BL2 loads the EL3 Runtime Software image from platform storage into a platform- 375specific address in trusted SRAM. If there is not enough memory to load the 376image or image is missing it leads to an assertion failure. If ``LOAD_IMAGE_V2`` 377is disabled and if image loads successfully, BL2 updates the amount of trusted 378SRAM used and available for use by EL3 Runtime Software. This information is 379populated at a platform-specific memory address. 380 381AArch64 BL32 (Secure-EL1 Payload) image load 382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 383 384BL2 loads the optional BL32 image from platform storage into a platform- 385specific region of secure memory. The image executes in the secure world. BL2 386relies on BL31 to pass control to the BL32 image, if present. Hence, BL2 387populates a platform-specific area of memory with the entrypoint/load-address 388of the BL32 image. The value of the Saved Processor Status Register (``SPSR``) 389for entry into BL32 is not determined by BL2, it is initialized by the 390Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for 391managing interaction with BL32. This information is passed to BL31. 392 393BL33 (Non-trusted Firmware) image load 394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 395 396BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from 397platform storage into non-secure memory as defined by the platform. 398 399BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state 400initialization is complete. Hence, BL2 populates a platform-specific area of 401memory with the entrypoint and Saved Program Status Register (``SPSR``) of the 402normal world software image. The entrypoint is the load address of the BL33 403image. The ``SPSR`` is determined as specified in Section 5.13 of the 404`PSCI PDD`_. This information is passed to the EL3 Runtime Software. 405 406AArch64 BL31 (EL3 Runtime Software) execution 407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 408 409BL2 execution continues as follows: 410 411#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the 412 BL31 entrypoint. The exception is handled by the SMC exception handler 413 installed by BL1. 414 415#. BL1 turns off the MMU and flushes the caches. It clears the 416 ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency 417 and invalidates the TLBs. 418 419#. BL1 passes control to BL31 at the specified entrypoint at EL3. 420 421AArch64 BL31 422~~~~~~~~~~~~ 423 424The image for this stage is loaded by BL2 and BL1 passes control to BL31 at 425EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and 426loaded at a platform-specific base address (more information can be found later 427in this document). The functionality implemented by BL31 is as follows. 428 429Architectural initialization 430^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 431 432Currently, BL31 performs a similar architectural initialization to BL1 as 433far as system register settings are concerned. Since BL1 code resides in ROM, 434architectural initialization in BL31 allows override of any previous 435initialization done by BL1. 436 437BL31 initializes the per-CPU data framework, which provides a cache of 438frequently accessed per-CPU data optimised for fast, concurrent manipulation 439on different CPUs. This buffer includes pointers to per-CPU contexts, crash 440buffer, CPU reset and power down operations, PSCI data, platform data and so on. 441 442It then replaces the exception vectors populated by BL1 with its own. BL31 443exception vectors implement more elaborate support for handling SMCs since this 444is the only mechanism to access the runtime services implemented by BL31 (PSCI 445for example). BL31 checks each SMC for validity as specified by the 446`SMC calling convention PDD`_ before passing control to the required SMC 447handler routine. 448 449BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system 450counter, which is provided by the platform. 451 452Platform initialization 453^^^^^^^^^^^^^^^^^^^^^^^ 454 455BL31 performs detailed platform initialization, which enables normal world 456software to function correctly. 457 458On ARM platforms, this consists of the following: 459 460- Initialize the console. 461- Configure the Interconnect to enable hardware coherency. 462- Enable the MMU and map the memory it needs to access. 463- Initialize the generic interrupt controller. 464- Initialize the power controller device. 465- Detect the system topology. 466 467Runtime services initialization 468^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 469 470BL31 is responsible for initializing the runtime services. One of them is PSCI. 471 472As part of the PSCI initializations, BL31 detects the system topology. It also 473initializes the data structures that implement the state machine used to track 474the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or 475``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster 476that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also 477initializes the locks that protect them. BL31 accesses the state of a CPU or 478cluster immediately after reset and before the data cache is enabled in the 479warm boot path. It is not currently possible to use 'exclusive' based spinlocks, 480therefore BL31 uses locks based on Lamport's Bakery algorithm instead. 481 482The runtime service framework and its initialization is described in more 483detail in the "EL3 runtime services framework" section below. 484 485Details about the status of the PSCI implementation are provided in the 486"Power State Coordination Interface" section below. 487 488AArch64 BL32 (Secure-EL1 Payload) image initialization 489^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 490 491If a BL32 image is present then there must be a matching Secure-EL1 Payload 492Dispatcher (SPD) service (see later for details). During initialization 493that service must register a function to carry out initialization of BL32 494once the runtime services are fully initialized. BL31 invokes such a 495registered function to initialize BL32 before running BL33. This initialization 496is not necessary for AArch32 SPs. 497 498Details on BL32 initialization and the SPD's role are described in the 499"Secure-EL1 Payloads and Dispatchers" section below. 500 501BL33 (Non-trusted Firmware) execution 502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 503 504EL3 Runtime Software initializes the EL2 or EL1 processor context for normal- 505world cold boot, ensuring that no secure state information finds its way into 506the non-secure execution state. EL3 Runtime Software uses the entrypoint 507information provided by BL2 to jump to the Non-trusted firmware image (BL33) 508at the highest available Exception Level (EL2 if available, otherwise EL1). 509 510Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only) 511~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 512 513Some platforms have existing implementations of Trusted Boot Firmware that 514would like to use ARM Trusted Firmware BL31 for the EL3 Runtime Software. To 515enable this firmware architecture it is important to provide a fully documented 516and stable interface between the Trusted Boot Firmware and BL31. 517 518Future changes to the BL31 interface will be done in a backwards compatible 519way, and this enables these firmware components to be independently enhanced/ 520updated to develop and exploit new functionality. 521 522Required CPU state when calling ``bl31_entrypoint()`` during cold boot 523^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 524 525This function must only be called by the primary CPU. 526 527On entry to this function the calling primary CPU must be executing in AArch64 528EL3, little-endian data access, and all interrupt sources masked: 529 530:: 531 532 PSTATE.EL = 3 533 PSTATE.RW = 1 534 PSTATE.DAIF = 0xf 535 SCTLR_EL3.EE = 0 536 537X0 and X1 can be used to pass information from the Trusted Boot Firmware to the 538platform code in BL31: 539 540:: 541 542 X0 : Reserved for common Trusted Firmware information 543 X1 : Platform specific information 544 545BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry, 546these will be zero filled prior to invoking platform setup code. 547 548Use of the X0 and X1 parameters 549''''''''''''''''''''''''''''''' 550 551The parameters are platform specific and passed from ``bl31_entrypoint()`` to 552``bl31_early_platform_setup()``. The value of these parameters is never directly 553used by the common BL31 code. 554 555The convention is that ``X0`` conveys information regarding the BL31, BL32 and 556BL33 images from the Trusted Boot firmware and ``X1`` can be used for other 557platform specific purpose. This convention allows platforms which use ARM 558Trusted Firmware's BL1 and BL2 images to transfer additional platform specific 559information from Secure Boot without conflicting with future evolution of the 560Trusted Firmware using ``X0`` to pass a ``bl31_params`` structure. 561 562BL31 common and SPD initialization code depends on image and entrypoint 563information about BL33 and BL32, which is provided via BL31 platform APIs. 564This information is required until the start of execution of BL33. This 565information can be provided in a platform defined manner, e.g. compiled into 566the platform code in BL31, or provided in a platform defined memory location 567by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the 568Cold boot Initialization parameters. This data may need to be cleaned out of 569the CPU caches if it is provided by an earlier boot stage and then accessed by 570BL31 platform code before the caches are enabled. 571 572ARM Trusted Firmware's BL2 implementation passes a ``bl31_params`` structure in 573``X0`` and the ARM development platforms interpret this in the BL31 platform 574code. 575 576MMU, Data caches & Coherency 577'''''''''''''''''''''''''''' 578 579BL31 does not depend on the enabled state of the MMU, data caches or 580interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled 581on entry, these should be enabled during ``bl31_plat_arch_setup()``. 582 583Data structures used in the BL31 cold boot interface 584'''''''''''''''''''''''''''''''''''''''''''''''''''' 585 586These structures are designed to support compatibility and independent 587evolution of the structures and the firmware images. For example, a version of 588BL31 that can interpret the BL3x image information from different versions of 589BL2, a platform that uses an extended entry\_point\_info structure to convey 590additional register information to BL31, or a ELF image loader that can convey 591more details about the firmware images. 592 593To support these scenarios the structures are versioned and sized, which enables 594BL31 to detect which information is present and respond appropriately. The 595``param_header`` is defined to capture this information: 596 597.. code:: c 598 599 typedef struct param_header { 600 uint8_t type; /* type of the structure */ 601 uint8_t version; /* version of this structure */ 602 uint16_t size; /* size of this structure in bytes */ 603 uint32_t attr; /* attributes: unused bits SBZ */ 604 } param_header_t; 605 606The structures using this format are ``entry_point_info``, ``image_info`` and 607``bl31_params``. The code that allocates and populates these structures must set 608the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined 609to simplify this action. 610 611Required CPU state for BL31 Warm boot initialization 612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 613 614When requesting a CPU power-on, or suspending a running CPU, ARM Trusted 615Firmware provides the platform power management code with a Warm boot 616initialization entry-point, to be invoked by the CPU immediately after the 617reset handler. On entry to the Warm boot initialization function the calling 618CPU must be in AArch64 EL3, little-endian data access and all interrupt sources 619masked: 620 621:: 622 623 PSTATE.EL = 3 624 PSTATE.RW = 1 625 PSTATE.DAIF = 0xf 626 SCTLR_EL3.EE = 0 627 628The PSCI implementation will initialize the processor state and ensure that the 629platform power management code is then invoked as required to initialize all 630necessary system, cluster and CPU resources. 631 632AArch32 EL3 Runtime Software entrypoint interface 633~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 634 635To enable this firmware architecture it is important to provide a fully 636documented and stable interface between the Trusted Boot Firmware and the 637AArch32 EL3 Runtime Software. 638 639Future changes to the entrypoint interface will be done in a backwards 640compatible way, and this enables these firmware components to be independently 641enhanced/updated to develop and exploit new functionality. 642 643Required CPU state when entering during cold boot 644^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 645 646This function must only be called by the primary CPU. 647 648On entry to this function the calling primary CPU must be executing in AArch32 649EL3, little-endian data access, and all interrupt sources masked: 650 651:: 652 653 PSTATE.AIF = 0x7 654 SCTLR.EE = 0 655 656R0 and R1 are used to pass information from the Trusted Boot Firmware to the 657platform code in AArch32 EL3 Runtime Software: 658 659:: 660 661 R0 : Reserved for common Trusted Firmware information 662 R1 : Platform specific information 663 664Use of the R0 and R1 parameters 665''''''''''''''''''''''''''''''' 666 667The parameters are platform specific and the convention is that ``R0`` conveys 668information regarding the BL3x images from the Trusted Boot firmware and ``R1`` 669can be used for other platform specific purpose. This convention allows 670platforms which use ARM Trusted Firmware's BL1 and BL2 images to transfer 671additional platform specific information from Secure Boot without conflicting 672with future evolution of the Trusted Firmware using ``R0`` to pass a ``bl_params`` 673structure. 674 675The AArch32 EL3 Runtime Software is responsible for entry into BL33. This 676information can be obtained in a platform defined manner, e.g. compiled into 677the AArch32 EL3 Runtime Software, or provided in a platform defined memory 678location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware 679via the Cold boot Initialization parameters. This data may need to be cleaned 680out of the CPU caches if it is provided by an earlier boot stage and then 681accessed by AArch32 EL3 Runtime Software before the caches are enabled. 682 683When using AArch32 EL3 Runtime Software, the ARM development platforms pass a 684``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime 685Software platform code. 686 687MMU, Data caches & Coherency 688'''''''''''''''''''''''''''' 689 690AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU, 691data caches or interconnect coherency in its entrypoint. They must be explicitly 692enabled if required. 693 694Data structures used in cold boot interface 695''''''''''''''''''''''''''''''''''''''''''' 696 697The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead 698of ``bl31_params``. The ``bl_params`` structure is based on the convention 699described in AArch64 BL31 cold boot interface section. 700 701Required CPU state for warm boot initialization 702^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 703 704When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3 705Runtime Software must ensure execution of a warm boot initialization entrypoint. 706If ARM Trusted Firmware BL1 is used and the PROGRAMMABLE\_RESET\_ADDRESS build 707flag is false, then AArch32 EL3 Runtime Software must ensure that BL1 branches 708to the warm boot entrypoint by arranging for the BL1 platform function, 709plat\_get\_my\_entrypoint(), to return a non-zero value. 710 711In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian 712data access and all interrupt sources masked: 713 714:: 715 716 PSTATE.AIF = 0x7 717 SCTLR.EE = 0 718 719The warm boot entrypoint may be implemented by using the ARM Trusted Firmware 720``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil 721the pre-requisites mentioned in the `PSCI Library integration guide`_. 722 723EL3 runtime services framework 724------------------------------ 725 726Software executing in the non-secure state and in the secure state at exception 727levels lower than EL3 will request runtime services using the Secure Monitor 728Call (SMC) instruction. These requests will follow the convention described in 729the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function 730identifiers to each SMC request and describes how arguments are passed and 731returned. 732 733The EL3 runtime services framework enables the development of services by 734different providers that can be easily integrated into final product firmware. 735The following sections describe the framework which facilitates the 736registration, initialization and use of runtime services in EL3 Runtime 737Software (BL31). 738 739The design of the runtime services depends heavily on the concepts and 740definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning 741Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling 742conventions. Please refer to that document for more detailed explanation of 743these terms. 744 745The following runtime services are expected to be implemented first. They have 746not all been instantiated in the current implementation. 747 748#. Standard service calls 749 750 This service is for management of the entire system. The Power State 751 Coordination Interface (`PSCI`_) is the first set of standard service calls 752 defined by ARM (see PSCI section later). 753 754#. Secure-EL1 Payload Dispatcher service 755 756 If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then 757 it also requires a *Secure Monitor* at EL3 to switch the EL1 processor 758 context between the normal world (EL1/EL2) and trusted world (Secure-EL1). 759 The Secure Monitor will make these world switches in response to SMCs. The 760 `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted 761 Application Call OEN ranges. 762 763 The interface between the EL3 Runtime Software and the Secure-EL1 Payload is 764 not defined by the `SMCCC`_ or any other standard. As a result, each 765 Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime 766 service - within ARM Trusted Firmware this service is referred to as the 767 Secure-EL1 Payload Dispatcher (SPD). 768 769 ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and its 770 associated Dispatcher (TSPD). Details of SPD design and TSP/TSPD operation 771 are described in the "Secure-EL1 Payloads and Dispatchers" section below. 772 773#. CPU implementation service 774 775 This service will provide an interface to CPU implementation specific 776 services for a given platform e.g. access to processor errata workarounds. 777 This service is currently unimplemented. 778 779Additional services for ARM Architecture, SiP and OEM calls can be implemented. 780Each implemented service handles a range of SMC function identifiers as 781described in the `SMCCC`_. 782 783Registration 784~~~~~~~~~~~~ 785 786A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying 787the name of the service, the range of OENs covered, the type of service and 788initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``). 789This structure is allocated in a special ELF section ``rt_svc_descs``, enabling 790the framework to find all service descriptors included into BL31. 791 792The specific service for a SMC Function is selected based on the OEN and call 793type of the Function ID, and the framework uses that information in the service 794descriptor to identify the handler for the SMC Call. 795 796The service descriptors do not include information to identify the precise set 797of SMC function identifiers supported by this service implementation, the 798security state from which such calls are valid nor the capability to support 79964-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately 800to these aspects of a SMC call is the responsibility of the service 801implementation, the framework is focused on integration of services from 802different providers and minimizing the time taken by the framework before the 803service handler is invoked. 804 805Details of the parameters, requirements and behavior of the initialization and 806call handling functions are provided in the following sections. 807 808Initialization 809~~~~~~~~~~~~~~ 810 811``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services 812framework running on the primary CPU during cold boot as part of the BL31 813initialization. This happens prior to initializing a Trusted OS and running 814Normal world boot firmware that might in turn use these services. 815Initialization involves validating each of the declared runtime service 816descriptors, calling the service initialization function and populating the 817index used for runtime lookup of the service. 818 819The BL31 linker script collects all of the declared service descriptors into a 820single array and defines symbols that allow the framework to locate and traverse 821the array, and determine its size. 822 823The framework does basic validation of each descriptor to halt firmware 824initialization if service declaration errors are detected. The framework does 825not check descriptors for the following error conditions, and may behave in an 826unpredictable manner under such scenarios: 827 828#. Overlapping OEN ranges 829#. Multiple descriptors for the same range of OENs and ``call_type`` 830#. Incorrect range of owning entity numbers for a given ``call_type`` 831 832Once validated, the service ``init()`` callback is invoked. This function carries 833out any essential EL3 initialization before servicing requests. The ``init()`` 834function is only invoked on the primary CPU during cold boot. If the service 835uses per-CPU data this must either be initialized for all CPUs during this call, 836or be done lazily when a CPU first issues an SMC call to that service. If 837``init()`` returns anything other than ``0``, this is treated as an initialization 838error and the service is ignored: this does not cause the firmware to halt. 839 840The OEN and call type fields present in the SMC Function ID cover a total of 841128 distinct services, but in practice a single descriptor can cover a range of 842OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a 843service handler, the framework uses an array of 128 indices that map every 844distinct OEN/call-type combination either to one of the declared services or to 845indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is 846populated for all of the OENs covered by a service after the service ``init()`` 847function has reported success. So a service that fails to initialize will never 848have it's ``handle()`` function invoked. 849 850The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC 851Function ID call type and OEN onto a specific service handler in the 852``rt_svc_descs[]`` array. 853 854|Image 1| 855 856Handling an SMC 857~~~~~~~~~~~~~~~ 858 859When the EL3 runtime services framework receives a Secure Monitor Call, the SMC 860Function ID is passed in W0 from the lower exception level (as per the 861`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an 862SMC Function which indicates the SMC64 calling convention: such calls are 863ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF`` 864in R0/X0. 865 866Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC 867Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The 868resulting value might indicate a service that has no handler, in this case the 869framework will also report an Unknown SMC Function ID. Otherwise, the value is 870used as a further index into the ``rt_svc_descs[]`` array to locate the required 871service and handler. 872 873The service's ``handle()`` callback is provided with five of the SMC parameters 874directly, the others are saved into memory for retrieval (if needed) by the 875handler. The handler is also provided with an opaque ``handle`` for use with the 876supporting library for parameter retrieval, setting return values and context 877manipulation; and with ``flags`` indicating the security state of the caller. The 878framework finally sets up the execution stack for the handler, and invokes the 879services ``handle()`` function. 880 881On return from the handler the result registers are populated in X0-X3 before 882restoring the stack and CPU state and returning from the original SMC. 883 884Power State Coordination Interface 885---------------------------------- 886 887TODO: Provide design walkthrough of PSCI implementation. 888 889The PSCI v1.1 specification categorizes APIs as optional and mandatory. All the 890mandatory APIs in PSCI v1.1, PSCI v1.0 and in PSCI v0.2 draft specification 891`Power State Coordination Interface PDD`_ are implemented. The table lists 892the PSCI v1.1 APIs and their support in generic code. 893 894An API implementation might have a dependency on platform code e.g. CPU\_SUSPEND 895requires the platform to export a part of the implementation. Hence the level 896of support of the mandatory APIs depends upon the support exported by the 897platform port as well. The Juno and FVP (all variants) platforms export all the 898required support. 899 900+-----------------------------+-------------+-------------------------------+ 901| PSCI v1.1 API | Supported | Comments | 902+=============================+=============+===============================+ 903| ``PSCI_VERSION`` | Yes | The version returned is 1.1 | 904+-----------------------------+-------------+-------------------------------+ 905| ``CPU_SUSPEND`` | Yes\* | | 906+-----------------------------+-------------+-------------------------------+ 907| ``CPU_OFF`` | Yes\* | | 908+-----------------------------+-------------+-------------------------------+ 909| ``CPU_ON`` | Yes\* | | 910+-----------------------------+-------------+-------------------------------+ 911| ``AFFINITY_INFO`` | Yes | | 912+-----------------------------+-------------+-------------------------------+ 913| ``MIGRATE`` | Yes\*\* | | 914+-----------------------------+-------------+-------------------------------+ 915| ``MIGRATE_INFO_TYPE`` | Yes\*\* | | 916+-----------------------------+-------------+-------------------------------+ 917| ``MIGRATE_INFO_CPU`` | Yes\*\* | | 918+-----------------------------+-------------+-------------------------------+ 919| ``SYSTEM_OFF`` | Yes\* | | 920+-----------------------------+-------------+-------------------------------+ 921| ``SYSTEM_RESET`` | Yes\* | | 922+-----------------------------+-------------+-------------------------------+ 923| ``PSCI_FEATURES`` | Yes | | 924+-----------------------------+-------------+-------------------------------+ 925| ``CPU_FREEZE`` | No | | 926+-----------------------------+-------------+-------------------------------+ 927| ``CPU_DEFAULT_SUSPEND`` | No | | 928+-----------------------------+-------------+-------------------------------+ 929| ``NODE_HW_STATE`` | Yes\* | | 930+-----------------------------+-------------+-------------------------------+ 931| ``SYSTEM_SUSPEND`` | Yes\* | | 932+-----------------------------+-------------+-------------------------------+ 933| ``PSCI_SET_SUSPEND_MODE`` | No | | 934+-----------------------------+-------------+-------------------------------+ 935| ``PSCI_STAT_RESIDENCY`` | Yes\* | | 936+-----------------------------+-------------+-------------------------------+ 937| ``PSCI_STAT_COUNT`` | Yes\* | | 938+-----------------------------+-------------+-------------------------------+ 939| ``SYSTEM_RESET2`` | Yes\* | | 940+-----------------------------+-------------+-------------------------------+ 941| ``MEM_PROTECT`` | Yes\* | | 942+-----------------------------+-------------+-------------------------------+ 943| ``MEM_PROTECT_CHECK_RANGE`` | Yes\* | | 944+-----------------------------+-------------+-------------------------------+ 945 946\*Note : These PSCI APIs require platform power management hooks to be 947registered with the generic PSCI code to be supported. 948 949\*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher 950hooks to be registered with the generic PSCI code to be supported. 951 952The PSCI implementation in ARM Trusted Firmware is a library which can be 953integrated with AArch64 or AArch32 EL3 Runtime Software for ARMv8-A systems. 954A guide to integrating PSCI library with AArch32 EL3 Runtime Software 955can be found `here`_. 956 957Secure-EL1 Payloads and Dispatchers 958----------------------------------- 959 960On a production system that includes a Trusted OS running in Secure-EL1/EL0, 961the Trusted OS is coupled with a companion runtime service in the BL31 962firmware. This service is responsible for the initialisation of the Trusted 963OS and all communications with it. The Trusted OS is the BL32 stage of the 964boot flow in ARM Trusted Firmware. The firmware will attempt to locate, load 965and execute a BL32 image. 966 967ARM Trusted Firmware uses a more general term for the BL32 software that runs 968at Secure-EL1 - the *Secure-EL1 Payload* - as it is not always a Trusted OS. 969 970The ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and a Test 971Secure-EL1 Payload Dispatcher (TSPD) service as an example of how a Trusted OS 972is supported on a production system using the Runtime Services Framework. On 973such a system, the Test BL32 image and service are replaced by the Trusted OS 974and its dispatcher service. The ARM Trusted Firmware build system expects that 975the dispatcher will define the build flag ``NEED_BL32`` to enable it to include 976the BL32 in the build either as a binary or to compile from source depending 977on whether the ``BL32`` build option is specified or not. 978 979The TSP runs in Secure-EL1. It is designed to demonstrate synchronous 980communication with the normal-world software running in EL1/EL2. Communication 981is initiated by the normal-world software 982 983- either directly through a Fast SMC (as defined in the `SMCCC`_) 984 985- or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn 986 informs the TSPD about the requested power management operation. This allows 987 the TSP to prepare for or respond to the power state change 988 989The TSPD service is responsible for. 990 991- Initializing the TSP 992 993- Routing requests and responses between the secure and the non-secure 994 states during the two types of communications just described 995 996Initializing a BL32 Image 997~~~~~~~~~~~~~~~~~~~~~~~~~ 998 999The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing 1000the BL32 image. It needs access to the information passed by BL2 to BL31 to do 1001so. This is provided by: 1002 1003.. code:: c 1004 1005 entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); 1006 1007which returns a reference to the ``entry_point_info`` structure corresponding to 1008the image which will be run in the specified security state. The SPD uses this 1009API to get entry point information for the SECURE image, BL32. 1010 1011In the absence of a BL32 image, BL31 passes control to the normal world 1012bootloader image (BL33). When the BL32 image is present, it is typical 1013that the SPD wants control to be passed to BL32 first and then later to BL33. 1014 1015To do this the SPD has to register a BL32 initialization function during 1016initialization of the SPD service. The BL32 initialization function has this 1017prototype: 1018 1019.. code:: c 1020 1021 int32_t init(void); 1022 1023and is registered using the ``bl31_register_bl32_init()`` function. 1024 1025Trusted Firmware supports two approaches for the SPD to pass control to BL32 1026before returning through EL3 and running the non-trusted firmware (BL33): 1027 1028#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to 1029 request that the exit from ``bl31_main()`` is to the BL32 entrypoint in 1030 Secure-EL1. BL31 will exit to BL32 using the asynchronous method by 1031 calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``. 1032 1033 When the BL32 has completed initialization at Secure-EL1, it returns to 1034 BL31 by issuing an SMC, using a Function ID allocated to the SPD. On 1035 receipt of this SMC, the SPD service handler should switch the CPU context 1036 from trusted to normal world and use the ``bl31_set_next_image_type()`` and 1037 ``bl31_prepare_next_image_entry()`` functions to set up the initial return to 1038 the normal world firmware BL33. On return from the handler the framework 1039 will exit to EL2 and run BL33. 1040 1041#. The BL32 setup function registers an initialization function using 1042 ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to 1043 invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32 1044 entrypoint. 1045 NOTE: The Test SPD service included with the Trusted Firmware provides one 1046 implementation of such a mechanism. 1047 1048 On completion BL32 returns control to BL31 via a SMC, and on receipt the 1049 SPD service handler invokes the synchronous call return mechanism to return 1050 to the BL32 initialization function. On return from this function, 1051 ``bl31_main()`` will set up the return to the normal world firmware BL33 and 1052 continue the boot process in the normal world. 1053 1054Crash Reporting in BL31 1055----------------------- 1056 1057BL31 implements a scheme for reporting the processor state when an unhandled 1058exception is encountered. The reporting mechanism attempts to preserve all the 1059register contents and report it via a dedicated UART (PL011 console). BL31 1060reports the general purpose, EL3, Secure EL1 and some EL2 state registers. 1061 1062A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via 1063the per-CPU pointer cache. The implementation attempts to minimise the memory 1064required for this feature. The file ``crash_reporting.S`` contains the 1065implementation for crash reporting. 1066 1067The sample crash output is shown below. 1068 1069:: 1070 1071 x0 :0x000000004F00007C 1072 x1 :0x0000000007FFFFFF 1073 x2 :0x0000000004014D50 1074 x3 :0x0000000000000000 1075 x4 :0x0000000088007998 1076 x5 :0x00000000001343AC 1077 x6 :0x0000000000000016 1078 x7 :0x00000000000B8A38 1079 x8 :0x00000000001343AC 1080 x9 :0x00000000000101A8 1081 x10 :0x0000000000000002 1082 x11 :0x000000000000011C 1083 x12 :0x00000000FEFDC644 1084 x13 :0x00000000FED93FFC 1085 x14 :0x0000000000247950 1086 x15 :0x00000000000007A2 1087 x16 :0x00000000000007A4 1088 x17 :0x0000000000247950 1089 x18 :0x0000000000000000 1090 x19 :0x00000000FFFFFFFF 1091 x20 :0x0000000004014D50 1092 x21 :0x000000000400A38C 1093 x22 :0x0000000000247950 1094 x23 :0x0000000000000010 1095 x24 :0x0000000000000024 1096 x25 :0x00000000FEFDC868 1097 x26 :0x00000000FEFDC86A 1098 x27 :0x00000000019EDEDC 1099 x28 :0x000000000A7CFDAA 1100 x29 :0x0000000004010780 1101 x30 :0x000000000400F004 1102 scr_el3 :0x0000000000000D3D 1103 sctlr_el3 :0x0000000000C8181F 1104 cptr_el3 :0x0000000000000000 1105 tcr_el3 :0x0000000080803520 1106 daif :0x00000000000003C0 1107 mair_el3 :0x00000000000004FF 1108 spsr_el3 :0x00000000800003CC 1109 elr_el3 :0x000000000400C0CC 1110 ttbr0_el3 :0x00000000040172A0 1111 esr_el3 :0x0000000096000210 1112 sp_el3 :0x0000000004014D50 1113 far_el3 :0x000000004F00007C 1114 spsr_el1 :0x0000000000000000 1115 elr_el1 :0x0000000000000000 1116 spsr_abt :0x0000000000000000 1117 spsr_und :0x0000000000000000 1118 spsr_irq :0x0000000000000000 1119 spsr_fiq :0x0000000000000000 1120 sctlr_el1 :0x0000000030C81807 1121 actlr_el1 :0x0000000000000000 1122 cpacr_el1 :0x0000000000300000 1123 csselr_el1 :0x0000000000000002 1124 sp_el1 :0x0000000004028800 1125 esr_el1 :0x0000000000000000 1126 ttbr0_el1 :0x000000000402C200 1127 ttbr1_el1 :0x0000000000000000 1128 mair_el1 :0x00000000000004FF 1129 amair_el1 :0x0000000000000000 1130 tcr_el1 :0x0000000000003520 1131 tpidr_el1 :0x0000000000000000 1132 tpidr_el0 :0x0000000000000000 1133 tpidrro_el0 :0x0000000000000000 1134 dacr32_el2 :0x0000000000000000 1135 ifsr32_el2 :0x0000000000000000 1136 par_el1 :0x0000000000000000 1137 far_el1 :0x0000000000000000 1138 afsr0_el1 :0x0000000000000000 1139 afsr1_el1 :0x0000000000000000 1140 contextidr_el1 :0x0000000000000000 1141 vbar_el1 :0x0000000004027000 1142 cntp_ctl_el0 :0x0000000000000000 1143 cntp_cval_el0 :0x0000000000000000 1144 cntv_ctl_el0 :0x0000000000000000 1145 cntv_cval_el0 :0x0000000000000000 1146 cntkctl_el1 :0x0000000000000000 1147 fpexc32_el2 :0x0000000004000700 1148 sp_el0 :0x0000000004010780 1149 1150Guidelines for Reset Handlers 1151----------------------------- 1152 1153Trusted Firmware implements a framework that allows CPU and platform ports to 1154perform actions very early after a CPU is released from reset in both the cold 1155and warm boot paths. This is done by calling the ``reset_handler()`` function in 1156both the BL1 and BL31 images. It in turn calls the platform and CPU specific 1157reset handling functions. 1158 1159Details for implementing a CPU specific reset handler can be found in 1160Section 8. Details for implementing a platform specific reset handler can be 1161found in the `Porting Guide`_ (see the ``plat_reset_handler()`` function). 1162 1163When adding functionality to a reset handler, keep in mind that if a different 1164reset handling behavior is required between the first and the subsequent 1165invocations of the reset handling code, this should be detected at runtime. 1166In other words, the reset handler should be able to detect whether an action has 1167already been performed and act as appropriate. Possible courses of actions are, 1168e.g. skip the action the second time, or undo/redo it. 1169 1170Configuring secure interrupts 1171----------------------------- 1172 1173The GIC driver is responsible for performing initial configuration of secure 1174interrupts on the platform. To this end, the platform is expected to provide the 1175GIC driver (either GICv2 or GICv3, as selected by the platform) with the 1176interrupt configuration during the driver initialisation. 1177 1178There are two ways to specify secure interrupt configuration: 1179 1180#. Array of secure interrupt properties: In this scheme, in both GICv2 and GICv3 1181 driver data structures, the ``interrupt_props`` member points to an array of 1182 interrupt properties. Each element of the array specifies the interrupt 1183 number and its configuration, viz. priority, group, configuration. Each 1184 element of the array shall be populated by the macro ``INTR_PROP_DESC()``. 1185 The macro takes the following arguments: 1186 1187 - 10-bit interrupt number, 1188 1189 - 8-bit interrupt priority, 1190 1191 - Interrupt type (one of ``INTR_TYPE_EL3``, ``INTR_TYPE_S_EL1``, 1192 ``INTR_TYPE_NS``), 1193 1194 - Interrupt configuration (either ``GIC_INTR_CFG_LEVEL`` or 1195 ``GIC_INTR_CFG_EDGE``). 1196 1197#. Array of secure interrupts: In this scheme, the GIC driver is provided an 1198 array of secure interrupt numbers. The GIC driver, at the time of 1199 initialisation, iterates through the array and assigns each interrupt 1200 the appropriate group. 1201 1202 - For the GICv2 driver, in ``gicv2_driver_data`` structure, the 1203 ``g0_interrupt_array`` member of the should point to the array of 1204 interrupts to be assigned to *Group 0*, and the ``g0_interrupt_num`` 1205 member of the should be set to the number of interrupts in the array. 1206 1207 - For the GICv3 driver, in ``gicv3_driver_data`` structure: 1208 1209 - The ``g0_interrupt_array`` member of the should point to the array of 1210 interrupts to be assigned to *Group 0*, and the ``g0_interrupt_num`` 1211 member of the should be set to the number of interrupts in the array. 1212 1213 - The ``g1s_interrupt_array`` member of the should point to the array of 1214 interrupts to be assigned to *Group 1 Secure*, and the 1215 ``g1s_interrupt_num`` member of the should be set to the number of 1216 interrupts in the array. 1217 1218 **Note that this scheme is deprecated.** 1219 1220CPU specific operations framework 1221--------------------------------- 1222 1223Certain aspects of the ARMv8 architecture are implementation defined, 1224that is, certain behaviours are not architecturally defined, but must be defined 1225and documented by individual processor implementations. The ARM Trusted 1226Firmware implements a framework which categorises the common implementation 1227defined behaviours and allows a processor to export its implementation of that 1228behaviour. The categories are: 1229 1230#. Processor specific reset sequence. 1231 1232#. Processor specific power down sequences. 1233 1234#. Processor specific register dumping as a part of crash reporting. 1235 1236#. Errata status reporting. 1237 1238Each of the above categories fulfils a different requirement. 1239 1240#. allows any processor specific initialization before the caches and MMU 1241 are turned on, like implementation of errata workarounds, entry into 1242 the intra-cluster coherency domain etc. 1243 1244#. allows each processor to implement the power down sequence mandated in 1245 its Technical Reference Manual (TRM). 1246 1247#. allows a processor to provide additional information to the developer 1248 in the event of a crash, for example Cortex-A53 has registers which 1249 can expose the data cache contents. 1250 1251#. allows a processor to define a function that inspects and reports the status 1252 of all errata workarounds on that processor. 1253 1254Please note that only 2. is mandated by the TRM. 1255 1256The CPU specific operations framework scales to accommodate a large number of 1257different CPUs during power down and reset handling. The platform can specify 1258any CPU optimization it wants to enable for each CPU. It can also specify 1259the CPU errata workarounds to be applied for each CPU type during reset 1260handling by defining CPU errata compile time macros. Details on these macros 1261can be found in the `cpu-specific-build-macros.rst`_ file. 1262 1263The CPU specific operations framework depends on the ``cpu_ops`` structure which 1264needs to be exported for each type of CPU in the platform. It is defined in 1265``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``, 1266``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and 1267``cpu_reg_dump()``. 1268 1269The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with 1270suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S`` 1271exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform 1272configuration, these CPU specific files must be included in the build by 1273the platform makefile. The generic CPU specific operations framework code exists 1274in ``lib/cpus/aarch64/cpu_helpers.S``. 1275 1276CPU specific Reset Handling 1277~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1278 1279After a reset, the state of the CPU when it calls generic reset handler is: 1280MMU turned off, both instruction and data caches turned off and not part 1281of any coherency domain. 1282 1283The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow 1284the platform to perform any system initialization required and any system 1285errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads 1286the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops`` 1287array and returns it. Note that only the part number and implementer fields 1288in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in 1289the returned ``cpu_ops`` is then invoked which executes the required reset 1290handling for that CPU and also any errata workarounds enabled by the platform. 1291This function must preserve the values of general purpose registers x20 to x29. 1292 1293Refer to Section "Guidelines for Reset Handlers" for general guidelines 1294regarding placement of code in a reset handler. 1295 1296CPU specific power down sequence 1297~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1298 1299During the BL31 initialization sequence, the pointer to the matching ``cpu_ops`` 1300entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly 1301retrieved during power down sequences. 1302 1303Various CPU drivers register handlers to perform power down at certain power 1304levels for that specific CPU. The PSCI service, upon receiving a power down 1305request, determines the highest power level at which to execute power down 1306sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to 1307pick the right power down handler for the requested level. The function 1308retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further 1309retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the 1310requested power level is higher than what a CPU driver supports, the handler 1311registered for highest level is invoked. 1312 1313At runtime the platform hooks for power down are invoked by the PSCI service to 1314perform platform specific operations during a power down sequence, for example 1315turning off CCI coherency during a cluster power down. 1316 1317CPU specific register reporting during crash 1318~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1319 1320If the crash reporting is enabled in BL31, when a crash occurs, the crash 1321reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching 1322``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in 1323``cpu_ops`` is invoked, which then returns the CPU specific register values to 1324be reported and a pointer to the ASCII list of register names in a format 1325expected by the crash reporting framework. 1326 1327CPU errata status reporting 1328~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1329 1330Errata workarounds for CPUs supported in ARM Trusted Firmware are applied during 1331both cold and warm boots, shortly after reset. Individual Errata workarounds are 1332enabled as build options. Some errata workarounds have potential run-time 1333implications; therefore some are enabled by default, others not. Platform ports 1334shall override build options to enable or disable errata as appropriate. The CPU 1335drivers take care of applying errata workarounds that are enabled and applicable 1336to a given CPU. Refer to the section titled *CPU Errata Workarounds* in `CPUBM`_ 1337for more information. 1338 1339Functions in CPU drivers that apply errata workaround must follow the 1340conventions listed below. 1341 1342The errata workaround must be authored as two separate functions: 1343 1344- One that checks for errata. This function must determine whether that errata 1345 applies to the current CPU. Typically this involves matching the current 1346 CPUs revision and variant against a value that's known to be affected by the 1347 errata. If the function determines that the errata applies to this CPU, it 1348 must return ``ERRATA_APPLIES``; otherwise, it must return 1349 ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and 1350 ``cpu_rev_var_ls`` functions may come in handy for this purpose. 1351 1352For an errata identified as ``E``, the check function must be named 1353``check_errata_E``. 1354 1355This function will be invoked at different times, both from assembly and from 1356C run time. Therefore it must follow AAPCS, and must not use stack. 1357 1358- Another one that applies the errata workaround. This function would call the 1359 check function described above, and applies errata workaround if required. 1360 1361CPU drivers that apply errata workaround can optionally implement an assembly 1362function that report the status of errata workarounds pertaining to that CPU. 1363For a driver that registers the CPU, for example, ``cpux`` via. ``declare_cpu_ops`` 1364macro, the errata reporting function, if it exists, must be named 1365``cpux_errata_report``. This function will always be called with MMU enabled; it 1366must follow AAPCS and may use stack. 1367 1368In a debug build of ARM Trusted Firmware, on a CPU that comes out of reset, both 1369BL1 and the run time firmware (BL31 in AArch64, and BL32 in AArch32) will invoke 1370errata status reporting function, if one exists, for that type of CPU. 1371 1372To report the status of each errata workaround, the function shall use the 1373assembler macro ``report_errata``, passing it: 1374 1375- The build option that enables the errata; 1376 1377- The name of the CPU: this must be the same identifier that CPU driver 1378 registered itself with, using ``declare_cpu_ops``; 1379 1380- And the errata identifier: the identifier must match what's used in the 1381 errata's check function described above. 1382 1383The errata status reporting function will be called once per CPU type/errata 1384combination during the software's active life time. 1385 1386It's expected that whenever an errata workaround is submitted to ARM Trusted 1387Firmware, the errata reporting function is appropriately extended to report its 1388status as well. 1389 1390Reporting the status of errata workaround is for informational purpose only; it 1391has no functional significance. 1392 1393Memory layout of BL images 1394-------------------------- 1395 1396Each bootloader image can be divided in 2 parts: 1397 1398- the static contents of the image. These are data actually stored in the 1399 binary on the disk. In the ELF terminology, they are called ``PROGBITS`` 1400 sections; 1401 1402- the run-time contents of the image. These are data that don't occupy any 1403 space in the binary on the disk. The ELF binary just contains some 1404 metadata indicating where these data will be stored at run-time and the 1405 corresponding sections need to be allocated and initialized at run-time. 1406 In the ELF terminology, they are called ``NOBITS`` sections. 1407 1408All PROGBITS sections are grouped together at the beginning of the image, 1409followed by all NOBITS sections. This is true for all Trusted Firmware images 1410and it is governed by the linker scripts. This ensures that the raw binary 1411images are as small as possible. If a NOBITS section was inserted in between 1412PROGBITS sections then the resulting binary file would contain zero bytes in 1413place of this NOBITS section, making the image unnecessarily bigger. Smaller 1414images allow faster loading from the FIP to the main memory. 1415 1416Linker scripts and symbols 1417~~~~~~~~~~~~~~~~~~~~~~~~~~ 1418 1419Each bootloader stage image layout is described by its own linker script. The 1420linker scripts export some symbols into the program symbol table. Their values 1421correspond to particular addresses. The trusted firmware code can refer to these 1422symbols to figure out the image memory layout. 1423 1424Linker symbols follow the following naming convention in the trusted firmware. 1425 1426- ``__<SECTION>_START__`` 1427 1428 Start address of a given section named ``<SECTION>``. 1429 1430- ``__<SECTION>_END__`` 1431 1432 End address of a given section named ``<SECTION>``. If there is an alignment 1433 constraint on the section's end address then ``__<SECTION>_END__`` corresponds 1434 to the end address of the section's actual contents, rounded up to the right 1435 boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the 1436 actual end address of the section's contents. 1437 1438- ``__<SECTION>_UNALIGNED_END__`` 1439 1440 End address of a given section named ``<SECTION>`` without any padding or 1441 rounding up due to some alignment constraint. 1442 1443- ``__<SECTION>_SIZE__`` 1444 1445 Size (in bytes) of a given section named ``<SECTION>``. If there is an 1446 alignment constraint on the section's end address then ``__<SECTION>_SIZE__`` 1447 corresponds to the size of the section's actual contents, rounded up to the 1448 right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__`` 1449 to know the actual size of the section's contents. 1450 1451- ``__<SECTION>_UNALIGNED_SIZE__`` 1452 1453 Size (in bytes) of a given section named ``<SECTION>`` without any padding or 1454 rounding up due to some alignment constraint. In other words, 1455 ``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``. 1456 1457Some of the linker symbols are mandatory as the trusted firmware code relies on 1458them to be defined. They are listed in the following subsections. Some of them 1459must be provided for each bootloader stage and some are specific to a given 1460bootloader stage. 1461 1462The linker scripts define some extra, optional symbols. They are not actually 1463used by any code but they help in understanding the bootloader images' memory 1464layout as they are easy to spot in the link map files. 1465 1466Common linker symbols 1467^^^^^^^^^^^^^^^^^^^^^ 1468 1469All BL images share the following requirements: 1470 1471- The BSS section must be zero-initialised before executing any C code. 1472- The coherent memory section (if enabled) must be zero-initialised as well. 1473- The MMU setup code needs to know the extents of the coherent and read-only 1474 memory regions to set the right memory attributes. When 1475 ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the 1476 read-only memory region is divided between code and data. 1477 1478The following linker symbols are defined for this purpose: 1479 1480- ``__BSS_START__`` 1481- ``__BSS_SIZE__`` 1482- ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary. 1483- ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary. 1484- ``__COHERENT_RAM_UNALIGNED_SIZE__`` 1485- ``__RO_START__`` 1486- ``__RO_END__`` 1487- ``__TEXT_START__`` 1488- ``__TEXT_END__`` 1489- ``__RODATA_START__`` 1490- ``__RODATA_END__`` 1491 1492BL1's linker symbols 1493^^^^^^^^^^^^^^^^^^^^ 1494 1495BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and 1496it is entirely executed in place but it needs some read-write memory for its 1497mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be 1498relocated from ROM to RAM before executing any C code. 1499 1500The following additional linker symbols are defined for BL1: 1501 1502- ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code 1503 and ``.data`` section in ROM. 1504- ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be 1505 aligned on a 16-byte boundary. 1506- ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be 1507 copied over. Must be aligned on a 16-byte boundary. 1508- ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM). 1509- ``__BL1_RAM_START__`` Start address of BL1 read-write data. 1510- ``__BL1_RAM_END__`` End address of BL1 read-write data. 1511 1512How to choose the right base addresses for each bootloader stage image 1513~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1514 1515There is currently no support for dynamic image loading in the Trusted Firmware. 1516This means that all bootloader images need to be linked against their ultimate 1517runtime locations and the base addresses of each image must be chosen carefully 1518such that images don't overlap each other in an undesired way. As the code 1519grows, the base addresses might need adjustments to cope with the new memory 1520layout. 1521 1522The memory layout is completely specific to the platform and so there is no 1523general recipe for choosing the right base addresses for each bootloader image. 1524However, there are tools to aid in understanding the memory layout. These are 1525the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>`` 1526being the stage bootloader. They provide a detailed view of the memory usage of 1527each image. Among other useful information, they provide the end address of 1528each image. 1529 1530- ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address. 1531- ``bl2.map`` link map file provides ``__BL2_END__`` address. 1532- ``bl31.map`` link map file provides ``__BL31_END__`` address. 1533- ``bl32.map`` link map file provides ``__BL32_END__`` address. 1534 1535For each bootloader image, the platform code must provide its start address 1536as well as a limit address that it must not overstep. The latter is used in the 1537linker scripts to check that the image doesn't grow past that address. If that 1538happens, the linker will issue a message similar to the following: 1539 1540:: 1541 1542 aarch64-none-elf-ld: BLx has exceeded its limit. 1543 1544Additionally, if the platform memory layout implies some image overlaying like 1545on FVP, BL31 and TSP need to know the limit address that their PROGBITS 1546sections must not overstep. The platform code must provide those. 1547 1548When LOAD\_IMAGE\_V2 is disabled, Trusted Firmware provides a mechanism to 1549verify at boot time that the memory to load a new image is free to prevent 1550overwriting a previously loaded image. For this mechanism to work, the platform 1551must specify the memory available in the system as regions, where each region 1552consists of base address, total size and the free area within it (as defined 1553in the ``meminfo_t`` structure). Trusted Firmware retrieves these memory regions 1554by calling the corresponding platform API: 1555 1556- ``meminfo_t *bl1_plat_sec_mem_layout(void)`` 1557- ``meminfo_t *bl2_plat_sec_mem_layout(void)`` 1558- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)`` 1559- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)`` 1560- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)`` 1561 1562For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will 1563return the region defined by the platform where BL1 intends to load BL2. The 1564``load_image()`` function will check that the memory where BL2 will be loaded is 1565within the specified region and marked as free. 1566 1567The actual number of regions and their base addresses and sizes is platform 1568specific. The platform may return the same region or define a different one for 1569each API. However, the overlap verification mechanism applies only to a single 1570region. Hence, it is the platform responsibility to guarantee that different 1571regions do not overlap, or that if they do, the overlapping images are not 1572accessed at the same time. This could be used, for example, to load temporary 1573images (e.g. certificates) or firmware images prior to being transfered to its 1574corresponding processor (e.g. the SCP BL2 image). 1575 1576To reduce fragmentation and simplify the tracking of free memory, all the free 1577memory within a region is always located in one single buffer defined by its 1578base address and size. Trusted Firmware implements a top/bottom load approach: 1579after a new image is loaded, it checks how much memory remains free above and 1580below the image. The smallest area is marked as unavailable, while the larger 1581area becomes the new free memory buffer. Platforms should take this behaviour 1582into account when defining the base address for each of the images. For example, 1583if an image is loaded near the middle of the region, small changes in image size 1584could cause a flip between a top load and a bottom load, which may result in an 1585unexpected memory layout. 1586 1587The following diagram is an example of an image loaded in the bottom part of 1588the memory region. The region is initially free (nothing has been loaded yet): 1589 1590:: 1591 1592 Memory region 1593 +----------+ 1594 | | 1595 | | <<<<<<<<<<<<< Free 1596 | | 1597 |----------| +------------+ 1598 | image | <<<<<<<<<<<<< | image | 1599 |----------| +------------+ 1600 | xxxxxxxx | <<<<<<<<<<<<< Marked as unavailable 1601 +----------+ 1602 1603And the following diagram is an example of an image loaded in the top part: 1604 1605:: 1606 1607 Memory region 1608 +----------+ 1609 | xxxxxxxx | <<<<<<<<<<<<< Marked as unavailable 1610 |----------| +------------+ 1611 | image | <<<<<<<<<<<<< | image | 1612 |----------| +------------+ 1613 | | 1614 | | <<<<<<<<<<<<< Free 1615 | | 1616 +----------+ 1617 1618When LOAD\_IMAGE\_V2 is enabled, Trusted Firmware does not provide any mechanism 1619to verify at boot time that the memory to load a new image is free to prevent 1620overwriting a previously loaded image. The platform must specify the memory 1621available in the system for all the relevant BL images to be loaded. 1622 1623For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will 1624return the region defined by the platform where BL1 intends to load BL2. The 1625``load_image()`` function performs bounds check for the image size based on the 1626base and maximum image size provided by the platforms. Platforms must take 1627this behaviour into account when defining the base/size for each of the images. 1628 1629Memory layout on ARM development platforms 1630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1631 1632The following list describes the memory layout on the ARM development platforms: 1633 1634- A 4KB page of shared memory is used for communication between Trusted 1635 Firmware and the platform's power controller. This is located at the base of 1636 Trusted SRAM. The amount of Trusted SRAM available to load the bootloader 1637 images is reduced by the size of the shared memory. 1638 1639 The shared memory is used to store the CPUs' entrypoint mailbox. On Juno, 1640 this is also used for the MHU payload when passing messages to and from the 1641 SCP. 1642 1643- On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On 1644 Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write 1645 data are relocated to the top of Trusted SRAM at runtime. 1646 1647- EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP\_MIN), 1648 is loaded at the top of the Trusted SRAM, such that its NOBITS sections will 1649 overwrite BL1 R/W data. This implies that BL1 global variables remain valid 1650 only until execution reaches the EL3 Runtime Software entry point during a 1651 cold boot. 1652 1653- BL2 is loaded below EL3 Runtime Software. 1654 1655- On Juno, SCP\_BL2 is loaded temporarily into the EL3 Runtime Software memory 1656 region and transfered to the SCP before being overwritten by EL3 Runtime 1657 Software. 1658 1659- BL32 (for AArch64) can be loaded in one of the following locations: 1660 1661 - Trusted SRAM 1662 - Trusted DRAM (FVP only) 1663 - Secure region of DRAM (top 16MB of DRAM configured by the TrustZone 1664 controller) 1665 1666 When BL32 (for AArch64) is loaded into Trusted SRAM, its NOBITS sections 1667 are allowed to overlay BL2. This memory layout is designed to give the 1668 BL32 image as much memory as possible when it is loaded into Trusted SRAM. 1669 1670When LOAD\_IMAGE\_V2 is disabled the memory regions for the overlap detection 1671mechanism at boot time are defined as follows (shown per API): 1672 1673- ``meminfo_t *bl1_plat_sec_mem_layout(void)`` 1674 1675 This region corresponds to the whole Trusted SRAM except for the shared 1676 memory at the base. This region is initially free. At boot time, BL1 will 1677 mark the BL1(rw) section within this region as occupied. The BL1(rw) section 1678 is placed at the top of Trusted SRAM. 1679 1680- ``meminfo_t *bl2_plat_sec_mem_layout(void)`` 1681 1682 This region corresponds to the whole Trusted SRAM as defined by 1683 ``bl1_plat_sec_mem_layout()``, but with the BL1(rw) section marked as 1684 occupied. This memory region is used to check that BL2 and BL31 do not 1685 overlap with each other. BL2\_BASE and BL1\_RW\_BASE are carefully chosen so 1686 that the memory for BL31 is top loaded above BL2. 1687 1688- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)`` 1689 1690 This region is an exact copy of the region defined by 1691 ``bl2_plat_sec_mem_layout()``. Being a disconnected copy means that all the 1692 changes made to this region by the Trusted Firmware will not be propagated. 1693 This approach is valid because the SCP BL2 image is loaded temporarily 1694 while it is being transferred to the SCP, so this memory is reused 1695 afterwards. 1696 1697- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)`` 1698 1699 This region depends on the location of the BL32 image. Currently, ARM 1700 platforms support three different locations (detailed below): Trusted SRAM, 1701 Trusted DRAM and the TZC-Secured DRAM. 1702 1703- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)`` 1704 1705 This region corresponds to the Non-Secure DDR-DRAM, excluding the 1706 TZC-Secured area. 1707 1708The location of the BL32 image will result in different memory maps. This is 1709illustrated for both FVP and Juno in the following diagrams, using the TSP as 1710an example. 1711 1712Note: Loading the BL32 image in TZC secured DRAM doesn't change the memory 1713layout of the other images in Trusted SRAM. 1714 1715**FVP with TSP in Trusted SRAM (default option):** 1716(These diagrams only cover the AArch64 case) 1717 1718:: 1719 1720 Trusted SRAM 1721 0x04040000 +----------+ loaded by BL2 ------------------ 1722 | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | 1723 |----------| <<<<<<<<<<<<< |----------------| 1724 | | <<<<<<<<<<<<< | BL31 PROGBITS | 1725 |----------| ------------------ 1726 | BL2 | <<<<<<<<<<<<< | BL32 NOBITS | 1727 |----------| <<<<<<<<<<<<< |----------------| 1728 | | <<<<<<<<<<<<< | BL32 PROGBITS | 1729 0x04001000 +----------+ ------------------ 1730 | Shared | 1731 0x04000000 +----------+ 1732 1733 Trusted ROM 1734 0x04000000 +----------+ 1735 | BL1 (ro) | 1736 0x00000000 +----------+ 1737 1738**FVP with TSP in Trusted DRAM:** 1739 1740:: 1741 1742 Trusted DRAM 1743 0x08000000 +----------+ 1744 | BL32 | 1745 0x06000000 +----------+ 1746 1747 Trusted SRAM 1748 0x04040000 +----------+ loaded by BL2 ------------------ 1749 | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | 1750 |----------| <<<<<<<<<<<<< |----------------| 1751 | | <<<<<<<<<<<<< | BL31 PROGBITS | 1752 |----------| ------------------ 1753 | BL2 | 1754 |----------| 1755 | | 1756 0x04001000 +----------+ 1757 | Shared | 1758 0x04000000 +----------+ 1759 1760 Trusted ROM 1761 0x04000000 +----------+ 1762 | BL1 (ro) | 1763 0x00000000 +----------+ 1764 1765**FVP with TSP in TZC-Secured DRAM:** 1766 1767:: 1768 1769 DRAM 1770 0xffffffff +----------+ 1771 | BL32 | (secure) 1772 0xff000000 +----------+ 1773 | | 1774 : : (non-secure) 1775 | | 1776 0x80000000 +----------+ 1777 1778 Trusted SRAM 1779 0x04040000 +----------+ loaded by BL2 ------------------ 1780 | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | 1781 |----------| <<<<<<<<<<<<< |----------------| 1782 | | <<<<<<<<<<<<< | BL31 PROGBITS | 1783 |----------| ------------------ 1784 | BL2 | 1785 |----------| 1786 | | 1787 0x04001000 +----------+ 1788 | Shared | 1789 0x04000000 +----------+ 1790 1791 Trusted ROM 1792 0x04000000 +----------+ 1793 | BL1 (ro) | 1794 0x00000000 +----------+ 1795 1796**Juno with BL32 in Trusted SRAM (default option):** 1797 1798:: 1799 1800 Flash0 1801 0x0C000000 +----------+ 1802 : : 1803 0x0BED0000 |----------| 1804 | BL1 (ro) | 1805 0x0BEC0000 |----------| 1806 : : 1807 0x08000000 +----------+ BL31 is loaded 1808 after SCP_BL2 has 1809 Trusted SRAM been sent to SCP 1810 0x04040000 +----------+ loaded by BL2 ------------------ 1811 | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | 1812 |----------| <<<<<<<<<<<<< |----------------| 1813 | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | 1814 |----------| ------------------ 1815 | BL2 | <<<<<<<<<<<<< | BL32 NOBITS | 1816 |----------| <<<<<<<<<<<<< |----------------| 1817 | | <<<<<<<<<<<<< | BL32 PROGBITS | 1818 0x04001000 +----------+ ------------------ 1819 | MHU | 1820 0x04000000 +----------+ 1821 1822**Juno with BL32 in TZC-secured DRAM:** 1823 1824:: 1825 1826 DRAM 1827 0xFFE00000 +----------+ 1828 | BL32 | (secure) 1829 0xFF000000 |----------| 1830 | | 1831 : : (non-secure) 1832 | | 1833 0x80000000 +----------+ 1834 1835 Flash0 1836 0x0C000000 +----------+ 1837 : : 1838 0x0BED0000 |----------| 1839 | BL1 (ro) | 1840 0x0BEC0000 |----------| 1841 : : 1842 0x08000000 +----------+ BL31 is loaded 1843 after SCP_BL2 has 1844 Trusted SRAM been sent to SCP 1845 0x04040000 +----------+ loaded by BL2 ------------------ 1846 | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | 1847 |----------| <<<<<<<<<<<<< |----------------| 1848 | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | 1849 |----------| ------------------ 1850 | BL2 | 1851 |----------| 1852 | | 1853 0x04001000 +----------+ 1854 | MHU | 1855 0x04000000 +----------+ 1856 1857Firmware Image Package (FIP) 1858---------------------------- 1859 1860Using a Firmware Image Package (FIP) allows for packing bootloader images (and 1861potentially other payloads) into a single archive that can be loaded by the ARM 1862Trusted Firmware from non-volatile platform storage. A driver to load images 1863from a FIP has been added to the storage layer and allows a package to be read 1864from supported platform storage. A tool to create Firmware Image Packages is 1865also provided and described below. 1866 1867Firmware Image Package layout 1868~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1869 1870The FIP layout consists of a table of contents (ToC) followed by payload data. 1871The ToC itself has a header followed by one or more table entries. The ToC is 1872terminated by an end marker entry. All ToC entries describe some payload data 1873that has been appended to the end of the binary package. With the information 1874provided in the ToC entry the corresponding payload data can be retrieved. 1875 1876:: 1877 1878 ------------------ 1879 | ToC Header | 1880 |----------------| 1881 | ToC Entry 0 | 1882 |----------------| 1883 | ToC Entry 1 | 1884 |----------------| 1885 | ToC End Marker | 1886 |----------------| 1887 | | 1888 | Data 0 | 1889 | | 1890 |----------------| 1891 | | 1892 | Data 1 | 1893 | | 1894 ------------------ 1895 1896The ToC header and entry formats are described in the header file 1897``include/tools_share/firmware_image_package.h``. This file is used by both the 1898tool and the ARM Trusted firmware. 1899 1900The ToC header has the following fields: 1901 1902:: 1903 1904 `name`: The name of the ToC. This is currently used to validate the header. 1905 `serial_number`: A non-zero number provided by the creation tool 1906 `flags`: Flags associated with this data. 1907 Bits 0-31: Reserved 1908 Bits 32-47: Platform defined 1909 Bits 48-63: Reserved 1910 1911A ToC entry has the following fields: 1912 1913:: 1914 1915 `uuid`: All files are referred to by a pre-defined Universally Unique 1916 IDentifier [UUID] . The UUIDs are defined in 1917 `include/tools_share/firmware_image_package.h`. The platform translates 1918 the requested image name into the corresponding UUID when accessing the 1919 package. 1920 `offset_address`: The offset address at which the corresponding payload data 1921 can be found. The offset is calculated from the ToC base address. 1922 `size`: The size of the corresponding payload data in bytes. 1923 `flags`: Flags associated with this entry. None are yet defined. 1924 1925Firmware Image Package creation tool 1926~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1927 1928The FIP creation tool can be used to pack specified images into a binary package 1929that can be loaded by the ARM Trusted Firmware from platform storage. The tool 1930currently only supports packing bootloader images. Additional image definitions 1931can be added to the tool as required. 1932 1933The tool can be found in ``tools/fiptool``. 1934 1935Loading from a Firmware Image Package (FIP) 1936~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1937 1938The Firmware Image Package (FIP) driver can load images from a binary package on 1939non-volatile platform storage. For the ARM development platforms, this is 1940currently NOR FLASH. 1941 1942Bootloader images are loaded according to the platform policy as specified by 1943the function ``plat_get_image_source()``. For the ARM development platforms, this 1944means the platform will attempt to load images from a Firmware Image Package 1945located at the start of NOR FLASH0. 1946 1947The ARM development platforms' policy is to only allow loading of a known set of 1948images. The platform policy can be modified to allow additional images. 1949 1950Use of coherent memory in Trusted Firmware 1951------------------------------------------ 1952 1953There might be loss of coherency when physical memory with mismatched 1954shareability, cacheability and memory attributes is accessed by multiple CPUs 1955(refer to section B2.9 of `ARM ARM`_ for more details). This possibility occurs 1956in Trusted Firmware during power up/down sequences when coherency, MMU and 1957caches are turned on/off incrementally. 1958 1959Trusted Firmware defines coherent memory as a region of memory with Device 1960nGnRE attributes in the translation tables. The translation granule size in 1961Trusted Firmware is 4KB. This is the smallest possible size of the coherent 1962memory region. 1963 1964By default, all data structures which are susceptible to accesses with 1965mismatched attributes from various CPUs are allocated in a coherent memory 1966region (refer to section 2.1 of `Porting Guide`_). The coherent memory region 1967accesses are Outer Shareable, non-cacheable and they can be accessed 1968with the Device nGnRE attributes when the MMU is turned on. Hence, at the 1969expense of at least an extra page of memory, Trusted Firmware is able to work 1970around coherency issues due to mismatched memory attributes. 1971 1972The alternative to the above approach is to allocate the susceptible data 1973structures in Normal WriteBack WriteAllocate Inner shareable memory. This 1974approach requires the data structures to be designed so that it is possible to 1975work around the issue of mismatched memory attributes by performing software 1976cache maintenance on them. 1977 1978Disabling the use of coherent memory in Trusted Firmware 1979~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1980 1981It might be desirable to avoid the cost of allocating coherent memory on 1982platforms which are memory constrained. Trusted Firmware enables inclusion of 1983coherent memory in firmware images through the build flag ``USE_COHERENT_MEM``. 1984This flag is enabled by default. It can be disabled to choose the second 1985approach described above. 1986 1987The below sections analyze the data structures allocated in the coherent memory 1988region and the changes required to allocate them in normal memory. 1989 1990Coherent memory usage in PSCI implementation 1991~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1992 1993The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain 1994tree information for state management of power domains. By default, this data 1995structure is allocated in the coherent memory region in the Trusted Firmware 1996because it can be accessed by multple CPUs, either with caches enabled or 1997disabled. 1998 1999.. code:: c 2000 2001 typedef struct non_cpu_pwr_domain_node { 2002 /* 2003 * Index of the first CPU power domain node level 0 which has this node 2004 * as its parent. 2005 */ 2006 unsigned int cpu_start_idx; 2007 2008 /* 2009 * Number of CPU power domains which are siblings of the domain indexed 2010 * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx 2011 * -> cpu_start_idx + ncpus' have this node as their parent. 2012 */ 2013 unsigned int ncpus; 2014 2015 /* 2016 * Index of the parent power domain node. 2017 * TODO: Figure out whether to whether using pointer is more efficient. 2018 */ 2019 unsigned int parent_node; 2020 2021 plat_local_state_t local_state; 2022 2023 unsigned char level; 2024 2025 /* For indexing the psci_lock array*/ 2026 unsigned char lock_index; 2027 } non_cpu_pd_node_t; 2028 2029In order to move this data structure to normal memory, the use of each of its 2030fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node`` 2031``level`` and ``lock_index`` are only written once during cold boot. Hence removing 2032them from coherent memory involves only doing a clean and invalidate of the 2033cache lines after these fields are written. 2034 2035The field ``local_state`` can be concurrently accessed by multiple CPUs in 2036different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure 2037mutual exlusion to this field and a clean and invalidate is needed after it 2038is written. 2039 2040Bakery lock data 2041~~~~~~~~~~~~~~~~ 2042 2043The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory 2044and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is 2045defined as follows: 2046 2047.. code:: c 2048 2049 typedef struct bakery_lock { 2050 /* 2051 * The lock_data is a bit-field of 2 members: 2052 * Bit[0] : choosing. This field is set when the CPU is 2053 * choosing its bakery number. 2054 * Bits[1 - 15] : number. This is the bakery number allocated. 2055 */ 2056 volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS]; 2057 } bakery_lock_t; 2058 2059It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU 2060fields can be read by all CPUs but only written to by the owning CPU. 2061 2062Depending upon the data cache line size, the per-CPU fields of the 2063``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line. 2064These per-CPU fields can be read and written during lock contention by multiple 2065CPUs with mismatched memory attributes. Since these fields are a part of the 2066lock implementation, they do not have access to any other locking primitive to 2067safeguard against the resulting coherency issues. As a result, simple software 2068cache maintenance is not enough to allocate them in coherent memory. Consider 2069the following example. 2070 2071CPU0 updates its per-CPU field with data cache enabled. This write updates a 2072local cache line which contains a copy of the fields for other CPUs as well. Now 2073CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache 2074disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of 2075its field in any other cache line in the system. This operation will invalidate 2076the update made by CPU0 as well. 2077 2078To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure 2079has been redesigned. The changes utilise the characteristic of Lamport's Bakery 2080algorithm mentioned earlier. The bakery\_lock structure only allocates the memory 2081for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks 2082needed for a CPU into a section ``bakery_lock``. The linker allocates the memory 2083for other cores by using the total size allocated for the bakery\_lock section 2084and multiplying it with (PLATFORM\_CORE\_COUNT - 1). This enables software to 2085perform software cache maintenance on the lock data structure without running 2086into coherency issues associated with mismatched attributes. 2087 2088The bakery lock data structure ``bakery_info_t`` is defined for use when 2089``USE_COHERENT_MEM`` is disabled as follows: 2090 2091.. code:: c 2092 2093 typedef struct bakery_info { 2094 /* 2095 * The lock_data is a bit-field of 2 members: 2096 * Bit[0] : choosing. This field is set when the CPU is 2097 * choosing its bakery number. 2098 * Bits[1 - 15] : number. This is the bakery number allocated. 2099 */ 2100 volatile uint16_t lock_data; 2101 } bakery_info_t; 2102 2103The ``bakery_info_t`` represents a single per-CPU field of one lock and 2104the combination of corresponding ``bakery_info_t`` structures for all CPUs in the 2105system represents the complete bakery lock. The view in memory for a system 2106with n bakery locks are: 2107 2108:: 2109 2110 bakery_lock section start 2111 |----------------| 2112 | `bakery_info_t`| <-- Lock_0 per-CPU field 2113 | Lock_0 | for CPU0 2114 |----------------| 2115 | `bakery_info_t`| <-- Lock_1 per-CPU field 2116 | Lock_1 | for CPU0 2117 |----------------| 2118 | .... | 2119 |----------------| 2120 | `bakery_info_t`| <-- Lock_N per-CPU field 2121 | Lock_N | for CPU0 2122 ------------------ 2123 | XXXXX | 2124 | Padding to | 2125 | next Cache WB | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate 2126 | Granule | continuous memory for remaining CPUs. 2127 ------------------ 2128 | `bakery_info_t`| <-- Lock_0 per-CPU field 2129 | Lock_0 | for CPU1 2130 |----------------| 2131 | `bakery_info_t`| <-- Lock_1 per-CPU field 2132 | Lock_1 | for CPU1 2133 |----------------| 2134 | .... | 2135 |----------------| 2136 | `bakery_info_t`| <-- Lock_N per-CPU field 2137 | Lock_N | for CPU1 2138 ------------------ 2139 | XXXXX | 2140 | Padding to | 2141 | next Cache WB | 2142 | Granule | 2143 ------------------ 2144 2145Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an 2146operation on Lock\_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1 2147``bakery_lock`` section need to be fetched and appropriate cache operations need 2148to be performed for each access. 2149 2150On ARM Platforms, bakery locks are used in psci (``psci_locks``) and power controller 2151driver (``arm_lock``). 2152 2153Non Functional Impact of removing coherent memory 2154~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2155 2156Removal of the coherent memory region leads to the additional software overhead 2157of performing cache maintenance for the affected data structures. However, since 2158the memory where the data structures are allocated is cacheable, the overhead is 2159mostly mitigated by an increase in performance. 2160 2161There is however a performance impact for bakery locks, due to: 2162 2163- Additional cache maintenance operations, and 2164- Multiple cache line reads for each lock operation, since the bakery locks 2165 for each CPU are distributed across different cache lines. 2166 2167The implementation has been optimized to minimize this additional overhead. 2168Measurements indicate that when bakery locks are allocated in Normal memory, the 2169minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas 2170in Device memory the same is 2 micro seconds. The measurements were done on the 2171Juno ARM development platform. 2172 2173As mentioned earlier, almost a page of memory can be saved by disabling 2174``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide 2175whether coherent memory should be used. If a platform disables 2176``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can 2177optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the 2178`Porting Guide`_). Refer to the reference platform code for examples. 2179 2180Isolating code and read-only data on separate memory pages 2181---------------------------------------------------------- 2182 2183In the ARMv8 VMSA, translation table entries include fields that define the 2184properties of the target memory region, such as its access permissions. The 2185smallest unit of memory that can be addressed by a translation table entry is 2186a memory page. Therefore, if software needs to set different permissions on two 2187memory regions then it needs to map them using different memory pages. 2188 2189The default memory layout for each BL image is as follows: 2190 2191:: 2192 2193 | ... | 2194 +-------------------+ 2195 | Read-write data | 2196 +-------------------+ Page boundary 2197 | <Padding> | 2198 +-------------------+ 2199 | Exception vectors | 2200 +-------------------+ 2 KB boundary 2201 | <Padding> | 2202 +-------------------+ 2203 | Read-only data | 2204 +-------------------+ 2205 | Code | 2206 +-------------------+ BLx_BASE 2207 2208Note: The 2KB alignment for the exception vectors is an architectural 2209requirement. 2210 2211The read-write data start on a new memory page so that they can be mapped with 2212read-write permissions, whereas the code and read-only data below are configured 2213as read-only. 2214 2215However, the read-only data are not aligned on a page boundary. They are 2216contiguous to the code. Therefore, the end of the code section and the beginning 2217of the read-only data one might share a memory page. This forces both to be 2218mapped with the same memory attributes. As the code needs to be executable, this 2219means that the read-only data stored on the same memory page as the code are 2220executable as well. This could potentially be exploited as part of a security 2221attack. 2222 2223TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and 2224read-only data on separate memory pages. This in turn allows independent control 2225of the access permissions for the code and read-only data. In this case, 2226platform code gets a finer-grained view of the image layout and can 2227appropriately map the code region as executable and the read-only data as 2228execute-never. 2229 2230This has an impact on memory footprint, as padding bytes need to be introduced 2231between the code and read-only data to ensure the segragation of the two. To 2232limit the memory cost, this flag also changes the memory layout such that the 2233code and exception vectors are now contiguous, like so: 2234 2235:: 2236 2237 | ... | 2238 +-------------------+ 2239 | Read-write data | 2240 +-------------------+ Page boundary 2241 | <Padding> | 2242 +-------------------+ 2243 | Read-only data | 2244 +-------------------+ Page boundary 2245 | <Padding> | 2246 +-------------------+ 2247 | Exception vectors | 2248 +-------------------+ 2 KB boundary 2249 | <Padding> | 2250 +-------------------+ 2251 | Code | 2252 +-------------------+ BLx_BASE 2253 2254With this more condensed memory layout, the separation of read-only data will 2255add zero or one page to the memory footprint of each BL image. Each platform 2256should consider the trade-off between memory footprint and security. 2257 2258This build flag is disabled by default, minimising memory footprint. On ARM 2259platforms, it is enabled. 2260 2261Publish and Subscribe Framework 2262------------------------------- 2263 2264The Publish and Subscribe Framework allows EL3 components to define and publish 2265events, to which other EL3 components can subscribe. 2266 2267The following macros are provided by the framework: 2268 2269- ``REGISTER_PUBSUB_EVENT(event)``: Defines an event, and takes one argument, 2270 the event name, which must be a valid C identifier. All calls to 2271 ``REGISTER_PUBSUB_EVENT`` macro must be placed in the file 2272 ``pubsub_events.h``. 2273 2274- ``PUBLISH_EVENT_ARG(event, arg)``: Publishes a defined event, by iterating 2275 subscribed handlers and calling them in turn. The handlers will be passed the 2276 parameter ``arg``. The expected use-case is to broadcast an event. 2277 2278- ``PUBLISH_EVENT(event)``: Like ``PUBLISH_EVENT_ARG``, except that the value 2279 ``NULL`` is passed to subscribed handlers. 2280 2281- ``SUBSCRIBE_TO_EVENT(event, handler)``: Registers the ``handler`` to 2282 subscribe to ``event``. The handler will be executed whenever the ``event`` 2283 is published. 2284 2285- ``for_each_subscriber(event, subscriber)``: Iterates through all handlers 2286 subscribed for ``event``. ``subscriber`` must be a local variable of type 2287 ``pubsub_cb_t *``, and will point to each subscribed handler in turn during 2288 iteration. This macro can be used for those patterns that none of the 2289 ``PUBLISH_EVENT_*()`` macros cover. 2290 2291Publishing an event that wasn't defined using ``REGISTER_PUBSUB_EVENT`` will 2292result in build error. Subscribing to an undefined event however won't. 2293 2294Subscribed handlers must be of type ``pubsub_cb_t``, with following function 2295signature: 2296 2297:: 2298 2299 typedef void* (*pubsub_cb_t)(const void *arg); 2300 2301There may be arbitrary number of handlers registered to the same event. The 2302order in which subscribed handlers are notified when that event is published is 2303not defined. Subscribed handlers may be executed in any order; handlers should 2304not assume any relative ordering amongst them. 2305 2306Publishing an event on a PE will result in subscribed handlers executing on that 2307PE only; it won't cause handlers to execute on a different PE. 2308 2309Note that publishing an event on a PE blocks until all the subscribed handlers 2310finish executing on the PE. 2311 2312ARM Trusted Firmware generic code publishes and subscribes to some events 2313within. Platform ports are discouraged from subscribing to them. These events 2314may be withdrawn, renamed, or have their semantics altered in the future. 2315Platforms may however register, publish, and subscribe to platform-specific 2316events. 2317 2318Publish and Subscribe Example 2319~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2320 2321A publisher that wants to publish event ``foo`` would: 2322 2323- Define the event ``foo`` in the ``pubsub_events.h``. 2324 2325 :: 2326 2327 REGISTER_PUBSUB_EVENT(foo); 2328 2329- Depending on the nature of event, use one of ``PUBLISH_EVENT_*()`` macros to 2330 publish the event at the appropriate path and time of execution. 2331 2332A subscriber that wants to subscribe to event ``foo`` published above would 2333implement: 2334 2335:: 2336 2337 void *foo_handler(const void *arg) 2338 { 2339 void *result; 2340 2341 /* Do handling ... */ 2342 2343 return result; 2344 } 2345 2346 SUBSCRIBE_TO_EVENT(foo, foo_handler); 2347 2348Performance Measurement Framework 2349--------------------------------- 2350 2351The Performance Measurement Framework (PMF) facilitates collection of 2352timestamps by registered services and provides interfaces to retrieve 2353them from within the ARM Trusted Firmware. A platform can choose to 2354expose appropriate SMCs to retrieve these collected timestamps. 2355 2356By default, the global physical counter is used for the timestamp 2357value and is read via ``CNTPCT_EL0``. The framework allows to retrieve 2358timestamps captured by other CPUs. 2359 2360Timestamp identifier format 2361~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2362 2363A PMF timestamp is uniquely identified across the system via the 2364timestamp ID or ``tid``. The ``tid`` is composed as follows: 2365 2366:: 2367 2368 Bits 0-7: The local timestamp identifier. 2369 Bits 8-9: Reserved. 2370 Bits 10-15: The service identifier. 2371 Bits 16-31: Reserved. 2372 2373#. The service identifier. Each PMF service is identified by a 2374 service name and a service identifier. Both the service name and 2375 identifier are unique within the system as a whole. 2376 2377#. The local timestamp identifier. This identifier is unique within a given 2378 service. 2379 2380Registering a PMF service 2381~~~~~~~~~~~~~~~~~~~~~~~~~ 2382 2383To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h`` 2384is used. The arguments required are the service name, the service ID, 2385the total number of local timestamps to be captured and a set of flags. 2386 2387The ``flags`` field can be specified as a bitwise-OR of the following values: 2388 2389:: 2390 2391 PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval. 2392 PMF_DUMP_ENABLE: The timestamp is dumped on the serial console. 2393 2394The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured 2395timestamps in a PMF specific linker section at build time. 2396Additionally, it defines necessary functions to capture and 2397retrieve a particular timestamp for the given service at runtime. 2398 2399The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF 2400timestamps from within ARM Trusted Firmware. In order to retrieve 2401timestamps from outside of ARM Trusted Firmware, the 2402``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro 2403accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()`` 2404macro but additionally supports retrieving timestamps using SMCs. 2405 2406Capturing a timestamp 2407~~~~~~~~~~~~~~~~~~~~~ 2408 2409PMF timestamps are stored in a per-service timestamp region. On a 2410system with multiple CPUs, each timestamp is captured and stored 2411in a per-CPU cache line aligned memory region. 2412 2413Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be 2414used to capture a timestamp at the location where it is used. The macro 2415takes the service name, a local timestamp identifier and a flag as arguments. 2416 2417The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which 2418instructs PMF to do cache maintenance following the capture. Cache 2419maintenance is required if any of the service's timestamps are captured 2420with data cache disabled. 2421 2422To capture a timestamp in assembly code, the caller should use 2423``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to 2424calculate the address of where the timestamp would be stored. The 2425caller should then read ``CNTPCT_EL0`` register to obtain the timestamp 2426and store it at the determined address for later retrieval. 2427 2428Retrieving a timestamp 2429~~~~~~~~~~~~~~~~~~~~~~ 2430 2431From within ARM Trusted Firmware, timestamps for individual CPUs can 2432be retrieved using either ``PMF_GET_TIMESTAMP_BY_MPIDR()`` or 2433``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. These macros accept the CPU's MPIDR 2434value, or its ordinal position, respectively. 2435 2436From outside ARM Trusted Firmware, timestamps for individual CPUs can be 2437retrieved by calling into ``pmf_smc_handler()``. 2438 2439.. code:: c 2440 2441 Interface : pmf_smc_handler() 2442 Argument : unsigned int smc_fid, u_register_t x1, 2443 u_register_t x2, u_register_t x3, 2444 u_register_t x4, void *cookie, 2445 void *handle, u_register_t flags 2446 Return : uintptr_t 2447 2448 smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32` 2449 when the caller of the SMC is running in AArch32 mode 2450 or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode. 2451 x1: Timestamp identifier. 2452 x2: The `mpidr` of the CPU for which the timestamp has to be retrieved. 2453 This can be the `mpidr` of a different core to the one initiating 2454 the SMC. In that case, service specific cache maintenance may be 2455 required to ensure the updated copy of the timestamp is returned. 2456 x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If 2457 `PMF_CACHE_MAINT` is passed, then the PMF code will perform a 2458 cache invalidate before reading the timestamp. This ensures 2459 an updated copy is returned. 2460 2461The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused 2462in this implementation. 2463 2464PMF code structure 2465~~~~~~~~~~~~~~~~~~ 2466 2467#. ``pmf_main.c`` consists of core functions that implement service registration, 2468 initialization, storing, dumping and retrieving timestamps. 2469 2470#. ``pmf_smc.c`` contains the SMC handling for registered PMF services. 2471 2472#. ``pmf.h`` contains the public interface to Performance Measurement Framework. 2473 2474#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in 2475 assembly code. 2476 2477#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``. 2478 2479ARMv8 Architecture Extensions 2480----------------------------- 2481 2482ARM Trusted Firmware makes use of ARMv8 Architecture Extensions where 2483applicable. This section lists the usage of Architecture Extensions, and build 2484flags controlling them. 2485 2486In general, and unless individually mentioned, the build options 2487``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` selects the Architecture Extension to 2488target when building ARM Trusted Firmware. Subsequent ARM Architecture 2489Extensions are backward compatible with previous versions. 2490 2491The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a 2492valid numeric value. These build options only control whether or not 2493Architecture Extension-specific code is included in the build. Otherwise, ARM 2494Trusted Firmware targets the base ARMv8.0 architecture; i.e. as if 2495``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` == 0, which are also their respective 2496default values. 2497 2498See also the *Summary of build options* in `User Guide`_. 2499 2500For details on the Architecture Extension and available features, please refer 2501to the respective Architecture Extension Supplement. 2502 2503ARMv8.1 2504~~~~~~~ 2505 2506This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when 2507``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1. 2508 2509- The Compare and Swap instruction is used to implement spinlocks. Otherwise, 2510 the load-/store-exclusive instruction pair is used. 2511 2512ARMv8.2 2513~~~~~~~ 2514 2515This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` == 8 and 2516``ARM_ARCH_MINOR`` >= 2. 2517 2518- The Common not Private (CnP) bit is enabled to indicate that multiple 2519 Page Entries in the same Inner Shareable domain use the same translation 2520 table entries for a given stage of translation for a particular translation 2521 regime. 2522 2523Code Structure 2524-------------- 2525 2526Trusted Firmware code is logically divided between the three boot loader 2527stages mentioned in the previous sections. The code is also divided into the 2528following categories (present as directories in the source code): 2529 2530- **Platform specific.** Choice of architecture specific code depends upon 2531 the platform. 2532- **Common code.** This is platform and architecture agnostic code. 2533- **Library code.** This code comprises of functionality commonly used by all 2534 other code. The PSCI implementation and other EL3 runtime frameworks reside 2535 as Library components. 2536- **Stage specific.** Code specific to a boot stage. 2537- **Drivers.** 2538- **Services.** EL3 runtime services (eg: SPD). Specific SPD services 2539 reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``). 2540 2541Each boot loader stage uses code from one or more of the above mentioned 2542categories. Based upon the above, the code layout looks like this: 2543 2544:: 2545 2546 Directory Used by BL1? Used by BL2? Used by BL31? 2547 bl1 Yes No No 2548 bl2 No Yes No 2549 bl31 No No Yes 2550 plat Yes Yes Yes 2551 drivers Yes No Yes 2552 common Yes Yes Yes 2553 lib Yes Yes Yes 2554 services No No Yes 2555 2556The build system provides a non configurable build option IMAGE\_BLx for each 2557boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE\_BL1 will be 2558defined by the build system. This enables the Trusted Firmware to compile 2559certain code only for specific boot loader stages 2560 2561All assembler files have the ``.S`` extension. The linker source files for each 2562boot stage have the extension ``.ld.S``. These are processed by GCC to create the 2563linker scripts which have the extension ``.ld``. 2564 2565FDTs provide a description of the hardware platform and are used by the Linux 2566kernel at boot time. These can be found in the ``fdts`` directory. 2567 2568References 2569---------- 2570 2571.. [#] Trusted Board Boot Requirements CLIENT PDD (ARM DEN0006C-1). Available 2572 under NDA through your ARM account representative. 2573.. [#] `Power State Coordination Interface PDD`_ 2574.. [#] `SMC Calling Convention PDD`_ 2575.. [#] `ARM Trusted Firmware Interrupt Management Design guide`_. 2576 2577-------------- 2578 2579*Copyright (c) 2013-2017, ARM Limited and Contributors. All rights reserved.* 2580 2581.. _Reset Design: ./reset-design.rst 2582.. _Porting Guide: ./porting-guide.rst 2583.. _Firmware Update: ./firmware-update.rst 2584.. _PSCI PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf 2585.. _SMC calling convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf 2586.. _PSCI Library integration guide: ./psci-lib-integration-guide.rst 2587.. _SMCCC: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf 2588.. _PSCI: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf 2589.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf 2590.. _here: ./psci-lib-integration-guide.rst 2591.. _cpu-specific-build-macros.rst: ./cpu-specific-build-macros.rst 2592.. _CPUBM: ./cpu-specific-build-macros.rst 2593.. _ARM ARM: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html 2594.. _User Guide: ./user-guide.rst 2595.. _SMC Calling Convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf 2596.. _ARM Trusted Firmware Interrupt Management Design guide: ./interrupt-framework-design.rst 2597.. _Xlat_tables design: xlat-tables-lib-v2-design.rst 2598 2599.. |Image 1| image:: diagrams/rt-svc-descs-layout.png?raw=true 2600