1Bare-metal CI 2============= 3 4The bare-metal scripts run on a system with gitlab-runner and Docker, 5connected to potentially multiple bare-metal boards that run tests of 6Mesa. Currently "fastboot", "ChromeOS Servo", and POE-powered devices are 7supported. 8 9In comparison with LAVA, this doesn't involve maintaining a separate 10web service with its own job scheduler and replicating jobs between the 11two. It also places more of the board support in Git, instead of 12web service configuration. On the other hand, the serial interactions 13and bootloader support are more primitive. 14 15Requirements (fastboot) 16----------------------- 17 18This testing requires power control of the DUTs by the gitlab-runner 19machine, since this is what we use to reset the system and get back to 20a pristine state at the start of testing. 21 22We require access to the console output from the gitlab-runner system, 23since that is how we get the final results back from the tests. You 24should probably have the console on a serial connection, so that you 25can see bootloader progress. 26 27The boards need to be able to have a kernel/initramfs supplied by the 28gitlab-runner system, since Mesa often needs to update the kernel either for new 29DRM functionality, or to fix kernel bugs. 30 31The boards must have networking, so that we can extract the dEQP .xml results to 32artifacts on GitLab, and so that we can download traces (too large for an 33initramfs) for trace replay testing. Given that we need networking already, and 34our deqp/piglit/etc. payload is large, we use nfs from the x86 runner system 35rather than initramfs. 36 37See `src/freedreno/ci/gitlab-ci.yml` for an example of fastboot on DB410c and 38DB820c (freedreno-a306 and freereno-a530). 39 40Requirements (servo) 41-------------------- 42 43For servo-connected boards, we can use the EC connection for power 44control to reboot the board. However, loading a kernel is not as easy 45as fastboot, so we assume your bootloader can do TFTP, and that your 46gitlab-runner mounts the runner's tftp directory specific to the board 47at /tftp in the container. 48 49Since we're going the TFTP route, we also use NFS root. This avoids 50packing the rootfs and sending it to the board as a ramdisk, which 51means we can support larger rootfses (for piglit testing), at the cost 52of needing more storage on the runner. 53 54Telling the board about where its TFTP and NFS should come from is 55done using dnsmasq on the runner host. For example, this snippet in 56the dnsmasq.conf.d in the google farm, with the gitlab-runner host we 57call "servo":: 58 59 dhcp-host=1c:69:7a:0d:a3:d3,10.42.0.10,set:servo 60 61 # Fixed dhcp addresses for my sanity, and setting a tag for 62 # specializing other DHCP options 63 dhcp-host=a0:ce:c8:c8:d9:5d,10.42.0.11,set:cheza1 64 dhcp-host=a0:ce:c8:c8:d8:81,10.42.0.12,set:cheza2 65 66 # Specify the next server, watch out for the double ',,'. The 67 # filename didn't seem to get picked up by the bootloader, so we use 68 # tftp-unique-root and mount directories like 69 # /srv/tftp/10.42.0.11/jwerner/cheza as /tftp in the job containers. 70 tftp-unique-root 71 dhcp-boot=tag:cheza1,cheza1/vmlinuz,,10.42.0.10 72 dhcp-boot=tag:cheza2,cheza2/vmlinuz,,10.42.0.10 73 74 dhcp-option=tag:cheza1,option:root-path,/srv/nfs/cheza1 75 dhcp-option=tag:cheza2,option:root-path,/srv/nfs/cheza2 76 77See `src/freedreno/ci/gitlab-ci.yml` for an example of servo on cheza. Note 78that other servo boards in CI are managed using LAVA. 79 80Requirements (POE) 81------------------ 82 83For boards with 30W or less power consumption, POE can be used for the power 84control. The parts list ends up looking something like (for example): 85 86- x86-64 gitlab-runner machine with a mid-range CPU, and 3+ GB of SSD storage 87 per board. This can host at least 15 boards in our experience. 88- Cisco 2960S gigabit ethernet switch with POE. (Cisco 3750G, 3560G, or 2960G 89 were also recommended as reasonable-priced HW, but make sure the name ends in 90 G, X, or S) 91- POE splitters to power the boards (you can find ones that go to micro USB, 92 USBC, and 5V barrel jacks at least) 93- USB serial cables (Adafruit sells pretty reliable ones) 94- A large powered USB hub for all the serial cables 95- A pile of ethernet cables 96 97You'll talk to the Cisco for configuration using its USB port, which provides a 98serial terminal at 9600 baud. You need to enable SNMP control, which we'll do 99using a "mesaci" community name that the gitlab runner can access as its 100authentication (no password) to configure. To talk to the SNMP on the router, 101you need to put an ip address on the default vlan (vlan 1). 102 103Setting that up looks something like: 104 105.. code-block: console 106 107 Switch> 108 Password: 109 Switch#configure terminal 110 Switch(config)#interface Vlan 1 111 Switch(config-if)#ip address 10.42.0.2 255.255.0.0 112 Switch(config-if)#end 113 Switch(config)#snmp-server community mesaci RW 114 Switch(config)#end 115 Switch#copy running-config startup-config 116 117With that set up, you should be able to power on/off a port with something like: 118 119.. code-block: console 120 121 % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 1 122 % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 4 123 124Note that the "1.3.6..." SNMP OID changes between switches. The last digit 125above is the interface id (port number). You can probably find the right OID by 126google, that was easier than figuring it out from finding the switch's MIB 127database. You can query the POE status from the switch serial using the `show 128power inline` command. 129 130Other than that, find the dnsmasq/tftp/nfs setup for your boards "servo" above. 131 132See `src/broadcom/ci/gitlab-ci.yml` and `src/nouveau/ci/gitlab-ci.yml` for an 133examples of POE for Raspberry Pi 3/4, and Jetson Nano. 134 135Setup 136----- 137 138Each board will be registered in freedesktop.org GitLab. You'll want 139something like this to register a fastboot board: 140 141.. code-block:: console 142 143 sudo gitlab-runner register \ 144 --url https://gitlab.freedesktop.org \ 145 --registration-token $1 \ 146 --name MY_BOARD_NAME \ 147 --tag-list MY_BOARD_TAG \ 148 --executor docker \ 149 --docker-image "alpine:latest" \ 150 --docker-volumes "/dev:/dev" \ 151 --docker-network-mode "host" \ 152 --docker-privileged \ 153 --non-interactive 154 155For a servo board, you'll need to also volume mount the board's NFS 156root dir at /nfs and TFTP kernel directory at /tftp. 157 158The registration token has to come from a freedesktop.org GitLab admin 159going to https://gitlab.freedesktop.org/admin/runners 160 161The name scheme for Google's lab is google-freedreno-boardname-n, and 162our tag is something like google-freedreno-db410c. The tag is what 163identifies a board type so that board-specific jobs can be dispatched 164into that pool. 165 166We need privileged mode and the /dev bind mount in order to get at the 167serial console and fastboot USB devices (--device arguments don't 168apply to devices that show up after container start, which is the case 169with fastboot, and the servo serial devices are actually links to 170/dev/pts). We use host network mode so that we can spin up a nginx 171server to collect XML results for fastboot. 172 173Once you've added your boards, you're going to need to add a little 174more customization in ``/etc/gitlab-runner/config.toml``. First, add 175``concurrent = <number of boards>`` at the top ("we should have up to 176this many jobs running managed by this gitlab-runner"). Then for each 177board's runner, set ``limit = 1`` ("only 1 job served by this board at a 178time"). Finally, add the board-specific environment variables 179required by your bare-metal script, something like:: 180 181 [[runners]] 182 name = "google-freedreno-db410c-1" 183 environment = ["BM_SERIAL=/dev/ttyDB410c8", "BM_POWERUP=google-power-up.sh 8", "BM_FASTBOOT_SERIAL=15e9e390", "FDO_CI_CONCURRENT=4"] 184 185The ``FDO_CI_CONCURRENT`` variable should be set to the number of CPU threads on 186the board, which is used for auto-tuning of job parallelism. 187 188Once you've updated your runners' configs, restart with ``sudo service 189gitlab-runner restart`` 190 191Caching downloads 192----------------- 193 194To improve the runtime for downloading traces during traces job runs, you will 195want a pass-through HTTP cache. On your runner box, install nginx: 196 197.. code-block:: console 198 199 sudo apt install nginx libnginx-mod-http-lua 200 201Add the server setup files: 202 203.. literalinclude:: fdo-cache 204 :name: /etc/nginx/sites-available/fdo-cache 205 :caption: /etc/nginx/sites-available/fdo-cache 206 207.. literalinclude:: uri-caching.conf 208 :name: /etc/nginx/snippets/uri-caching.conf 209 :caption: /etc/nginx/snippets/uri-caching.conf 210 211Edit the listener addresses in fdo-cache to suit the ethernet interface that 212your devices are on. 213 214Enable the site and restart nginx: 215 216.. code-block:: console 217 218 sudo ln -s /etc/nginx/sites-available/fdo-cache /etc/nginx/sites-enabled/fdo-cache 219 sudo service nginx restart 220 221 # First download will hit the internet 222 wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace 223 # Second download should be cached. 224 wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace 225 226Now, set ``download-url`` in your ``traces-*.yml`` entry to something like 227``http://10.42.0.1:8888/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public`` 228and you should have cached downloads for traces. Add it to 229``FDO_HTTP_CACHE_URI=`` in your ``config.toml`` runner environment lines and you 230can use it for cached artifact downloads instead of going all the way to 231freedesktop.org on each job. 232