• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Bare-metal CI
2=============
3
4The bare-metal scripts run on a system with gitlab-runner and Docker,
5connected to potentially multiple bare-metal boards that run tests of
6Mesa.  Currently "fastboot", "ChromeOS Servo", and POE-powered devices are
7supported.
8
9In comparison with LAVA, this doesn't involve maintaining a separate
10web service with its own job scheduler and replicating jobs between the
11two.  It also places more of the board support in Git, instead of
12web service configuration.  On the other hand, the serial interactions
13and bootloader support are more primitive.
14
15Requirements (fastboot)
16-----------------------
17
18This testing requires power control of the DUTs by the gitlab-runner
19machine, since this is what we use to reset the system and get back to
20a pristine state at the start of testing.
21
22We require access to the console output from the gitlab-runner system,
23since that is how we get the final results back from the tests.  You
24should probably have the console on a serial connection, so that you
25can see bootloader progress.
26
27The boards need to be able to have a kernel/initramfs supplied by the
28gitlab-runner system, since Mesa often needs to update the kernel either for new
29DRM functionality, or to fix kernel bugs.
30
31The boards must have networking, so that we can extract the dEQP .xml results to
32artifacts on GitLab, and so that we can download traces (too large for an
33initramfs) for trace replay testing.  Given that we need networking already, and
34our deqp/piglit/etc. payload is large, we use nfs from the x86 runner system
35rather than initramfs.
36
37See `src/freedreno/ci/gitlab-ci.yml` for an example of fastboot on DB410c and
38DB820c (freedreno-a306 and freereno-a530).
39
40Requirements (servo)
41--------------------
42
43For servo-connected boards, we can use the EC connection for power
44control to reboot the board.  However, loading a kernel is not as easy
45as fastboot, so we assume your bootloader can do TFTP, and that your
46gitlab-runner mounts the runner's tftp directory specific to the board
47at /tftp in the container.
48
49Since we're going the TFTP route, we also use NFS root.  This avoids
50packing the rootfs and sending it to the board as a ramdisk, which
51means we can support larger rootfses (for piglit testing), at the cost
52of needing more storage on the runner.
53
54Telling the board about where its TFTP and NFS should come from is
55done using dnsmasq on the runner host.  For example, this snippet in
56the dnsmasq.conf.d in the google farm, with the gitlab-runner host we
57call "servo"::
58
59  dhcp-host=1c:69:7a:0d:a3:d3,10.42.0.10,set:servo
60
61  # Fixed dhcp addresses for my sanity, and setting a tag for
62  # specializing other DHCP options
63  dhcp-host=a0:ce:c8:c8:d9:5d,10.42.0.11,set:cheza1
64  dhcp-host=a0:ce:c8:c8:d8:81,10.42.0.12,set:cheza2
65
66  # Specify the next server, watch out for the double ',,'.  The
67  # filename didn't seem to get picked up by the bootloader, so we use
68  # tftp-unique-root and mount directories like
69  # /srv/tftp/10.42.0.11/jwerner/cheza as /tftp in the job containers.
70  tftp-unique-root
71  dhcp-boot=tag:cheza1,cheza1/vmlinuz,,10.42.0.10
72  dhcp-boot=tag:cheza2,cheza2/vmlinuz,,10.42.0.10
73
74  dhcp-option=tag:cheza1,option:root-path,/srv/nfs/cheza1
75  dhcp-option=tag:cheza2,option:root-path,/srv/nfs/cheza2
76
77See `src/freedreno/ci/gitlab-ci.yml` for an example of servo on cheza.  Note
78that other servo boards in CI are managed using LAVA.
79
80Requirements (POE)
81------------------
82
83For boards with 30W or less power consumption, POE can be used for the power
84control.  The parts list ends up looking something like (for example):
85
86- x86-64 gitlab-runner machine with a mid-range CPU, and 3+ GB of SSD storage
87  per board.  This can host at least 15 boards in our experience.
88- Cisco 2960S gigabit ethernet switch with POE. (Cisco 3750G, 3560G, or 2960G
89  were also recommended as reasonable-priced HW, but make sure the name ends in
90  G, X, or S)
91- POE splitters to power the boards (you can find ones that go to micro USB,
92  USBC, and 5V barrel jacks at least)
93- USB serial cables (Adafruit sells pretty reliable ones)
94- A large powered USB hub for all the serial cables
95- A pile of ethernet cables
96
97You'll talk to the Cisco for configuration using its USB port, which provides a
98serial terminal at 9600 baud.  You need to enable SNMP control, which we'll do
99using a "mesaci" community name that the gitlab runner can access as its
100authentication (no password) to configure.  To talk to the SNMP on the router,
101you need to put an ip address on the default vlan (vlan 1).
102
103Setting that up looks something like:
104
105.. code-block: console
106
107  Switch>
108  Password:
109  Switch#configure terminal
110  Switch(config)#interface Vlan 1
111  Switch(config-if)#ip address 10.42.0.2 255.255.0.0
112  Switch(config-if)#end
113  Switch(config)#snmp-server community mesaci RW
114  Switch(config)#end
115  Switch#copy running-config startup-config
116
117With that set up, you should be able to power on/off a port with something like:
118
119.. code-block: console
120
121  % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 1
122  % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 4
123
124Note that the "1.3.6..." SNMP OID changes between switches.  The last digit
125above is the interface id (port number).  You can probably find the right OID by
126google, that was easier than figuring it out from finding the switch's MIB
127database.  You can query the POE status from the switch serial using the `show
128power inline` command.
129
130Other than that, find the dnsmasq/tftp/nfs setup for your boards "servo" above.
131
132See `src/broadcom/ci/gitlab-ci.yml` and `src/nouveau/ci/gitlab-ci.yml` for an
133examples of POE for Raspberry Pi 3/4, and Jetson Nano.
134
135Setup
136-----
137
138Each board will be registered in freedesktop.org GitLab.  You'll want
139something like this to register a fastboot board:
140
141.. code-block:: console
142
143  sudo gitlab-runner register \
144       --url https://gitlab.freedesktop.org \
145       --registration-token $1 \
146       --name MY_BOARD_NAME \
147       --tag-list MY_BOARD_TAG \
148       --executor docker \
149       --docker-image "alpine:latest" \
150       --docker-volumes "/dev:/dev" \
151       --docker-network-mode "host" \
152       --docker-privileged \
153       --non-interactive
154
155For a servo board, you'll need to also volume mount the board's NFS
156root dir at /nfs and TFTP kernel directory at /tftp.
157
158The registration token has to come from a freedesktop.org GitLab admin
159going to https://gitlab.freedesktop.org/admin/runners
160
161The name scheme for Google's lab is google-freedreno-boardname-n, and
162our tag is something like google-freedreno-db410c.  The tag is what
163identifies a board type so that board-specific jobs can be dispatched
164into that pool.
165
166We need privileged mode and the /dev bind mount in order to get at the
167serial console and fastboot USB devices (--device arguments don't
168apply to devices that show up after container start, which is the case
169with fastboot, and the servo serial devices are actually links to
170/dev/pts).  We use host network mode so that we can spin up a nginx
171server to collect XML results for fastboot.
172
173Once you've added your boards, you're going to need to add a little
174more customization in ``/etc/gitlab-runner/config.toml``.  First, add
175``concurrent = <number of boards>`` at the top ("we should have up to
176this many jobs running managed by this gitlab-runner").  Then for each
177board's runner, set ``limit = 1`` ("only 1 job served by this board at a
178time").  Finally, add the board-specific environment variables
179required by your bare-metal script, something like::
180
181  [[runners]]
182    name = "google-freedreno-db410c-1"
183    environment = ["BM_SERIAL=/dev/ttyDB410c8", "BM_POWERUP=google-power-up.sh 8", "BM_FASTBOOT_SERIAL=15e9e390", "FDO_CI_CONCURRENT=4"]
184
185The ``FDO_CI_CONCURRENT`` variable should be set to the number of CPU threads on
186the board, which is used for auto-tuning of job parallelism.
187
188Once you've updated your runners' configs, restart with ``sudo service
189gitlab-runner restart``
190
191Caching downloads
192-----------------
193
194To improve the runtime for downloading traces during traces job runs, you will
195want a pass-through HTTP cache.  On your runner box, install nginx:
196
197.. code-block:: console
198
199  sudo apt install nginx libnginx-mod-http-lua
200
201Add the server setup files:
202
203.. literalinclude:: fdo-cache
204   :name: /etc/nginx/sites-available/fdo-cache
205   :caption: /etc/nginx/sites-available/fdo-cache
206
207.. literalinclude:: uri-caching.conf
208   :name: /etc/nginx/snippets/uri-caching.conf
209   :caption: /etc/nginx/snippets/uri-caching.conf
210
211Edit the listener addresses in fdo-cache to suit the ethernet interface that
212your devices are on.
213
214Enable the site and restart nginx:
215
216.. code-block:: console
217
218  sudo ln -s /etc/nginx/sites-available/fdo-cache /etc/nginx/sites-enabled/fdo-cache
219  sudo service nginx restart
220
221  # First download will hit the internet
222  wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace
223  # Second download should be cached.
224  wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace
225
226Now, set ``download-url`` in your ``traces-*.yml`` entry to something like
227``http://10.42.0.1:8888/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public``
228and you should have cached downloads for traces.  Add it to
229``FDO_HTTP_CACHE_URI=`` in your ``config.toml`` runner environment lines and you
230can use it for cached artifact downloads instead of going all the way to
231freedesktop.org on each job.
232