Skip to content

Commit 5dbe181

Browse files
committed
doc: add usecases of labgrid
This PR adds a documentation page describing the different usecases of labgrid. Its helpful for understanding the consumers and the needs better. Signed-off-by: Ozan Durgut <ozan.durgut@analog.com>
1 parent 06fd06c commit 5dbe181

2 files changed

Lines changed: 286 additions & 0 deletions

File tree

doc/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ You can also look at :ref:`ideas` for enhancements which are not yet implemented
4343
usage
4444
man
4545
configuration
46+
use_cases
4647
development
4748
design_decisions
4849
changes

doc/use_cases.rst

Lines changed: 285 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,285 @@
1+
Use Cases
2+
=========
3+
4+
labgrid can be used in very different ways depending on who uses the boards,
5+
where the hardware is connected, and how much configuration and operational
6+
knowledge must be shared between users.
7+
8+
The four common patterns below are:
9+
10+
* one developer with one board
11+
* one tester with many boards
12+
* multiple users sharing boards in one location
13+
* multiple users sharing boards across locations
14+
15+
The last case is the most demanding one and deserves special attention. It is
16+
also the case where the gap between "technically possible" and "pleasant to
17+
operate" becomes most visible.
18+
19+
One Developer with One Board
20+
----------------------------
21+
22+
This is the simplest labgrid setup. One developer works with a board connected
23+
directly to a PC.
24+
25+
Typical characteristics are:
26+
27+
* the user usually knows the board well
28+
* serial ports, USB devices, power switches, and images are local
29+
* environment files live next to the project or test code
30+
* there is little or no separation between board owner, board user, and test
31+
author
32+
33+
The same person who needs the board also controls the local tools, file paths,
34+
and board specific details. A local environment file is usually acceptable
35+
because it is close to the code which needs it and does not need to be
36+
distributed to a large audience. It can include personal details and local
37+
conventions.
38+
39+
The operational overhead is low compared to the shared cases below, but it is
40+
not zero. Users may still need to install the Python package manually, set up
41+
systemd services, add udev rules, create a dedicated user, configure the
42+
required permissions, and learn the relevant labgrid concepts. This setup is
43+
technically very capable, but the amount of software and abstraction can still
44+
feel heavy when the goal is simply to automate testing on hardware at your
45+
desk, a workflow explored further in `this talk
46+
<https://www.youtube.com/watch?v=_QQmoT5rQOA>`_.
47+
48+
An example would be an embedded Linux developer with a laptop, a
49+
serial adapter, and a single board used for daily bring-up and debugging.
50+
51+
One Tester with Many Boards
52+
---------------------------
53+
54+
In this model, the boards are remote, but they are still effectively owned and
55+
operated by one person or a very small group. The hardware may be in a rack, in
56+
another room, or in another site, but the user is still close to the setup from
57+
an ownership point of view.
58+
59+
Typical characteristics are:
60+
61+
* boards are exported from one or more remote lab hosts
62+
* one primary user, or a very small group, maintains the board setup and uses
63+
it daily
64+
* places are often mapped 1:1 to physical boards
65+
* board specific knowledge stays within the owning team
66+
67+
Here, the current workflow is still workable. Exporters publish the resources.
68+
Places are created and matched to those resources. Environment files are stored
69+
with the user's own automation or project code.
70+
71+
This setup is already less convenient than the purely local case, but the pain
72+
is usually manageable because the people maintaining the setup are also the
73+
people using it. Knowledge is not widely distributed, onboarding is limited,
74+
and local conventions can remain informal. This is also the use case that best
75+
fits labgrid's current architecture, and it is the original and primary use
76+
case for labgrid.
77+
78+
An example would be a tester with a rack of boards in a nearby lab, maintained
79+
and used by the same person every day.
80+
81+
Multiple Users Sharing Boards in One Location
82+
---------------------------------------------
83+
84+
This use case is different in an important way. The boards are now shared
85+
infrastructure, but the users are many, varied, and often not labgrid experts.
86+
That changes the cost model significantly.
87+
88+
Typical examples include internal hardware labs where development, validation,
89+
bring-up, debugging, issue reproduction, and CI all rely on the same board
90+
inventory. Users may include embedded Linux engineers, application developers,
91+
electrical engineers, support engineers, validation teams, and automation
92+
maintainers.
93+
94+
In such a setup, a board often looks physically simple. One serial console, one
95+
debug probe, and one power switch may already be enough for most workflows:
96+
97+
.. code-block:: text
98+
99+
lab/board-01/NetworkSerialPort
100+
lab/board-01/NetworkUSBDebugger
101+
lab/board-01/NetworkPowerPort
102+
103+
At first glance, this appears to map naturally to one place:
104+
105+
.. code-block:: shell
106+
107+
labgrid-client -p board-01 add-match lab/board-01/*
108+
109+
From there, a user can run common commands such as:
110+
111+
.. code-block:: shell
112+
113+
labgrid-client -p board-01 acquire
114+
labgrid-client -p board-01 console
115+
labgrid-client -p board-01 power cycle
116+
117+
For a narrow set of interactive tasks, this is enough. The problem is that
118+
shared labs rarely stay within that narrow set for long.
119+
120+
Current Recommended Workflow
121+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122+
123+
The currently recommended workflow for a shared board farm is usually:
124+
125+
#. create places and map exporter resources to them
126+
#. let users acquire places instead of interacting with raw resources
127+
#. let interactive users call :command:`labgrid-client` directly for console,
128+
power, and similar operations
129+
#. keep environment files with the user's own automation when strategies,
130+
drivers, tool paths, images, or debug configuration are needed
131+
#. manage exporter access with normal SSH, DNS, and host-side account
132+
management
133+
134+
This workflow is as flexible as labgrid itself. It supports dynamic labs,
135+
resource aggregation across exporters, and test suites which need to control
136+
their own target description.
137+
138+
That flexibility is valuable, but in a large, mostly static board lab it can
139+
also become a poor default. Even though labgrid is intentionally flexible, it
140+
should still support simpler ways of working when the infrastructure is already
141+
well understood. Otherwise, the workflow keeps exposing infrastructure details
142+
to every user even when most users are simply trying to "use board-01".
143+
144+
Why This Becomes Heavy
145+
~~~~~~~~~~~~~~~~~~~~~~
146+
147+
In a shared board lab, the difficulty is not that the hardware is impossible to
148+
control. The difficulty is that the same board description and the same
149+
operational knowledge are reconstructed again and again in slightly different
150+
places.
151+
152+
Several aspects contribute to this:
153+
154+
* the exporter group often already represents a board, while the place is used
155+
to represent that same board again from the user side
156+
* the user still needs to understand places, matches, resources, drivers,
157+
strategies, and environments to do more than a few basic actions
158+
* board-specific configuration remains client-side even when it is effectively
159+
shared infrastructure knowledge
160+
* infrastructure changes must be communicated outward instead of being absorbed
161+
centrally
162+
* direct client to exporter access pushes access management, permissions,
163+
cleanup, and audit concerns down to every host
164+
165+
None of these points is fatal on its own. Together, they create a workflow that
166+
works, but asks the organization to repeatedly pay for the same understanding.
167+
168+
An example would be a central validation lab used by firmware,
169+
application, and test teams in the same office, all competing for the same
170+
inventory of boards.
171+
172+
Multiple Users Sharing Boards Across Locations
173+
----------------------------------------------
174+
175+
This is the same shared infrastructure problem, but with the added complexity
176+
that users, operators, and hardware are no longer in the same place. The lab
177+
may span multiple offices, multiple time zones, or even external partners. At
178+
that point, the distance between "someone can make this work" and "this is
179+
pleasant to operate" becomes much larger.
180+
181+
Typical characteristics are:
182+
183+
* users may not know who physically manages a given board
184+
* recovery often depends on someone in another room, site, or time zone
185+
* access policies and audit requirements are usually stricter
186+
* local conventions stop scaling because the audience is broader and less
187+
connected
188+
189+
Observed friction in practice
190+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
191+
192+
The following situations tend to come up repeatedly in large shared labs:
193+
194+
Board identity is split across concepts
195+
The physical board is already obvious to the operator and to the exporter
196+
layout, but users still have to reason about resources, exporter groups,
197+
places, and separate client-side target descriptions. For a mostly static
198+
1:1 board setup, this can feel like modeling the same thing multiple times.
199+
200+
Interactive usage and full usage diverge
201+
Console access or power control may work directly through the place, but more
202+
realistic tasks often need OpenOCD configuration, image paths, strategies,
203+
helper scripts, or custom drivers. The result is that the "simple" workflow
204+
only covers the shallow end, while normal engineering work falls back to
205+
extra local setup.
206+
207+
Shared knowledge is distributed as per-user configuration
208+
If many users need the same board-specific OpenOCD setup, the same flash
209+
layout, or the same boot strategy, that knowledge is no longer really
210+
personal configuration. Treating it as such makes updates and consistency
211+
harder than they need to be.
212+
213+
Infrastructure changes propagate poorly
214+
When tool paths, images, server names, or debug settings change, the change
215+
is not absorbed once at the infrastructure boundary. Instead, it tends to
216+
trigger a documentation, support, and synchronization exercise across users,
217+
repositories, or wrapper scripts.
218+
219+
Access management becomes part of the user workflow
220+
Shared Unix users reduce friction but weaken isolation and traceability.
221+
Per-user Unix accounts improve traceability but increase provisioning,
222+
revocation, SSH key management, and host-side permissions work. Neither
223+
choice is attractive when the goal is simply safe, shared board access.
224+
225+
SSH access becomes a scaling problem of its own
226+
When labs span multiple hosts or locations, SSH setup and lifecycle
227+
management can become a visible part of daily operations. Teams may benefit
228+
from stronger centralized approaches for authentication and short-lived
229+
access. One option worth evaluating is `OpenPubkey SSH (opkssh)
230+
<https://github.com/openpubkey/opkssh>`_.
231+
232+
CI and human users compete through the same abstractions
233+
The same board inventory must serve automation and interactive work, but the
234+
current model often leaves teams building local conventions on top of places,
235+
locks, scripts, and documentation to make that coexistence understandable.
236+
237+
Why The Current Model Feels Mismatched
238+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239+
240+
The core issue is not that the present model is wrong. It is that it is
241+
optimized for flexibility and configurability first, while large shared board
242+
farms usually need operational simplicity first.
243+
244+
For dynamic setups, custom test suites, and project-owned environments, the
245+
separation between resources, places, and client-side target descriptions makes
246+
sense.
247+
248+
For large shared labs, however, this same separation can have unfortunate side
249+
effects:
250+
251+
* the board looks simple at the hardware level, but complicated at the user
252+
interface level
253+
* the infrastructure already knows most of the board description, but the user
254+
must still assemble the rest
255+
* routine lab maintenance turns into user facing migration work
256+
* support effort grows faster than the apparent complexity of the board itself
257+
258+
In other words, the model remains powerful, but the user experience can become
259+
heavier than the actual task being performed.
260+
261+
Possible Improvements
262+
~~~~~~~~~~~~~~~~~~~~~
263+
264+
Large shared labs may benefit from an additional workflow which keeps the
265+
current flexible model intact, but offers a more infrastructure-provided path
266+
for common static deployments.
267+
268+
Possible improvements include:
269+
270+
* allow a board-oriented definition to be served centrally, so users can
271+
consume a ready-to-use target instead of rebuilding it locally
272+
* make 1:1 exporter-group-to-board mappings first-class, so the common static
273+
case needs less manual place modeling
274+
* allow exporters or the coordinator to provide environment fragments or merged
275+
target descriptions for interactive users
276+
* reduce the amount of board-specific knowledge that must be duplicated across
277+
user repositories and local wrapper scripts
278+
* provide a cleaner access model where users interact with shared boards without
279+
needing direct exposure to host-level account management details
280+
* define a clearer workflow for mixed CI and interactive usage in the same
281+
shared board inventory
282+
283+
An example would be a company with teams in different offices using
284+
the same lab through a shared coordinator, while a smaller operations group
285+
maintains the hardware and access policies.

0 commit comments

Comments
 (0)