Skip to content

Commit 7e19a86

Browse files
committed
docs: more ape writing
1 parent 8d1d214 commit 7e19a86

1 file changed

Lines changed: 69 additions & 135 deletions

File tree

oeps/best-practices/oep-0068-bp-resource-identifiers.rst

Lines changed: 69 additions & 135 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ OEP-68: Learning Content Identifiers
99
* - OEP
1010
- :ref:`OEP-0068 <OEP-68 Resource Identifiers>`
1111
* - Title
12-
- Resource Identifiers
12+
- Learning Content Identifiers
1313
* - Last Modified
1414
- 2026-03-18
1515
* - Authors
@@ -52,11 +52,8 @@ Motivation
5252
**********
5353

5454
Identifiers are ubiquitous: they appear as Python variables, function parameters, Django model
55-
fields, database columns, REST API fields, and event data schemas.
56-
57-
To avoid subtle issues with performance, portability, and correctness, we must be careful to
58-
use the correct kind of identifier building content data models or referencing content.
59-
For example:
55+
fields, database columns, REST API fields, and event data schemas. It's important to choose
56+
the right identifier for the right job. For example:
6057

6158
* Including Open edX instance-specific information in an identifier can cause content to
6259
break when it's transferred to other instances. For example, exporting course content with
@@ -68,8 +65,7 @@ For example:
6865
media files from course run X to course run Y, the transfer format must describe the
6966
component-media relationships *without* reference to the course run keys for X or Y,
7067
otherwise the copied component in Y may erronously try to reference media files from
71-
course run X (or, we'd have to complicate the pasting code with special logic to strip
72-
the identifiers of reference to X, which is likely to be confusing and brittle).
68+
course run X.
7369

7470
* Mingling version-aware and version-agnostic identifiers can lead to unexpected lookup failures.
7571
For example, if learner-facing code queries a course run cache for the latest content without
@@ -95,14 +91,14 @@ tracking events, legacy untyped Python modules, and logs). So, this OEP aims to:
9591
Specification
9692
*************
9793

98-
There are five recognized categories of identifier in Open edX code. When naming an identifier,
99-
first determine which category it belongs to, then apply the naming convention for that category. If
100-
an identifier does not fit any category, choose a name that does not collide with any of the five
101-
conventions, so that readers are not misled.
94+
Here are five recognized categories of learning content identifier in Open edX code.
95+
When using an identifier, first determine which category it belongs to, then consider
96+
if it's appropriate for the job at hand. Finally, apply the appropriate naming convention,
97+
as long as doing is backwards compatible and consistent with surrounding code (some judgement required here).
10298

103-
These conventions apply wherever identifiers are named: Python variables, parameters, and
104-
attributes; Django model field names and database column names; REST API request and response field
105-
names; and event data schema fields.
99+
These conventions apply wherever Open edX learning content is referenced: Python variables, JS
100+
variables, Django model field names, REST API arguments, event schema fields, admin interfaces,
101+
application logs, and so on.
106102

107103
Summary
108104
=======
@@ -121,17 +117,14 @@ Summary
121117
- * ``id``, ``*_id`` on the model.
122118
* ``pk``, ``*_pk`` everywhere else.
123119
* - Code
124-
- Locally-scoped slug-like string component of an OpaqueKey
120+
- Locally-scoped slug-like string
125121
- ``str``
126-
- ``*_code``
122+
- * ``*_code``
127123
* - OpaqueKey
128-
- Parsed key object scoped to one Open edX instance
124+
- Codes composed together into a semi-readable instance-wide identifier
129125
- subclass of ``OpaqueKey``
130-
- * ``*_key``
131-
* - OpaqueKey String
132-
- Serialized form of an OpaqueKey
133-
- ``str``
134-
- ``*_key_string`` (or ``*_key`` when unambiguous)
126+
- * ``*_key`` for parsed OpaqueKEy objects
127+
* ``*_key`` or ``*_key_string`` for serialized OpaqueKey strings
135128
* - UUID
136129
- Globally unique identifier scoped across all Open edX instances
137130
- ``uuid.UUID``
@@ -140,6 +133,7 @@ Summary
140133
- Serialized form of a UUID
141134
- ``str``
142135
- ``*_uuid_string`` (or ``*_uuid`` when unambiguous)
136+
- Global unique reference
143137

144138
Integer Primary Keys
145139
====================
@@ -237,10 +231,25 @@ For example:
237231
**When to use**: Integer primary keys and OpaqueKeys both uniquely identify a resource across an
238232
Open edX instance. When choosing between the two, consider the following:
239233

240-
* Blah
241-
242-
**How to name**: Variables and fields holding a parsed ``OpaqueKey`` object should use the suffix
243-
``_key``.
234+
* Integer primary keys are by far the most efficient and reliable method to relate tables within a database.
235+
* When displayed, OpaqueKeys provide more information. This can be good in cases where quasi-readability
236+
matters, such as URLs, error logs, event data, and UIs for admins and power-users.
237+
* Because they are human-readable, site admins may want to change the code identifying a piece of content,
238+
which would break any OpaqueKey reference to that content. This generally doesn't happen, but by limiting
239+
the number of internal OpaqueKey references, we make the platform more flexible to support these sort
240+
of operations (e.g. changing a course's run code) in the future.
241+
242+
**How to name**:
243+
244+
* Python variables and attributes holding a parsed ``OpaqueKey`` object should use the suffix ``_key``.
245+
* Fields which marshal between OpaqueKey objects and their serialized strings, such as Django Model
246+
Fields or Serializer Fields, should also use the suffix ``_key``.
247+
* REST APIs, event data fields, or other external serializations of OpaqueKeys should all also use
248+
the suffix ``_key``.
249+
* When parsed ``OpaqueKey`` objects and serialized key strings co-exist in the same context,
250+
such as a parsing function, use the suffix ``_key_string`` to disambiguate serialized strings.
251+
* Frontend variables should use the suffix ``*Key``. The suffix ``*KeyString`` is not necessary,
252+
because parsed OpaqueKeys do not exist on the frontend.
244253

245254
.. code-block:: python
246255
@@ -251,107 +260,38 @@ Open edX instance. When choosing between the two, consider the following:
251260
252261
def get_course(course_key: CourseKey) -> CourseOverview: ...
253262
254-
Please note that, for historical reasons, concrete OpaqueKey subclasses use the suffix ``Locator``
263+
Please note that it's preferrable to pass around the parsed ``OpaqueKey`` object
264+
whenever it's available--compared to the serialized key string, it's more
265+
type-safe and centralizes all the parsing logic. Developers are also encouraged to use
266+
``OpaqueKey`` subclasses as type annotations wherever appropriate.
267+
268+
Hint: For historical reasons, concrete OpaqueKey subclasses use the suffix ``Locator``
255269
instead of ``Key``. For all intents and purposes, this distinction can be ignored by consumers. They
256270
are all ``_keys``. In the future, we will unify all OpaqueKey classes to be named ``*Key``.
257271

258272
.. _openedx/opaque-keys: https://github.com/openedx/opaque-keys
259273

260-
OpaqueKey Strings
261-
-----------------
262-
263-
The serialized (string) form of an OpaqueKey is distinct from the parsed object. When the context
264-
makes it unambiguous that the value is a string—such as inside a Django form, serializer, or REST
265-
API field—it is acceptable to use ``*_key`` for the serialized form as well.
266-
267-
When there is any ambiguity about whether a value is a parsed ``OpaqueKey`` object or its string
268-
serialization, use the suffix ``*_key_string`` to make the distinction explicit.
269-
270-
.. code-block:: python
271-
272-
# Unambiguous context (serializer field): *_key is fine
273-
class BlockSerializer(serializers.Serializer):
274-
usage_key = serializers.CharField()
275-
276-
# Ambiguous context: use *_key_string to distinguish from the parsed object
277-
def _get_context_key_if_valid(serializer) -> LearningContextKey | None:
278-
usage_key_string = serializer.cleaned_data.get('usage_key')
279-
if not usage_key_string:
280-
return None
281-
try:
282-
return UsageKey.from_string(usage_key_string).context_key
283-
except InvalidKeyError:
284-
return None
285-
286-
Please note that OpaqueKey Strings should only be used at the boundaries of the platform (REST APIs,
287-
external events, logging, etc.). Within the system, parsed OpaqueKey objects are always preferred,
288-
as they protect against serialization-deserialization errors and provide type safety.
289-
290-
.. note::
291-
292-
On the frontend, parsed ``OpaqueKey`` objects are not available; OpaqueKeys are always plain
293-
strings. Therefore, a ``*KeyString`` suffix is not needed in frontend code—``*Key`` is always
294-
acceptable.
295-
296274
UUIDs
297275
=====
298276

299277
A UUID (Universally Unique Identifier) is a 128-bit identifier that uniquely identifies a resource
300278
across *all* Open edX instances. Unlike primary keys and OpaqueKeys, UUIDs are not scoped to a
301279
single database or instance.
302280

303-
**When to use**: Use UUIDs when you need a stable identifier that remains meaningful across Open edX
304-
instances—for example, in cross-instance event data, external integrations, and shared databases. If
305-
the identifier only needs to be unique within a single instance, an OpaqueKey or primary key is more
306-
appropriate.
281+
**When to use**: Use UUIDs when you want to give an object an identity that is unique across all
282+
Open edX instances.
307283

308-
**How to name**: The preferred Python type for a UUID is ``uuid.UUID``. Variables and fields holding
309-
a UUID should use the suffix ``_uuid``.
310-
311-
.. code-block:: python
284+
**How to name**:
312285

313-
import uuid
314-
315-
discussion_uuid: uuid.UUID = thread.uuid
316-
enrollment_uuid: uuid.UUID = enrollment.uuid
317-
318-
def get_discussion_thread(discussion_uuid: uuid.UUID) -> DiscussionThread: ...
319-
320-
class DiscussionThread(models.Model):
321-
uuid = models.UUIDField(default=uuid.uuid4, editable=False, unique=True)
322-
323-
UUID Strings
324-
------------
325-
326-
The serialized (string) form of a UUID is distinct from the ``uuid.UUID`` object. When the context
327-
makes it unambiguous that the value is a string—such as inside a Django serializer or REST API
328-
field—it is acceptable to use ``*_uuid`` for the serialized form as well.
329-
330-
When there is any ambiguity about whether a value is a ``uuid.UUID`` object or its string
331-
serialization, use the suffix ``*_uuid_string`` to make the distinction explicit.
332-
333-
.. code-block:: python
334-
335-
# Unambiguous context (serializer field): *_uuid is fine
336-
class ThreadSerializer(serializers.Serializer):
337-
discussion_uuid = serializers.UUIDField()
338-
339-
# Ambiguous context: use *_uuid_string to distinguish from the UUID object
340-
def _get_thread_by_discussion(discussion_uuid_string: str) -> DiscussionThread:
341-
try:
342-
discussion_uuid = uuid.UUID(discussion_uuid_string)
343-
except ValueError:
344-
raise BadDiscussionUUID(discussion_uuid)
345-
return DiscussionThread.objects.get(uuid=discussion_uuid)
346-
347-
As with OpaqueKey Strings, UUID Strings should only be used at the boundaries of the platform.
348-
Within the system, ``uuid.UUID`` objects are always preferred.
349-
350-
.. note::
351-
352-
On the frontend, ``uuid.UUID`` objects are not available; UUIDs are always plain strings.
353-
Therefore, a ``*UuidString`` suffix is not needed in frontend code—``*Uuid`` is always
354-
acceptable.
286+
* Python variables and attributes holding a parsed ``uuid.UUID`` object should use the suffix ``_uuid``.
287+
* Fields which marshal between UUID objects and their serialized strings, such as Django Model
288+
Fields or Serializer Fields, should also use the suffix ``_uuid``.
289+
* REST APIs, event data fields, or other external serializations of UUIDs should all also use
290+
the suffix ``_uuid``.
291+
* When parsed UUID objects and serialized UUID strings co-exist in the same context,
292+
such as a parsing function, use the suffix ``_uuid_string`` to disambiguate serialized strings.
293+
* Frontend variables should use the suffix ``*Uuid``. The suffix ``*UuidString`` is not necessary,
294+
because parsed UUID objects are not used in Open edX frontend code.
355295

356296
Other Identifiers
357297
=================
@@ -361,20 +301,17 @@ choose a name that does not use any of the suffixes ``_pk``, ``_key``, ``_key_st
361301
``_uuid``, or ``_uuid_string``, so that readers are not misled into assuming a type or scope that
362302
does not apply.
363303

364-
TODO mention version numbers
365-
366-
**When to use**: Use this guidance when you have confirmed that an identifier does not fit any of
367-
the four named categories above. Before settling on a novel identifier type, consider whether a
368-
primary key, OpaqueKey, code, or UUID would serve the purpose.
369-
370-
**How to name**: Choose a name that is evocative of the identifier's specific meaning and that does
371-
not collide with any of the conventions above. For inspiration, consider the ``refname`` field on
372-
``PublishableEntity`` objects in ``openedx-learning``. A ``refname`` correlates a database entity
373-
with its representation in off-platform content archives. It is not a primary key (which would be
374-
database-specific), not a code (because it may contain non-slug characters), and not an OpaqueKey
375-
(because it cannot be parsed into a globally-scoped identifier). By choosing the name
376-
``refname``—which collides with none of the conventions above—the code signals clearly that this
377-
identifier is its own distinct thing.
304+
Examples:
305+
* The ``refname`` field on ``PublishableEntity`` objects in ``openedx-core``. A ``refname``
306+
correlates a database entity
307+
with its representation in off-platform content archives. It is not a primary key (which would be
308+
database-specific), not a code (because it may contain non-slug characters), and not an OpaqueKey
309+
(because it cannot be parsed into a globally-scoped identifier). By choosing the name
310+
``refname``—which collides with none of the conventions above—the code signals clearly that this
311+
identifier is its own distinct thing.
312+
* The integer ``version_num`` is used as part of the identity several version-aware content models.
313+
It is like a Code, because it identifies a thing (a version) within a local context (a versioned entity).
314+
However, it's not a string, so we don't use the suffix ``_code``.
378315

379316
Rationale
380317
*********
@@ -391,17 +328,12 @@ is a plain string rather than a Python object.
391328
Backward Compatibility
392329
**********************
393330

394-
Much existing Open edX code uses ``id`` and ``_id`` suffixes for primary keys, and some code uses
395-
``_key`` for both parsed OpaqueKey objects and their string serializations without distinction. This
396-
OEP does not require renaming existing identifiers; that would be a large and risky churn. New code
397-
should follow these conventions, and existing code may be updated opportunistically during
398-
refactors.
331+
TBC
399332

400333
Reference Implementation
401334
************************
402335

403-
The conventions in this OEP reflect and formalize naming patterns already in use in several Open edX
404-
repositories, including ``openedx-learning`` and ``openedx-content-libraries``.
336+
TBC
405337

406338
Rejected Alternatives
407339
*********************
@@ -414,6 +346,8 @@ name.
414346
identifies something" would eliminate the useful distinction between globally-scoped OpaqueKeys,
415347
locally-scoped codes, and serialized vs. parsed representations.
416348

349+
TODO Finish
350+
417351
Change History
418352
**************
419353

0 commit comments

Comments
 (0)