Non-obvious facts about how Monash's handbook data is shaped — what it looks like inside a record, what fields mean what, and which fields silently lie about their content. Skim before building a UI view or writing a query.
The upstream CMS is CourseLoop; two distinct reference shapes appear inside records. They look similar and confusing them silently produces wrong data (not a type error).
Full CLReference — {value, cl_id, key}. Used for pointers at
other CourseLoop rows (school, teaching period, location). Extract
display text as .value:
"school": { "value": "Faculty of Information Technology",
"cl_id": "c5684a53...", "key": "name" }Lite-reference — {label, value}. Used for classification
dropdowns (level, type, status, AQF level, attendance mode,
undergrad/postgrad). .label is human-readable ("Level 1", "Bachelor
Degree", "Accredited", "Undergraduate"); .value is the internal
code ("2", "7_bach_deg", "Active"). Prefer .label for display,
fall back to .value.
The one place you must use .value, not .label, is
academic_item_type inside tree leaves: .label is "Unit"/"Course"
(human), .value is "subject"/"course" (internal code). Filters in
the ingest pipeline check against "subject" — using the label
returns zero refs silently.
Scalars-as-strings. Numbers and booleans arrive as strings:
credit_points: "6", offered: "true". The DB columns have already
parsed these; if you reach into raw JSONB for a field we haven't
extracted, re-parse.
Both live on units and both restrict enrolment, but they are not the same thing:
requisites(prerequisites, corequisites, prohibitions) carry a structured AND/OR tree of unit-code references inruleJSONB. Thedescriptionfield is empty 99.9% of the time — do not render it. The rule tree is the authoritative source.enrolment_rulesare program-level constraints ("must be enrolled in Bachelor of IT", "must have 48cp in Art, Design and Architecture"). They ship as HTML prose only — no structured tree — and they always have a populateddescription. You can't evaluate these programmatically without NLP; just render the HTML.
For graph-shaped queries on requisites ("what requires X?", "what
unlocks after X?"), use requisite_refs — it's the flat edge view of
the trees. Use requisites.rule only when you need AND/OR semantics
for validation ("does this student's set of completed units satisfy
this block?").
- Unit requisites only reference units (
academic_item_type.value === "subject"). Never courses. Verified across 7,354 leaf refs in the 2026 corpus — zero course refs. - AoS curriculum → units, with a grouping label ("Core units", "Elective units", "Malaysia", etc.). 6,773 edges in 2026.
- Course curriculum → AoS, with the nearest ancestor container title naming the relationship ("Part B. Major studies", "Science extended majors", "Discipline elective studies"). 719 edges.
- Course curriculum → units also exists (C2000 references 17
units directly). We don't surface this as a flat table yet; reach
into
courses.curriculum_structureJSONB for it.
When a requisite leaf points at FIT1008, its academic_item_url
usually looks like /2021/units/FIT1008 — referencing the
2021-handbook version of that unit, not the current year. ~88% of
requisite refs point at historical years. This is not a bug; Monash
freezes prereq pointers at whatever handbook version they were
approved against.
Planner logic must match on code alone. requisite_refs already
drops the referenced year for this reason — a student who took
FIT1008 in 2024 satisfies a 2026 unit's prereq even when the leaf's
URL says 2021.
When a UI reaches into a raw curriculum tree (courses, AoS), the shape varies:
- Unit requisites nest as
container[].containers[].relationships[](pluralrelationships). - AoS curriculumStructure nests as
container[].container[].relationship[](singularrelationship). - Course curriculumStructure mixes both, and also has AoS codes appearing as bare string values outside any array.
The only stable invariant is that leaves carry an
academic_item_code field. The ingest walker (collectCodeRefs in
packages/ingest/src/parse.ts) is deliberately shape-agnostic: it
recurses every property and treats any object bearing
academic_item_code as a leaf. UIs that render raw trees should do
the same rather than hard-code container keys.
Each leaf in a curriculum tree carries academic_item_credit_points,
academic_item_name, academic_item_url, abbr_name, order, and
parent_connector ({label: "AND"|"OR", value: ...}) — so you have
everything needed to render "6cp | FIT1045 Introduction to
programming" rows grouped by AND/OR without joining back.
courses.description— populated on 6/501 records. Useoverviewinstead (94%).units.exclusions— always empty string. The "can't take both" relationship lives inrequisiteswithrequisite_type = prohibition.areas_of_study.type— always null. The study-level bucket you actually want isareas_of_study.study_level(extracted fromundergrad_postgrad): "Undergraduate", "Postgraduate", "Honours", "Research".courses.structure— always empty{}. The real field iscurriculum_structure(different spelling, different casing).courses.majors_minors,courses.specialisations— always empty arrays in 2026. The real mapping lives inside curriculum_structure; usecourse_areas_of_studywhich already extracts it.courses.double_degrees— technically populated for 60 records but the content is malformed HTML like"<". Unusable.- Research-program courses (Doctorate/Masters by research, 67 in
2026) have null
curriculum_structure. Expected, not missing. requisites.description— empty 99.9% of the time (see above).
unit_offerings.attendance_mode is verbose prose. Every value has a
parenthetical canonical code at the end, extracted into
attendance_mode_code. 24 distinct codes observed in 2026; the six
most common:
| code | example source string |
|---|---|
ON-CAMPUS |
"Teaching activities are on-campus (ON-CAMPUS)" |
EXT-CAND |
"External Candidature (EXT-CAND)" |
IMMERSIVE |
"Teaching mostly conducted outside of a classroom/campus environment (IMMERSIVE)" |
ON-BLK |
"Teaching activities are on-campus and in a block period (ON-BLK)" |
ONLINE |
"Teaching is all online (ONLINE)" |
FLEXIBLE |
"Some activities have a choice of on-campus or online teaching activities (FLEXIBLE)" |
Use attendance_mode_code for filtering. Use attendance_mode for
display if you want the full description, otherwise the code is fine
for both.
course_areas_of_study.kind is derived from relationship_label
(the nearest ancestor container title) via case-insensitive keyword
matching, in this order:
| keyword | kind |
|---|---|
extended major |
extended_major |
specialisation / specialization / specialist |
specialisation |
minor |
minor |
elective |
elective |
major |
major |
| (no match) | other |
Order matters — extended major is checked before plain major.
The 113 other rows in 2026 are genuinely structural containers
that aren't a degree component — honours research streams, location
splits ("Malaysia offerings"), generic "Course requirements" buckets.
Not misclassifications.
The original label is kept in relationship_label for display
fidelity.
Several text fields contain HTML:
units.handbook_synopsiscourses.overviewareas_of_study.handbook_descriptionenrolment_rules.description
Monash ships inline tags (<p>, <br>, <a>) and occasional
non-breaking spaces. Render with a trusted-HTML path (React:
dangerouslySetInnerHTML, or sanitise via DOMPurify first if the
content is displayed in a security-sensitive context — handbook
content is first-party so direct render is defensible). Do not try
to strip tags; some fields rely on them for line breaks.
| table | rows |
|---|---|
units |
5,217 |
courses |
501 |
areas_of_study |
410 |
unit_offerings |
10,189 |
requisites |
3,310 |
requisite_refs |
7,341 |
enrolment_rules |
4,455 |
course_areas_of_study |
719 |
area_of_study_units |
6,773 |
Requisite type split: 1,612 prohibition · 1,317 prerequisite · 381 corequisite. AoS kind split: 195 major · 162 specialisation · 113 other · 107 minor · 80 elective · 62 extended_major.