feat(t2): Phase 3 legal_representatives extraction for LV#134
Open
petterlindstrom79 wants to merge 4 commits into
Open
feat(t2): Phase 3 legal_representatives extraction for LV#134petterlindstrom79 wants to merge 4 commits into
petterlindstrom79 wants to merge 4 commits into
Conversation
Extract directors/officers from data.gov.lv `amatpersonas` open dataset (CKAN resource e665114a-73c2-4375-9470-55874b4cfa6b) and surface them as the canonical legal_representatives[] array on latvian-company-data output. Flips tier_2_available true when the upstream returns at least one active officer. Free, real-time CKAN datastore_search call alongside the existing entities-master lookup — no infra, no auth, no scraping. Officer FK column at_legal_entity_registration_number is numeric, so the JSON filter value is sent unquoted. Roles normalized to a stable English enum (board_chair, board_member, council_member, council_chair, procurist, liquidator); unknown LV codes pass through verbatim. Each entry carries rights_of_representation, representation_with_at_least, start_date, and entity_type. Coverage limit honestly disclosed in tier_2_available_reason: the amatpersonas dataset is a current-active-officers snapshot — no resignations or historical entries are exposed via this resource. Smoke-verified against airBaltic (40003245752, 2 officers), Latvenergo (40003032949, 5 officers), and a 40003020121 fallback (2 officers). BE deferred from this phase: cbeapi.be doesn't expose officers, KBO Open Data CSV omits the function table, and KBO Public Search Web Service SOAP is paid (€50 per 2k requests with bank-transfer onboarding). No free path exists. Refs DEC-20260518-A, DEC-20260518-D. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pass B reviewer flagged that the populated-officers reason string spelled
the registry name without diacritics ("Uznemumu registrs") while the
provenance attribution + file comment use the correct "Uzņēmumu reģistrs"
form. Aligns the two so an AI agent caller does not surface inconsistent
transliterations in the same response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two inline fixes from the /go six-lens review on PR #133: - Pass A correctness: removed 404 swallow in fetchOfficers. Previously the officers endpoint 404 silently produced legal_representatives=[] with tier_2_available=true, which made the flag misleading in the rare case where /company/{n} succeeds but /company/{n}/officers returns 404. Now the 404 propagates as a structured error consistent with fetchCompany. - Quality consistency: guarded the legal_representatives assignment with the alias-block invariant pattern (if undefined). Matches surrounding canonical-alias resolution and FR sibling. Deferred (PR-body MEDIUMs): cross-country shape mismatch, manifest output_field_reliability gap, items_per_page=100 truncation guard, SE YAML roadmap-state phrasing, reason-string voice, auth-header duplication. See PR description for details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit fb82e4c.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 3 of the legal_representatives extraction sweep — Latvia.
Extracts directors/officers from
data.gov.lvamatpersonasopen dataset (CKAN resourcee665114a-73c2-4375-9470-55874b4cfa6b) and surfaces them as the canonicallegal_representatives[]array onlatvian-company-dataoutput. Flipstier_2_available: truewhen the upstream returns at least one active officer.Free, no auth, no infra. A second CKAN
datastore_searchcall alongside the existing entities-master lookup. The officer FK columnat_legal_entity_registration_numberis numeric, so the JSON filter value is sent unquoted (see comment infetchOfficers).Role normalization: stable English enum (
board_chair,board_member,council_member,council_chair,procurist,liquidator). Unknown LV codes pass through verbatim. Each entry carriesname,role,start_date,rights_of_representation,representation_with_at_least,entity_type.Honest coverage caveat: the
amatpersonasdataset is a current-active-officers snapshot only — resignations and historical entries are not exposed via this resource. Disclosed intier_2_available_reason.Scope and dependencies
This PR is stacked on #133 (Phase 1 Class A relabel for FR/SK/UK/SE). PR base is
feat/phase-1-class-a-relabel-fr-sk-uk. When #133 merges, this PR auto-rebases tomain. The diff shown here is the LV delta only (2 files).BE deferred from this phase
Source research confirmed no free path exists for BE officer data:
cbeapi.be: no officer endpointRecommend deferring BE to a "Paid Tier-2 vendor onboarding" phase that handles DEC-20260428-A vetting + budget approval.
Verification
tsc --noEmitcleanvalidate-capability --slug latvian-company-data— 19/20 (single Gate 5 failure is pre-existing on main:task/company_nameentry points lack fixture coverage; not introduced by this change)smoke-test --slug latvian-company-data— 11/11 steps green, live execution returns 24 fields in 1272 msReviewer findings
Applied this session (1):
f623d73— registry name now spelledUzņēmumu reģistrsconsistently with the provenance attribution and file docstring.Flagged for follow-up (3, none ship-blocking):
Pass A.1 — Raw
fetchvssafeFetchpolicy gap (MEDIUM).callDatastoreuses rawfetchagainst a hardcodedLV_DATASTORE_APIconstant. No exploitable SSRF path today since user input only flows into a query-string param. The smell is that ifcallDatastoreis later extended to accept a caller-supplied base URL, the missingsafeFetchwrapper becomes live. Flagging so the next person sees the conscious choice.Pass A.2 — Sequential awaits on name-lookup path (MEDIUM). For name queries: entity record fetched first (15s timeout), then officers (15s timeout) — worst case 30s before handler returns. The 10s DEC-22 sync threshold means name-lookup queries flip to async more often than regcode queries. Genuine dependency (officers fetched by regcode derived from the entity record), so the serial order is correct — but the behavior is worth documenting so the async flip is expected, not treated as a degradation signal.
Pass B.1 —
tier_2_availablesemantic ambiguity (MEDIUM). The field is settruewhen officers count > 0 andfalse(with explanatory reason) when officers count = 0. An empty officers list is a valid registry state (newly registered entity, all officers resigned). The current boolean conflates data-availability with entity-state. Note: this is consistent with the canonical contract already shipped on UK/SK/FR in Phases 1+2, so changing the shape here would create cross-handler inconsistency. Worth a platform-wide follow-up (legal_representatives_availableorofficers_known_present: boolean | null) — not appropriate to fork the shape in a single-country PR.LOW (pre-existing, not introduced here):
c2e4974; lives in this file but originated upstream. Out of scope.Cross-repo
No frontend changes required.
legal_representativesshape matches what UK/FR/SK already emit; the additional LV-specific keys (rights_of_representation,representation_with_at_least,entity_type) are additive.Refs DEC-20260518-A, DEC-20260518-D.
Test plan
tier_2_availablecorrectly toggles based on officer count🤖 Generated with Claude Code