| layout | minimal |
|---|---|
| title | Tasks |
| description | Prioritized technical tasks for pipeline reliability, data integrity, and SEO hardening. |
| breadcrumb | Tasks |
| breadcrumb_parent_name | Docs |
| breadcrumb_parent_url | /backlog/docs/ |
| id | doc-015 |
{% include breadcrumbs.html %}
This task list is prioritized to protect pipeline correctness first, then data integrity, then non-visual SEO improvements.
- Standardized the pipeline around
bin/pipelineas the canonical entrypoint. - Removed deprecated wrapper scripts and aligned backlog/docs/CI to canonical commands.
- Kept runtime parity checks, resume artifact checks, and deterministic CI validation as hard gates.
- Added strong data validation for uniqueness and required fields.
- Normalized generated interview/video metadata to remove duplication and low-quality copy.
- Kept data checks in the CI path to fail fast on structural regressions.
- Enforced explicit canonical semantics (
/as root resume route,/home/as homepage route). - Added output validators for canonical/indexability and semantic/schema coverage.
- Added CI artifacts for observability (
seo-metadata-report,schema-coverage-report).
- Implemented build-time last-modified generation and graduated it from experimental to default behavior.
- Added verification checks to ensure rendered
dateModifiedparity across article pages. - Established template/data contract to keep metadata generation and rendered output in sync.
- Use the standard retrospective format in
docs/retrospectives/index.mdfor each meaningful implementation cycle. - Every retrospective must explicitly capture:
- what worked
- what did not work
- what went well
- what could be better
- process improvements with owners, checkpoints, and validation methods
- The next retrospective must evaluate prior improvements with status (
upheld,improved,maintained,regressed,dropped) and evidence.
- Root route semantics:
/is the default root route and resume page;/home/is homepage content. - Resume artifacts (
/,/resume.txt,/resume.md) must render correctly on every build. - Any task affecting resume or homepage routing/canonical behavior must include automated pass/fail criteria.
- Status: complete and stable.
- Ongoing expectation: keep runtime parity and resume guardrails as blocking CI checks.
- Status: complete and stable.
- Ongoing expectation: keep uniqueness/integrity checks in the default CI path.
- Status: implemented, actively tuned with metric-driven follow-up.
- Ongoing expectation: preserve canonical/schema contracts and artifact-based observability.
- Keep plan text synchronized with implementation in the same commit series.
- Preserve sitemap-driven smoke scope and avoid reintroducing hardcoded route lists.
- Keep
SITEMAP_MAX_URLS=5000unless explicit publication-scope expansion is approved. - If publication model changes, define canonical/robots policy first, then implement validation.
- Keep JSON-LD required-field rules in lockstep with template/schema changes.
- Keep semantic graph artifacts and documentation snapshots synchronized when schema contracts change.
- Keep agent/skill usage explicit: use registered skills for CI/smoke/security tasks; add a dedicated SEO semantic skill only if repeated workflows become too custom.
- Keep Playwright smoke checks bounded and observable: enforce explicit step timeouts, cap sampled sitemap routes, and always upload smoke logs as CI artifacts.
- Sub-agents: not required for the current repository workflow. The existing pipeline + script boundaries are sufficient for deterministic SEO/schema hardening.
- Registered skills: sufficient for current needs (
gh-fix-ci,gh-address-comments,playwright,screenshot, security skills). - Gap to consider later: a dedicated SEO/semantic skill could reduce repetitive analysis steps, but this is optimization work, not a blocker.
-
Transcript Coverage Expansion
- Status: deferred (work in progress).
- Notes: transcript onboarding should continue as new source transcript files are produced; avoid blocking other metadata/SEO work on transcript completeness.
-
Metadata Completion at Scale (Active Candidate)
- Status: active candidate.
- Fill missing
video_assetsfields (description,topic) in prioritized batches. - Fill missing
interviewstopicvalues where conference/community context is known. - Keep topic/description conventions consistent with canonical slugs and transcript-derived phrasing.
-
SEO Metadata Quality Cleanup
- Status: implemented (2026-02-12 pass), monitor and tune.
- Added global head-level title/description normalization for minimum and maximum lengths.
- Improved generated interview/video metadata copy to avoid thin descriptions and low-information titles.
- Ongoing: monitor
tmp/seo-metadata-report.jsonfor regressions as new content lands.
-
Data Model Documentation Alignment
- Status: implemented, keep synchronized.
- Update docs to reflect per-file transcripts in
_data/transcripts/*.ymlas canonical transcript storage. - Clarify
_data/transcripts.ymlis legacy/placeholder and not the active content source. - Keep docs synchronized with generators/templates in the same commit series.
-
Structured Data Object Model Expansion
- Status: implemented (first complete pass), monitor and tune.
- Interview schema now encodes richer entity relationships (
Interview,Person,Event,Organization, linkedVideoObject). - Resume schema now enforces
Person+Occupation+ careerItemListconsistency for ATS/search use. - Added semantic graph snapshot docs workflow (
./bin/pipeline semantic-snapshot) for reviewability.
-
Ongoing Maintenance
- Status: continuous.
- Continue periodic validator hardening only where reports indicate drift.
- Keep command/documentation grammar aligned to
bin/pipelinesubcommands. - Track retrospective follow-through as first-class process work, not post-hoc cleanup.