Skip to content

Commit c4e3a73

Browse files
committed
Tidy up template and gap docs
1 parent 6e1f76e commit c4e3a73

11 files changed

Lines changed: 391 additions & 212 deletions

File tree

paperworks/CLAUDE.md

Lines changed: 37 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -88,10 +88,12 @@ The paper inventory correlates three sources:
8888
- Events: job_queued, job_started, job_completed, job_failed, queue_drained
8989
- NEVER bypass the queue. NEVER call the session directly.
9090

91-
## Two Pages
91+
## Tabs
9292

93-
- `/` - Dashboard: top bar (auth + summary pills), paper table, log, bottom status bar
94-
- `/settings` - Settings: output dir, watch dirs, credentials, style
93+
- **Folio** - paper inventory table with status badges, detail rows, upload
94+
- **Render** - card-based drop targets for one-off preview rendering
95+
- **Settings** - output dirs (folio + render), watch dirs, style, fonts, credentials
96+
- **Log** - activity log with dimming
9597

9698
Both pages use SSE (`/api/events`) for live updates. The SSE
9799
connection is per-page - it reconnects on navigation. State is
@@ -103,33 +105,50 @@ markdown changed -> watchdog -> render worker -> PDF created ->
103105
PDF watchdog -> dirty flag -> user clicks Upload -> IsoCppSession
104106
queue -> isocpp.org
105107

106-
## Button UX Rules (TODO - not yet fully implemented)
108+
## Button UX Rules
107109

108-
Buttons that submit work to the IsoCppSession queue (docketeer)
109-
or to the render worker follow this pattern:
110+
Buttons that submit work follow this pattern:
110111

111112
1. **Press** -> button enters working state immediately:
112113
- Disabled (no re-click)
113114
- Animated glowing border (badge-working style)
114115
- 70% opacity
115116
2. **Log** -> IsoCppSession's on_event fires job_queued, which
116117
the server logs automatically. Render worker logs "Starting..."
117-
3. **Event** -> button stays working until an SSE event confirms
118-
completion (job_completed, job_failed, rendered, render_done)
119-
4. **Done** -> log entry for completion, button re-enables
120-
5. **Log styling** -> completion entries at 50% opacity to
118+
3. **Event** -> button stays working until completion is
119+
confirmed (SSE event or HTTP response)
120+
4. **Done** -> log entry for completion, button re-enables,
121+
`_workingSet` cleared for the doc number
122+
5. **Failure** -> toast shown, button re-enables, working
123+
state cleared. Both HTTP errors and SSE failures clear state.
124+
6. **Log styling** -> completion entries at 50% opacity to
121125
distinguish from active/error entries
122126

123-
Buttons that go through the queue:
124-
- Upload (IsoCppSession.submit upload - syncs title, author, abstract + PDF)
127+
Queue-based buttons (SSE confirms completion):
128+
- Upload (IsoCppSession.submit upload - syncs title, author,
129+
abstract + PDF)
125130
- Draft/Review transition (IsoCppSession.submit transition)
126-
- Render per-paper (render worker)
127-
- Render All (render worker)
128-
- Log In (blocks on login request, not queued but same UX)
131+
- Render per-paper (render worker via batch queue)
132+
- Render All (render worker via batch queue)
129133

130-
Buttons that do NOT go through the queue (instant, no working
131-
state needed): Log Out, Clear Log, Open Folder, Shut Down,
132-
Save (settings), tab switches.
134+
Synchronous buttons (HTTP response confirms completion):
135+
- Render tab preview cards (render-preview endpoint with
136+
`_preview_lock` serialization)
137+
- Log In (blocks on login request)
138+
139+
Instant buttons (no working state needed): Log Out, Clear Log,
140+
Shut Down, Save (settings), tab switches.
141+
142+
## Render Serialization
143+
144+
All `build_pdf` calls are serialized through `_preview_lock`.
145+
The batch render worker acquires it per file. The preview
146+
endpoint acquires it for the full render. This prevents
147+
concurrent ReportLab font registration corruption.
148+
149+
`build_pdf` is self-contained: it loads the font manifest,
150+
downloads missing fonts, registers them, and builds the PDF.
151+
Callers only need to provide a style dict.
133152

134153
## No Public Scraping
135154

paperworks/lib/inventory.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@
33
Source priority (highest to lowest):
44
1. Markdown front matter + body - authoritative for all metadata
55
2. Rendered PDF - fallback when markdown is absent
6-
3. isocpp.org - remote status only, never a metadata source
6+
3. isocpp.org - remote status and form URLs only. Title and author
7+
from isocpp.org are used as last-resort fallback when neither
8+
markdown nor PDF exists for a paper.
79
810
D/P prefixes are interchangeable for matching: D4007R0 and P4007R0
911
refer to the same paper.

paperworks/lib/isocpp.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,7 @@ def login(self, username, password):
125125
if "invalid" in r.text.lower() or "incorrect" in r.text.lower():
126126
return False, "Invalid username or password"
127127

128-
self._authenticated = True
129-
self._username = username
130-
return True, "Logged in"
128+
return False, "Unexpected response - login may have failed"
131129

132130
def logout(self):
133131
with self._lock:
@@ -240,6 +238,7 @@ def _run_worker(self):
240238

241239
with self._pending_lock:
242240
self._pending.pop(job_id, None)
241+
drained = len(self._pending) == 0
243242

244243
self._active_job = None
245244

@@ -256,7 +255,7 @@ def _run_worker(self):
256255
payload["error"] = message
257256
self._emit(payload)
258257

259-
if self._queue.empty():
258+
if drained:
260259
self._emit({"event": "queue_drained"})
261260

262261
def _emit(self, event):

paperworks/lib/pdf_reader.py

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,14 @@
55

66

77
_DOC_NUM_RE = re.compile(
8-
r"\b([DPN]\d{4,5}R\d+)\b"
9-
r"|\b([DPN]\d{4,5})\b"
10-
r"|\b(N\d{4,5})\b",
8+
r"\b([DPN]\d{3,5}R\d+)\b"
9+
r"|\b([DPN]\d{3,5})\b"
10+
r"|\b(N\d{3,5})\b",
1111
re.IGNORECASE,
1212
)
1313

1414
_DOC_FIELD_RE = re.compile(
15-
r"Document\s+Number[:\s]+([DPN]\d{4,5}(?:R\d+)?|N\d{4,5})",
15+
r"Document\s+Number[:\s]+([DPN]\d{3,5}(?:R\d+)?|N\d{3,5})",
1616
re.IGNORECASE,
1717
)
1818

@@ -44,13 +44,13 @@ def _extract_doc_number(text):
4444

4545
def _doc_number_from_filename(path):
4646
stem = Path(path).stem.lower()
47-
m = re.match(r"([dpn]\d{4,5}(?:r\d+)?)", stem)
47+
m = re.match(r"([dpn]\d{3,5}(?:r\d+)?)", stem)
4848
if m:
4949
return m.group(1).upper()
5050
return None
5151

5252

53-
def _extract_title(text, lines):
53+
def _extract_title(lines):
5454
for line in lines[:20]:
5555
stripped = line.strip()
5656
if len(stripped) > 10 and not _DOC_NUM_RE.match(stripped):
@@ -73,11 +73,15 @@ def _extract_authors(text):
7373

7474

7575
def _extract_abstract_from_doc(doc):
76-
"""Scan pages 1-4 for the abstract body (page 0 has the TOC entry).
76+
"""Scan pages for the abstract body.
77+
78+
For multi-page documents, starts at page 1 (page 0 typically has
79+
the TOC entry). For single-page documents, scans page 0.
7780
7881
Returns (brutal_summary, full_abstract) or (None, None).
7982
"""
80-
for pg_num in range(1, min(5, doc.page_count)):
83+
start = 0 if doc.page_count == 1 else 1
84+
for pg_num in range(start, min(5, doc.page_count)):
8185
text = doc[pg_num].get_text()
8286
lines = text.split("\n")
8387
for i, line in enumerate(lines):
@@ -127,15 +131,15 @@ def read_pdf(path):
127131
try:
128132
import fitz
129133
doc = fitz.open(str(path))
130-
if doc.page_count == 0:
134+
try:
135+
if doc.page_count == 0:
136+
result["doc_number"] = _doc_number_from_filename(path)
137+
return result
138+
139+
first_page = doc[0].get_text()
140+
brutal, abstract = _extract_abstract_from_doc(doc)
141+
finally:
131142
doc.close()
132-
result["doc_number"] = _doc_number_from_filename(path)
133-
return result
134-
135-
first_page = doc[0].get_text()
136-
137-
brutal, abstract = _extract_abstract_from_doc(doc)
138-
doc.close()
139143
except Exception:
140144
result["doc_number"] = _doc_number_from_filename(path)
141145
return result
@@ -144,7 +148,7 @@ def read_pdf(path):
144148
text = first_page[:3000]
145149

146150
result["doc_number"] = _extract_doc_number(text) or _doc_number_from_filename(path)
147-
result["title"] = _extract_title(text, lines)
151+
result["title"] = _extract_title(lines)
148152
result["authors"] = _extract_authors(text)
149153
result["abstract"] = abstract
150154
result["brutal_summary"] = brutal

0 commit comments

Comments
 (0)