feat(jobs): add apply_to_job and save_job tools#313
feat(jobs): add apply_to_job and save_job tools#313Gabrcodes wants to merge 773 commits intostickerdaniel:mainfrom
Conversation
Lock file already has 3.1.0 since #166; align pyproject.toml floor to prevent accidental downgrades to v2. Resolves: #190
Lock file already has 3.1.0 since #166; align pyproject.toml
floor to prevent accidental downgrades to v2.
Resolves: #190
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR tightens the `fastmcp` minimum version constraint from `>=2.14.0` to `>=3.0.0` in `pyproject.toml` (and the corresponding `uv.lock` metadata), preventing any future resolver from backtracking to the incompatible v2 series. The lock file has already been pinning `fastmcp==3.1.0` since PR #166, so there is no runtime impact — this is purely a spec/metadata alignment.
- `pyproject.toml`: `fastmcp` floor raised to `>=3.0.0`
- `uv.lock`: `package.metadata.requires-dist` updated to match; the resolved package entry (`3.1.0`) is unchanged
- No upper-bound cap (`<4.0.0`) is set, which is consistent with the project's existing open-ended constraints for all other dependencies
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge — it is a pure metadata alignment with no functional or runtime impact.
- The locked version was already `3.1.0` before this PR; the only change is raising the declared floor to match. Both modified lines are trivially correct, consistent with each other, and have no side-effects on the installed environment.
- No files require special attention.
<h3>Important Files Changed</h3>
| Filename | Overview |
|----------|----------|
| pyproject.toml | Single-line change updating the `fastmcp` floor constraint from `>=2.14.0` to `>=3.0.0`, aligning with the already-resolved version in the lock file. |
| uv.lock | Auto-generated lock file metadata updated to reflect the new `>=3.0.0` specifier; the resolved `fastmcp` version (3.1.0) was already correct and unchanged. |
</details>
<h3>Flowchart</h3>
```mermaid
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["pyproject.toml\nfastmcp >=3.0.0"] -->|uv resolves| B["uv.lock\nfastmcp 3.1.0 (pinned)"]
B --> C["Installed environment\nfastmcp 3.1.0"]
D["Old constraint\nfastmcp >=2.14.0"] -. "could resolve to" .-> E["fastmcp 2.x\n(incompatible)"]
style D fill:#f9d0d0,stroke:#c00
style E fill:#f9d0d0,stroke:#c00
style A fill:#d0f0d0,stroke:#060
style B fill:#d0f0d0,stroke:#060
style C fill:#d0f0d0,stroke:#060
```
<sub>Last reviewed commit: 7d2363e</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Replace dict-returning handle_tool_error() with raise_tool_error() that raises FastMCP ToolError for known exceptions. Unknown exceptions re-raise as-is for mask_error_details=True to handle. Resolves: #185
Add logger.error with exc_info for unknown exceptions before re-raising, and add test coverage for AuthenticationError and ElementNotFoundError.
Re-add optional context parameter to raise_tool_error() for log correlation, and add test for base LinkedInScraperException branch.
Add catch-all comment on base exception branch and NoReturn inline comments on all raise_tool_error() call sites.
…mcp_constraint_to_3.0.0 refactor(error-handler): replace handle_tool_error with ToolError
Replace repeated ensure_authenticated/get_or_create_browser/ LinkedInExtractor boilerplate in all 6 tool functions with FastMCP Depends()-based dependency injection via a single get_extractor() factory in dependencies.py. Resolves: #186
Updated the get_extractor function to route errors through raise_tool_error, ensuring that MCP clients receive structured ToolError responses for authentication failures. Added a test to verify that authentication errors are correctly handled and produce the expected ToolError response.
…epends_to_inject_extractor refactor(tools): Use Depends() to inject extractor
Replace ToolAnnotations(...) with plain dicts, move title to top-level @mcp.tool() param, and add category tags to all tools. Resolves: #189
Replace ToolAnnotations(...) with plain dicts, move title to
top-level @mcp.tool() param, and add category tags to all tools.
Resolves: #189
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR is a clean, well-scoped refactoring that modernises tool metadata across all four changed files to align with the FastMCP 3.x API. It introduces no functional or behavioural changes.
Key changes:
- Removes the `ToolAnnotations(...)` Pydantic wrapper in `company.py`, `job.py`, and `person.py`, replacing it with plain `dict` syntax for the `annotations` parameter — the simpler form supported by FastMCP 3.x.
- Moves `title` from inside `ToolAnnotations` to a top-level keyword argument on `@mcp.tool()`, matching the updated FastMCP 3.x decorator signature.
- Drops the now-redundant `destructiveHint=False` from all read-only tools. Per the MCP spec, `destructiveHint` is only meaningful when `readOnlyHint` is `false`, so omitting it from tools that already declare `readOnlyHint=True` is semantically equivalent.
- Adds `tags` (as Python `set` literals) to every tool for categorisation (`"company"`, `"job"`, `"person"`, `"scraping"`, `"search"`, `"session"`).
- Enriches the previously unannotated `close_session` tool in `server.py` with a title, `destructiveHint=True`, and the `"session"` tag — accurately describing its destructive nature.
The existing test suite in `tests/test_tools.py` covers all tool functions but does not assert on annotation metadata, so no test changes are required. The refactoring is consistent across all tool files and fits naturally within the project's layered registration pattern.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge — it is a pure metadata/annotation refactoring with no changes to tool logic, inputs, outputs, or error handling.
- All changes are limited to decorator parameters (`title`, `annotations`, `tags`). The `annotations` dict values are semantically equivalent to the removed `ToolAnnotations` objects, `destructiveHint=False` is correctly dropped only for `readOnlyHint=True` tools, and the new `close_session` annotations accurately reflect its destructive nature. No business logic, scraping behaviour, or error paths were altered.
- No files require special attention.
<h3>Flowchart</h3>
```mermaid
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["@mcp.tool() decorator"] --> B{Annotation style}
B -->|Before| C["ToolAnnotations(title=..., readOnlyHint=..., destructiveHint=False, openWorldHint=...)"]
B -->|After| D["title='...' (top-level param)\nannotations={'readOnlyHint': True, 'openWorldHint': True}\ntags={'category', 'type'}"]
D --> E["person tools\n(get_person_profile, search_people)"]
D --> F["company tools\n(get_company_profile, get_company_posts)"]
D --> G["job tools\n(get_job_details, search_jobs)"]
D --> H["session tool\n(close_session)\nannotations={'destructiveHint': True}"]
```
<sub>Last reviewed commit: c5bf554</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Use lowercase dict instead of Dict, add auth validation log line
…t_lifespan_into_composable_browser_auth_lifespans refactor(server): Split lifespan into composable browser + auth lifespans
# Conflicts: # linkedin_mcp_server/server.py # linkedin_mcp_server/tools/company.py # linkedin_mcp_server/tools/job.py # linkedin_mcp_server/tools/person.py
# Conflicts: # linkedin_mcp_server/server.py # linkedin_mcp_server/tools/company.py # linkedin_mcp_server/tools/job.py # linkedin_mcp_server/tools/person.py
# Conflicts: # linkedin_mcp_server/server.py
…_timeouts feat(tools): add global 90s tool timeouts
…_jobs Extract job IDs from href attributes (the one thing innerText can't capture), scroll the job sidebar instead of the main page, and paginate through multiple result pages with dynamic offsets. Resolves: #195
- Use fixed 25-per-page offset instead of dynamic ID count - Read "Page X of Y" from pagination state to cap pagination - Add soft rate-limit retry via _extract_search_page helper - Use keyword arguments in tool wrapper for clarity
- Stop on page 0 when no job IDs found (avoid useless page 1) - Fix test_stops_at_total_pages to use distinct IDs per page so only the total_pages guard stops pagination
Add date_posted, job_type, experience_level, work_type, easy_apply, and sort_by filters to search_jobs with human-readable normalization. Fix Greptile review: always log no-results break, move _PAGE_SIZE to module level, add Field(ge=1, le=10) on max_pages, skip ID extraction on empty text. Resolves: #174
Use _normalize_csv for job_type to preserve raw commas in multi-value filters and add human-readable names (full_time, contract, etc.).
Break early when _extract_search_page returns _RATE_LIMITED_MSG to avoid extracting IDs from unreliable DOM state. Remove redundant truthiness check now guarded by the early break.
feat: linkedin messaging, get sidebar profiles
…IDs (#300) * fix(scraping): Respect --timeout for messaging, recognize thread URLs Remove all hardcoded timeout=5000 from the send_message flow and messaging helpers so they fall through to the page-level default set from BrowserConfig.default_timeout (configurable via --timeout). Also add /messaging/thread/ URL recognition to classify_link so conversation thread references are captured when they appear in search results or conversation detail views. Raise inbox reference cap to 30 and add proper section context labels. Resolves: #296 See also: #297 * fix(scraping): Extract conversation thread IDs from inbox via click-and-capture LinkedIn's conversation sidebar uses JS click handlers instead of <a> tags, so anchor extraction cannot capture thread IDs. Click each conversation item and read the resulting SPA URL change to build conversation references with thread_id and participant name. Before: get_inbox returned 2 references (active conversation only) After: get_inbox returns all conversation thread IDs (10+ refs) Resolves: #297 * fix(scraping): Respect --timeout across all remaining scraping methods Remove the remaining 10 hardcoded timeout=5000 from profile scraping, connection flow, modal detection, sidebar profiles, conversation resolution, and job search. All Playwright calls now use the page-level default from BrowserConfig.default_timeout. Resolves: #299 * fix: Address PR review feedback - Use saved inbox URL instead of self._page.url (P1: wrong URL after clicks) - Fix docstring to clarify 2s recipient-picker probe is intentional - Replace class-name selectors with aria-label discovery + minimal class fallback - Dedupe references after merging conversation and anchor refs
First-time uvx runs download ~77 Python packages including the 39MB patchright wheel. On slow connections, uv's default 30s HTTP timeout can cause silent failures before the server process starts. Co-authored-by: Daniel Sticker <sticker@ngenn.net>
Move UV_HTTP_TIMEOUT=300 into the main uvx config example so it's the default, not an optional troubleshooting step. Fix grammar in the troubleshooting note. Co-authored-by: Daniel Sticker <sticker@ngenn.net>
* docs: use @latest tag in uvx config for auto-updates Without @latest, uvx caches the first downloaded version forever. Adding @latest ensures uvx checks PyPI on each client launch and pulls new versions automatically. * docs: apply @latest consistently to all uvx invocations Update --login examples in README.md and docs/docker-hub.md to use linkedin-scraper-mcp@latest for consistency with the MCP config. --------- Co-authored-by: Daniel Sticker <sticker@ngenn.net>
Two new action tools for job hunting: apply_to_job: - Automates LinkedIn's Easy Apply multi-step modal - Confirm gate: set confirm_apply=true to actually submit - Handles Next/Review/Submit button flow up to 15 steps - Reports which fields need manual input when required fields are empty - Distinguishes confirmed vs unconfirmed submission - Scrolls Easy Apply button into view before clicking (sticky navbar fix) - Statuses: applied, applied_unconfirmed, already_applied, not_easy_apply, confirmation_required, requires_input, apply_failed save_job: - Bookmarks a job posting for later review - Verifies button state changed after clicking - Statuses: saved, already_saved, save_unavailable Both tools are annotated with destructiveHint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR adds Confidence Score: 4/5Safe to merge after considering the worst-case timeout risk; the logic and status reporting are solid. All prior P0/P1 findings are resolved. One P2 remains: the loop's worst-case runtime (up to ~75 s) cuts close to the 90 s tool timeout, which could leave the dialog open mid-application on slow connections. Reducing max_steps or the per-step dialog timeout would eliminate this risk before merging. linkedin_mcp_server/scraping/extractor.py — loop timing budget in apply_to_job Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[apply_to_job called] --> B[Navigate to job URL]
B --> C{Easy Apply button found?}
C -- No --> D{Page text: 'You applied' / 'Application submitted'?}
D -- Yes --> E[already_applied]
D -- No --> F[not_easy_apply]
C -- Yes --> G{confirm_apply?}
G -- False --> H[confirmation_required]
G -- True --> I[Click Easy Apply]
I --> J{Dialog opens within 10s?}
J -- No --> K[apply_failed]
J -- Yes --> L[Loop: max 15 steps]
L --> M{Dialog still open? timeout=3s}
M -- No --> N{Page text matches submission phrases?}
N -- Yes --> O[applied_unconfirmed]
N -- No --> P[applied_unconfirmed dialog_closed_early]
M -- Yes --> Q{Required fields empty?}
Q -- Yes --> R[Dismiss + requires_input]
Q -- No --> S{Submit button?}
S -- Yes --> T[Click Submit, wait 2s]
T --> U{dialog_text or page_text matches?}
U -- Yes --> V[applied]
U -- No --> W[applied_unconfirmed]
S -- No --> X{Review / Next / Continue button?}
X -- Yes --> Y[Click + continue loop]
X -- No --> Z[apply_failed stuck]
L -- exhausted --> AA[apply_failed max steps]
Prompt To Fix All With AIThis is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 1843-1845
Comment:
**Worst-case loop duration approaches the tool timeout**
With `max_steps=15`, each iteration sleeps 1 s then calls `_dialog_is_open(timeout=3000)` (up to 3 s), giving up to 60 s of loop time alone. Adding the 2 s submit sleep, ~10 s modal wait, and page navigation puts the worst case at ~75–80 s against a `TOOL_TIMEOUT_SECONDS` of 90 s. On a slow connection or a long form this will silently time out before the loop finishes, leaving the dialog open mid-application with no cleanup.
Consider either lowering `max_steps` (10 is plenty for any observed LinkedIn Easy Apply form), reducing `_dialog_is_open(timeout=…)` to 1000 ms inside the loop, or adding an explicit elapsed-time guard.
How can I resolve this? If you propose a fix, please make it concise.Reviews (9): Last reviewed commit: "fix: dismiss dialog on unconfirmed submi..." | Re-trigger Greptile |
…t confirmation - fix(extractor): replace LinkedIn CSS class-name selectors in already_applied check with page text check — avoids brittle selectors that violate CLAUDE.md and silently break when LinkedIn updates CSS - fix(extractor): distinguish dialog-closed-early from max-steps in apply_failed error message using dialog_closed_early flag - fix(extractor): check both dialog and main page text for submission confirmation after clicking Submit — LinkedIn shows the confirmation screen inside the still-open modal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: change \bApplied\b to 'You applied|Application submitted' to avoid false positives on job titles like 'Applied ML Engineer'; add re.IGNORECASE for consistency with all other re.search calls - fix: skip individual radio inputs in required-field check (input.value is always set for radios regardless of selection); add group-level check using input[type=radio][required]:checked per named group Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix(extractor): move required-fields check BEFORE Review/Next/Continue buttons in apply_to_job — LinkedIn always renders Next on steps with unfilled required fields, making requires_input unreachable when checked last; now evaluated first so the status fires correctly - fix(extractor): scroll Save button into view before clicking in save_job to avoid sticky navbar obstruction (same fix as #304) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix(extractor): move required-fields check before Review/Next/Continue buttons — LinkedIn renders Next even on unfilled-required steps, so checking last makes requires_input dead code; now evaluated first - fix(extractor): scroll Save button into view before clicking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e radio name - fix: move required_empty check before Submit button — EEO/disclosure questions on the final step now trigger requires_input instead of applied_unconfirmed - fix: tighten dialog-closed-early regex to multi-word phrases to avoid false positives on job titles like 'Applied Machine Learning Engineer' - fix: escape CSS-special chars in radio name attribute before interpolating into querySelector template literal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: post-Submit confirmation uses dialog_text as primary signal (LinkedIn shows explicit 'Application sent' in the dialog before closing). Falls back to page_text only when dialog is already gone. Multi-word phrases only to avoid matching 'Applied ML Engineer' in the job listing body. - fix: correct Python backslash escaping for CSS selector injection guard — /\/g in Python source becomes /\/g in JS (match backslash), not /\/g (match forward-slash) as before. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dialog_closed_early represents ambiguous outcome — the application may or may not have been submitted. apply_failed implies definitive failure which would mislead callers into retrying, risking duplicate submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: add aria-required="true" to required-field selectors in apply_to_job — LinkedIn SPA marks many fields as required via ARIA rather than HTML5 required attribute; both text fields and radio groups now check both patterns - fix: remove dead saved_btn locator in save_job verification — unsave_btn already matches 'Saved' via its regex pattern, making the separate saved_btn check unreachable dead code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: call _dismiss_dialog() before returning applied_unconfirmed when submit confirmation cannot be verified — leaves browser state clean for subsequent tool calls (mirrors requires_input and stuck paths) - fix: skip input[type='file'] in required-field check — browsers always report file.value as '' regardless of selection, so file upload fields would always trigger requires_input even when resume is on file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Two new action tools for the job hunting workflow.
apply_to_job— automates LinkedIn's Easy Apply modal:confirm_apply=false(default) dry-runs and reports whether Easy Apply is availableconfirm_apply=truesteps through the modal: Next → Review → Submitapplied) from unconfirmed (applied_unconfirmed)save_job— bookmarks a job posting:save_unavailableif the click didn't take effectBoth are annotated
destructiveHint.Changes
scraping/extractor.py—apply_to_job,save_jobmethodstools/job.py—apply_to_job,save_jobtool registrationsTest plan
ruff checkandruff formatpass