Skip to content

feat(jobs): add apply_to_job and save_job tools#313

Closed
Gabrcodes wants to merge 773 commits intostickerdaniel:mainfrom
Gabrcodes:feat/job-actions
Closed

feat(jobs): add apply_to_job and save_job tools#313
Gabrcodes wants to merge 773 commits intostickerdaniel:mainfrom
Gabrcodes:feat/job-actions

Conversation

@Gabrcodes
Copy link
Copy Markdown

Summary

Two new action tools for the job hunting workflow.

apply_to_job — automates LinkedIn's Easy Apply modal:

  • confirm_apply=false (default) dry-runs and reports whether Easy Apply is available
  • confirm_apply=true steps through the modal: Next → Review → Submit
  • Reports which fields need manual input when required fields are empty
  • Scrolls the button into view before clicking (fixes sticky navbar issue from [BUG] connect_with_person fails when Connect button is behind More menu (sticky navbar viewport issue) #304)
  • Distinguishes confirmed submission (applied) from unconfirmed (applied_unconfirmed)
  • Checks page text when dialog closes unexpectedly to detect silent submissions

save_job — bookmarks a job posting:

  • Verifies the button transitioned to "Saved"/"Unsave" after clicking
  • Returns save_unavailable if the click didn't take effect

Both are annotated destructiveHint.

Changes

  • scraping/extractor.pyapply_to_job, save_job methods
  • tools/job.pyapply_to_job, save_job tool registrations

Test plan

  • 357 passed, 5 skipped, 0 failures
  • ruff check and ruff format pass
  • Live test: Easy Apply on a real job posting
  • Live test: Save a job and verify in saved jobs list

stickerdaniel and others added 30 commits March 4, 2026 20:30
Lock file already has 3.1.0 since #166; align pyproject.toml
floor to prevent accidental downgrades to v2.

Resolves: #190
Lock file already has 3.1.0 since #166; align pyproject.toml
floor to prevent accidental downgrades to v2.

Resolves: #190

<!-- greptile_comment -->

<h3>Greptile Summary</h3>

This PR tightens the `fastmcp` minimum version constraint from `>=2.14.0` to `>=3.0.0` in `pyproject.toml` (and the corresponding `uv.lock` metadata), preventing any future resolver from backtracking to the incompatible v2 series. The lock file has already been pinning `fastmcp==3.1.0` since PR #166, so there is no runtime impact — this is purely a spec/metadata alignment.

- `pyproject.toml`: `fastmcp` floor raised to `>=3.0.0`
- `uv.lock`: `package.metadata.requires-dist` updated to match; the resolved package entry (`3.1.0`) is unchanged
- No upper-bound cap (`<4.0.0`) is set, which is consistent with the project's existing open-ended constraints for all other dependencies

<h3>Confidence Score: 5/5</h3>

- This PR is safe to merge — it is a pure metadata alignment with no functional or runtime impact.
- The locked version was already `3.1.0` before this PR; the only change is raising the declared floor to match. Both modified lines are trivially correct, consistent with each other, and have no side-effects on the installed environment.
- No files require special attention.

<h3>Important Files Changed</h3>




| Filename | Overview |
|----------|----------|
| pyproject.toml | Single-line change updating the `fastmcp` floor constraint from `>=2.14.0` to `>=3.0.0`, aligning with the already-resolved version in the lock file. |
| uv.lock | Auto-generated lock file metadata updated to reflect the new `>=3.0.0` specifier; the resolved `fastmcp` version (3.1.0) was already correct and unchanged. |

</details>



<h3>Flowchart</h3>

```mermaid
%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["pyproject.toml\nfastmcp >=3.0.0"] -->|uv resolves| B["uv.lock\nfastmcp 3.1.0 (pinned)"]
    B --> C["Installed environment\nfastmcp 3.1.0"]
    D["Old constraint\nfastmcp >=2.14.0"] -. "could resolve to" .-> E["fastmcp 2.x\n(incompatible)"]
    style D fill:#f9d0d0,stroke:#c00
    style E fill:#f9d0d0,stroke:#c00
    style A fill:#d0f0d0,stroke:#060
    style B fill:#d0f0d0,stroke:#060
    style C fill:#d0f0d0,stroke:#060
```

<sub>Last reviewed commit: 7d2363e</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
Replace dict-returning handle_tool_error() with raise_tool_error()
that raises FastMCP ToolError for known exceptions. Unknown exceptions
re-raise as-is for mask_error_details=True to handle.

Resolves: #185
Add logger.error with exc_info for unknown exceptions before re-raising,
and add test coverage for AuthenticationError and ElementNotFoundError.
Re-add optional context parameter to raise_tool_error() for log
correlation, and add test for base LinkedInScraperException branch.
Add catch-all comment on base exception branch and NoReturn
inline comments on all raise_tool_error() call sites.
…mcp_constraint_to_3.0.0

refactor(error-handler): replace handle_tool_error with ToolError
Replace repeated ensure_authenticated/get_or_create_browser/
LinkedInExtractor boilerplate in all 6 tool functions with
FastMCP Depends()-based dependency injection via a single
get_extractor() factory in dependencies.py.

Resolves: #186
Updated the get_extractor function to route errors through raise_tool_error, ensuring that MCP clients receive structured ToolError responses for authentication failures. Added a test to verify that authentication errors are correctly handled and produce the expected ToolError response.
…epends_to_inject_extractor

refactor(tools): Use Depends() to inject extractor
Replace ToolAnnotations(...) with plain dicts, move title to
top-level @mcp.tool() param, and add category tags to all tools.

Resolves: #189
Replace ToolAnnotations(...) with plain dicts, move title to
top-level @mcp.tool() param, and add category tags to all tools.

Resolves: #189

<!-- greptile_comment -->

<h3>Greptile Summary</h3>

This PR is a clean, well-scoped refactoring that modernises tool metadata across all four changed files to align with the FastMCP 3.x API. It introduces no functional or behavioural changes.

Key changes:
- Removes the `ToolAnnotations(...)` Pydantic wrapper in `company.py`, `job.py`, and `person.py`, replacing it with plain `dict` syntax for the `annotations` parameter — the simpler form supported by FastMCP 3.x.
- Moves `title` from inside `ToolAnnotations` to a top-level keyword argument on `@mcp.tool()`, matching the updated FastMCP 3.x decorator signature.
- Drops the now-redundant `destructiveHint=False` from all read-only tools. Per the MCP spec, `destructiveHint` is only meaningful when `readOnlyHint` is `false`, so omitting it from tools that already declare `readOnlyHint=True` is semantically equivalent.
- Adds `tags` (as Python `set` literals) to every tool for categorisation (`"company"`, `"job"`, `"person"`, `"scraping"`, `"search"`, `"session"`).
- Enriches the previously unannotated `close_session` tool in `server.py` with a title, `destructiveHint=True`, and the `"session"` tag — accurately describing its destructive nature.

The existing test suite in `tests/test_tools.py` covers all tool functions but does not assert on annotation metadata, so no test changes are required. The refactoring is consistent across all tool files and fits naturally within the project's layered registration pattern.

<h3>Confidence Score: 5/5</h3>

- This PR is safe to merge — it is a pure metadata/annotation refactoring with no changes to tool logic, inputs, outputs, or error handling.
- All changes are limited to decorator parameters (`title`, `annotations`, `tags`). The `annotations` dict values are semantically equivalent to the removed `ToolAnnotations` objects, `destructiveHint=False` is correctly dropped only for `readOnlyHint=True` tools, and the new `close_session` annotations accurately reflect its destructive nature. No business logic, scraping behaviour, or error paths were altered.
- No files require special attention.

<h3>Flowchart</h3>

```mermaid
%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["@mcp.tool() decorator"] --> B{Annotation style}
    B -->|Before| C["ToolAnnotations(title=..., readOnlyHint=..., destructiveHint=False, openWorldHint=...)"]
    B -->|After| D["title='...' (top-level param)\nannotations={'readOnlyHint': True, 'openWorldHint': True}\ntags={'category', 'type'}"]
    D --> E["person tools\n(get_person_profile, search_people)"]
    D --> F["company tools\n(get_company_profile, get_company_posts)"]
    D --> G["job tools\n(get_job_details, search_jobs)"]
    D --> H["session tool\n(close_session)\nannotations={'destructiveHint': True}"]
```

<sub>Last reviewed commit: c5bf554</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
Use lowercase dict instead of Dict, add auth validation log line
…t_lifespan_into_composable_browser_auth_lifespans

refactor(server): Split lifespan into composable browser + auth lifespans
# Conflicts:
#	linkedin_mcp_server/server.py
#	linkedin_mcp_server/tools/company.py
#	linkedin_mcp_server/tools/job.py
#	linkedin_mcp_server/tools/person.py
# Conflicts:
#	linkedin_mcp_server/server.py
#	linkedin_mcp_server/tools/company.py
#	linkedin_mcp_server/tools/job.py
#	linkedin_mcp_server/tools/person.py
# Conflicts:
#	linkedin_mcp_server/server.py
…_timeouts

feat(tools): add global 90s tool timeouts
…_jobs

Extract job IDs from href attributes (the one thing innerText can't
capture), scroll the job sidebar instead of the main page, and paginate
through multiple result pages with dynamic offsets.

Resolves: #195
- Use fixed 25-per-page offset instead of dynamic ID count
- Read "Page X of Y" from pagination state to cap pagination
- Add soft rate-limit retry via _extract_search_page helper
- Use keyword arguments in tool wrapper for clarity
- Stop on page 0 when no job IDs found (avoid useless page 1)
- Fix test_stops_at_total_pages to use distinct IDs per page so
  only the total_pages guard stops pagination
Add date_posted, job_type, experience_level, work_type, easy_apply,
and sort_by filters to search_jobs with human-readable normalization.
Fix Greptile review: always log no-results break, move _PAGE_SIZE to
module level, add Field(ge=1, le=10) on max_pages, skip ID extraction
on empty text.

Resolves: #174
Use _normalize_csv for job_type to preserve raw commas in multi-value
filters and add human-readable names (full_time, contract, etc.).
Break early when _extract_search_page returns _RATE_LIMITED_MSG to
avoid extracting IDs from unreliable DOM state. Remove redundant
truthiness check now guarded by the early break.
stickerdaniel and others added 12 commits March 30, 2026 18:34
feat: linkedin messaging, get sidebar profiles
…IDs (#300)

* fix(scraping): Respect --timeout for messaging, recognize thread URLs

Remove all hardcoded timeout=5000 from the send_message flow and
messaging helpers so they fall through to the page-level default
set from BrowserConfig.default_timeout (configurable via --timeout).

Also add /messaging/thread/ URL recognition to classify_link so
conversation thread references are captured when they appear in
search results or conversation detail views. Raise inbox reference
cap to 30 and add proper section context labels.

Resolves: #296
See also: #297

* fix(scraping): Extract conversation thread IDs from inbox via click-and-capture

LinkedIn's conversation sidebar uses JS click handlers instead of <a>
tags, so anchor extraction cannot capture thread IDs. Click each
conversation item and read the resulting SPA URL change to build
conversation references with thread_id and participant name.

Before: get_inbox returned 2 references (active conversation only)
After: get_inbox returns all conversation thread IDs (10+ refs)

Resolves: #297

* fix(scraping): Respect --timeout across all remaining scraping methods

Remove the remaining 10 hardcoded timeout=5000 from profile scraping,
connection flow, modal detection, sidebar profiles, conversation
resolution, and job search. All Playwright calls now use the page-level
default from BrowserConfig.default_timeout.

Resolves: #299

* fix: Address PR review feedback

- Use saved inbox URL instead of self._page.url (P1: wrong URL after clicks)
- Fix docstring to clarify 2s recipient-picker probe is intentional
- Replace class-name selectors with aria-label discovery + minimal class fallback
- Dedupe references after merging conversation and anchor refs
First-time uvx runs download ~77 Python packages including the 39MB
patchright wheel. On slow connections, uv's default 30s HTTP timeout
can cause silent failures before the server process starts.

Co-authored-by: Daniel Sticker <sticker@ngenn.net>
Move UV_HTTP_TIMEOUT=300 into the main uvx config example so it's the
default, not an optional troubleshooting step. Fix grammar in the
troubleshooting note.

Co-authored-by: Daniel Sticker <sticker@ngenn.net>
* docs: use @latest tag in uvx config for auto-updates

Without @latest, uvx caches the first downloaded version forever.
Adding @latest ensures uvx checks PyPI on each client launch and
pulls new versions automatically.

* docs: apply @latest consistently to all uvx invocations

Update --login examples in README.md and docs/docker-hub.md to use
linkedin-scraper-mcp@latest for consistency with the MCP config.

---------

Co-authored-by: Daniel Sticker <sticker@ngenn.net>
Two new action tools for job hunting:

apply_to_job:
- Automates LinkedIn's Easy Apply multi-step modal
- Confirm gate: set confirm_apply=true to actually submit
- Handles Next/Review/Submit button flow up to 15 steps
- Reports which fields need manual input when required fields are empty
- Distinguishes confirmed vs unconfirmed submission
- Scrolls Easy Apply button into view before clicking (sticky navbar fix)
- Statuses: applied, applied_unconfirmed, already_applied,
  not_easy_apply, confirmation_required, requires_input, apply_failed

save_job:
- Bookmarks a job posting for later review
- Verifies button state changed after clicking
- Statuses: saved, already_saved, save_unavailable

Both tools are annotated with destructiveHint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR adds apply_to_job and save_job tools that automate LinkedIn's Easy Apply modal and job bookmarking, respectively. All eight issues raised in the prior review round have been addressed: class-name selectors replaced with text-based detection, required-field checks moved before any button click (including Submit), radio-group detection added at the group level, multi-word confirmation phrases used to avoid job-title false positives, and dialog_text prioritised over page_text for the submit confirmation signal.

Confidence Score: 4/5

Safe to merge after considering the worst-case timeout risk; the logic and status reporting are solid.

All prior P0/P1 findings are resolved. One P2 remains: the loop's worst-case runtime (up to ~75 s) cuts close to the 90 s tool timeout, which could leave the dialog open mid-application on slow connections. Reducing max_steps or the per-step dialog timeout would eliminate this risk before merging.

linkedin_mcp_server/scraping/extractor.py — loop timing budget in apply_to_job

Important Files Changed

Filename Overview
linkedin_mcp_server/scraping/extractor.py Adds apply_to_job and save_job methods; prior review concerns (class selectors, false-positive text matches, radio group detection, required-field ordering, dialog confirmation scope) are all addressed; one potential timeout risk remains under worst-case step counts.
linkedin_mcp_server/tools/job.py Registers apply_to_job and save_job tools with destructiveHint/openWorldHint annotations; follows the same pattern as existing tools, error handling and progress reporting are consistent.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[apply_to_job called] --> B[Navigate to job URL]
    B --> C{Easy Apply button found?}
    C -- No --> D{Page text: 'You applied' / 'Application submitted'?}
    D -- Yes --> E[already_applied]
    D -- No --> F[not_easy_apply]
    C -- Yes --> G{confirm_apply?}
    G -- False --> H[confirmation_required]
    G -- True --> I[Click Easy Apply]
    I --> J{Dialog opens within 10s?}
    J -- No --> K[apply_failed]
    J -- Yes --> L[Loop: max 15 steps]
    L --> M{Dialog still open? timeout=3s}
    M -- No --> N{Page text matches submission phrases?}
    N -- Yes --> O[applied_unconfirmed]
    N -- No --> P[applied_unconfirmed dialog_closed_early]
    M -- Yes --> Q{Required fields empty?}
    Q -- Yes --> R[Dismiss + requires_input]
    Q -- No --> S{Submit button?}
    S -- Yes --> T[Click Submit, wait 2s]
    T --> U{dialog_text or page_text matches?}
    U -- Yes --> V[applied]
    U -- No --> W[applied_unconfirmed]
    S -- No --> X{Review / Next / Continue button?}
    X -- Yes --> Y[Click + continue loop]
    X -- No --> Z[apply_failed stuck]
    L -- exhausted --> AA[apply_failed max steps]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 1843-1845

Comment:
**Worst-case loop duration approaches the tool timeout**

With `max_steps=15`, each iteration sleeps 1 s then calls `_dialog_is_open(timeout=3000)` (up to 3 s), giving up to 60 s of loop time alone. Adding the 2 s submit sleep, ~10 s modal wait, and page navigation puts the worst case at ~75–80 s against a `TOOL_TIMEOUT_SECONDS` of 90 s. On a slow connection or a long form this will silently time out before the loop finishes, leaving the dialog open mid-application with no cleanup.

Consider either lowering `max_steps` (10 is plenty for any observed LinkedIn Easy Apply form), reducing `_dialog_is_open(timeout=…)` to 1000 ms inside the loop, or adding an explicit elapsed-time guard.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (9): Last reviewed commit: "fix: dismiss dialog on unconfirmed submi..." | Re-trigger Greptile

Comment thread linkedin_mcp_server/scraping/extractor.py
Comment thread linkedin_mcp_server/scraping/extractor.py
Comment thread linkedin_mcp_server/scraping/extractor.py
…t confirmation

- fix(extractor): replace LinkedIn CSS class-name selectors in
  already_applied check with page text check — avoids brittle selectors
  that violate CLAUDE.md and silently break when LinkedIn updates CSS
- fix(extractor): distinguish dialog-closed-early from max-steps in
  apply_failed error message using dialog_closed_early flag
- fix(extractor): check both dialog and main page text for submission
  confirmation after clicking Submit — LinkedIn shows the confirmation
  screen inside the still-open modal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread linkedin_mcp_server/scraping/extractor.py Outdated
Comment thread linkedin_mcp_server/scraping/extractor.py Outdated
- fix: change \bApplied\b to 'You applied|Application submitted' to
  avoid false positives on job titles like 'Applied ML Engineer';
  add re.IGNORECASE for consistency with all other re.search calls
- fix: skip individual radio inputs in required-field check (input.value
  is always set for radios regardless of selection); add group-level
  check using input[type=radio][required]:checked per named group

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread linkedin_mcp_server/scraping/extractor.py Outdated
Gabrcodes and others added 2 commits April 3, 2026 03:00
- fix(extractor): move required-fields check BEFORE Review/Next/Continue
  buttons in apply_to_job — LinkedIn always renders Next on steps with
  unfilled required fields, making requires_input unreachable when
  checked last; now evaluated first so the status fires correctly
- fix(extractor): scroll Save button into view before clicking in
  save_job to avoid sticky navbar obstruction (same fix as #304)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix(extractor): move required-fields check before Review/Next/Continue
  buttons — LinkedIn renders Next even on unfilled-required steps, so
  checking last makes requires_input dead code; now evaluated first
- fix(extractor): scroll Save button into view before clicking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread linkedin_mcp_server/scraping/extractor.py Outdated
…e radio name

- fix: move required_empty check before Submit button — EEO/disclosure
  questions on the final step now trigger requires_input instead of
  applied_unconfirmed
- fix: tighten dialog-closed-early regex to multi-word phrases to avoid
  false positives on job titles like 'Applied Machine Learning Engineer'
- fix: escape CSS-special chars in radio name attribute before
  interpolating into querySelector template literal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread linkedin_mcp_server/scraping/extractor.py
Gabrcodes and others added 3 commits April 3, 2026 04:04
- fix: post-Submit confirmation uses dialog_text as primary signal
  (LinkedIn shows explicit 'Application sent' in the dialog before
  closing). Falls back to page_text only when dialog is already gone.
  Multi-word phrases only to avoid matching 'Applied ML Engineer'
  in the job listing body.
- fix: correct Python backslash escaping for CSS selector injection
  guard — /\/g in Python source becomes /\/g in JS (match
  backslash), not /\/g (match forward-slash) as before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dialog_closed_early represents ambiguous outcome — the application may
or may not have been submitted. apply_failed implies definitive failure
which would mislead callers into retrying, risking duplicate submissions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: add aria-required="true" to required-field selectors in
  apply_to_job — LinkedIn SPA marks many fields as required via ARIA
  rather than HTML5 required attribute; both text fields and radio
  groups now check both patterns
- fix: remove dead saved_btn locator in save_job verification —
  unsave_btn already matches 'Saved' via its regex pattern, making
  the separate saved_btn check unreachable dead code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fix: call _dismiss_dialog() before returning applied_unconfirmed
  when submit confirmation cannot be verified — leaves browser state
  clean for subsequent tool calls (mirrors requires_input and stuck paths)
- fix: skip input[type='file'] in required-field check — browsers always
  report file.value as '' regardless of selection, so file upload fields
  would always trigger requires_input even when resume is on file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants