feat(scraping): add 8 new profile sections for comprehensive reading by Gabrcodes · Pull Request #311 · stickerdaniel/linkedin-mcp-server

Gabrcodes · 2026-04-02T22:42:33Z

Summary

Adds 8 new entries to PERSON_SECTIONS so that get_person_profile can scrape every profile section LinkedIn offers. No new extractor methods needed — they follow the existing /details/{section}/ URL pattern.

New Section	URL Suffix
skills	`/details/skills/`
certifications	`/details/certifications/`
volunteer	`/details/volunteering-experiences/`
projects	`/details/projects/`
publications	`/details/publications/`
courses	`/details/courses/`
recommendations	`/details/recommendations/`
organizations	`/details/organizations/`

Also exports ALL_PERSON_SECTION_NAMES from scraping/__init__.py for convenience.

Changes

scraping/fields.py — 8 new section entries + ALL_PERSON_SECTION_NAMES list
scraping/__init__.py — export the new constant
tools/person.py — updated docstring listing available sections
tests/test_fields.py — updated expected keys and all-sections test

Test plan

24/24 field tests pass
ruff check and ruff format pass

docs: sync manifest.json tools and features with current capabilities

…ance chore(deps): lock file maintenance

Lock file already has 3.1.0 since #166; align pyproject.toml floor to prevent accidental downgrades to v2. Resolves: #190

Lock file already has 3.1.0 since #166; align pyproject.toml floor to prevent accidental downgrades to v2. Resolves: #190  <h3>Greptile Summary</h3> This PR tightens the `fastmcp` minimum version constraint from `>=2.14.0` to `>=3.0.0` in `pyproject.toml` (and the corresponding `uv.lock` metadata), preventing any future resolver from backtracking to the incompatible v2 series. The lock file has already been pinning `fastmcp==3.1.0` since PR #166, so there is no runtime impact — this is purely a spec/metadata alignment. - `pyproject.toml`: `fastmcp` floor raised to `>=3.0.0` - `uv.lock`: `package.metadata.requires-dist` updated to match; the resolved package entry (`3.1.0`) is unchanged - No upper-bound cap (`<4.0.0`) is set, which is consistent with the project's existing open-ended constraints for all other dependencies <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it is a pure metadata alignment with no functional or runtime impact. - The locked version was already `3.1.0` before this PR; the only change is raising the declared floor to match. Both modified lines are trivially correct, consistent with each other, and have no side-effects on the installed environment. - No files require special attention. <h3>Important Files Changed</h3> | Filename | Overview | |----------|----------| | pyproject.toml | Single-line change updating the `fastmcp` floor constraint from `>=2.14.0` to `>=3.0.0`, aligning with the already-resolved version in the lock file. | | uv.lock | Auto-generated lock file metadata updated to reflect the new `>=3.0.0` specifier; the resolved `fastmcp` version (3.1.0) was already correct and unchanged. | </details> <h3>Flowchart</h3> ```mermaid %%{init: {'theme': 'neutral'}}%% flowchart TD A["pyproject.toml\nfastmcp >=3.0.0"] -->|uv resolves| B["uv.lock\nfastmcp 3.1.0 (pinned)"] B --> C["Installed environment\nfastmcp 3.1.0"] D["Old constraint\nfastmcp >=2.14.0"] -. "could resolve to" .-> E["fastmcp 2.x\n(incompatible)"] style D fill:#f9d0d0,stroke:#c00 style E fill:#f9d0d0,stroke:#c00 style A fill:#d0f0d0,stroke:#060 style B fill:#d0f0d0,stroke:#060 style C fill:#d0f0d0,stroke:#060 ``` <sub>Last reviewed commit: 7d2363e</sub>

Replace dict-returning handle_tool_error() with raise_tool_error() that raises FastMCP ToolError for known exceptions. Unknown exceptions re-raise as-is for mask_error_details=True to handle. Resolves: #185

Add logger.error with exc_info for unknown exceptions before re-raising, and add test coverage for AuthenticationError and ElementNotFoundError.

Re-add optional context parameter to raise_tool_error() for log correlation, and add test for base LinkedInScraperException branch.

Add catch-all comment on base exception branch and NoReturn inline comments on all raise_tool_error() call sites.

…mcp_constraint_to_3.0.0 refactor(error-handler): replace handle_tool_error with ToolError

Replace repeated ensure_authenticated/get_or_create_browser/ LinkedInExtractor boilerplate in all 6 tool functions with FastMCP Depends()-based dependency injection via a single get_extractor() factory in dependencies.py. Resolves: #186

Updated the get_extractor function to route errors through raise_tool_error, ensuring that MCP clients receive structured ToolError responses for authentication failures. Added a test to verify that authentication errors are correctly handled and produce the expected ToolError response.

…epends_to_inject_extractor refactor(tools): Use Depends() to inject extractor

…c (#196)

Replace ToolAnnotations(...) with plain dicts, move title to top-level @mcp.tool() param, and add category tags to all tools. Resolves: #189

Replace ToolAnnotations(...) with plain dicts, move title to top-level @mcp.tool() param, and add category tags to all tools. Resolves: #189  <h3>Greptile Summary</h3> This PR is a clean, well-scoped refactoring that modernises tool metadata across all four changed files to align with the FastMCP 3.x API. It introduces no functional or behavioural changes. Key changes: - Removes the `ToolAnnotations(...)` Pydantic wrapper in `company.py`, `job.py`, and `person.py`, replacing it with plain `dict` syntax for the `annotations` parameter — the simpler form supported by FastMCP 3.x. - Moves `title` from inside `ToolAnnotations` to a top-level keyword argument on `@mcp.tool()`, matching the updated FastMCP 3.x decorator signature. - Drops the now-redundant `destructiveHint=False` from all read-only tools. Per the MCP spec, `destructiveHint` is only meaningful when `readOnlyHint` is `false`, so omitting it from tools that already declare `readOnlyHint=True` is semantically equivalent. - Adds `tags` (as Python `set` literals) to every tool for categorisation (`"company"`, `"job"`, `"person"`, `"scraping"`, `"search"`, `"session"`). - Enriches the previously unannotated `close_session` tool in `server.py` with a title, `destructiveHint=True`, and the `"session"` tag — accurately describing its destructive nature. The existing test suite in `tests/test_tools.py` covers all tool functions but does not assert on annotation metadata, so no test changes are required. The refactoring is consistent across all tool files and fits naturally within the project's layered registration pattern. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it is a pure metadata/annotation refactoring with no changes to tool logic, inputs, outputs, or error handling. - All changes are limited to decorator parameters (`title`, `annotations`, `tags`). The `annotations` dict values are semantically equivalent to the removed `ToolAnnotations` objects, `destructiveHint=False` is correctly dropped only for `readOnlyHint=True` tools, and the new `close_session` annotations accurately reflect its destructive nature. No business logic, scraping behaviour, or error paths were altered. - No files require special attention. <h3>Flowchart</h3> ```mermaid %%{init: {'theme': 'neutral'}}%% flowchart TD A["@mcp.tool() decorator"] --> B{Annotation style} B -->|Before| C["ToolAnnotations(title=..., readOnlyHint=..., destructiveHint=False, openWorldHint=...)"] B -->|After| D["title='...' (top-level param)\nannotations={'readOnlyHint': True, 'openWorldHint': True}\ntags={'category', 'type'}"] D --> E["person tools\n(get_person_profile, search_people)"] D --> F["company tools\n(get_company_profile, get_company_posts)"] D --> G["job tools\n(get_job_details, search_jobs)"] D --> H["session tool\n(close_session)\nannotations={'destructiveHint': True}"] ``` <sub>Last reviewed commit: c5bf554</sub>

…pans

Use lowercase dict instead of Dict, add auth validation log line

…t_lifespan_into_composable_browser_auth_lifespans refactor(server): Split lifespan into composable browser + auth lifespans

# Conflicts: # linkedin_mcp_server/server.py # linkedin_mcp_server/tools/company.py # linkedin_mcp_server/tools/job.py # linkedin_mcp_server/tools/person.py

# Conflicts: # linkedin_mcp_server/server.py

…_timeouts feat(tools): add global 90s tool timeouts

…_jobs Extract job IDs from href attributes (the one thing innerText can't capture), scroll the job sidebar instead of the main page, and paginate through multiple result pages with dynamic offsets. Resolves: #195

chore(deps): update ci dependencies

- Replace custom _secure_profile_dirs/_set_private_mode with thin _harden_linkedin_tree that uses secure_mkdir from common_utils - Fix export_storage_state: chmod 0o600 after Playwright writes - Add test for export_storage_state permission hardening - Add test for no-op outside .linkedin-mcp tree - Revert unrelated loaders.py change

Harden .linkedin-mcp profile/cookie permissions

- Remove unused selector constants (_MESSAGING_THREAD_LINK_SELECTOR, _MESSAGING_RESULT_ITEM_SELECTOR, _MESSAGING_SEND_SELECTOR) - Remove dead _conversation_thread_cache (new extractor per tool call) - Add AuthenticationError handling to get_sidebar_profiles and all messaging tools - Pass CSS selector as evaluate() arg instead of f-string interpolation - Replace deprecated execCommand with press_sequentially - Guard sidebar container walk against depth-limit exhaustion - Update scrape_person docstring to document profile_urn return key - Add messaging tools to README tool-status table

LinkedIn redirects /messaging/ to the most recent thread; capture baseline_thread_id after the SPA settles so search-selected threads can be distinguished from the auto-opened one.

feat: linkedin messaging, get sidebar profiles

…IDs (#300) * fix(scraping): Respect --timeout for messaging, recognize thread URLs Remove all hardcoded timeout=5000 from the send_message flow and messaging helpers so they fall through to the page-level default set from BrowserConfig.default_timeout (configurable via --timeout). Also add /messaging/thread/ URL recognition to classify_link so conversation thread references are captured when they appear in search results or conversation detail views. Raise inbox reference cap to 30 and add proper section context labels. Resolves: #296 See also: #297 * fix(scraping): Extract conversation thread IDs from inbox via click-and-capture LinkedIn's conversation sidebar uses JS click handlers instead of <a> tags, so anchor extraction cannot capture thread IDs. Click each conversation item and read the resulting SPA URL change to build conversation references with thread_id and participant name. Before: get_inbox returned 2 references (active conversation only) After: get_inbox returns all conversation thread IDs (10+ refs) Resolves: #297 * fix(scraping): Respect --timeout across all remaining scraping methods Remove the remaining 10 hardcoded timeout=5000 from profile scraping, connection flow, modal detection, sidebar profiles, conversation resolution, and job search. All Playwright calls now use the page-level default from BrowserConfig.default_timeout. Resolves: #299 * fix: Address PR review feedback - Use saved inbox URL instead of self._page.url (P1: wrong URL after clicks) - Fix docstring to clarify 2s recipient-picker probe is intentional - Replace class-name selectors with aria-label discovery + minimal class fallback - Dedupe references after merging conversation and anchor refs

First-time uvx runs download ~77 Python packages including the 39MB patchright wheel. On slow connections, uv's default 30s HTTP timeout can cause silent failures before the server process starts. Co-authored-by: Daniel Sticker <sticker@ngenn.net>

Move UV_HTTP_TIMEOUT=300 into the main uvx config example so it's the default, not an optional troubleshooting step. Fix grammar in the troubleshooting note. Co-authored-by: Daniel Sticker <sticker@ngenn.net>

@latest

* docs: use @latest tag in uvx config for auto-updates Without @latest, uvx caches the first downloaded version forever. Adding @latest ensures uvx checks PyPI on each client launch and pulls new versions automatically. * docs: apply @latest consistently to all uvx invocations Update --login examples in README.md and docs/docker-hub.md to use linkedin-scraper-mcp@latest for consistency with the MCP config. --------- Co-authored-by: Daniel Sticker <sticker@ngenn.net>

Add skills, certifications, volunteer, projects, publications, courses, recommendations, and organizations to PERSON_SECTIONS. These map to LinkedIn's /details/{section}/ URLs and follow the existing extraction pattern — no new extractor methods needed. Also exports ALL_PERSON_SECTION_NAMES for convenience when scraping every section at once. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-04-02T22:45:01Z

Greptile Summary

This PR adds 8 new entries to PERSON_SECTIONS in scraping/fields.py (skills, certifications, volunteer, projects, publications, courses, recommendations, organizations), each following the established /details/{section}/ URL navigation pattern. It also introduces an ALL_PERSON_SECTION_NAMES convenience constant that contains every requestable section name (all keys except main_profile), which is exported from scraping/__init__.py. The docstring in tools/person.py is updated to advertise the new sections, and both test files are updated to cover the expanded dict and the new constant.

Key points:

All 8 new sections follow the exact same (url_suffix, is_overlay=False) shape as the existing non-overlay sections — no new scraping logic is needed.
ALL_PERSON_SECTION_NAMES correctly excludes main_profile (which is always implicitly included by parse_person_sections), and its semantics are verified by the new test_all_person_section_names_excludes_main_profile test.
test_all_sections in test_fields.py now derives the join string from the constant itself, preventing future drift.
test_all_sections_visit_all_urls in test_scraping.py now drives the section set directly from PERSON_SECTIONS, so the count assertion (15 page + 1 overlay) stays in sync automatically.

Confidence Score: 5/5

This PR is safe to merge — it is a purely additive, low-risk change with no logic modifications and comprehensive test coverage.

All changes follow the existing PERSON_SECTIONS pattern exactly. No new scraper logic is introduced, and the only runtime effect is additional page navigations when callers explicitly request the new section names. Tests are thorough: the hard-coded section lists are replaced with data-driven derivations, a new structural invariant test is added, and URL assertions cover every new suffix.

No files require special attention.

Important Files Changed

Filename	Overview
linkedin_mcp_server/scraping/fields.py	Added 8 new section entries following the existing pattern and introduced `ALL_PERSON_SECTION_NAMES`; logic is correct and consistent.
linkedin_mcp_server/scraping/init.py	Exports the new `ALL_PERSON_SECTION_NAMES` constant; straightforward and correct.
linkedin_mcp_server/tools/person.py	Docstring updated to list all new sections; example string updated to show `skills`; no logic changes.
tests/test_fields.py	Expected-keys test updated, `test_all_sections` now uses the constant, and a new structural invariant test for `ALL_PERSON_SECTION_NAMES` is added.
tests/test_scraping.py	Hard-coded section set replaced with `set(PERSON_SECTIONS)`, URL assertions added for each new section, and the page-count assertion updated to 15; all changes are consistent with the implementation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[get_person_profile\ncalled with sections string] --> B[parse_person_sections]
    B --> C{section name\nin PERSON_SECTIONS?}
    C -- yes --> D[add to requested set]
    C -- no --> E[add to unknown list\nlog warning]
    D --> F{is_overlay?}
    F -- False --> G[extract_page\n/in/username/suffix/]
    F -- True --> H[_extract_overlay\n/in/username/overlay/contact-info/]
    G --> I[sections result dict]
    H --> I

    subgraph PERSON_SECTIONS [PERSON_SECTIONS - 16 entries]
        P0[main_profile]
        P1[experience]
        P2[education]
        P3[skills NEW]
        P4[certifications NEW]
        P5[volunteer NEW]
        P6[projects NEW]
        P7[publications NEW]
        P8[courses NEW]
        P9[recommendations NEW]
        P10[organizations NEW]
        P11[interests]
        P12[honors]
        P13[languages]
        P14[contact_info overlay=True]
        P15[posts]
    end

_{Reviews (5): Last reviewed commit: "style: move PERSON_SECTIONS import to mo..." | Re-trigger Greptile}

… test - test_all_sections now derives section names from ALL_PERSON_SECTION_NAMES so future additions only need a single update in fields.py - Add test_all_person_section_names_excludes_main_profile to verify the constant excludes main_profile and matches PERSON_SECTIONS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Uses PERSON_SECTIONS/ALL_PERSON_SECTION_NAMES to derive the full set dynamically. Updates page count from 7 to 15 and adds URL assertions for all 8 new sections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

stickerdaniel and others added 30 commits March 4, 2026 19:43

docs(manifest): Add people search to top-level description

fe3d487

docs(manifest): Add people, search, posts keywords

414b3f0

Merge pull request #183 from ConnorMoss02/docs/manifest-sync-tools-177

df042e7

docs: sync manifest.json tools and features with current capabilities

chore(deps): lock file maintenance

6b2a32a

Merge pull request #166 from stickerdaniel/renovate/lock-file-mainten…

9c5e841

…ance chore(deps): lock file maintenance

chore(deps): bump fastmcp constraint to >=3.0.0

84d8a6b

Lock file already has 3.1.0 since #166; align pyproject.toml floor to prevent accidental downgrades to v2. Resolves: #190

refactor(error-handler): replace handle_tool_error with ToolError

438766a

Replace dict-returning handle_tool_error() with raise_tool_error() that raises FastMCP ToolError for known exceptions. Unknown exceptions re-raise as-is for mask_error_details=True to handle. Resolves: #185

fix(error-handler): add logging for unknown exceptions and missing tests

89e3019

Add logger.error with exc_info for unknown exceptions before re-raising, and add test coverage for AuthenticationError and ElementNotFoundError.

fix(error-handler): restore tool context in logs and add missing test

01f8d70

Re-add optional context parameter to raise_tool_error() for log correlation, and add test for base LinkedInScraperException branch.

style(error-handler): Add clarifying comments

6ea679b

Add catch-all comment on base exception branch and NoReturn inline comments on all raise_tool_error() call sites.

Merge pull request #192 from stickerdaniel/03-04-chore_deps_bump_fast…

1276925

…mcp_constraint_to_3.0.0 refactor(error-handler): replace handle_tool_error with ToolError

style(deps): Soften get_extractor docstring

c36d754

docs(deps): Fix operation order in docstring

de52fcf

Merge pull request #194 from stickerdaniel/03-04-refactor_tools_use_d…

e95e867

…epends_to_inject_extractor refactor(tools): Use Depends() to inject extractor

chore(config): Update model and provider settings in btca.config.jsonc

f13c900

chore(config): Update model and provider settings in btca.config.json…

b566d90

…c (#196)

refactor(tools): Simplify annotations to dict syntax and add tags

846e5e2

Replace ToolAnnotations(...) with plain dicts, move title to top-level @mcp.tool() param, and add category tags to all tools. Resolves: #189

refactor(server): Split lifespan into composable browser + auth lifes…

ba4f312

…pans

style(server): Address Greptile review feedback

fd8373c

Use lowercase dict instead of Dict, add auth validation log line

Merge pull request #199 from stickerdaniel/03-05-refactor_server_spli…

a6ac72e

…t_lifespan_into_composable_browser_auth_lifespans refactor(server): Split lifespan into composable browser + auth lifespans

feat(tools): add global 60s tool timeouts

1871483

# Conflicts: # linkedin_mcp_server/server.py # linkedin_mcp_server/tools/company.py # linkedin_mcp_server/tools/job.py # linkedin_mcp_server/tools/person.py

fix(tools): raise global timeout to 90s

bb7d820

# Conflicts: # linkedin_mcp_server/server.py # linkedin_mcp_server/tools/company.py # linkedin_mcp_server/tools/job.py # linkedin_mcp_server/tools/person.py

refactor(tools): centralize tool timeout constant

4cf4eae

# Conflicts: # linkedin_mcp_server/server.py

docs: reduce timeout feature emphasis

95fe5f0

Merge pull request #197 from stickerdaniel/03-05-feat_global_60s_tool…

071be1d

…_timeouts feat(tools): add global 90s tool timeouts

stickerdaniel and others added 21 commits March 30, 2026 15:28

Merge branch 'main' into bug-279-secure-profile-perms

912df60

Merge pull request #292 from stickerdaniel/renovate/ci-dependencies

9b17f8f

chore(deps): update ci dependencies

Merge branch 'main' into aspectrr/linkedin-connect

3b927c5

Merge branch 'main' into bug-279-secure-profile-perms

cdef4b2

Merge pull request #283 from shuofengzhang/bug-279-secure-profile-perms

7c4614f

Harden .linkedin-mcp profile/cookie permissions

Merge branch 'main' into aspectrr/linkedin-connect

a9549d9

fix: Add baseline comment for messaging auto-redirect

9bb2961

LinkedIn redirects /messaging/ to the most recent thread; capture baseline_thread_id after the SPA settles so search-selected threads can be distinguished from the auto-opened one.

Merge pull request #291 from aspectrr/aspectrr/linkedin-connect

d840ecc

feat: linkedin messaging, get sidebar profiles

chore: Bump version to 4.8.0

8daca4a

chore: Bump version to 4.8.0 (#294)

501c136

chore: update manifest.json and docker-compose.yml to v4.8.0 [skip ci]

c257bcb

chore: Bump version to 4.8.1 (#301)

e1a106d

chore: update manifest.json and docker-compose.yml to v4.8.1 [skip ci]

89a5e99

docs: add UV_HTTP_TIMEOUT to default uvx config (#305)

b352721

Move UV_HTTP_TIMEOUT=300 into the main uvx config example so it's the default, not an optional troubleshooting step. Fix grammar in the troubleshooting note. Co-authored-by: Daniel Sticker <sticker@ngenn.net>

ci: remove unused workflow

fd80f60

greptile-apps Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread tests/test_fields.py

Gabrcodes and others added 4 commits April 3, 2026 01:19

fix: remove unused ALL_PERSON_SECTION_NAMES import in test_scraping

fba3f84

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: move PERSON_SECTIONS import to module level in test_scraping

a57a00b

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Gabrcodes mentioned this pull request Apr 3, 2026

feat(account): add get_my_profile, get_my_profile_full, get_saved_jobs, get_my_applications #312

Closed

2 tasks

stickerdaniel force-pushed the main branch from fd80f60 to 7661f43 Compare April 3, 2026 07:59

Gabrcodes closed this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scraping): add 8 new profile sections for comprehensive reading#311

feat(scraping): add 8 new profile sections for comprehensive reading#311
Gabrcodes wants to merge 768 commits into
stickerdaniel:mainfrom
Gabrcodes:feat/profile-sections

Gabrcodes commented Apr 2, 2026

Uh oh!

greptile-apps Bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

Gabrcodes commented Apr 2, 2026

Summary

Changes

Test plan

Uh oh!

greptile-apps Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

greptile-apps Bot commented Apr 2, 2026 •

edited

Loading