feat(search_people): add network and current_company filters (#248)#384
Conversation
Extends the existing search_people tool with two optional params that LinkedIn's people-search URL already accepts: - network: list of "F" / "S" / "O" tokens for 1st / 2nd / 3rd+ degree connection filter. Invalid tokens raise ValueError. - current_company: company name (or URN id) passed to LinkedIn's currentCompany facet. Encoded via a small list-facet helper that produces URLs of the form network=%5B%22F%22%5D and currentCompany=%5B%22Weber+Inc%22%5D. Addresses the "who in my 1st-degree network works at <company>" half of stickerdaniel#248. Industry filter tracked as a follow-up.
|
Thanks! Let me know when it's ready for review |
|
Ready for review @stickerdaniel - all CI green (lint-and-check, test, Socket Security). Thanks! |
Greptile SummaryThis PR adds two optional filter parameters —
Confidence Score: 5/5Safe to merge — the change is additive, all new params are optional with All new parameters default to No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as MCP Client
participant Tool as person.py (search_people)
participant Extractor as extractor.py (search_people)
participant LinkedIn as LinkedIn Search URL
Client->>Tool: search_people(keywords, network, current_company)
Tool->>Extractor: "extractor.search_people(keywords, location, network=, current_company=)"
alt network tokens invalid
Extractor-->>Tool: raise FilterValidationError
Tool-->>Client: raise ToolError (message preserved)
else current_company not numeric URN
Extractor-->>Tool: raise FilterValidationError
Tool-->>Client: raise ToolError (message preserved)
else params valid
Extractor->>Extractor: build params + network + currentCompany
Extractor->>LinkedIn: extract_page(url)
LinkedIn-->>Extractor: page content
Extractor-->>Tool: "{url, sections}"
Tool-->>Client: "{url, sections}"
end
Reviews (4): Last reviewed commit: "fix(search_people): reject non-numeric c..." | Re-trigger Greptile |
|
Live-tested all three network tokens against my account. F/S/O filter
Two things before merge:
Validation depends on |
- Soften get_company_profile docstring: "may include" company_urn (anchor isn't on every company page, e.g. very small / brand-new ones) - Reword company_urn description in tools/company.py docstring, manifest.json, README, and docs/docker-hub.md to point at LinkedIn's currentCompany URL facet rather than the search_people.current_company parameter (which doesn't exist on main today; arrives via #384) - Defensive _FIRST_URN_RE: accept both quoted-string and bare-integer JSON list elements; new test covers the unquoted form - Comment why normalize_reference re-extracts the urn from the canonical url instead of threading it through classify_link
Two follow-ups from PR stickerdaniel#384 review: - _encode_list_facet now uses json.dumps with compact separators instead of manual string interpolation. Same URL output for well-behaved tokens, but values containing quotes/backslashes (e.g. Weber "Big" Inc) now produce valid JSON instead of malformed output. Greptile P2 from the inline review. - current_company docstrings (extractor + tool) now state that the currentCompany facet only filters on numeric URN ids; plain company names are accepted by the URL but ignored by LinkedIn. Points users to get_company_profile.references["about"] for URN lookup, per @stickerdaniel's findings and merged stickerdaniel#430.
|
Pushed a327845:
|
A plain-text company name is silently ignored by LinkedIn's currentCompany facet, returning unfiltered results without warning. Mirror the existing network-token validation: raise up front, with a hint pointing at get_company_profile -> references["about"] for URN lookup (per stickerdaniel#430). Both validations now raise FilterValidationError (a ValueError subclass — backward compatible). The tool wrapper maps this to ToolError so the helpful message reaches the MCP client instead of being collapsed to "Error calling tool" by mask_error_details. Same fix retroactively surfaces the existing network=["X"] error for the same reason.
Two follow-ups from PR stickerdaniel#384 review: - _encode_list_facet now uses json.dumps with compact separators instead of manual string interpolation. Same URL output for well-behaved tokens, but values containing quotes/backslashes (e.g. Weber "Big" Inc) now produce valid JSON instead of malformed output. Greptile P2 from the inline review. - current_company docstrings (extractor + tool) now state that the currentCompany facet only filters on numeric URN ids; plain company names are accepted by the URL but ignored by LinkedIn. Points users to get_company_profile.references["about"] for URN lookup, per @stickerdaniel's findings and merged stickerdaniel#430.
Summary
search_people:network(1st / 2nd / 3rd degree filter, tokens"F"/"S"/"O") andcurrent_company(current-employer filter, accepts a company name or URN id).tools/person.pyandscraping/extractor.pyonto the existing LinkedIn search URL. Encodes the two facets asnetwork=%5B%22F%22%5DandcurrentCompany=%5B%22Weber+Inc%22%5Dvia a small list-facet helper.networktokens raiseValueErrorat the extractor boundary.Closes #248 (company-filter half; the industry filter from the second half is a natural follow-up PR once this shape is accepted).
Motivation
Concrete user story that motivated this: a user has Jennifer Bonuso (President Americas at Weber Inc) as a 1st-degree connection, but
search_people("Weber Inc")returns only 2nd-degree results and omits her. After this PR,search_people(keywords="Weber Inc", network=["F"])constrains the results page to 1st-degree.Before:
After (with the new params):
Prior Art
Inspired by:
This PR delivers the core of #152 and part of #320 as a minimal, current-main-aligned diff that follows the
search_jobsfilter pattern already inextractor.py.Testing
uv run pytest- 395 passed locally, including 6 new extractor tests and 1 new tool test covering single-degree, multi-degree, current_company, invalid token, and combined-filter URL construction.uv run ruff check .- cleanuv run ruff format .- cleanuv run ty check- cleanuv run pre-commit run --all-files- cleanTests exercise URL construction only. Live verification against LinkedIn's rendered page is not included in this PR since it requires an authenticated browser session; happy to add notes from a manual live run if you want that before merge.
Out of scope
Synthetic prompt
Generated with Claude Opus 4.7 (1M context)