feat(analytics): server-side Plausible tracking for og:image requests#3174
feat(analytics): server-side Plausible tracking for og:image requests#3174MarkusNeusinger merged 2 commits intomainfrom
Conversation
…ests Track og:image requests from social media bots (Twitter, WhatsApp, Teams, etc.) with platform detection. Bots don't execute JavaScript, so server-side tracking is required for complete analytics. - Add api/analytics.py with fire-and-forget Plausible tracking - Add /og/home.png and /og/catalog.png endpoints with tracking - Route all og:images through API for complete tracking coverage - Detect 25 platforms from User-Agent (social, messaging, search, link preview) - Track filter params for shared filtered URLs (filter_lib, filter_dom, etc.) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR adds server-side Plausible analytics tracking for og:image requests from social media bots, which cannot execute JavaScript. The implementation uses a fire-and-forget pattern to avoid delaying image responses.
Key Changes:
- Server-side tracking infrastructure with platform detection for 25 bot types
- New API endpoints for home and catalog page og:images with tracking
- Query parameter passthrough for tracking shared filtered URLs
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
api/analytics.py |
New module implementing server-side Plausible tracking with platform detection and fire-and-forget event sending |
api/routers/og_images.py |
Added /og/home.png and /og/catalog.png endpoints with tracking; integrated tracking into existing spec/impl endpoints |
api/routers/seo.py |
Updated default image URLs to route through API for tracking; added query parameter passthrough for filter tracking |
tests/unit/api/test_routers.py |
Updated test assertions to expect new API-routed og:image URLs |
docs/architecture/plausible.md |
Documented new og_image_view event, platform detection, and tracking architecture |
| """Server-side Plausible Analytics for og:image tracking. | ||
|
|
||
| Tracks og:image requests from social media bots (Twitter, WhatsApp, etc.) | ||
| since bots don't execute JavaScript and can't be tracked client-side. | ||
|
|
||
| Uses fire-and-forget pattern to avoid delaying responses. | ||
| """ | ||
|
|
||
| import asyncio | ||
| import logging | ||
|
|
||
| import httpx | ||
| from fastapi import Request | ||
|
|
||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| PLAUSIBLE_ENDPOINT = "https://plausible.io/api/event" | ||
| DOMAIN = "pyplots.ai" | ||
|
|
||
| # All platforms from nginx.conf bot detection (25 total) | ||
| PLATFORM_PATTERNS = { | ||
| # Social Media | ||
| "twitter": "twitterbot", | ||
| "facebook": "facebookexternalhit", | ||
| "linkedin": "linkedinbot", | ||
| "pinterest": "pinterestbot", | ||
| "reddit": "redditbot", | ||
| "tumblr": "tumblr", | ||
| "mastodon": "mastodon", | ||
| # Messaging Apps | ||
| "slack": "slackbot", | ||
| "discord": "discordbot", | ||
| "telegram": "telegrambot", | ||
| "whatsapp": "whatsapp", | ||
| "signal": "signal", | ||
| "viber": "viber", | ||
| "skype": "skypeuripreview", | ||
| "teams": "microsoft teams", | ||
| "snapchat": "snapchat", | ||
| # Search Engines | ||
| "google": "googlebot", | ||
| "bing": "bingbot", | ||
| "yandex": "yandexbot", | ||
| "duckduckgo": "duckduckbot", | ||
| "baidu": "baiduspider", | ||
| "apple": "applebot", | ||
| # Link Preview Services | ||
| "embedly": "embedly", | ||
| "quora": "quora link preview", | ||
| "outbrain": "outbrain", | ||
| "rogerbot": "rogerbot", | ||
| "showyoubot": "showyoubot", | ||
| } | ||
|
|
||
|
|
||
| def detect_platform(user_agent: str) -> str: | ||
| """Detect platform from User-Agent string. | ||
|
|
||
| Args: | ||
| user_agent: The User-Agent header value | ||
|
|
||
| Returns: | ||
| Platform name (e.g., 'twitter', 'whatsapp') or 'unknown' | ||
| """ | ||
| ua_lower = user_agent.lower() | ||
| for platform, pattern in PLATFORM_PATTERNS.items(): | ||
| if pattern in ua_lower: | ||
| return platform | ||
| return "unknown" | ||
|
|
||
|
|
||
| async def _send_plausible_event(user_agent: str, client_ip: str, name: str, url: str, props: dict) -> None: | ||
| """Internal: Send event to Plausible (called as background task). | ||
|
|
||
| Args: | ||
| user_agent: Original User-Agent header | ||
| client_ip: Client IP for geolocation | ||
| name: Event name | ||
| url: Page URL | ||
| props: Event properties | ||
| """ | ||
| try: | ||
| async with httpx.AsyncClient(timeout=5.0) as client: | ||
| await client.post( | ||
| PLAUSIBLE_ENDPOINT, | ||
| headers={"User-Agent": user_agent, "X-Forwarded-For": client_ip, "Content-Type": "application/json"}, | ||
| json={"name": name, "url": url, "domain": DOMAIN, "props": props}, | ||
| ) | ||
| except Exception as e: | ||
| logger.debug(f"Plausible tracking failed (non-critical): {e}") | ||
|
|
||
|
|
||
| def track_og_image( | ||
| request: Request, | ||
| page: str, | ||
| spec: str | None = None, | ||
| library: str | None = None, | ||
| filters: dict[str, str] | None = None, | ||
| ) -> None: | ||
| """Track og:image request (fire-and-forget). | ||
|
|
||
| Sends event to Plausible in background without blocking response. | ||
|
|
||
| Args: | ||
| request: FastAPI request for headers | ||
| page: Page type ('home', 'catalog', 'spec_overview', 'spec_detail') | ||
| spec: Spec ID (optional) | ||
| library: Library ID (optional) | ||
| filters: Query params for filtered home page (e.g., {'lib': 'plotly', 'dom': 'statistics'}) | ||
| """ | ||
| user_agent = request.headers.get("user-agent", "") | ||
| client_ip = request.headers.get("x-forwarded-for", request.client.host if request.client else "") | ||
| platform = detect_platform(user_agent) | ||
|
|
||
| # Build URL based on page type | ||
| if page == "home": | ||
| url = "https://pyplots.ai/" | ||
| elif page == "catalog": | ||
| url = "https://pyplots.ai/catalog" | ||
| elif library: | ||
| url = f"https://pyplots.ai/{spec}/{library}" | ||
| else: | ||
| url = f"https://pyplots.ai/{spec}" | ||
|
|
||
| props: dict[str, str] = {"page": page, "platform": platform} | ||
| if spec: | ||
| props["spec"] = spec | ||
| if library: | ||
| props["library"] = library | ||
| if filters: | ||
| # Add each filter as separate prop (e.g., filter_lib, filter_dom) | ||
| # This handles comma-separated values like lib=plotly,matplotlib | ||
| for key, value in filters.items(): | ||
| props[f"filter_{key}"] = value | ||
|
|
||
| # Fire-and-forget: create task without awaiting | ||
| asyncio.create_task(_send_plausible_event(user_agent, client_ip, "og_image_view", url, props)) |
There was a problem hiding this comment.
Missing test coverage for the new analytics module. The detect_platform function, track_og_image function, and _send_plausible_event function lack unit tests. Since this repository uses comprehensive automated testing, tests should be added to verify platform detection logic, proper event property construction, and error handling in the Plausible API call.
| @router.get("/home.png") | ||
| async def get_home_og_image(request: Request) -> Response: | ||
| """OG image for home page with tracking. | ||
|
|
||
| Supports filter params (e.g., ?lib=plotly&dom=statistics) for tracking shared filtered URLs. | ||
| """ | ||
| # Capture filter params for tracking (e.g., ?lib=plotly&dom=statistics) | ||
| filters = dict(request.query_params) if request.query_params else None | ||
| track_og_image(request, page="home", filters=filters) | ||
|
|
||
| return Response( | ||
| content=_get_static_og_image(), media_type="image/png", headers={"Cache-Control": "public, max-age=86400"} | ||
| ) | ||
|
|
||
|
|
||
| @router.get("/catalog.png") | ||
| async def get_catalog_og_image(request: Request) -> Response: | ||
| """OG image for catalog page with tracking.""" | ||
| track_og_image(request, page="catalog") | ||
|
|
||
| return Response( | ||
| content=_get_static_og_image(), media_type="image/png", headers={"Cache-Control": "public, max-age=86400"} | ||
| ) |
There was a problem hiding this comment.
Missing test coverage for new endpoints. The get_home_og_image and get_catalog_og_image endpoints lack unit tests to verify tracking integration, filter parameter handling, and proper response headers. Given the comprehensive test coverage in this repository, tests should be added to the TestOgImagesRouter class.
| elif library: | ||
| url = f"https://pyplots.ai/{spec}/{library}" | ||
| else: | ||
| url = f"https://pyplots.ai/{spec}" |
There was a problem hiding this comment.
Potential bug in URL construction for spec_overview pages. When page is 'spec_overview' but library is None, the code falls through to line 124 which constructs the URL as f"https://pyplots.ai/{spec}". However, if spec is also None (which is possible based on the function signature), this will create an invalid URL "https://pyplots.ai/None". Add validation to ensure spec is not None when constructing spec-based URLs.
| elif library: | |
| url = f"https://pyplots.ai/{spec}/{library}" | |
| else: | |
| url = f"https://pyplots.ai/{spec}" | |
| elif spec is not None and library: | |
| url = f"https://pyplots.ai/{spec}/{library}" | |
| elif spec is not None: | |
| url = f"https://pyplots.ai/{spec}" | |
| else: | |
| # Fallback: missing spec for a spec-based page, use home URL to avoid invalid "None" path | |
| url = "https://pyplots.ai/" |
| query_string = str(request.query_params) if request.query_params else "" | ||
| image_url = f"{DEFAULT_HOME_IMAGE}?{query_string}" if query_string else DEFAULT_HOME_IMAGE | ||
| page_url = f"https://pyplots.ai/?{query_string}" if query_string else "https://pyplots.ai/" |
There was a problem hiding this comment.
Potential XSS vulnerability through unvalidated query parameters in og:image URL. The query parameters are passed directly into the HTML meta tags without sanitization. While request.query_params is from FastAPI and should be safe, the string conversion and concatenation could introduce edge cases. Consider using urllib.parse.urlencode to properly encode query parameters, or at minimum validate/sanitize the query string before including it in HTML attributes.
| global _STATIC_OG_IMAGE | ||
| if _STATIC_OG_IMAGE is None: | ||
| path = Path(__file__).parent.parent.parent / "app" / "public" / "og-image.png" | ||
| _STATIC_OG_IMAGE = path.read_bytes() |
There was a problem hiding this comment.
Missing error handling for file not found. If the static og-image.png file doesn't exist at the expected path, this function will raise an unhandled exception. Add a try-except block to handle FileNotFoundError and provide a meaningful error message or fallback behavior.
| _STATIC_OG_IMAGE = path.read_bytes() | |
| try: | |
| _STATIC_OG_IMAGE = path.read_bytes() | |
| except FileNotFoundError as exc: | |
| raise HTTPException(status_code=500, detail="Static OG image not found") from exc |
| props[f"filter_{key}"] = value | ||
|
|
||
| # Fire-and-forget: create task without awaiting | ||
| asyncio.create_task(_send_plausible_event(user_agent, client_ip, "og_image_view", url, props)) |
There was a problem hiding this comment.
Fire-and-forget task created without retaining a reference, which can lead to warnings and potential task cleanup issues. The created task can be garbage collected while still running, and exceptions in the task won't be logged. Consider using asyncio.create_task with a strong reference stored in a set, or use FastAPI's BackgroundTasks for better task lifecycle management.
| return "unknown" | ||
|
|
||
|
|
||
| async def _send_plausible_event(user_agent: str, client_ip: str, name: str, url: str, props: dict) -> None: |
There was a problem hiding this comment.
Missing type hints for the props parameter. The parameter is typed as 'dict' but should be 'dict[str, str]' to match how it's used in the calling function and for better type safety.
| async def _send_plausible_event(user_agent: str, client_ip: str, name: str, url: str, props: dict) -> None: | |
| async def _send_plausible_event( | |
| user_agent: str, | |
| client_ip: str, | |
| name: str, | |
| url: str, | |
| props: dict[str, str], | |
| ) -> None: |
- Add URL validation to prevent "https://pyplots.ai/None" when spec is None - Add html.escape for query params to prevent XSS in og:image URLs - Add FileNotFoundError handling for missing static og-image.png - Add comprehensive tests for analytics module (detect_platform, track_og_image) - Add tests for new og:image endpoints (home.png, catalog.png) - Fix platform count: 27 platforms (not 25) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Changes
api/analytics.py(NEW): Server-side Plausible tracking moduletrack_og_image()with fire-and-forget patterndetect_platform()for 25 platforms (social, messaging, search, link preview)api/routers/og_images.py: Add tracking + new endpoints/og/home.pngwith filter param tracking/og/catalog.pngfor catalog pageapi/routers/seo.py: Route og:images through APIDEFAULT_HOME_IMAGE→https://api.pyplots.ai/og/home.pngDEFAULT_CATALOG_IMAGE→https://api.pyplots.ai/og/catalog.pngdocs/architecture/plausible.md: Document new event and propertiesNew Plausible Event
og_image_viewpage,platform,spec?,library?,filter_*?Test plan
platform,filter_lib,filter_dom, etc.og_image_view🤖 Generated with Claude Code