adjust primary skill.md framing with triggers and examples#456
adjust primary skill.md framing with triggers and examples#456paulirish wants to merge 21 commits into
Conversation
…eaving rename out for separate review)
rviscomi
left a comment
There was a problem hiding this comment.
If it works, it works. But do you think it might be a little too finely tuned to today's set of use cases (scrolling, forms, etc) and may not trigger for whatever else we come up with in the future? Or is that a known limitation and the plan is to periodically update it with more keywords?
LeaVerou
left a comment
There was a problem hiding this comment.
Added some thoughts/ideas/suggestions, in case it's helpful!
All with the caveat that most of my experience is around Claude and skills written primarily for it, so I'm not sure how much of it transfers to Gemini. E.g. perhaps some of these were worded specifically to address certain issues with Gemini that I'm unaware of.
Note: Claude at least truncates descriptions to 1024 characters, so we should be sure whatever we end up with fits within that.
| name: modern-web-use-cases | ||
| description: | | ||
| IMPORTANT: This is a search tool that will help you find the most modern and recommended way to implement any web development use case. | ||
| Search and retrieve modern web platform best practices and Chrome-recommended implementation guides. |
There was a problem hiding this comment.
"Search and retrieve" describes the means, not the goal, so it seems a little superfluous — the means will become clear once the agent has picked the skill, the description is a bid to signal relevance.
I wonder if "web development best practices" may help make it a tiny bit more relevant than "web platform". 🤔
Is the goal of "and Chrome-recommended implementation guides." to signal authority? If so, I wonder if name-dropping Google could help more, e.g. "by the Google Chrome team" will do? (edit: after some research, it looks like signaling authority doesn't help here, so perhaps drop it altogether?)
Putting it all together, what about something like:
| Search and retrieve modern web platform best practices and Chrome-recommended implementation guides. | |
| Create performant, accessible, lightweight frontend interfaces using the most current web development best practices. |
Still too general though, so will get a low score. :/
There was a problem hiding this comment.
Completely agree. I've reframed the opening objective exactly as you recommended:
Create performant, accessible, and lightweight user interfaces using current web development best practices.
This positions relevance based on design objectives rather than technical search mechanics. I also dropped the specific 'Chrome-recommended' authority indicators as my downstream capability routing benchmarks verified that authoritative claims do not drive higher matching scores.
| description: | | ||
| IMPORTANT: This is a search tool that will help you find the most modern and recommended way to implement any web development use case. | ||
| Search and retrieve modern web platform best practices and Chrome-recommended implementation guides. | ||
| MANDATORY: Use this at the start of EVERY web development task, including CSS styling (layout, animations, transforms), UI components (modals, carousels, forms), performance optimization (LCP, INP, resource loading), accessibility, and modern JavaScript API usage. |
There was a problem hiding this comment.
Having trigger phrases/words is good but I wonder if some of this may be too specific? At least we should add "e.g." to make it clear they're examples.
Looking at how Claude's frontend-design skill does their targeting (which seems to work quite well, at least in Claude), what about something a little broader like:
| MANDATORY: Use this at the start of EVERY web development task, including CSS styling (layout, animations, transforms), UI components (modals, carousels, forms), performance optimization (LCP, INP, resource loading), accessibility, and modern JavaScript API usage. | |
| Use this skill at the start of every web development task, including when the user asks to build web components, pages, artifacts, or web applications (examples include websites, landing pages, dashboards, HTML/CSS layouts). The skill should be consulted before implementing, modifying, or optimizing any web UI, or any HTML, CSS, or JS code. |
There was a problem hiding this comment.
Yup. i like some of those specifics. iterated a lot but. yeah. incorporated.
| IMPORTANT: This is a search tool that will help you find the most modern and recommended way to implement any web development use case. | ||
| Search and retrieve modern web platform best practices and Chrome-recommended implementation guides. | ||
| MANDATORY: Use this at the start of EVERY web development task, including CSS styling (layout, animations, transforms), UI components (modals, carousels, forms), performance optimization (LCP, INP, resource loading), accessibility, and modern JavaScript API usage. | ||
| Trigger this skill whenever the user mentions: "frontend", "web page", "CSS", "React", "component", "animation", "scrolling", "performance", or "accessibility". |
There was a problem hiding this comment.
Same as above, these seem too specific. They would work, but I'm worried trying to list enough keywords to trigger the skill in every case it should be triggered is a losing battle…
| Search and retrieve modern web platform best practices and Chrome-recommended implementation guides. | ||
| MANDATORY: Use this at the start of EVERY web development task, including CSS styling (layout, animations, transforms), UI components (modals, carousels, forms), performance optimization (LCP, INP, resource loading), accessibility, and modern JavaScript API usage. | ||
| Trigger this skill whenever the user mentions: "frontend", "web page", "CSS", "React", "component", "animation", "scrolling", "performance", or "accessibility". | ||
| Even for common tasks like centering a div or creating a dialog, you MUST check this first to avoid legacy patterns or heavy libraries that have been replaced by native browser features (e.g., :has(), <dialog>, popover). This ensures implementations are evergreen and avoid technical debt. |
There was a problem hiding this comment.
One potential concern is that looking up the newer alternative to a "legacy pattern" presupposes the agent is already aware the pattern is legacy, which is exactly the problem we're trying to solve.
Also, another issue is that agents are aware of many of these features (definitely the three examples mentioned here!), but are often being extremely conservative around applying them due to outdated baseline info, so I wonder if we could touch on that too.
Just to take a stab:
| Even for common tasks like centering a div or creating a dialog, you MUST check this first to avoid legacy patterns or heavy libraries that have been replaced by native browser features (e.g., :has(), <dialog>, popover). This ensures implementations are evergreen and avoid technical debt. | |
| Consult this even before simple or common tasks, as many widespread patterns are now legacy and have been replaced by native browser features, many of which have only recently become widely available. Generates performant, accessible, reliable, cross-browser code which ensures implementations are evergreen and avoid technical debt. |
There was a problem hiding this comment.
This seems very vague. What is "implementing a web feature"?
But also, a lot of this seems to be repeating routing guidance from the description. Do we need a separate When to use section? If my understanding is correct, by the time the agent is here, it's already decided that this skill is relevant, so the content of the skill is mainly about what to do, the when to do it is in the description.
|
Unfortunately, in our testing, neither the edits in this PR, nor any of my suggestions helped. 😢 See #574 |
|
Appreciate all these comments! Also note to myself: i think the name of the skill should be just "modern web". especially for when devs need to type "no.. use the modern web skill first" 3 weeks later: i renamed it to modern-web but now... we'll retitle it to modern-web-guidance because then everyyyyyyyything has that same name. |
|
@paulirish can you give us a status update on this change? It'd be great to land it ASAP before we do much more QA testing, as I'd expect it to strongly affect guide discoverability. |
…ions-Bypass format TAG=agy CONV=87b88efb-ad93-46d8-b5f4-4a9df35a0ebd
…ing and align async tests
|
I have some edits I think improve things but micah is doing a more comprehensive approach. this particular file is the most bikesheddable thing ever. so... i think we're gonna need evals for all of it. rough. |
|
Compiled the results from the analysis here, and also leaving a comment in the PR for the suggested change to the description (based on the results). |
| Do NOT invoke this skill for trivial visual tweaks, single-property updates, or standard baseline styles (e.g., basic colors, padding, borders, typography) that are fully ready from general memory. | ||
|
|
||
| Do NOT skip this step even if you already know how to implement the use case — web platform APIs evolve rapidly and your training data may recommend outdated approaches. This tool ensures you use the current best practice. | ||
| WARNING: Stale training memory relies on obsolete scripts and bloated design patterns. You are strictly forbidden from authoring frontend structures from memory without searching this skill first, except for the basic changes listed above. |
There was a problem hiding this comment.
Result of description from AGY analysis:
description: |
IMPORTANT: This is a specialized search and retrieval tool for modern web platform best practices and Chrome-recommended implementation patterns.
MANDATORY: You MUST execute this FIRST at the start of EVERY web development task touching HTML, CSS, or client-side JavaScript (including UI components, layout, styling, animations, performance tuning, or accessibility).
Do NOT skip this step even if you already know how to implement the use case — web APIs evolve rapidly and your training data contains obsolete patterns. Using this tool is the only way to ensure compliance with Chrome's modern standards.
Trigger this skill immediately when the user prompt involves:
- UI Components (modals, dialogs, popovers, dropdowns, tabs, carousels)
- Layout & Styling (frosted glass/Glassmorphism, backdrop-filter, centering, grid/flexbox, anchor positioning, responsive designs, `:has()`, `:user-valid`)
- Animations & Scrolling (View Transitions, Scroll-driven animations, scroll-triggered behaviors)
- Performance & Resource Loading (LCP, INP, Fetch Priority, image optimization, lazy loading)
- Device & OS Integration (direct local file system access, local directory read/write, WebUSB)
- High-Performance & Real-Time (WebAssembly data engines, WASM widgets, WebSockets bidirectional syncing)
- Modern JS APIs (AbortController, field-sizing, native browser APIs)
- Framework Adaptation (implementing CSS/styling/UI patterns in React, Angular, Vue, etc.)
DO NOT trigger this skill for:
- Pure backend tasks (server-side API routing, database SQL queries, ORM setup)
- CI/CD, deployment, containerization, or build-pipeline scripts (e.g., Docker, GitHub Actions)
- Non-web programming tasks (command-line tools, system utilities, general Python/Go scripts)
- Generic project configuration (e.g., setting up git, eslint, package.json scripts)
There was a problem hiding this comment.
This looks great, but isn't it too long? (1875 chars).
At least Claude's cap is no longer 250 characters (!) but it's still 1500.
Though given that the last list is when not to trigger, I guess the truncation doesn't lose us much.
There was a problem hiding this comment.
Ah, great point. Most of the other frontier models have a hard limit at 1024 characters, I believe. Here is a much more concise version (991 characters):
IMPORTANT: Specialized search/retrieval tool for Chrome guidelines. MANDATORY: Execute FIRST for all HTML/CSS/JS tasks. Do NOT skip — web APIs evolve rapidly and training weights contain obsolete patterns. Using this tool is required to comply with Chrome standards.
Trigger immediately when prompt involves:
- UI & Layout: Modals, dialogs, popovers, Glassmorphism/backdrop-filters, anchor positioning, container queries, `:has()`, `:user-valid`.
- Scroll & Motion: View Transitions, Scroll-driven animations, scroll parallax/reveals.
- Performance: Core Vitals (LCP, INP), content-visibility, Fetch Priority, image optimization, AbortController.
- System & APIs: Local filesystem access, WebUSB, WebSockets sync, WebAssembly widgets.
- Frameworks: Adapting layout/styles in React, Vue, Angular.
DO NOT trigger for:
- Backend: Database SQL, ORM schemas, Express API routes.
- Pipelines: CI/CD deployment pipelines, Docker, Actions.
- Generic: Local scripts (Python/Go tools), ESLint, Git.
This was tested against the same dataset had the same results (0% False Positive rate, 86.7% trigger accuracy).
This is for 1) better activation rates and 2) better CLI usage and 3) better guidance adhereence
Upgrades the
modern-web-use-cases/SKILL.mdto improve agent activation frequency and tool-usage reliability by leaning into modern web platform standards.frontend,scrolling,animation,CSS,React) to intercept legacy usage early.eval-readyquery examples (full-session-analytics,optimize-image-priority) straight from our workspace.prompts