Skip to content

Commit 8d96cd8

Browse files
committed
Add a sample SKILL.md
1 parent 84d22d3 commit 8d96cd8

2 files changed

Lines changed: 80 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
- [Available Flags](#available-flags)
2323
- [Instance Management](#instance-management)
2424
- [Response Data](#response-data)
25+
- [AI Agent Skills](SKILL.md)
2526
- [License](#license)
2627

2728
## Key Features

SKILL.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Metafetch Skills for AI Agents
2+
3+
Metafetch is a powerful, high-performance library and CLI tool for extracting metadata, Open Graph data, JSON-LD, and more from any web page. It supports client-side rendering (SPAs) and performance optimizations like head-only fetching.
4+
5+
## Capabilities
6+
- **Metadata Extraction:** Title, description, site name, images, favicons, feeds, videos, audio.
7+
- **Structured Data:** Parsed JSON-LD (filtered or full), Microdata, RDFa, and Web App Manifests.
8+
- **SPA Support:** Renders JavaScript-heavy sites using Puppeteer.
9+
- **Performance:** `headOnly` mode to stop downloading after the `</head>` tag (massive bandwidth/time savings).
10+
- **Robustness:** Automatic retries with exponential backoff.
11+
12+
## Installation
13+
14+
```bash
15+
npm install metafetch
16+
# Optional: Install puppeteer for SPA rendering support
17+
npm install puppeteer
18+
```
19+
20+
## CLI Usage (Recommended for Agents)
21+
22+
Agents should prefer the CLI for quick information gathering.
23+
24+
### Basic Fetch
25+
```bash
26+
npx metafetch https://example.com
27+
```
28+
29+
### Performance Optimized (Fastest)
30+
If you only need basic meta tags and want to save time/bandwidth:
31+
```bash
32+
npx metafetch https://example.com --head-only --pretty
33+
```
34+
35+
### SPA / Client-Side Rendering
36+
If the page is empty or missing metadata (React/Vue/etc.):
37+
```bash
38+
npx metafetch https://example.com --render
39+
```
40+
41+
### Filtering Structured Data
42+
To extract only specific JSON-LD types (e.g., Products or Recipes):
43+
```bash
44+
npx metafetch https://example.com --jsonld-types Product,Recipe
45+
```
46+
47+
### Key-Value Output (Easy to parse)
48+
```bash
49+
npx metafetch https://example.com --format kv
50+
```
51+
52+
## API Usage (For Code Generation)
53+
54+
When writing scripts that use `metafetch`, use this pattern:
55+
56+
```javascript
57+
import metafetch from 'metafetch';
58+
59+
async function getMetadata(url) {
60+
const meta = await metafetch.fetch(url, {
61+
// Optimization: Only fetch what you need
62+
flags: {
63+
links: false,
64+
images: false,
65+
microdata: false
66+
},
67+
headOnly: true, // Optional: Stop after </head>
68+
retries: 2 // Optional: Handle transient failures
69+
});
70+
return meta;
71+
}
72+
```
73+
74+
## Best Practices for AI Agents
75+
76+
1. **Default to `headOnly`:** Unless you need images from the body or full link lists, use `--head-only`. It is significantly faster and uses less context.
77+
2. **Check for SPAs:** If the first fetch returns an empty description or generic title, retry with `--render`.
78+
3. **Use Flags:** In code, disable unnecessary flags (e.g., `links: false`) to reduce the size of the returned object and keep the agent's context window lean.
79+
4. **Verify Puppeteer:** If `--render` fails, ensure `puppeteer` is installed in the environment.

0 commit comments

Comments
 (0)