Skip to content

Commit fe5dba7

Browse files
authored
Merge branch 'main' into copilot/convert-prompt-to-md-file
2 parents a3736b6 + 2d5c79a commit fe5dba7

File tree

19 files changed

+969
-180
lines changed

19 files changed

+969
-180
lines changed
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
---
2+
applyTo: ".github/workflows/**"
3+
---
4+
5+
# Workflow Security: Validating Untrusted User Input
6+
7+
## Overview
8+
9+
Workflows in this repository that are triggered by untrusted user input (issue
10+
bodies, PR descriptions, comments, branch names, etc.) **must** validate that
11+
input for hidden characters and potential prompt injection before processing it.
12+
13+
This is especially important for workflows that pass user content to AI/LLM
14+
systems (e.g. GitHub Copilot agents), but also applies to any automated
15+
processing where a malicious actor could influence the workflow's behavior.
16+
17+
## The Central Validation Script
18+
19+
**`.github/workflows/validate-input.sh`** is the single, authoritative script
20+
for this check. It detects:
21+
22+
| Threat | Description |
23+
|--------|-------------|
24+
| Bidirectional Unicode control characters | Trojan Source attack (CVE-2021-42574) — makes text look different to humans vs. AI |
25+
| Zero-width / invisible characters | Hidden text injected between visible characters, invisible to human reviewers |
26+
| Unicode tag characters (U+E0000–E007F) | Completely invisible; can encode arbitrary ASCII instructions |
27+
| Unicode variation selectors | Can steganographically encode hidden data |
28+
| HTML comment blocks (`<!-- ... -->`) | Stripped by GitHub's renderer but fully visible to LLMs processing raw Markdown |
29+
| Non-printable control characters | Unexpected control bytes that may confuse parsers |
30+
31+
If any of the above are found, the script:
32+
1. **Posts a warning comment** to the issue or PR, listing every finding and
33+
linking back to the workflow run that caught it.
34+
2. **Exits with a non-zero code**, failing the workflow job immediately so that
35+
no further processing occurs on the untrusted content.
36+
37+
## How to Use the Script in a Workflow
38+
39+
Add a validation step **before** any step that reads or processes the untrusted
40+
input. The step must run after the repository is checked out (so the script file
41+
is available), and it needs a `GH_TOKEN` with write access to post comments.
42+
43+
```yaml
44+
- name: Validate <input source> for hidden content
45+
env:
46+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
47+
INPUT_TEXT: ${{ github.event.issue.body }} # ← the untrusted text
48+
ITEM_NUMBER: ${{ github.event.issue.number }} # ← issue or PR number
49+
REPO: ${{ github.repository }}
50+
RUN_ID: ${{ github.run_id }}
51+
SERVER_URL: ${{ github.server_url }}
52+
CONTEXT_TYPE: issue # "issue" or "pr"
53+
FINDINGS_FILE: /tmp/validation-findings.txt
54+
run: bash .github/workflows/validate-input.sh
55+
```
56+
57+
For a pull request body, swap the event expressions and set `CONTEXT_TYPE: pr`:
58+
59+
```yaml
60+
- name: Validate PR body for hidden content
61+
env:
62+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
63+
INPUT_TEXT: ${{ github.event.pull_request.body }}
64+
ITEM_NUMBER: ${{ github.event.pull_request.number }}
65+
REPO: ${{ github.repository }}
66+
RUN_ID: ${{ github.run_id }}
67+
SERVER_URL: ${{ github.server_url }}
68+
CONTEXT_TYPE: pr
69+
FINDINGS_FILE: /tmp/validation-findings.txt
70+
run: bash .github/workflows/validate-input.sh
71+
```
72+
73+
## Deciding Whether a Workflow Needs Validation
74+
75+
Apply the validation step when **all** of the following are true:
76+
77+
1. The workflow is triggered by a user-controllable event:
78+
`issues`, `issue_comment`, `pull_request`, `pull_request_review`,
79+
`pull_request_review_comment`, `discussion`, `discussion_comment`, etc.
80+
2. The workflow reads a **text field** from the event payload that a user wrote:
81+
`.body`, `.title`, `.comment.body`, `.review.body`, branch names, etc.
82+
3. That text is subsequently processed by an automated system (especially an AI).
83+
84+
You do **not** need the script for:
85+
- Purely numeric fields like `issue.number` or `pull_request.number`.
86+
- Internal, trusted triggers (`workflow_dispatch` with controlled inputs,
87+
`push` to protected branches, `schedule`, etc.).
88+
- Metadata-only fields like `pull_request.draft` or `label.name`.
89+
90+
## Permissions
91+
92+
The validation step requires the `issues: write` (or `pull-requests: write`)
93+
permission on the job so the `gh` CLI can post the warning comment:
94+
95+
```yaml
96+
jobs:
97+
my-job:
98+
permissions:
99+
issues: write # needed to post the warning comment
100+
contents: read
101+
```
102+
103+
## Keeping the Script Up to Date
104+
105+
If you discover a new class of hidden-character or injection attack not already
106+
covered, add a new detection block to `.github/workflows/validate-input.sh`
107+
under its clearly-labelled sections. Keep detection logic inside the Python
108+
heredoc so Unicode handling is reliable across all runners.
109+
110+
Document any new threat type with:
111+
- A short comment explaining the attack and why it is dangerous.
112+
- An example of the Unicode code points or patterns being detected.
113+
- A human-readable finding message added to the `findings` list.

.github/skills/azure-storage-loader/package-lock.json

Lines changed: 27 additions & 8 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
title: Check URLs Skill
3+
description: Scan TypeScript source files for hardcoded URLs and verify they resolve
4+
lastUpdated: 2026-03-18
5+
---
6+
7+
# Check URLs Skill
8+
9+
A GitHub Copilot Agent Skill that finds all hardcoded `http(s)://` URLs in the TypeScript source files and verifies that each one still resolves.
10+
11+
## Files in This Directory
12+
13+
- **SKILL.md** — Main skill file with YAML frontmatter and detailed instructions for the agent
14+
- **check-urls.js** — Node.js script that performs the scan and HTTP resolution checks
15+
- **README.md** — This file
16+
17+
## Quick Usage
18+
19+
```bash
20+
node .github/skills/check-urls/check-urls.js
21+
```
22+
23+
## How to Invoke via Copilot
24+
25+
Ask Copilot something like:
26+
- "Check that all hardcoded URLs in the source code still resolve"
27+
- "Are any of the links in the fluency hints broken?"
28+
- "Validate all URL links in the TypeScript files"

.github/skills/check-urls/SKILL.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
name: check-urls
3+
description: Find all hardcoded URLs in TypeScript source files and verify they resolve (return HTTP 2xx/3xx). Use when you want to validate that links in tips, hints, and documentation strings are still live.
4+
---
5+
6+
# Check URLs Skill
7+
8+
This skill scans all TypeScript source files for hardcoded `http://` and `https://` URLs and performs HTTP HEAD requests to verify each one resolves without a 4xx/5xx error.
9+
10+
## When to Use This Skill
11+
12+
Use this skill when you need to:
13+
- Validate links added to fluency hints or tips in `maturityScoring.ts`
14+
- Check that VS Code docs URLs, tech.hub.ms video links, or any other hardcoded URLs are still live
15+
- Audit the codebase after bulk URL changes to catch 404s before a release
16+
- Routinely health-check external references as part of a maintenance pass
17+
18+
## Running the Check
19+
20+
```bash
21+
node .github/skills/check-urls/check-urls.js
22+
```
23+
24+
The script will:
25+
1. Recursively scan every `*.ts` file under `src/`
26+
2. Extract all unique `https?://...` URLs (strips trailing punctuation, skips template literals)
27+
3. Send an HTTP HEAD request to each URL (with a 10-second timeout)
28+
4. If HEAD returns any 4xx status, automatically retry with GET — some servers (e.g. intent URLs, social sharing endpoints) return 404/405 for HEAD but correctly respond to GET
29+
5. Print a summary showing ✅ OK, ⚠️ REDIRECT, or ❌ BROKEN for every URL
30+
6. Exit with code `1` if any URL returns a 4xx or 5xx status on both HEAD and GET, or times out
31+
32+
## Interpreting Output
33+
34+
| Symbol | Meaning |
35+
|--------|---------|
36+
| ✅ OK | 2xx response — URL is live |
37+
| ⚠️ REDIRECT | 3xx response — URL redirects; consider updating to the final destination |
38+
| ❌ BROKEN | 4xx/5xx or connection failure — URL must be fixed |
39+
40+
## After Finding Broken URLs
41+
42+
1. **404 on tech.hub.ms**: The slug may have changed or the page was removed. Check `https://tech.hub.ms` to find the replacement and update `src/maturityScoring.ts`.
43+
2. **404 on code.visualstudio.com**: The VS Code docs may have been reorganised. Search [VS Code docs](https://code.visualstudio.com/docs) for the relevant topic and update the link.
44+
3. **Timeout**: May be a transient network issue. Re-run the script to confirm before changing anything.
45+
4. After fixing, re-run `node .github/skills/check-urls/check-urls.js` to confirm all URLs resolve.
46+
5. Run `npm run compile` to confirm the TypeScript build still passes.
47+
48+
## Files in This Directory
49+
50+
- **SKILL.md** — This file; instructions for the skill
51+
- **check-urls.js** — Node.js script that performs the URL scan and resolution check
52+
- **README.md** — Short overview of the skill

0 commit comments

Comments
 (0)