|
| 1 | +--- |
| 2 | +name: cu-sdk-py-sample-run |
| 3 | +description: Run a specific sample for the Azure AI Content Understanding SDK. Use when users want to run a particular sample like sample_analyze_url.py or sample_analyze_invoice.py. |
| 4 | +--- |
| 5 | + |
| 6 | +# Run a Specific Sample |
| 7 | + |
| 8 | +Run a specific sample from the Azure AI Content Understanding SDK. |
| 9 | + |
| 10 | +> **[COPILOT INTERACTION MODEL]:** This skill is designed to be interactive. At each step marked with **[ASK USER]**, pause execution and prompt the user for input or confirmation before proceeding. Do NOT silently skip these prompts. Use the `ask_questions` tool when available. |
| 11 | +
|
| 12 | +## Prerequisites |
| 13 | + |
| 14 | +- Python >= 3.9 |
| 15 | +- Virtual environment set up with SDK installed (see `cu-sdk-setup` skill) |
| 16 | +- Environment variables configured in `.env` |
| 17 | +- For prebuilt analyzers: model deployments configured (run `sample_update_defaults.py` first) |
| 18 | + |
| 19 | +> **[ASK USER] Prerequisites check:** |
| 20 | +> Before proceeding, verify the user's environment: |
| 21 | +> 1. "Have you already set up your Python environment and installed the SDK?" -- If no, direct them to the `cu-sdk-setup` skill first. |
| 22 | +> 2. "Have you configured your `.env` file with your endpoint and credentials?" -- If no, direct them to Step 4 of the `cu-sdk-setup` skill. |
| 23 | +> 3. "Have you run `sample_update_defaults.py` to configure model defaults?" -- If no and they want to use prebuilt analyzers, guide them to run it first. |
| 24 | +
|
| 25 | +## Package Directory |
| 26 | + |
| 27 | +``` |
| 28 | +sdk/contentunderstanding/azure-ai-contentunderstanding |
| 29 | +``` |
| 30 | + |
| 31 | +## Available Samples |
| 32 | + |
| 33 | +All sync samples have async versions with `_async` suffix in `samples/async_samples/`. |
| 34 | + |
| 35 | +### Getting Started (Run These First) |
| 36 | + |
| 37 | +#### `sample_update_defaults` -- Required First! |
| 38 | +**One-time setup** - Configures model deployment mappings (GPT-4.1, GPT-4.1-mini, text-embedding-3-large) for your Microsoft Foundry resource. Must run before using prebuilt analyzers. |
| 39 | + |
| 40 | +#### `sample_analyze_url` -- Start Here! |
| 41 | +Analyzes content from a URL using `prebuilt-documentSearch`. Works with documents, images, audio, and video. |
| 42 | +- Key concepts: URL input, markdown extraction, multi-modal content |
| 43 | + |
| 44 | +#### `sample_analyze_binary` |
| 45 | +Analyzes local PDF/image files using `prebuilt-documentSearch`. |
| 46 | +- Key concepts: Binary input, local file reading, page properties |
| 47 | + |
| 48 | +### Document Analysis |
| 49 | + |
| 50 | +#### `sample_analyze_invoice` |
| 51 | +Extracts structured fields from invoices using `prebuilt-invoice`. |
| 52 | +- Key concepts: Field extraction (customer name, totals, dates, line items), confidence scores, array fields |
| 53 | + |
| 54 | +#### `sample_analyze_configs` |
| 55 | +Extracts advanced features: charts, hyperlinks, formulas, annotations. |
| 56 | +- Key concepts: Chart.js output, LaTeX formulas, PDF annotations, enhanced analysis options |
| 57 | + |
| 58 | +#### `sample_analyze_return_raw_json` |
| 59 | +Gets raw JSON response for custom processing. |
| 60 | +- Key concepts: Raw response access, saving to file, debugging |
| 61 | + |
| 62 | +### Custom Analyzers |
| 63 | + |
| 64 | +#### `sample_create_analyzer` |
| 65 | +Creates custom analyzer with field schema for domain-specific extraction. |
| 66 | +- Key concepts: Field types (string, number, date, object, array), extraction methods (extract, generate, classify) |
| 67 | + |
| 68 | +#### `sample_create_classifier` |
| 69 | +Creates classifier to categorize documents (Loan_Application, Invoice, Bank_Statement). |
| 70 | +- Key concepts: Content categories, segmentation, document routing |
| 71 | + |
| 72 | +### Analyzer Management |
| 73 | + |
| 74 | +#### `sample_get_analyzer` |
| 75 | +Retrieves analyzer details and configuration. |
| 76 | + |
| 77 | +#### `sample_list_analyzers` |
| 78 | +Lists all available analyzers (prebuilt and custom). |
| 79 | + |
| 80 | +#### `sample_update_analyzer` |
| 81 | +Updates analyzer description and tags. |
| 82 | + |
| 83 | +#### `sample_delete_analyzer` |
| 84 | +Deletes a custom analyzer. |
| 85 | + |
| 86 | +#### `sample_copy_analyzer` |
| 87 | +Copies analyzer within the same resource. |
| 88 | + |
| 89 | +#### `sample_grant_copy_auth` |
| 90 | +Cross-resource copying between different Azure resources/regions. |
| 91 | +- Requires additional env vars: `CONTENTUNDERSTANDING_TARGET_ENDPOINT`, `CONTENTUNDERSTANDING_TARGET_RESOURCE_ID` |
| 92 | + |
| 93 | +### Result Management |
| 94 | + |
| 95 | +#### `sample_get_result_file` |
| 96 | +Retrieves keyframe images from video analysis. |
| 97 | +- Key concepts: Operation IDs, extracting generated files |
| 98 | + |
| 99 | +#### `sample_delete_result` |
| 100 | +Deletes analysis results for data cleanup. |
| 101 | +- Key concepts: Result retention (24-hour auto-deletion), compliance |
| 102 | + |
| 103 | +## Workflow |
| 104 | + |
| 105 | +### Step 1: Navigate to Package Directory |
| 106 | + |
| 107 | +```bash |
| 108 | +cd sdk/contentunderstanding/azure-ai-contentunderstanding |
| 109 | +``` |
| 110 | + |
| 111 | +### Step 2: Activate Virtual Environment |
| 112 | + |
| 113 | +```bash |
| 114 | +source .venv/bin/activate # Linux/macOS |
| 115 | +# .venv\Scripts\activate # Windows |
| 116 | +``` |
| 117 | + |
| 118 | +> **[ASK USER] Confirm venv active:** |
| 119 | +> Ask: "Is your virtual environment active? Run `which python` (or `where python` on Windows) and confirm it points to a path inside `.venv`." |
| 120 | +> If the user reports it is not active or does not exist, direct them to the `cu-sdk-setup` skill. |
| 121 | +
|
| 122 | +### Step 3: Choose and Run the Sample |
| 123 | + |
| 124 | +> **[ASK USER] Which sample?:** |
| 125 | +> Ask the user: "Which sample would you like to run?" with options: |
| 126 | +> - `sample_analyze_url` -- Analyze content from a URL (recommended for first-time users) |
| 127 | +> - `sample_analyze_binary` -- Analyze a local PDF/image file |
| 128 | +> - `sample_analyze_invoice` -- Extract structured fields from an invoice |
| 129 | +> - `sample_create_analyzer` -- Create a custom analyzer |
| 130 | +> - `sample_update_defaults` -- Configure model defaults (one-time setup) |
| 131 | +> - Other -- Let me see the full list |
| 132 | +> |
| 133 | +> If the user picks "Other", show the full Available Samples list above or run `.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh --list`. |
| 134 | +
|
| 135 | +> **[ASK USER] Sync or async?:** |
| 136 | +> Ask: "Would you like to run the **sync** or **async** version of this sample?" |
| 137 | +> - Sync (default) -- Runs in `samples/` |
| 138 | +> - Async -- Runs in `samples/async_samples/` with `_async` suffix |
| 139 | +
|
| 140 | +**Run manually (recommended):** |
| 141 | + |
| 142 | +```bash |
| 143 | +# For sync samples |
| 144 | +cd samples |
| 145 | +python sample_analyze_url.py |
| 146 | + |
| 147 | +# For async samples |
| 148 | +cd samples/async_samples |
| 149 | +python sample_analyze_url_async.py |
| 150 | +``` |
| 151 | + |
| 152 | +**Or use the script:** |
| 153 | + |
| 154 | +```bash |
| 155 | +.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh <sample_name> |
| 156 | +``` |
| 157 | + |
| 158 | +**Examples:** |
| 159 | + |
| 160 | +```bash |
| 161 | +# Run sync sample |
| 162 | +.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_url |
| 163 | + |
| 164 | +# Run async sample |
| 165 | +.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_url_async |
| 166 | + |
| 167 | +# With .py extension (also works) |
| 168 | +.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_invoice.py |
| 169 | +``` |
| 170 | + |
| 171 | +> **[ASK USER] Sample result:** |
| 172 | +> After running the sample, ask: "Did the sample run successfully?" |
| 173 | +> - If yes: |
| 174 | +> - Show the terminal command to re-run this sample directly (e.g., `cd samples && python sample_analyze_url.py` or `cd samples && python async_samples/sample_analyze_url_async.py`) |
| 175 | +> - Briefly explain the key code concepts demonstrated in this sample (e.g., client creation, analyzer selection, result processing, content type casting) |
| 176 | +> - Then ask: "Would you like to run another sample, or are you all set?" |
| 177 | +> - If no: Help troubleshoot using the Troubleshooting section below. Common issues include missing `.env` configuration, inactive venv, or model defaults not configured. |
| 178 | +
|
| 179 | +> **[ASK USER] Run another?:** |
| 180 | +> If the user wants to run another sample, loop back to the "Which sample?" prompt above. |
| 181 | +
|
| 182 | +## Quick Reference |
| 183 | + |
| 184 | +### Most Common Samples for New Users |
| 185 | + |
| 186 | +1. **First-time setup** (run once per Foundry resource): |
| 187 | + ```bash |
| 188 | + .github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_update_defaults |
| 189 | + ``` |
| 190 | + |
| 191 | +2. **Analyze a document from URL:** |
| 192 | + ```bash |
| 193 | + .github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_url |
| 194 | + ``` |
| 195 | + |
| 196 | +3. **Analyze a local PDF file:** |
| 197 | + ```bash |
| 198 | + .github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_binary |
| 199 | + ``` |
| 200 | + |
| 201 | +4. **Extract invoice fields:** |
| 202 | + ```bash |
| 203 | + .github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh sample_analyze_invoice |
| 204 | + ``` |
| 205 | + |
| 206 | +### List Available Samples |
| 207 | + |
| 208 | +```bash |
| 209 | +.github/skills/cu-sdk-py-sample-run/scripts/run_sample.sh --list |
| 210 | +``` |
| 211 | + |
| 212 | +## Troubleshooting |
| 213 | + |
| 214 | +| Error | Solution | |
| 215 | +|-------|----------| |
| 216 | +| `ModuleNotFoundError: azure.ai.contentunderstanding` | Activate venv: `source .venv/bin/activate` then `pip install azure-ai-contentunderstanding` | |
| 217 | +| `ImportError: aiohttp package is not installed` | Install dev dependencies: `pip install -r dev_requirements.txt` | |
| 218 | +| `KeyError: 'CONTENTUNDERSTANDING_ENDPOINT'` | Create `.env` file with credentials (see `cu-sdk-setup` skill) | |
| 219 | +| `FileNotFoundError: sample_files/...` | Run samples from the `samples/` directory | |
| 220 | +| `Access denied` or authorization errors | Ensure **Cognitive Services User** role is assigned; check API key or run `az login` | |
| 221 | +| `Model deployment not found` | Run `sample_update_defaults.py` first to configure model mappings | |
| 222 | + |
| 223 | +## Related Skills |
| 224 | + |
| 225 | +- `cu-sdk-setup` - Set up environment for running samples |
| 226 | + |
| 227 | +## Additional Resources |
| 228 | + |
| 229 | +- [Samples README](../../../samples/README.md) - Detailed sample descriptions with key concepts |
| 230 | +- [SDK README](../../../README.md) - Full SDK documentation |
| 231 | +- [Product Documentation](https://learn.microsoft.com/azure/ai-services/content-understanding/) |
0 commit comments