Azure-Samples
diff --git a/‎.github/copilot-instructions.md‎
Lines changed: 2 additions & 1 deletion b/‎.github/copilot-instructions.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎.github/python.instructions.md‎
Lines changed: 2 additions & 1 deletion b/‎.github/python.instructions.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎assets/APIM-Samples-Slide-Deck.html‎
Lines changed: 1 addition & 1 deletion b/‎assets/APIM-Samples-Slide-Deck.html‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/index.html‎
Lines changed: 1 addition & 1 deletion b/‎docs/index.html‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎pyproject.toml‎
Lines changed: 9 additions & 8 deletions b/‎pyproject.toml‎
Lines changed: 9 additions & 8 deletions
diff --git a/‎samples/costing/README.md‎
Lines changed: 70 additions & 37 deletions b/‎samples/costing/README.md‎
Lines changed: 70 additions & 37 deletions
@@ -172,7 +172,7 @@ Structure:
    - Standard library imports (time, json, tempfile, requests, pathlib, datetime)
    - `utils`, `apimtypes`, `console`, `azure_resources` (including `az`, `get_infra_rg_name`, `get_account_info`)
 2. USER CONFIGURATION section:
-   - `rg_location`: Azure region (default: 'eastus2')
+   - `rg_location`: Azure region (default: `Region.EAST_US_2`)
    - `index`: Deployment index for resource naming (default: 1)
    - `deployment`: Selected infrastructure type (reference INFRASTRUCTURE enum options)
    - `api_prefix`: Prefix for APIs to avoid naming collisions
@@ -407,6 +407,7 @@ Check `docs/README.md` for local preview instructions and styling notes. The pag
 - Existing cells must keep a unique `metadata.id` value.
 - New cells do not need a `metadata.id` value unless an editor or tool assigns one.
 - Keep notebook JSON logically structured and valid. Do not emit partial notebook fragments when a full notebook document is required.
+- Place **all** `import` statements at the top of every code cell, before any other code. Never nest imports inside `if` / `else` / `try` blocks within a cell. Ruff's `PLC0415` does not flag imports inside module-level conditionals, so this must be enforced manually.
 - When describing notebook changes to users, refer to cells by visible cell number (Cell 1, Cell 2, etc.), not by internal cell IDs.
 
 ### Presentation Instructions
 
@@ -20,6 +20,7 @@ This ensures all code changes comply with the project's linting standards from t
 - Use explicit imports (avoid `from module import *`), especially in notebooks, to prevent `F403/F405`.
 - Keep lines within the configured length limit (see `pyproject.toml`), and wrap long strings or calls.
 - Avoid f-strings without placeholders (e.g., `F541`).
+- **Ruff gap:** `PLC0415` (`import-outside-toplevel`) only flags imports inside functions and classes. It does **not** flag imports inside module-level `if` / `else` / `try` blocks. Ruff will not catch those, so the top-of-file import rule below must be enforced manually.
 
 ## Goals
 
@@ -35,7 +36,7 @@ This ensures all code changes comply with the project's linting standards from t
 ## Style and Conventions
 
 - Prefer Python 3.12+ features unless otherwise required.
-- Keep all imports at the top of the file.
+- Keep **all** imports at the top of the file. Do not place `import` statements inside `if` / `else` / `try` blocks or inside functions. Hoist them even when only one branch uses the module. Ruff `PLC0415` will catch function-scope imports but will **not** catch imports inside module-level conditional blocks, so apply this rule manually.
 - Use type hints and concise docstrings (PEP 257).
 - Use 4-space indentation and PEP 8 conventions.
 - Surround an equal sign by a space on each side.
 
@@ -67,7 +67,7 @@ It's quick and easy to get started!
 | [AuthX][sample-authx]                                       | Authentication and role-based authorization in a mock HR API.                                                       | All infrastructures           |
 | [AuthX Pro][sample-authx-pro]                               | Authentication and role-based authorization in a mock product with multiple APIs and policy fragments.              | All infrastructures           |
 | [Azure Maps][sample-azure-maps]                             | Proxying calls to Azure Maps with APIM policies.                                                                    | All infrastructures           |
-| [Costing][sample-costing]                                   | Track and allocate API costs per business unit using APIM subscriptions, Log Analytics, and Cost Management.         | All infrastructures           |
+| [Costing][sample-costing]                                   | Track and allocate API costs per business unit using APIM subscriptions, Entra ID application tracking, and AI Gateway token/PTU tracking via Log Analytics and Cost Management. | All infrastructures           |
 | [Egress Control][sample-egress-control]                     | Control APIM outbound internet traffic by routing it through a Network Virtual Appliance (NVA) in a hub/spoke topology. | appgw-apim, appgw-apim-pe     |
 | [General][sample-general]                                   | Basic demo of APIM sample setup and policy usage.                                                                   | All infrastructures           |
 | [Load Balancing][sample-load-balancing]                     | Priority and weighted load balancing across backends.                                                               | apim-aca, afd-apim-pe         |
 
@@ -1118,7 +1118,7 @@ <h4>Azure Maps</h4>
       </div>
       <div class="arch-card">
         <h4>Costing &amp; Showback</h4>
-        <p>Track API costs per business unit via subscriptions &amp; Log Analytics.</p>
+        <p>Track API costs per business unit via subscriptions, Entra ID apps &amp; AI Gateway tokens.</p>
       </div>
       <div class="arch-card">
         <h4>Credential Manager</h4>
 
@@ -446,7 +446,7 @@ <h3>Azure Maps</h3>
 
           <a class="sample-card" href="https://github.com/Azure-Samples/Apim-Samples/tree/main/samples/costing" target="_blank" rel="noopener">
             <h3>Costing</h3>
-            <p>Track and allocate API costs per business unit using subscriptions, Log Analytics, and Cost Management.</p>
+            <p>Track and allocate API costs per business unit using subscriptions, Entra ID application tracking, and AI Gateway token/PTU tracking via Log Analytics and Cost Management.</p>
             <span class="infra-tag">All infrastructures</span>
           </a>
 
 
@@ -37,14 +37,15 @@ exclude = ["*.ipynb"]
 
 [tool.ruff.lint]
 select = [
-  "E",    # pycodestyle errors
-  "W",    # pycodestyle warnings
-  "F",    # Pyflakes
-  "PLC",  # Pylint convention
-  "PLE",  # Pylint error
-  "PLR",  # Pylint refactoring
-  "PLW",  # Pylint warning
-  "Q",    # flake8-quotes
+  "E",        # pycodestyle errors
+  "W",        # pycodestyle warnings
+  "F",        # Pyflakes
+  "PLC",      # Pylint convention
+  "PLC0415",  # import-outside-toplevel (explicit; enforce imports at top of file/cell)
+  "PLE",      # Pylint error
+  "PLR",      # Pylint refactoring
+  "PLW",      # Pylint warning
+  "Q",        # flake8-quotes
 ]
 ignore = [
   "PLR0911",  # Too many return statements
 
@@ -1,6 +1,6 @@
 # Samples: APIM Costing & Showback
 
-This sample demonstrates how to track and allocate API costs using Azure API Management with Azure Monitor, Application Insights, Log Analytics, and Cost Management. This setup enables organizations to determine the cost of API consumption per business unit, department, or application.
+This sample demonstrates how to track and allocate API costs using Azure API Management with Azure Monitor, Application Insights, Log Analytics, and Cost Management. It supports three complementary approaches: **subscription-based** tracking (using APIM subscription keys), **Entra ID application** tracking (using the `emit-metric` policy with JWT `appid` claims), and **AI Gateway token/PTU** tracking (using the `emit-metric` policy to capture per-client token consumption when APIM acts as an AI Gateway). All approaches share a single Azure Monitor Workbook with tabbed views.
 
 ⚙️ **Supported infrastructures**: All infrastructures (or bring your own existing APIM deployment)
 
@@ -9,10 +9,13 @@ This sample demonstrates how to track and allocate API costs using Azure API Man
 ## 🎯 Objectives
 
 1. **Track API usage by caller** - Use APIM subscription keys to identify business units, departments, or applications
-2. **Capture request metrics** - Log subscriptionId, apiName, operationName, and status codes
-3. **Aggregate cost data** - Combine API usage metrics with Azure Cost Management data
-4. **Visualize showback data** - Create Azure Monitor Workbooks to display cost allocation by caller
-5. **Enable cost governance** - Establish patterns for consistent tagging and naming conventions
+2. **Track API usage by Entra ID application** - Use the `emit-metric` policy to extract `appid`/`azp` JWT claims and emit per-caller custom metrics
+3. **Capture request metrics** - Log subscriptionId, apiName, operationName, and status codes
+4. **Aggregate cost data** - Combine API usage metrics with Azure Cost Management data
+5. **Visualize showback data** - Create Azure Monitor Workbooks with tabbed views for both approaches
+6. **Enable cost governance** - Establish patterns for consistent tagging and naming conventions
+7. **Enable budget alerts** - Create scheduled query alerts when callers exceed configurable thresholds
+8. **Track AI token consumption per client** - When APIM is used as an AI Gateway, capture prompt, completion, and total token usage per calling application, enabling per-client cost attribution for PTU or pay-as-you-go OpenAI deployments
 
 ## ✅ Prerequisites
 
@@ -50,46 +53,35 @@ Users who only need to **view** the deployed Azure Monitor Workbook (not deploy
 
 ## ⚙️ Configuration
 
-### Important: Sample Index
+### Deployment Index
 
-The `create.ipynb` notebook passes a **`sampleIndex` parameter** to the Bicep template. This parameter ensures unique resource naming when deploying multiple instances of this sample. The notebook automatically provides this value; you only need to verify it matches your deployment scenario:
+The `create.ipynb` notebook passes an **`index` parameter** to the Bicep template. This parameter ensures unique resource naming when deploying multiple instances of this sample. The notebook automatically provides this value; you only need to verify it matches your deployment scenario:
 
 ```python
-sample_index = 2  # Increment this for multiple sample deployments
+index = 1  # Match your infrastructure deployment index
 ```
 
-This index is used in resource names (e.g., `appi-cost-2-xxxx`, `log-cost-2-xxxx`) to avoid naming conflicts when running multiple instances of the sample.
+This index is used in resource names (e.g., `appi-cost-1-xxxx`, `log-cost-1-xxxx`) to avoid naming conflicts.
 
-### Option A: Use a repository infrastructure (recommended)
+### Running the Sample
 
 1. Navigate to the desired [infrastructure](../../infrastructure/) folder (e.g., [simple-apim](../../infrastructure/simple-apim/)) and follow its README.md to deploy.
 2. Open `create.ipynb` and set:
    ```python
-   infrastructure = INFRASTRUCTURE.SIMPLE_APIM  # Match your deployed infra
-   index = 1                                     # Match your infra index
-   sample_index = 1                              # Increment for multiple sample deployments
+   deployment = INFRASTRUCTURE.SIMPLE_APIM  # Match your deployed infra
+   index = 1                                # Match your infra index
    ```
 3. Run All Cells.
 
-### Option B: Bring your own existing APIM
-
-You can use any existing Azure API Management instance. The sample only adds diagnostic settings and sample resources to your APIM - it does **not** modify your existing APIs or policies.
-
-1. Open `create.ipynb` and **uncomment** the two lines in the User Configuration section:
-   ```python
-   existing_rg_name = 'your-resource-group-name'
-   existing_apim_name = 'your-apim-service-name'
-   ```
-2. Set the correct Azure subscription: `az account set -s <subscription-id>`
-3. Run All Cells.
-
 **What the sample deploys into your resource group:**
 - Application Insights instance
 - Log Analytics Workspace
 - Storage Account (for cost exports)
 - Diagnostic Settings on your APIM (routes gateway logs to Log Analytics)
 - Azure Monitor Workbook
-- A sample API (`cost-tracking-api`) with 5 business unit subscriptions
+- Sample APIs with 4 business unit subscriptions
+- Entra ID tracking API with `emit-metric` policy (optional)
+- AI Gateway token tracking API with `emit-metric` policy (optional)
 
 **What it does NOT touch:**
 - Your existing APIs, policies, or subscriptions
@@ -109,6 +101,19 @@ Organizations often need to allocate the cost of shared API Management infrastru
 
 This sample focuses on **producing cost data**, not implementing billing processes. You determine costs; how you use that information (showback reports, chargeback, budgeting) is a separate business decision.
 
+### Three Tracking Approaches
+
+| Aspect | Subscription-Based | Entra ID Application | AI Gateway Token/PTU |
+|---|---|---|---|
+| **Caller identification** | APIM subscription key (`ApimSubscriptionId`) | JWT `appid`/`azp` claim | JWT `appid`/`azp` claim |
+| **Data source** | `ApiManagementGatewayLogs` in Log Analytics | `customMetrics` in Application Insights | `customMetrics` in Application Insights |
+| **Tracking mechanism** | Built-in APIM logging | `emit-metric` policy | `emit-metric` policy (outbound response parsing) |
+| **Metric name** | N/A (built-in logs) | `caller-requests` | `caller-tokens` |
+| **Cost Management export** | Yes (storage account) | No (metrics-based) | No (metrics-based) |
+| **Best for** | Dedicated subscriptions per BU | OAuth client-credentials flows, shared subscriptions | AI Gateway scenarios (Azure OpenAI, PTU capacity planning) |
+
+All three approaches are deployed together. Toggle `enable_entraid_tracking` and `enable_token_tracking` in the notebook to include or exclude each flow.
+
 ## 🛩️ Lab Components
 
 This lab deploys and configures:
@@ -118,15 +123,14 @@ This lab deploys and configures:
 - **Storage Account** - Receives Azure Cost Management exports
 - **Cost Management Export** - Automated export of cost data (configurable frequency)
 - **Diagnostic Settings** - Links APIM to Log Analytics with `logAnalyticsDestinationType: Dedicated` for resource-specific tables
-- **Sample API & Subscriptions** - 5 subscriptions representing different business units
-- **Azure Monitor Workbook** - Pre-built dashboard with:
-  - Cost allocation table (base + variable cost per BU)
-  - Base vs variable cost stacked bar chart
-  - Cost breakdown by API
-  - Request count and distribution charts
-  - Success/error rate analysis
-  - Response code distribution
-- **Live Pricing Integration** - Auto-detects your APIM SKU and fetches current pricing from the [Azure Retail Prices API](https://learn.microsoft.com/rest/api/cost-management/retail-prices/azure-retail-prices)
+- **Sample API & Subscriptions** - 4 subscriptions representing different business units
+- **Entra ID Tracking API** (optional) - A second API with the `emit-metric` policy that extracts `appid` from JWT tokens and emits `caller-requests` custom metrics
+- **AI Gateway Token Tracking API** (optional) - A third API with the `emit-metric` policy that parses Azure OpenAI response bodies to extract `prompt_tokens`, `completion_tokens`, and `total_tokens`, emitting `caller-tokens` custom metrics with `CallerId`, `TokenType`, and `Model` dimensions
+- **Azure Monitor Workbook** - Pre-built tabbed dashboard with:
+  - **Subscription-Based Costing tab**: Cost allocation table (base + variable cost per BU), base vs variable cost stacked bar chart, cost breakdown by API, request count and distribution charts, success/error rate analysis, response code distribution, business unit drill-down
+  - **Entra ID Application Costing tab**: Usage by caller ID (bar chart + table), cost allocation by caller (table + pie chart), hourly request trend by caller
+  - **AI Gateway Token/PTU tab**: Token consumption by client (prompt vs completion bar chart), token cost allocation table with configurable per-1K-token rates, token/cost distribution pie charts, hourly token trend with PTU capacity threshold line, prompt vs completion area chart, model breakdown table
+- **SKU-Based Pricing** - Automatically derives base monthly cost, overage rate, and included request allowance from the deployed APIM SKU using built-in pricing data (sourced from the [Azure API Management pricing page](https://azure.microsoft.com/pricing/details/api-management/), March 2026)
 - **Budget Alerts** (optional) - Per-BU scheduled query alerts when request thresholds are exceeded
 
 ### Cost Allocation Model
@@ -153,10 +157,10 @@ This lab deploys and configures:
 
 After running the notebook, you will have:
 
-1. **Application Insights** showing real-time API requests
+1. **Application Insights** showing real-time API requests and `caller-requests` custom metrics (Entra ID)
 2. **Log Analytics** with queryable `ApiManagementGatewayLogs` (resource-specific table)
 3. **Storage Account** receiving cost export data
-4. **Azure Monitor Workbook** displaying cost allocation and usage analytics
+4. **Azure Monitor Workbook** with tabbed views for both subscription-based and Entra ID cost allocation
 5. **Portal links** printed in the notebook's final cell for quick access
 
 ### Cost Management Export
@@ -181,6 +185,30 @@ The deployed workbook provides a comprehensive view of API cost allocation and u
 
 ![Dashboard - Response Code Analysis](screenshots/Dashboard-05.png)
 
+![Dashboard - Drill-Down Details](screenshots/Dashboard-06.png)
+
+### Entra ID Application Costing Tab
+
+The Entra ID tab shows cost attribution by calling application, using the `emit-metric` policy's `caller-requests` custom metric.
+
+![Entra ID - Usage by Caller ID](screenshots/EntraID-01.png)
+
+![Entra ID - Cost Allocation](screenshots/EntraID-02.png)
+
+![Entra ID - Request Trend](screenshots/EntraID-03.png)
+
+### AI Gateway Token/PTU Tab
+
+The AI Gateway tab shows per-client token consumption and estimated costs when APIM is used as an AI Gateway in front of Azure OpenAI or other LLM backends. It uses the `emit-metric` policy's `caller-tokens` custom metric with `CallerId`, `TokenType` (prompt/completion/total), and `Model` dimensions.
+
+![AI Gateway - Token Consumption by Client](screenshots/AIGateway-01.png)
+
+![AI Gateway - Token Cost Allocation](screenshots/AIGateway-02.png)
+
+![AI Gateway - Token Trends & PTU Utilization](screenshots/AIGateway-03.png)
+
+![AI Gateway - Model & Caller Breakdown](screenshots/AIGateway-04.png)
+
 ## 🧹 Clean Up
 
 To remove all resources created by this sample, open and run `clean-up.ipynb`. This deletes:
@@ -199,6 +227,11 @@ To remove all resources created by this sample, open and run `clean-up.ipynb`. T
 - [Log Analytics Kusto Query Language](https://learn.microsoft.com/azure/data-explorer/kusto/query/)
 - [Azure Monitor Workbooks](https://learn.microsoft.com/azure/azure-monitor/visualize/workbooks-overview)
 - [APIM Diagnostic Settings](https://learn.microsoft.com/azure/api-management/api-management-howto-use-azure-monitor)
+- [APIM emit-metric policy](https://learn.microsoft.com/azure/api-management/emit-metric-policy)
+- [Application Insights custom metrics](https://learn.microsoft.com/azure/azure-monitor/essentials/metrics-custom-overview)
+- [Microsoft Entra ID application model](https://learn.microsoft.com/entra/identity-platform/application-model)
+- [Azure OpenAI usage and token metrics](https://learn.microsoft.com/azure/ai-services/openai/how-to/monitoring)
+- [PTU provisioned throughput concepts](https://learn.microsoft.com/azure/ai-services/openai/concepts/provisioned-throughput)
 
 [infrastructure-architectures]: ../../README.md#infrastructure-architectures
 [infrastructure-folder]: ../../infrastructure/