From d422186d400ad5299ea8c6e82a808366f4626dcf Mon Sep 17 00:00:00 2001 From: steaphenai Date: Sun, 15 Mar 2026 14:31:43 +0530 Subject: [PATCH 1/4] NDCG tutorial notebook --- .../intermediate/ndcg-metric-tutorial.ipynb | 404 ++++++++++++++++++ 1 file changed, 404 insertions(+) create mode 100644 tutorials/intermediate/ndcg-metric-tutorial.ipynb diff --git a/tutorials/intermediate/ndcg-metric-tutorial.ipynb b/tutorials/intermediate/ndcg-metric-tutorial.ipynb new file mode 100644 index 0000000..4dbe238 --- /dev/null +++ b/tutorials/intermediate/ndcg-metric-tutorial.ipynb @@ -0,0 +1,404 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Understanding NDCG (Normalized Discounted Cumulative Gain)\n", + "\n", + "This tutorial walks through how NDCG is computed from scratch, then verifies the result using PyTorch Ignite's `Ndcg` metric.\n", + "\n", + "NDCG is a ranking metric commonly used in information retrieval and recommender systems. Unlike metrics that only check if the right item was retrieved, NDCG rewards models that rank more relevant items **higher** in the list.\n", + "\n", + "By the end of this notebook you will:\n", + "- Understand what ground truth and predictions look like for a ranking problem\n", + "- Compute DCG and IDCG step by step by hand\n", + "- Calculate NDCG manually\n", + "- Verify every number matches the Ignite `Ndcg` implementation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# Install dependencies if needed\n", + "# !pip install pytorch-ignite torch\n", + "import torch\n", + "import math" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. The Problem Setup\n", + "\n", + "Imagine a search engine returning 5 documents for a query. Each document has a **relevance score** (ground truth) assigned by a human — higher means more relevant:\n", + "\n", + "| Document | Relevance (ground truth) |\n", + "|----------|-------------------------|\n", + "| Doc A | 3 (highly relevant) |\n", + "| Doc B | 2 (relevant) |\n", + "| Doc C | 3 (highly relevant) |\n", + "| Doc D | 0 (not relevant) |\n", + "| Doc E | 1 (slightly relevant) |\n", + "\n", + "The model predicts a **score** for each document. The model then ranks documents by these scores (highest score = rank 1):" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Ground truth relevance: tensor([[3., 2., 3., 0., 1.]])\n", + "Model prediction scores: tensor([[0.1000, 0.4000, 0.3500, 0.8000, 0.1000]])\n" + ] + } + ], + "source": [ + "# Ground truth relevance scores (one query, 5 documents)\n", + "# Shape: (1, 5) — batch of 1 query\n", + "y_true = torch.tensor([[3.0, 2.0, 3.0, 0.0, 1.0]])\n", + "\n", + "# Model prediction scores for each document\n", + "# Higher score = model thinks this doc is more relevant\n", + "y_pred = torch.tensor([[0.1, 0.4, 0.35, 0.8, 0.1]])\n", + "\n", + "print(\"Ground truth relevance:\", y_true)\n", + "print(\"Model prediction scores:\", y_pred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Step 1 — Rank the Documents by Model Score\n", + "\n", + "The model ranks documents by sorting its predicted scores in descending order. The document with the highest predicted score gets rank 1." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Ranked document indices (by model): tensor([[3, 1, 2, 0, 4]])\n", + "Relevance scores in model's ranked order: tensor([0., 2., 3., 3., 1.])\n", + "\n", + "So the model ranked:\n", + " Rank 1: Doc D (relevance=0, pred score=0.80)\n", + " Rank 2: Doc B (relevance=2, pred score=0.40)\n", + " Rank 3: Doc C (relevance=3, pred score=0.35)\n", + " Rank 4: Doc A (relevance=3, pred score=0.10)\n", + " Rank 5: Doc E (relevance=1, pred score=0.10)\n" + ] + } + ], + "source": [ + "# Sort document indices by predicted score (descending)\n", + "ranked_indices = torch.argsort(y_pred, descending=True)\n", + "print(\"Ranked document indices (by model):\", ranked_indices)\n", + "\n", + "# Reorder ground truth relevance scores according to model ranking\n", + "ranked_relevance = y_true[0][ranked_indices[0]]\n", + "print(\"Relevance scores in model's ranked order:\", ranked_relevance)\n", + "print()\n", + "print(\"So the model ranked:\")\n", + "doc_names = ['Doc A', 'Doc B', 'Doc C', 'Doc D', 'Doc E']\n", + "for rank, idx in enumerate(ranked_indices[0]):\n", + " print(f\" Rank {rank+1}: {doc_names[idx]} (relevance={y_true[0][idx].item():.0f}, pred score={y_pred[0][idx].item():.2f})\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Step 2 — Compute DCG (Discounted Cumulative Gain)\n", + "\n", + "DCG measures the quality of the ranking. It rewards relevant documents but **discounts** them based on their position — finding a relevant document at rank 1 is worth more than finding it at rank 5.\n", + "\n", + "The formula is:\n", + "\n", + "$$DCG@K = \\sum_{i=1}^{K} \\frac{2^{rel_i} - 1}{\\log_2(i + 1)}$$\n", + "\n", + "Where:\n", + "- $rel_i$ is the relevance of the document at rank $i$\n", + "- The numerator $2^{rel_i} - 1$ is the **gain** (higher relevance = exponentially higher gain)\n", + "- The denominator $\\log_2(i+1)$ is the **discount** (lower rank position = smaller discount)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Computing DCG@5 step by step:\n", + "\n", + "Rank Doc Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", + "--------------------------------------------------------------------------------\n", + "1 Doc D 0 0.0000 1.0000 0.0000\n", + "2 Doc B 2 3.0000 1.5850 1.8928\n", + "3 Doc C 3 7.0000 2.0000 3.5000\n", + "4 Doc A 3 7.0000 2.3219 3.0147\n", + "5 Doc E 1 1.0000 2.5850 0.3869\n", + "--------------------------------------------------------------------------------\n", + "DCG@5 = 8.7944\n" + ] + } + ], + "source": [ + "K = 5 # We evaluate the top 5 results\n", + "\n", + "dcg = 0.0\n", + "print(f\"Computing DCG@{K} step by step:\\n\")\n", + "print(f\"{'Rank':<6} {'Doc':<8} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", + "print(\"-\" * 80)\n", + "\n", + "for i, idx in enumerate(ranked_indices[0][:K]):\n", + " rank = i + 1\n", + " rel = y_true[0][idx].item()\n", + " gain = (2 ** rel) - 1\n", + " discount = math.log2(rank + 1)\n", + " contribution = gain / discount\n", + " dcg += contribution\n", + " print(f\"{rank:<6} {doc_names[idx]:<8} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", + "\n", + "print(\"-\" * 80)\n", + "print(f\"DCG@{K} = {dcg:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Step 3 — Compute IDCG (Ideal DCG)\n", + "\n", + "IDCG is the DCG of the **perfect ranking** — what score would we get if the model ranked documents in the exact order of their true relevance?\n", + "\n", + "We compute this by sorting the ground truth relevance scores in descending order and computing DCG on that ideal ordering." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Ideal relevance order: tensor([3., 3., 2., 1., 0.])\n", + "\n", + "Computing IDCG@5 step by step:\n", + "\n", + "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", + "------------------------------------------------------------------------\n", + "1 3 7.0000 1.0000 7.0000\n", + "2 3 7.0000 1.5850 4.4165\n", + "3 2 3.0000 2.0000 1.5000\n", + "4 1 1.0000 2.3219 0.4307\n", + "5 0 0.0000 2.5850 0.0000\n", + "------------------------------------------------------------------------\n", + "IDCG@5 = 13.3472\n" + ] + } + ], + "source": [ + "# The ideal ranking: sort ground truth relevance descending\n", + "ideal_relevance, _ = torch.sort(y_true[0], descending=True)\n", + "print(\"Ideal relevance order:\", ideal_relevance)\n", + "\n", + "idcg = 0.0\n", + "print(f\"\\nComputing IDCG@{K} step by step:\\n\")\n", + "print(f\"{'Rank':<6} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", + "print(\"-\" * 72)\n", + "\n", + "for i in range(K):\n", + " rank = i + 1\n", + " rel = ideal_relevance[i].item()\n", + " gain = (2 ** rel) - 1\n", + " discount = math.log2(rank + 1)\n", + " contribution = gain / discount\n", + " idcg += contribution\n", + " print(f\"{rank:<6} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", + "\n", + "print(\"-\" * 72)\n", + "print(f\"IDCG@{K} = {idcg:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Step 4 — Compute NDCG\n", + "\n", + "NDCG normalizes DCG by IDCG, giving a score between 0 and 1:\n", + "\n", + "$$NDCG@K = \\frac{DCG@K}{IDCG@K}$$\n", + "\n", + "A score of 1.0 means the model ranked everything perfectly. A score close to 0 means the ranking was very poor." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DCG@5 = 8.7944\n", + "IDCG@5 = 13.3472\n", + "NDCG@5 = DCG / IDCG = 8.7944 / 13.3472 = 0.6589\n", + "\n", + "The model achieved 65.9% of the ideal ranking quality.\n" + ] + } + ], + "source": [ + "ndcg_manual = dcg / idcg\n", + "\n", + "print(f\"DCG@{K} = {dcg:.4f}\")\n", + "print(f\"IDCG@{K} = {idcg:.4f}\")\n", + "print(f\"NDCG@{K} = DCG / IDCG = {dcg:.4f} / {idcg:.4f} = {ndcg_manual:.4f}\")\n", + "print(f\"\\nThe model achieved {ndcg_manual*100:.1f}% of the ideal ranking quality.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 6. Verify with PyTorch Ignite\n", + "\n", + "Now let's confirm our manual calculation matches the Ignite `Ndcg` metric exactly." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Manual NDCG@5: 0.6589\n", + "Ignite NDCG@5: 0.6589\n", + "\n", + "✓ Manual calculation matches Ignite implementation perfectly!\n" + ] + } + ], + "source": [ + "from ignite.metrics.rec_sys.ndcg import NDCG\n", + "\n", + "# Initialize the Ndcg metric with k=5\n", + "ndcg_metric = NDCG(output_transform=lambda x: x, top_k=[K])\n", + "\n", + "# Reset and update with our data\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred, y_true))\n", + "\n", + "# Compute the result\n", + "ignite_result = ndcg_metric.compute()\n", + "\n", + "print(f\"Manual NDCG@{K}: {ndcg_manual:.4f}\")\n", + "print(f\"Ignite NDCG@{K}: {ignite_result[0]:.4f}\")\n", + "print()\n", + "\n", + "# Verify they match\n", + "assert abs(ndcg_manual - ignite_result[0]) < 1e-4, \"Mismatch!\"\n", + "print(\"✓ Manual calculation matches Ignite implementation perfectly!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 7. Understanding the Score\n", + "\n", + "Let's build some intuition by looking at two extreme cases." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Perfect ranking NDCG@5: 1.0000 (should be 1.0)\n", + "Worst ranking NDCG@5: 0.5884 (should be close to 0)\n", + "\n", + "Our model's ranking NDCG@5: 0.6589 (somewhere in between)\n" + ] + } + ], + "source": [ + "# Case 1: Perfect ranking (model scores match relevance exactly)\n", + "y_pred_perfect = torch.tensor([[0.9, 0.6, 0.8, 0.1, 0.3]])\n", + "\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred_perfect, y_true))\n", + "perfect_score = ndcg_metric.compute()\n", + "print(f\"Perfect ranking NDCG@{K}: {perfect_score[0]:.4f} (should be 1.0)\")\n", + "\n", + "# Case 2: Worst ranking (model ranks least relevant items highest)\n", + "y_pred_worst = torch.tensor([[0.1, 0.3, 0.2, 0.9, 0.6]])\n", + "\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred_worst, y_true))\n", + "worst_score = ndcg_metric.compute()\n", + "print(f\"Worst ranking NDCG@{K}: {worst_score[0]:.4f} (should be close to 0)\")\n", + "\n", + "print(f\"\\nOur model's ranking NDCG@{K}: {ndcg_manual:.4f} (somewhere in between)\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} From 538233ec071e8a587cf525abbd4acb442f989f57 Mon Sep 17 00:00:00 2001 From: Steaphen Date: Fri, 20 Mar 2026 11:17:56 +0530 Subject: [PATCH 2/4] Add combined call --- .../intermediate/ndcg-metric-tutorial.ipynb | 129 +++++++++--------- 1 file changed, 61 insertions(+), 68 deletions(-) diff --git a/tutorials/intermediate/ndcg-metric-tutorial.ipynb b/tutorials/intermediate/ndcg-metric-tutorial.ipynb index 4dbe238..838e065 100644 --- a/tutorials/intermediate/ndcg-metric-tutorial.ipynb +++ b/tutorials/intermediate/ndcg-metric-tutorial.ipynb @@ -6,7 +6,7 @@ "source": [ "# Understanding NDCG (Normalized Discounted Cumulative Gain)\n", "\n", - "This tutorial walks through how NDCG is computed from scratch, then verifies the result using PyTorch Ignite's `Ndcg` metric.\n", + "This tutorial walks through how NDCG is computed from scratch, then verifies the result using PyTorch Ignite's `NDCG` metric.\n", "\n", "NDCG is a ranking metric commonly used in information retrieval and recommender systems. Unlike metrics that only check if the right item was retrieved, NDCG rewards models that rank more relevant items **higher** in the list.\n", "\n", @@ -14,7 +14,7 @@ "- Understand what ground truth and predictions look like for a ranking problem\n", "- Compute DCG and IDCG step by step by hand\n", "- Calculate NDCG manually\n", - "- Verify every number matches the Ignite `Ndcg` implementation" + "- Verify every number matches the Ignite `NDCG` implementation" ] }, { @@ -131,9 +131,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 3. Step 2 — Compute DCG (Discounted Cumulative Gain)\n", + "## 3. Step 2 — The DCG Helper Function\n", "\n", - "DCG measures the quality of the ranking. It rewards relevant documents but **discounts** them based on their position — finding a relevant document at rank 1 is worth more than finding it at rank 5.\n", + "DCG measures the quality of a ranking. It rewards relevant documents but **discounts** them based on their position — finding a relevant document at rank 1 is worth more than finding it at rank 5.\n", "\n", "The formula is:\n", "\n", @@ -142,62 +142,51 @@ "Where:\n", "- $rel_i$ is the relevance of the document at rank $i$\n", "- The numerator $2^{rel_i} - 1$ is the **gain** (higher relevance = exponentially higher gain)\n", - "- The denominator $\\log_2(i+1)$ is the **discount** (lower rank position = smaller discount)" + "- The denominator $\\log_2(i+1)$ is the **discount** (lower rank position = larger discount)\n", + "\n", + "We define a single `compute_dcg` function and reuse it for both DCG and IDCG — because IDCG is simply DCG computed on the **ideal** (perfectly sorted) relevance scores." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Computing DCG@5 step by step:\n", - "\n", - "Rank Doc Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", - "--------------------------------------------------------------------------------\n", - "1 Doc D 0 0.0000 1.0000 0.0000\n", - "2 Doc B 2 3.0000 1.5850 1.8928\n", - "3 Doc C 3 7.0000 2.0000 3.5000\n", - "4 Doc A 3 7.0000 2.3219 3.0147\n", - "5 Doc E 1 1.0000 2.5850 0.3869\n", - "--------------------------------------------------------------------------------\n", - "DCG@5 = 8.7944\n" - ] - } - ], + "outputs": [], "source": [ - "K = 5 # We evaluate the top 5 results\n", - "\n", - "dcg = 0.0\n", - "print(f\"Computing DCG@{K} step by step:\\n\")\n", - "print(f\"{'Rank':<6} {'Doc':<8} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", - "print(\"-\" * 80)\n", - "\n", - "for i, idx in enumerate(ranked_indices[0][:K]):\n", - " rank = i + 1\n", - " rel = y_true[0][idx].item()\n", - " gain = (2 ** rel) - 1\n", - " discount = math.log2(rank + 1)\n", - " contribution = gain / discount\n", - " dcg += contribution\n", - " print(f\"{rank:<6} {doc_names[idx]:<8} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", - "\n", - "print(\"-\" * 80)\n", - "print(f\"DCG@{K} = {dcg:.4f}\")" + "def compute_dcg(relevance_scores, k):\n", + " \"\"\"Compute DCG@K for a list of relevance scores already in ranked order.\n", + "\n", + " Args:\n", + " relevance_scores: 1D tensor of relevance values in ranked order\n", + " k: number of top positions to consider\n", + "\n", + " Returns:\n", + " DCG@K score (float)\n", + " \"\"\"\n", + " dcg = 0.0\n", + " print(f\"{'Rank':<6} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", + " print(\"-\" * 72)\n", + " for i in range(k):\n", + " rank = i + 1\n", + " rel = relevance_scores[i].item()\n", + " gain = (2 ** rel) - 1\n", + " discount = math.log2(rank + 1)\n", + " contribution = gain / discount\n", + " dcg += contribution\n", + " print(f\"{rank:<6} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", + " print(\"-\" * 72)\n", + " return dcg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## 4. Step 3 — Compute IDCG (Ideal DCG)\n", - "\n", - "IDCG is the DCG of the **perfect ranking** — what score would we get if the model ranked documents in the exact order of their true relevance?\n", + "## 4. Step 3 — Compute DCG and IDCG\n", "\n", - "We compute this by sorting the ground truth relevance scores in descending order and computing DCG on that ideal ordering." + "We call `compute_dcg` twice:\n", + "- Once on the **model's ranking** → DCG\n", + "- Once on the **ideal ranking** (ground truth sorted descending) → IDCG" ] }, { @@ -209,9 +198,19 @@ "name": "stdout", "output_type": "stream", "text": [ - "Ideal relevance order: tensor([3., 3., 2., 1., 0.])\n", + "DCG@5 — model's ranking:\n", + "\n", + "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", + "------------------------------------------------------------------------\n", + "1 0 0.0000 1.0000 0.0000\n", + "2 2 3.0000 1.5850 1.8928\n", + "3 3 7.0000 2.0000 3.5000\n", + "4 3 7.0000 2.3219 3.0147\n", + "5 1 1.0000 2.5850 0.3869\n", + "------------------------------------------------------------------------\n", + "DCG@5 = 8.7944\n", "\n", - "Computing IDCG@5 step by step:\n", + "IDCG@5 — ideal ranking (ground truth sorted descending):\n", "\n", "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", "------------------------------------------------------------------------\n", @@ -226,25 +225,19 @@ } ], "source": [ - "# The ideal ranking: sort ground truth relevance descending\n", + "K = 5 # Evaluate top 5 results\n", + "\n", + "# --- DCG: model's actual ranking ---\n", + "print(f\"DCG@{K} — model's ranking:\\n\")\n", + "dcg = compute_dcg(ranked_relevance, K)\n", + "print(f\"DCG@{K} = {dcg:.4f}\")\n", + "\n", + "print()\n", + "\n", + "# --- IDCG: ideal ranking (ground truth sorted descending) ---\n", "ideal_relevance, _ = torch.sort(y_true[0], descending=True)\n", - "print(\"Ideal relevance order:\", ideal_relevance)\n", - "\n", - "idcg = 0.0\n", - "print(f\"\\nComputing IDCG@{K} step by step:\\n\")\n", - "print(f\"{'Rank':<6} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", - "print(\"-\" * 72)\n", - "\n", - "for i in range(K):\n", - " rank = i + 1\n", - " rel = ideal_relevance[i].item()\n", - " gain = (2 ** rel) - 1\n", - " discount = math.log2(rank + 1)\n", - " contribution = gain / discount\n", - " idcg += contribution\n", - " print(f\"{rank:<6} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", - "\n", - "print(\"-\" * 72)\n", + "print(f\"IDCG@{K} — ideal ranking (ground truth sorted descending):\\n\")\n", + "idcg = compute_dcg(ideal_relevance, K)\n", "print(f\"IDCG@{K} = {idcg:.4f}\")" ] }, @@ -293,7 +286,7 @@ "source": [ "## 6. Verify with PyTorch Ignite\n", "\n", - "Now let's confirm our manual calculation matches the Ignite `Ndcg` metric exactly." + "Now let's confirm our manual calculation matches the Ignite `NDCG` metric exactly." ] }, { @@ -315,7 +308,7 @@ "source": [ "from ignite.metrics.rec_sys.ndcg import NDCG\n", "\n", - "# Initialize the Ndcg metric with k=5\n", + "# Initialize the NDCG metric with k=5\n", "ndcg_metric = NDCG(output_transform=lambda x: x, top_k=[K])\n", "\n", "# Reset and update with our data\n", From f330a3ba296febeaac1b8996ed3d4c5279adf7ef Mon Sep 17 00:00:00 2001 From: steaphenai Date: Mon, 30 Mar 2026 22:45:19 +0530 Subject: [PATCH 3/4] Use updated NDCG metric tutorial notebook --- .../ndcg-metric-tutorial-updated.ipynb | 359 ++++++++++++++++++ 1 file changed, 359 insertions(+) create mode 100644 tutorials/intermediate/ndcg-metric-tutorial-updated.ipynb diff --git a/tutorials/intermediate/ndcg-metric-tutorial-updated.ipynb b/tutorials/intermediate/ndcg-metric-tutorial-updated.ipynb new file mode 100644 index 0000000..db6b7f6 --- /dev/null +++ b/tutorials/intermediate/ndcg-metric-tutorial-updated.ipynb @@ -0,0 +1,359 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Understanding NDCG (Normalized Discounted Cumulative Gain)\n", + "\n", + "This tutorial walks through how NDCG is computed from scratch, then verifies the result using PyTorch Ignite's `NDCG` metric.\n", + "\n", + "NDCG is a ranking metric commonly used in information retrieval and recommender systems. Unlike metrics that only check if the right item was retrieved, NDCG rewards models that rank more relevant items **higher** in the list.\n", + "\n", + "By the end of this notebook you will:\n", + "- Understand what ground truth and predictions look like for a ranking problem\n", + "- Compute DCG and IDCG step by step by hand\n", + "- Calculate NDCG manually\n", + "- Verify every number matches the Ignite `NDCG` implementation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# Install dependencies if needed\n", + "# !pip install pytorch-ignite torch\n", + "import torch\n", + "import math" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. The Problem Setup\n", + "\n", + "Imagine a search engine returning 5 documents for a query. Each document has a **relevance score** (ground truth) assigned by a human — higher means more relevant:\n", + "\n", + "| Document | Relevance (ground truth) |\n", + "|----------|-------------------------|\n", + "| Doc A | 3 (highly relevant) |\n", + "| Doc B | 2 (relevant) |\n", + "| Doc C | 3 (highly relevant) |\n", + "| Doc D | 0 (not relevant) |\n", + "| Doc E | 1 (slightly relevant) |\n", + "\n", + "The model predicts a **score** for each document. The model then ranks documents by these scores (highest score = rank 1):" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Ground truth relevance: tensor([[3., 2., 3., 0., 1.]])\n", + "Model prediction scores: tensor([[0.1000, 0.4000, 0.3500, 0.8000, 0.1000]])\n" + ] + } + ], + "source": [ + "# Ground truth relevance scores (one query, 5 documents)\n", + "# Shape: (1, 5) — batch of 1 query\n", + "y_true = torch.tensor([[3.0, 2.0, 3.0, 0.0, 1.0]])\n", + "\n", + "# Model prediction scores for each document\n", + "# Higher score = model thinks this doc is more relevant\n", + "y_pred = torch.tensor([[0.1, 0.4, 0.35, 0.8, 0.1]])\n", + "\n", + "print(\"Ground truth relevance:\", y_true)\n", + "print(\"Model prediction scores:\", y_pred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Step 1 — The DCG Helper Function\n", + "\n", + "DCG measures the quality of a ranking. It rewards relevant documents but **discounts** them based on their position — finding a relevant document at rank 1 is worth more than finding it at rank 5.\n", + "\n", + "The formula is:\n", + "\n", + "$$DCG@K = \\sum_{i=1}^{K} \\frac{2^{rel_i} - 1}{\\log_2(i + 1)}$$\n", + "\n", + "Where:\n", + "- $rel_i$ is the relevance of the document at rank $i$\n", + "- The numerator $2^{rel_i} - 1$ is the **gain** (higher relevance = exponentially higher gain)\n", + "- The denominator $\\log_2(i+1)$ is the **discount** (lower rank position = larger discount)\n", + "\n", + "We define a single `compute_dcg` function that accepts both `y_true` (relevance scores) and `scores` (the signal used to rank documents). This lets us reuse the same function for both DCG and IDCG:\n", + "- **DCG**: pass `scores=y_pred` → ranks by model predictions\n", + "- **IDCG**: pass `scores=y_true` → ranks by ground truth (ideal order)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "def compute_dcg(y_true, scores, k):\n", + " \"\"\"Compute DCG@K by ranking y_true according to scores.\n", + "\n", + " Args:\n", + " y_true: 1D tensor of ground-truth relevance values\n", + " scores: 1D tensor used to rank the documents (descending)\n", + " Pass y_pred to get DCG; pass y_true to get IDCG.\n", + " k: number of top positions to consider\n", + "\n", + " Returns:\n", + " DCG@K score (float)\n", + " \"\"\"\n", + " # Rank documents by scores (descending) and reorder relevance accordingly\n", + " ranked_indices = torch.argsort(scores, descending=True)\n", + " ranked_relevance = y_true[ranked_indices]\n", + "\n", + " dcg = 0.0\n", + " print(f\"{'Rank':<6} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", + " print(\"-\" * 72)\n", + " for i in range(k):\n", + " rank = i + 1\n", + " rel = ranked_relevance[i].item()\n", + " gain = (2 ** rel) - 1\n", + " discount = math.log2(rank + 1)\n", + " contribution = gain / discount\n", + " dcg += contribution\n", + " print(f\"{rank:<6} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", + " print(\"-\" * 72)\n", + " return dcg" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Step 2 — Compute DCG and IDCG\n", + "\n", + "We call `compute_dcg` twice:\n", + "- `compute_dcg(y_true, y_pred, k)` → ranks by model predictions → **DCG**\n", + "- `compute_dcg(y_true, y_true, k)` → ranks by ground truth → **IDCG**" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DCG@5 — model's ranking (scores = y_pred):\n", + "\n", + "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", + "------------------------------------------------------------------------\n", + "1 0 0.0000 1.0000 0.0000\n", + "2 2 3.0000 1.5850 1.8928\n", + "3 3 7.0000 2.0000 3.5000\n", + "4 3 7.0000 2.3219 3.0147\n", + "5 1 1.0000 2.5850 0.3869\n", + "------------------------------------------------------------------------\n", + "DCG@5 = 8.7944\n", + "\n", + "IDCG@5 — ideal ranking (scores = y_true):\n", + "\n", + "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", + "------------------------------------------------------------------------\n", + "1 3 7.0000 1.0000 7.0000\n", + "2 3 7.0000 1.5850 4.4165\n", + "3 2 3.0000 2.0000 1.5000\n", + "4 1 1.0000 2.3219 0.4307\n", + "5 0 0.0000 2.5850 0.0000\n", + "------------------------------------------------------------------------\n", + "IDCG@5 = 13.3472\n" + ] + } + ], + "source": [ + "K = 5 # Evaluate top 5 results\n", + "\n", + "# --- DCG: rank by model predictions ---\n", + "print(f\"DCG@{K} — model's ranking (scores = y_pred):\\n\")\n", + "dcg = compute_dcg(y_true[0], y_pred[0], K)\n", + "print(f\"DCG@{K} = {dcg:.4f}\")\n", + "\n", + "print()\n", + "\n", + "# --- IDCG: rank by ground truth (ideal order) ---\n", + "print(f\"IDCG@{K} — ideal ranking (scores = y_true):\\n\")\n", + "idcg = compute_dcg(y_true[0], y_true[0], K)\n", + "print(f\"IDCG@{K} = {idcg:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Step 3 — Compute NDCG\n", + "\n", + "NDCG normalizes DCG by IDCG, giving a score between 0 and 1:\n", + "\n", + "$$NDCG@K = \\frac{DCG@K}{IDCG@K}$$\n", + "\n", + "A score of 1.0 means the model ranked everything perfectly. A score close to 0 means the ranking was very poor." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DCG@5 = 8.7944\n", + "IDCG@5 = 13.3472\n", + "NDCG@5 = DCG / IDCG = 8.7944 / 13.3472 = 0.6589\n", + "\n", + "The model achieved 65.9% of the ideal ranking quality.\n" + ] + } + ], + "source": [ + "ndcg_manual = dcg / idcg\n", + "\n", + "print(f\"DCG@{K} = {dcg:.4f}\")\n", + "print(f\"IDCG@{K} = {idcg:.4f}\")\n", + "print(f\"NDCG@{K} = DCG / IDCG = {dcg:.4f} / {idcg:.4f} = {ndcg_manual:.4f}\")\n", + "print(f\"\\nThe model achieved {ndcg_manual*100:.1f}% of the ideal ranking quality.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Verify with PyTorch Ignite\n", + "\n", + "Now let's confirm our manual calculation matches the Ignite `NDCG` metric exactly." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Manual NDCG@5: 0.6589\n", + "Ignite NDCG@5: 0.6589\n", + "\n", + "✓ Manual calculation matches Ignite implementation perfectly!\n" + ] + } + ], + "source": [ + "from ignite.metrics.rec_sys.ndcg import NDCG\n", + "\n", + "# Initialize the NDCG metric with k=5\n", + "ndcg_metric = NDCG(output_transform=lambda x: x, top_k=[K])\n", + "\n", + "# Reset and update with our data\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred, y_true))\n", + "\n", + "# Compute the result\n", + "ignite_result = ndcg_metric.compute()\n", + "\n", + "print(f\"Manual NDCG@{K}: {ndcg_manual:.4f}\")\n", + "print(f\"Ignite NDCG@{K}: {ignite_result[0]:.4f}\")\n", + "print()\n", + "\n", + "# Verify they match\n", + "assert abs(ndcg_manual - ignite_result[0]) < 1e-4, \"Mismatch!\"\n", + "print(\"✓ Manual calculation matches Ignite implementation perfectly!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 6. Understanding the Score\n", + "\n", + "Let's build some intuition by looking at two extreme cases." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Perfect ranking NDCG@5: 1.0000 (should be 1.0)\n", + "Worst ranking NDCG@5: 0.5884 (should be close to 0)\n", + "\n", + "Our model's ranking NDCG@5: 0.6589 (somewhere in between)\n" + ] + } + ], + "source": [ + "# Case 1: Perfect ranking (model scores match relevance exactly)\n", + "y_pred_perfect = torch.tensor([[0.9, 0.6, 0.8, 0.1, 0.3]])\n", + "\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred_perfect, y_true))\n", + "perfect_score = ndcg_metric.compute()\n", + "print(f\"Perfect ranking NDCG@{K}: {perfect_score[0]:.4f} (should be 1.0)\")\n", + "\n", + "# Case 2: Worst ranking (model ranks least relevant items highest)\n", + "y_pred_worst = torch.tensor([[0.1, 0.3, 0.2, 0.9, 0.6]])\n", + "\n", + "ndcg_metric.reset()\n", + "ndcg_metric.update((y_pred_worst, y_true))\n", + "worst_score = ndcg_metric.compute()\n", + "print(f\"Worst ranking NDCG@{K}: {worst_score[0]:.4f} (should be close to 0)\")\n", + "\n", + "print(f\"\\nOur model's ranking NDCG@{K}: {ndcg_manual:.4f} (somewhere in between)\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} From e3e517c097471fdb5a5a5788de738863291c2b3f Mon Sep 17 00:00:00 2001 From: steaphenai Date: Mon, 30 Mar 2026 22:47:57 +0530 Subject: [PATCH 4/4] Remove old NDCG tutorial notebook from PR --- .../intermediate/ndcg-metric-tutorial.ipynb | 397 ------------------ 1 file changed, 397 deletions(-) delete mode 100644 tutorials/intermediate/ndcg-metric-tutorial.ipynb diff --git a/tutorials/intermediate/ndcg-metric-tutorial.ipynb b/tutorials/intermediate/ndcg-metric-tutorial.ipynb deleted file mode 100644 index 838e065..0000000 --- a/tutorials/intermediate/ndcg-metric-tutorial.ipynb +++ /dev/null @@ -1,397 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Understanding NDCG (Normalized Discounted Cumulative Gain)\n", - "\n", - "This tutorial walks through how NDCG is computed from scratch, then verifies the result using PyTorch Ignite's `NDCG` metric.\n", - "\n", - "NDCG is a ranking metric commonly used in information retrieval and recommender systems. Unlike metrics that only check if the right item was retrieved, NDCG rewards models that rank more relevant items **higher** in the list.\n", - "\n", - "By the end of this notebook you will:\n", - "- Understand what ground truth and predictions look like for a ranking problem\n", - "- Compute DCG and IDCG step by step by hand\n", - "- Calculate NDCG manually\n", - "- Verify every number matches the Ignite `NDCG` implementation" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Setup" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# Install dependencies if needed\n", - "# !pip install pytorch-ignite torch\n", - "import torch\n", - "import math" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 1. The Problem Setup\n", - "\n", - "Imagine a search engine returning 5 documents for a query. Each document has a **relevance score** (ground truth) assigned by a human — higher means more relevant:\n", - "\n", - "| Document | Relevance (ground truth) |\n", - "|----------|-------------------------|\n", - "| Doc A | 3 (highly relevant) |\n", - "| Doc B | 2 (relevant) |\n", - "| Doc C | 3 (highly relevant) |\n", - "| Doc D | 0 (not relevant) |\n", - "| Doc E | 1 (slightly relevant) |\n", - "\n", - "The model predicts a **score** for each document. The model then ranks documents by these scores (highest score = rank 1):" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Ground truth relevance: tensor([[3., 2., 3., 0., 1.]])\n", - "Model prediction scores: tensor([[0.1000, 0.4000, 0.3500, 0.8000, 0.1000]])\n" - ] - } - ], - "source": [ - "# Ground truth relevance scores (one query, 5 documents)\n", - "# Shape: (1, 5) — batch of 1 query\n", - "y_true = torch.tensor([[3.0, 2.0, 3.0, 0.0, 1.0]])\n", - "\n", - "# Model prediction scores for each document\n", - "# Higher score = model thinks this doc is more relevant\n", - "y_pred = torch.tensor([[0.1, 0.4, 0.35, 0.8, 0.1]])\n", - "\n", - "print(\"Ground truth relevance:\", y_true)\n", - "print(\"Model prediction scores:\", y_pred)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2. Step 1 — Rank the Documents by Model Score\n", - "\n", - "The model ranks documents by sorting its predicted scores in descending order. The document with the highest predicted score gets rank 1." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Ranked document indices (by model): tensor([[3, 1, 2, 0, 4]])\n", - "Relevance scores in model's ranked order: tensor([0., 2., 3., 3., 1.])\n", - "\n", - "So the model ranked:\n", - " Rank 1: Doc D (relevance=0, pred score=0.80)\n", - " Rank 2: Doc B (relevance=2, pred score=0.40)\n", - " Rank 3: Doc C (relevance=3, pred score=0.35)\n", - " Rank 4: Doc A (relevance=3, pred score=0.10)\n", - " Rank 5: Doc E (relevance=1, pred score=0.10)\n" - ] - } - ], - "source": [ - "# Sort document indices by predicted score (descending)\n", - "ranked_indices = torch.argsort(y_pred, descending=True)\n", - "print(\"Ranked document indices (by model):\", ranked_indices)\n", - "\n", - "# Reorder ground truth relevance scores according to model ranking\n", - "ranked_relevance = y_true[0][ranked_indices[0]]\n", - "print(\"Relevance scores in model's ranked order:\", ranked_relevance)\n", - "print()\n", - "print(\"So the model ranked:\")\n", - "doc_names = ['Doc A', 'Doc B', 'Doc C', 'Doc D', 'Doc E']\n", - "for rank, idx in enumerate(ranked_indices[0]):\n", - " print(f\" Rank {rank+1}: {doc_names[idx]} (relevance={y_true[0][idx].item():.0f}, pred score={y_pred[0][idx].item():.2f})\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3. Step 2 — The DCG Helper Function\n", - "\n", - "DCG measures the quality of a ranking. It rewards relevant documents but **discounts** them based on their position — finding a relevant document at rank 1 is worth more than finding it at rank 5.\n", - "\n", - "The formula is:\n", - "\n", - "$$DCG@K = \\sum_{i=1}^{K} \\frac{2^{rel_i} - 1}{\\log_2(i + 1)}$$\n", - "\n", - "Where:\n", - "- $rel_i$ is the relevance of the document at rank $i$\n", - "- The numerator $2^{rel_i} - 1$ is the **gain** (higher relevance = exponentially higher gain)\n", - "- The denominator $\\log_2(i+1)$ is the **discount** (lower rank position = larger discount)\n", - "\n", - "We define a single `compute_dcg` function and reuse it for both DCG and IDCG — because IDCG is simply DCG computed on the **ideal** (perfectly sorted) relevance scores." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "def compute_dcg(relevance_scores, k):\n", - " \"\"\"Compute DCG@K for a list of relevance scores already in ranked order.\n", - "\n", - " Args:\n", - " relevance_scores: 1D tensor of relevance values in ranked order\n", - " k: number of top positions to consider\n", - "\n", - " Returns:\n", - " DCG@K score (float)\n", - " \"\"\"\n", - " dcg = 0.0\n", - " print(f\"{'Rank':<6} {'Relevance':<12} {'Gain (2^rel-1)':<18} {'Discount log2(i+1)':<22} {'Contribution'}\")\n", - " print(\"-\" * 72)\n", - " for i in range(k):\n", - " rank = i + 1\n", - " rel = relevance_scores[i].item()\n", - " gain = (2 ** rel) - 1\n", - " discount = math.log2(rank + 1)\n", - " contribution = gain / discount\n", - " dcg += contribution\n", - " print(f\"{rank:<6} {rel:<12.0f} {gain:<18.4f} {discount:<22.4f} {contribution:.4f}\")\n", - " print(\"-\" * 72)\n", - " return dcg" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4. Step 3 — Compute DCG and IDCG\n", - "\n", - "We call `compute_dcg` twice:\n", - "- Once on the **model's ranking** → DCG\n", - "- Once on the **ideal ranking** (ground truth sorted descending) → IDCG" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "DCG@5 — model's ranking:\n", - "\n", - "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", - "------------------------------------------------------------------------\n", - "1 0 0.0000 1.0000 0.0000\n", - "2 2 3.0000 1.5850 1.8928\n", - "3 3 7.0000 2.0000 3.5000\n", - "4 3 7.0000 2.3219 3.0147\n", - "5 1 1.0000 2.5850 0.3869\n", - "------------------------------------------------------------------------\n", - "DCG@5 = 8.7944\n", - "\n", - "IDCG@5 — ideal ranking (ground truth sorted descending):\n", - "\n", - "Rank Relevance Gain (2^rel-1) Discount log2(i+1) Contribution\n", - "------------------------------------------------------------------------\n", - "1 3 7.0000 1.0000 7.0000\n", - "2 3 7.0000 1.5850 4.4165\n", - "3 2 3.0000 2.0000 1.5000\n", - "4 1 1.0000 2.3219 0.4307\n", - "5 0 0.0000 2.5850 0.0000\n", - "------------------------------------------------------------------------\n", - "IDCG@5 = 13.3472\n" - ] - } - ], - "source": [ - "K = 5 # Evaluate top 5 results\n", - "\n", - "# --- DCG: model's actual ranking ---\n", - "print(f\"DCG@{K} — model's ranking:\\n\")\n", - "dcg = compute_dcg(ranked_relevance, K)\n", - "print(f\"DCG@{K} = {dcg:.4f}\")\n", - "\n", - "print()\n", - "\n", - "# --- IDCG: ideal ranking (ground truth sorted descending) ---\n", - "ideal_relevance, _ = torch.sort(y_true[0], descending=True)\n", - "print(f\"IDCG@{K} — ideal ranking (ground truth sorted descending):\\n\")\n", - "idcg = compute_dcg(ideal_relevance, K)\n", - "print(f\"IDCG@{K} = {idcg:.4f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5. Step 4 — Compute NDCG\n", - "\n", - "NDCG normalizes DCG by IDCG, giving a score between 0 and 1:\n", - "\n", - "$$NDCG@K = \\frac{DCG@K}{IDCG@K}$$\n", - "\n", - "A score of 1.0 means the model ranked everything perfectly. A score close to 0 means the ranking was very poor." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "DCG@5 = 8.7944\n", - "IDCG@5 = 13.3472\n", - "NDCG@5 = DCG / IDCG = 8.7944 / 13.3472 = 0.6589\n", - "\n", - "The model achieved 65.9% of the ideal ranking quality.\n" - ] - } - ], - "source": [ - "ndcg_manual = dcg / idcg\n", - "\n", - "print(f\"DCG@{K} = {dcg:.4f}\")\n", - "print(f\"IDCG@{K} = {idcg:.4f}\")\n", - "print(f\"NDCG@{K} = DCG / IDCG = {dcg:.4f} / {idcg:.4f} = {ndcg_manual:.4f}\")\n", - "print(f\"\\nThe model achieved {ndcg_manual*100:.1f}% of the ideal ranking quality.\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6. Verify with PyTorch Ignite\n", - "\n", - "Now let's confirm our manual calculation matches the Ignite `NDCG` metric exactly." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Manual NDCG@5: 0.6589\n", - "Ignite NDCG@5: 0.6589\n", - "\n", - "✓ Manual calculation matches Ignite implementation perfectly!\n" - ] - } - ], - "source": [ - "from ignite.metrics.rec_sys.ndcg import NDCG\n", - "\n", - "# Initialize the NDCG metric with k=5\n", - "ndcg_metric = NDCG(output_transform=lambda x: x, top_k=[K])\n", - "\n", - "# Reset and update with our data\n", - "ndcg_metric.reset()\n", - "ndcg_metric.update((y_pred, y_true))\n", - "\n", - "# Compute the result\n", - "ignite_result = ndcg_metric.compute()\n", - "\n", - "print(f\"Manual NDCG@{K}: {ndcg_manual:.4f}\")\n", - "print(f\"Ignite NDCG@{K}: {ignite_result[0]:.4f}\")\n", - "print()\n", - "\n", - "# Verify they match\n", - "assert abs(ndcg_manual - ignite_result[0]) < 1e-4, \"Mismatch!\"\n", - "print(\"✓ Manual calculation matches Ignite implementation perfectly!\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 7. Understanding the Score\n", - "\n", - "Let's build some intuition by looking at two extreme cases." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Perfect ranking NDCG@5: 1.0000 (should be 1.0)\n", - "Worst ranking NDCG@5: 0.5884 (should be close to 0)\n", - "\n", - "Our model's ranking NDCG@5: 0.6589 (somewhere in between)\n" - ] - } - ], - "source": [ - "# Case 1: Perfect ranking (model scores match relevance exactly)\n", - "y_pred_perfect = torch.tensor([[0.9, 0.6, 0.8, 0.1, 0.3]])\n", - "\n", - "ndcg_metric.reset()\n", - "ndcg_metric.update((y_pred_perfect, y_true))\n", - "perfect_score = ndcg_metric.compute()\n", - "print(f\"Perfect ranking NDCG@{K}: {perfect_score[0]:.4f} (should be 1.0)\")\n", - "\n", - "# Case 2: Worst ranking (model ranks least relevant items highest)\n", - "y_pred_worst = torch.tensor([[0.1, 0.3, 0.2, 0.9, 0.6]])\n", - "\n", - "ndcg_metric.reset()\n", - "ndcg_metric.update((y_pred_worst, y_true))\n", - "worst_score = ndcg_metric.compute()\n", - "print(f\"Worst ranking NDCG@{K}: {worst_score[0]:.4f} (should be close to 0)\")\n", - "\n", - "print(f\"\\nOur model's ranking NDCG@{K}: {ndcg_manual:.4f} (somewhere in between)\")" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -}