You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,15 @@ All notable changes to the IDC Claude Skill are documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/).
7
7
8
+
## [1.6.2] - 2026-05-08
9
+
10
+
### Changed
11
+
12
+
- Moved `version_metadata_index` to second position in Available Tables (right after `index`) to surface it alongside the primary index
13
+
- Moved `prior_versions_index` to last position in Available Tables; updated description to clarify it contains only removed/superseded series and should not be queried for current data
14
+
- Added explicit Best Practices rule prohibiting web search for IDC data content questions; idc-index DuckDB queries are always authoritative — web sources are stale
15
+
- Removed "Loaded" column from Available Tables and replaced with an unconditional rule: always call `client.fetch_index("table_name")` before querying any table; `fetch_index()` is idempotent for all tables including auto-loaded ones, so no exceptions are needed
Copy file name to clipboardExpand all lines: SKILL.md
+20-23Lines changed: 20 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ name: imaging-data-commons
3
3
description: Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Invoke for any question about IDC collections, cancer imaging datasets, DICOM data access, radiology (CT, MR, PET) or pathology AI training sets, metadata queries, visualization, or license checks — even when the user doesn't explicitly mention "IDC". No authentication required.
4
4
license: This skill is provided under the MIT License. IDC data itself has individual licensing (mostly CC-BY, some CC-NC) that must be respected when using the data.
5
5
metadata:
6
-
version: 1.6.1
6
+
version: 1.6.2
7
7
skill-author: Andrey Fedorov, @fedorov
8
8
idc-index: "0.12.2"
9
9
idc-data-version: "v24"
@@ -128,25 +128,24 @@ The `idc-index` package provides multiple metadata index tables, accessible via
|`contrast_index`| 1 row = 1 series with contrast info | fetch_index() | Contrast agent metadata: agent name, ingredient, administration route (CT, MR, PT, XA, RF) |
144
-
|`volume_geometry_index`| 1 row = 1 CT/MR/PT series | fetch_index() | 3D volume geometry validation for single-frame CT, MR, and PT series; boolean checks for orientation, spacing, dimensions, and slice positions; composite `regularly_spaced_3d_volume` flag |
145
-
|`rtstruct_index`| 1 row = 1 RTSTRUCT series | fetch_index() | RT Structure Set metadata: total ROI count, ROI names, generation algorithms, interpreted types, and the referenced image series UID |
146
-
|`version_metadata_index`| 1 row = 1 IDC release version | fetch_index() | IDC version release timestamps; join on `idc_version` to correlate series with their release date |
147
-
148
-
**Auto** = loaded automatically when `IDCClient()` is instantiated
149
-
**fetch_index()** = requires `client.fetch_index("table_name")` to load
131
+
Always call `client.fetch_index("table_name")` before querying any index table — it is safe and idempotent for all tables, including those loaded automatically at startup.
132
+
133
+
| Table | Row Granularity | Description |
134
+
|-------|-----------------|-------------|
135
+
|`index`| 1 row = 1 DICOM series | Primary metadata for all current IDC data |
136
+
|`version_metadata_index`| 1 row = 1 IDC release version | IDC version release timestamps; join on `idc_version` to correlate series with their release date |
|`seg_index`| 1 row = 1 DICOM Segmentation series | Segmentation metadata: algorithm, segment count, reference to source image series |
143
+
|`ann_index`| 1 row = 1 DICOM ANN series | Microscopy Bulk Simple Annotations series metadata; references annotated image series |
144
+
|`ann_group_index`| 1 row = 1 annotation group | Detailed annotation group metadata: graphic type, annotation count, property codes, algorithm |
145
+
|`contrast_index`| 1 row = 1 series with contrast info | Contrast agent metadata: agent name, ingredient, administration route (CT, MR, PT, XA, RF) |
146
+
|`volume_geometry_index`| 1 row = 1 CT/MR/PT series | 3D volume geometry validation for single-frame CT, MR, and PT series; boolean checks for orientation, spacing, dimensions, and slice positions; composite `regularly_spaced_3d_volume` flag |
147
+
|`rtstruct_index`| 1 row = 1 RTSTRUCT series | RT Structure Set metadata: total ROI count, ROI names, generation algorithms, interpreted types, and the referenced image series UID |
148
+
|`prior_versions_index`| 1 row = 1 DICOM series | Series that have been removed or superseded in previous IDC releases; use only to download deprecated/historical data — do not query for current data |
150
149
151
150
### Joining Tables
152
151
@@ -666,17 +665,15 @@ See `references/use_cases.md` for complete end-to-end workflow examples includin
666
665
667
666
## Best Practices
668
667
668
+
-**Never use web search for IDC data content questions** - Always query the idc-index directly using `client.sql_query()`. Web sources (release notes, blog posts, documentation pages) are frequently out of date and will produce incorrect answers. The local DuckDB index is the authoritative source; use it even when web search is available.
669
669
-**Verify IDC version before generating responses** - Always call `client.get_idc_version()` at the start of a session to confirm you're using the expected data version (currently v24). If using an older version, recommend `pip install --upgrade idc-index`
670
670
-**Check licenses before use** - Always query the `license_short_name` field and respect licensing terms (CC BY vs CC BY-NC)
671
671
-**Generate citations for attribution** - Use `citations_from_selection()` to get properly formatted citations from `source_DOI` values; include these in publications
672
672
-**Start with small queries** - Use `LIMIT` clause when exploring to avoid long downloads and understand data structure
673
673
-**Use mini-index for simple queries** - Only use BigQuery when you need comprehensive metadata or complex JOINs
674
674
-**Organize downloads with dirTemplate** - Use meaningful directory structures like `%collection_id/%PatientID/%Modality`
675
-
-**Cache query results** - Save DataFrames to CSV files to avoid re-querying and ensure reproducibility
676
675
-**Estimate size first** - Check collection size before downloading - some collection sizes are in terabytes!
677
676
-**Save manifests** - Always save query results with Series UIDs for reproducibility and data provenance
678
-
-**Read documentation** - IDC data structure and metadata fields are documented at https://learn.canceridc.dev/
679
-
-**Use IDC forum** - Search for questons/answers and ask your questions to the IDC maintainers and users at https://discourse.canceridc.dev/
0 commit comments