Name	Name	Last commit message	Last commit date
parent directory ..
Benchmark	Benchmark
script	script
.gitignore	.gitignore
PCBench-LLM-Distilled_valid_samples_and_target_jsons.json	PCBench-LLM-Distilled_valid_samples_and_target_jsons.json
README.md	README.md
README_zh.md	README_zh.md

PCBench-LLM-Distilled

Language / 语言: 中文 | English

A distilled benchmark dataset for evaluating LLM-based Python API compatibility detection.

Data Source

Extracted from PCART-tools/PCBench@db6b54b.

The original data was first processed by extracting the first sample for each API according to the #1[YN][YN] rule. After multiple rounds of cleaning and filtering, the processed dataset in the current directory was obtained.

Dataset Processing Workflow

The filtering process for this dataset is as follows:

Extracted 844 first samples from the original PCBench.
Removed 6 samples that could not be used as valid test samples because of runtime environment or external dependency issues, leaving 838 samples.
Removed another 2 invalid samples whose API information could not be obtained dynamically or statically, leaving 836 samples.
Among these 836 usable samples, only 284 still needed to enter findDiffer after the initial compatibility check and produced target JSON files. The remaining 552 samples were judged compatible during the initial compatibility check and did not enter findDiffer, so they are not included in the current processed dataset.

The final process can be summarized as:

PCBench (844 extracted first samples) -> 838 -> 836 -> 284

The current Benchmark/ directory stores the final 284 retained samples. The processing list is stored in PCBench-LLM-Distilled_valid_samples_and_target_jsons.json.

Current Dataset

The Benchmark/ directory currently contains 284 samples from 29 libraries. These samples all satisfy the following conditions:

They passed the first two rounds of invalid sample removal.
They actually entered findDiffer after the initial compatibility check.
They can be matched to retained target_jsons in the processing logs.

Directory Structure

PCBench-LLM-Distilled/
├── README.md
├── README_en.md
├── Benchmark/
│   ├── <library>/                    # e.g., numpy, pandas, scipy
│   │   ├── <api>@<version_range>/    # e.g., numpy.array@1.20.0-1.21.0
│   │   │   └── <api>#1<label>/       # e.g., numpy.array#1YY
│   │   │       ├── <api><label>.py   # test script
│   │   │       └── <target_json>.json
│   │   └── ...
│   └── ...
├── PCBench-LLM-Distilled_valid_samples_and_target_jsons.json
└── script/
    ├── download_envs.py
    └── generate_json.py

Statistics

Library	Samples	Library	Samples
pandas	46	plotly	34
sklearn	26	polars	25
networkx	23	scipy	22
matplotlib	14	gensim	12
tornado	12	numpy	11
rich	10	sympy	9
httpx	6	tensorflow	6
dask	5	keras	4
aiohttp	3	flask	2
click	2	PIL	2
django	2	xgboost	1
torch	1	transformers	1
fastapi	1	loguru	1
spacy	1	jax	1
pydantic	1

Full statistics: 29 libraries, 284 samples

Excluded Samples

Invalid Samples Caused by Runtime Environment or External Dependencies (6)

Library	Directory	Reason
gensim	`#gensim.models.wrappers.fasttext.FastText.train@2.2.0-2.3.0`	CPU instruction incompatibility (SIGILL)
redis	`#redis.commands.core.SortedSetCommands.zrank@4.5.5-4.6.0`	Redis service required
redis	`#zrevrank@4.5.4-4.5.5`	Redis service required
sklearn	`#sklearn.datasets.fetch_20newsgroups@0.21.3-0.22`	SSL certificate issue
sklearn	`#sklearn.datasets.fetch_20newsgroups_vectorized@0.21.3-0.22`	SSL certificate issue
torch	`#torch.cuda.synchronize@1.0.1-1.1.0`	NVIDIA GPU required

Invalid Samples Whose API Information Failed to Be Retrieved Both Dynamically and Statically (2)

Library	Directory	Reason
tensorflow	`#tf.data.experimental.choose_from_datasets@2.5.3-2.6.0`	API information failed to be retrieved both dynamically and statically; the API content passed into `findDiffer` was empty, making it an invalid sample, so it was removed
tensorflow	`#tf.data.experimental.sample_from_datasets@2.5.3-2.6.0`	API information failed to be retrieved both dynamically and statically; the API content passed into `findDiffer` was empty, making it an invalid sample, so it was removed

Samples That Did Not Enter `findDiffer` (552)

The remaining unretained samples all passed the initial compatibility check and did not enter findDiffer for processing. Therefore, they are treated as outside the current processed dataset, excluded uniformly, and not listed one by one.

Scripts

`script/generate_json.py`

Generates JSON configuration files for each sample in the Benchmark/ directory.

python script/generate_json.py

During execution, it prompts for the conda environment directory path, which is used to configure the currentEnv and targetEnv fields.

`script/download_envs.py`

Downloads preconfigured environment archives from the NUAA cloud drive and automatically extracts them to the specified installation path.

Two modes are supported:

Batch download: select one or more folders and download all archives inside them.
Selective download: enter a folder first, then select specific archives to download.

python script/download_envs.py

During execution, it prompts for the download mode, target folder or file, and installation path. After downloading, it automatically extracts the archives and outputs statistics for successful, failed, skipped items, and total elapsed time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

PCBench-LLM-Distilled

Data Source

Dataset Processing Workflow

Current Dataset

Directory Structure

Statistics

Excluded Samples

Invalid Samples Caused by Runtime Environment or External Dependencies (6)

Invalid Samples Whose API Information Failed to Be Retrieved Both Dynamically and Statically (2)

Samples That Did Not Enter `findDiffer` (552)

Scripts

`script/generate_json.py`

`script/download_envs.py`

FilesExpand file tree

PCBench-LLM-Distilled

Directory actions

More options

Directory actions

More options

Latest commit

History

PCBench-LLM-Distilled

Folders and files

parent directory

README.md

PCBench-LLM-Distilled

Data Source

Dataset Processing Workflow

Current Dataset

Directory Structure

Statistics

Excluded Samples

Invalid Samples Caused by Runtime Environment or External Dependencies (6)

Invalid Samples Whose API Information Failed to Be Retrieved Both Dynamically and Statically (2)

Samples That Did Not Enter findDiffer (552)

Scripts

script/generate_json.py

script/download_envs.py

Samples That Did Not Enter `findDiffer` (552)

`script/generate_json.py`

`script/download_envs.py`