Skip to content

Commit cf29f1b

Browse files
committed
went through the test list
1 parent 3043949 commit cf29f1b

File tree

3 files changed

+15
-21
lines changed

3 files changed

+15
-21
lines changed

README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,7 @@ Migrated from the original Python150k preprocessing pipeline:
8181
# Install dependencies
8282
pip install -e ".[dev]"
8383

84-
# Download the seed dataset
85-
cd data/raw/python-method && bash get_data.sh && cd -
86-
87-
# Convert to HuggingFace format
84+
# Convert to HuggingFace format (requires dataset access, see below)
8885
python -m src.data.convert_seed \
8986
--input-dir data/raw/python-method \
9087
--output-dir data/processed/python-method
@@ -95,6 +92,17 @@ python -m src.data.convert_seed \
9592
The seed dataset comes from the [NeuralCodeSum](https://github.com/wasiahmad/NeuralCodeSum)
9693
project (ACL 2020): 92,545 Python function-docstring pairs split into train/dev/test.
9794

95+
### Dataset Access
96+
97+
The python-method dataset was previously available via a Google Drive download script
98+
(`data/raw/python-method/get_data.sh`). This script has been removed as the Google Drive
99+
link (file ID: `1XPE1txk9VI0aOT_TdqbAeI58Q8puKVl2`) is no longer accessible.
100+
101+
To obtain the dataset, you can:
102+
1. Contact the [NeuralCodeSum](https://github.com/wasiahmad/NeuralCodeSum) authors
103+
2. Download from the original source if available at the project repository
104+
3. Use the alternative python150k dataset from [ETH Zurich SRI Lab](https://www.sri.inf.ethz.ch/py150)
105+
98106
## Acknowledgments
99107

100108
- Original C2NL dataset: [A Transformer-based Approach for Source Code Summarization](https://arxiv.org/abs/2005.00653)

data/raw/python-method/get_data.sh

Lines changed: 0 additions & 17 deletions
This file was deleted.

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,9 @@ dev = [
3232
"ruff>=0.1.0",
3333
]
3434

35+
[tool.hatch.build.targets.wheel]
36+
packages = ["src"]
37+
3538
[tool.ruff]
3639
line-length = 100
3740
target-version = "py310"

0 commit comments

Comments
 (0)