Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 165 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*.pyc
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# IDEs
.vscode/
.idea/
*.swp
*.swo
*~

# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# Project specific
outputs.json
*.bin
*.gz
GoogleNews-vectors-negative300.bin
GoogleNews-vectors-negative300.bin.gz

# Temporary files
*.tmp
*.temp

# Model files and data
models/
data/
*.model
*.pkl
*.pickle

# Results and outputs
results/
outputs/
logs/
*.log

# API keys
.env
.env.local
.env.*.local
api_keys.txt
config.json

# AI settings
.claude/
51 changes: 44 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,18 +75,31 @@ The <b>Sci</b>entific <b>Know</b>ledge <b>Eval</b>uation (<b>SciKnowEval</b>) be
<h2 id="3">🏹 QuickStart</h2>
<h3 id="3.1">⬇️ Step 1: Installation</h3>

To evaluate LLMs on SciKnowEval, first clone the repository:
**Option 1: pip install from GitHub (Recommended)**
```bash
pip install sciknoweval@https://github.com/HICAI-ZJU/SciKnowEval.git
```

**Option 2: Install from Source**
```bash
git clone https://github.com/HICAI-ZJU/SciKnowEval.git
cd SciKnowEval
pip install .
```
Next, set up a conda environment to manage the dependencies:

**Option 3: Development Installation**
```bash
conda create -n sciknoweval python=3.10.9
conda activate sciknoweval
git clone https://github.com/HICAI-ZJU/SciKnowEval.git
cd SciKnowEval
pip install -e .
```
Then, install the required dependencies:

**Option 4: Manual Setup (Legacy)**
```bash
git clone https://github.com/HICAI-ZJU/SciKnowEval.git
cd SciKnowEval
conda create -n sciknoweval python=3.10.9
conda activate sciknoweval
pip install -r requirements.txt
```

Expand Down Expand Up @@ -152,7 +165,31 @@ By following these guidelines, you can effectively use the SciKnowEval benchmark

<h3 id="3.4">🚀 Step 4: Evaluate</h3>

You can run `eval.py` to evaluate your model:
**Option 1: Using the Command Line Interface (Recommended)**

After installing SciKnowEval, you can use the `sciknoweval` command:

```bash
export OPENAI_API_KEY="YOUR_API_KEY"
sciknoweval \
--data_path "your/model/predictions.json" \
--word2vec_model_path "path/to/GoogleNews-vectors-negative300.bin" \
--gen_evaluator "gpt-4o" \
--output_path "path/to/your/output.json"
```

**Option 2: Using Python Module**

```bash
export OPENAI_API_KEY="YOUR_API_KEY"
python -m sciknoweval.eval \
--data_path "your/model/predictions.json" \
--word2vec_model_path "path/to/GoogleNews-vectors-negative300.bin" \
--gen_evaluator "gpt-4o" \
--output_path "path/to/your/output.json"
```

**Option 3: Direct Script Execution (Legacy)**

```bash
data_path="your/model/predictions.json"
Expand All @@ -161,7 +198,7 @@ gen_evaluator="gpt-4o" # the correct model name in OpenAI
output_path="path/to/your/output.json"

export OPENAI_API_KEY="YOUR_API_KEY"
python eval.py \
python sciknoweval/eval.py \
--data_path $data_path \
--word2vec_model_path $word2vec_model_path \
--gen_evaluator $gen_evaluator \
Expand Down
Loading