Skip to content

Commit 288d671

Browse files
docs: update README
1 parent 2a75c3e commit 288d671

1 file changed

Lines changed: 23 additions & 10 deletions

File tree

README.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -136,18 +136,31 @@ For any questions, please check [FAQ](https://github.com/open-sciencelab/GraphGe
136136
TRAINEE_BASE_URL=your_base_url_for_trainee_model
137137
TRAINEE_API_KEY=your_api_key_for_trainee_model
138138
```
139-
2. (Optional) If you want to modify the default generated configuration, you can edit the content of the configs/graphgen_config.yaml file.
139+
2. (Optional) Customize generation parameters in `graphgen/configs/` folder.
140+
141+
Edit the corresponding YAML file, e.g.:
142+
140143
```yaml
141-
# configs/aggregated_config.yaml
142-
# Example configuration
143-
input_data_type: "raw"
144-
input_file: "resources/input_examples/raw_demo.jsonl"
145-
# more configurations...
144+
# configs/cot_config.yaml
145+
input_data_type: raw
146+
input_file: resources/input_examples/raw_demo.jsonl
147+
output_data_type: cot
148+
tokenizer: cl100k_base
149+
# additional settings...
146150
```
147-
3. Run the generation script
148-
```bash
149-
bash scripts/generate/generate_aggregated.sh
150-
```
151+
152+
3. Generate data
153+
154+
Pick the desired format and run the matching script:
155+
156+
| Format | Script to run | Notes |
157+
| ------------ | ---------------------------------------------- |-------------------------------------------------------------------|
158+
| `cot` | `bash scripts/generate/generate_cot.sh` | Chain-of-Thought Q\&A pairs |
159+
| `atomic` | `bash scripts/generate/generate_atomic.sh` | Atomic Q\&A pairs covering basic knowledge |
160+
| `aggregated` | `bash scripts/generate/generate_aggregated.sh` | Aggregated Q\&A pairs incorporating complex, integrated knowledge |
161+
| `multi-hop` | `bash scripts/generate/generate_multihop.sh` | Multi-hop reasoning Q\&A pairs |
162+
163+
151164
4. Get the generated data
152165
```bash
153166
ls cache/data/graphgen

0 commit comments

Comments
 (0)