Skip to content

Commit c7198d8

Browse files
docs: update README
1 parent e3a11c9 commit c7198d8

2 files changed

Lines changed: 4 additions & 2 deletions

File tree

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,13 +61,14 @@ Furthermore, GraphGen incorporates multi-hop neighborhood sampling to capture co
6161
After data generation, you can use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) and [xtuner](https://github.com/InternLM/xtuner) to finetune your LLMs.
6262

6363
## 📌 Latest Updates
64+
- **2026.02.04**: We support HuggingFace Datasets as input data source for data generation now.
6465
- **2026.01.15**: **LLM benchmark synthesis** now supports single/multiple-choice & fill-in-the-blank & true-or-false—ideal for education 🌟🌟
6566
- **2025.12.26**: Knowledge graph evaluation metrics about accuracy (entity/relation), consistency (conflict detection), structural robustness (noise, connectivity, degree distribution)
66-
- **2025.12.16**: Added [rocksdb](https://github.com/facebook/rocksdb) for key-value storage backend and [kuzudb](https://github.com/kuzudb/kuzu) for graph database backend support.
6767

6868
<details>
6969
<summary>History</summary>
7070

71+
- **2025.12.16**: Added [rocksdb](https://github.com/facebook/rocksdb) for key-value storage backend and [kuzudb](https://github.com/kuzudb/kuzu) for graph database backend support.
7172
- **2025.12.16**: Added [vllm](https://github.com/vllm-project/vllm) for local inference backend support.
7273
- **2025.12.16**: Refactored the data generation pipeline using [ray](https://github.com/ray-project/ray) to improve the efficiency of distributed execution and resource management.
7374
- **2025.12.1**: Added search support for [NCBI](https://www.ncbi.nlm.nih.gov/) and [RNAcentral](https://rnacentral.org/) databases, enabling extraction of DNA and RNA data from these bioinformatics databases.

README_zh.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,14 +62,15 @@ GraphGen 首先根据源文本构建细粒度的知识图谱,然后利用期
6262
在数据生成后,您可以使用[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)[xtuner](https://github.com/InternLM/xtuner)对大语言模型进行微调。
6363

6464
## 📌 最新功能
65+
- **2026.02.04**:支持使用直接读入 HuggingFace 数据集进行数据生成
6566
- **2026.01.15**:合成垂域评测数据(单选题、多选题、填空题和判断题型)🌟🌟
6667
- **2025.12.26**:引入知识图谱评估指标,包括准确度评估(实体/关系抽取质量)、一致性评估(冲突检测)和结构鲁棒性评估(噪声比、连通性、度分布)
67-
- **2025.12.16**:支持 [rocksdb](https://github.com/facebook/rocksdb) 作为键值存储后端, [kuzudb](https://github.com/kuzudb/kuzu) 作为图数据库后端
6868

6969

7070
<details>
7171
<summary>历史更新记录</summary>
7272

73+
- **2025.12.16**:支持 [rocksdb](https://github.com/facebook/rocksdb) 作为键值存储后端, [kuzudb](https://github.com/kuzudb/kuzu) 作为图数据库后端。
7374
- **2025.12.16**:支持 [vllm](https://github.com/vllm-project/vllm) 作为本地推理后端。
7475
- **2025.12.16**:使用 [ray](https://github.com/ray-project/ray) 重构了数据生成 pipeline,提升了分布式执行和资源管理的效率。
7576
- **2025.12.1**:新增对 [NCBI](https://www.ncbi.nlm.nih.gov/)[RNAcentral](https://rnacentral.org/) 数据库的检索支持,现在可以从这些生物信息学数据库中提取DNA和RNA数据。

0 commit comments

Comments
 (0)