#847 chore(deps): 添加 OceanBase 集成示例和文档#1080
Conversation
- 在 pyproject.toml 文件的依赖列表中添加了 pymysql - 方便后续数据库相关操作的支持 - 保持依赖一致性和完整性
- Add OceanBaseMetricsLogger class for metrics persistence - Database connection with environment variable support - Table creation with proper indexes - Metric insertion with error handling - Query examples for verification - Add comprehensive quickstart guide - OceanBase introduction and Docker deployment - Connection configuration and troubleshooting - Two integration approaches (direct + custom) - Common SQL queries and performance optimization - Add pymysql dependency to pyproject.toml - Update README with tutorial link Closes: OceanBase integration feature request
Replace `Optional[X]` with `X | None` syntax (Python 3.10+) in oceanbase_example.py to comply with ruff UP045 rule. Changes: - Remove unused `typing.Optional` import - Update connection type annotation - Update insert_metric parameter annotations
- 引入了 typing.Optional 以替代联合类型注解 - 将 pymysql.Connection | None 修改为 Optional[pymysql.Connection] - 将 float | None 类型参数改为 Optional[float] - 提升代码的类型一致性和可读性
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances AReaL by providing an integration example and documentation for OceanBase, a distributed database suitable for storing large-scale training metrics. It also introduces comprehensive documentation for several core modules, offering a clearer understanding of the codebase structure and functionality. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
| loss FLOAT, | ||
| reward FLOAT, | ||
| timestamp DATETIME NOT NULL, | ||
| PRIMARY KEY (id, timestamp) |
| logger.info("插入示例训练指标...") | ||
| for step in range(1, 6): | ||
| metrics_logger.insert_metric( | ||
| experiment_name="gsm8k_grpo_demo", | ||
| step=step * 100, | ||
| loss=1.5 - step * 0.2, | ||
| reward=0.5 + step * 0.1, | ||
| ) |
There was a problem hiding this comment.
在循环中逐条插入指标效率较低,会导致大量的数据库网络请求。建议使用批量插入(executemany)来提高性能,这对于日志记录场景尤其重要。您的文档 oceanbase_quickstart.md 中也推荐了批量插入作为性能优化方案。
以下建议直接在 main 函数中实现了批量插入。为了更好的封装性,建议将此批量插入逻辑封装成 OceanBaseMetricsLogger 类的一个新方法(例如 insert_metrics_batch)。
logger.info("生成并批量插入示例训练指标...")
metrics_to_insert = [
(
"gsm8k_grpo_demo",
step * 100,
1.5 - step * 0.2,
0.5 + step * 0.1,
datetime.now(),
)
for step in range(1, 6)
]
if metrics_to_insert and metrics_logger.connection:
with metrics_logger.connection.cursor() as cursor:
insert_sql = """
INSERT INTO training_metrics
(experiment_name, step, loss, reward, timestamp)
VALUES (%s, %s, %s, %s, %s)
"""
cursor.executemany(insert_sql, metrics_to_insert)
logger.info(f"批量插入 {len(metrics_to_insert)} 条指标成功")|
This pull request has been automatically marked as stale because it has not had recent activity within the last 14 days. Please add a comment or push new commits to keep it active. Thank you for your contribution! |
|
Hi @flying-dragon-ai , the OceanBase logger would be a great feature! Could you please: |
|
This pull request has been automatically marked as stale because it has not had recent activity within the last 14 days. Please add a comment or push new commits to keep it active. Thank you for your contribution! |
Description
Related Issue
Fixes #(issue)
Type of Change
Checklist
pre-commit run --all-files)./docs/build_all.sh)main/review-prcommand/create-prBreaking Change Details (if applicable):
Additional Context
Need help? Check the Contributing Guide or ask in
GitHub Discussions!