Commit 224560f
DeepFinance Enhancements (#6)
* feat(finworld): Added AgentScope learning protocol and OpenJudge evaluation functionality to the FinWorld task.
- Added the ExampleAgentScopeLearnProtocol class to implement the AgentScope execution flow for multi-turn interactions.
- Integrated semaphore control to manage the parallelism of environment calls, improving environment stepping performance.
- Implemented a mechanism for detecting context overflows and quickly terminating during environment interactions to prevent blocking.
- Added a finworld.yaml configuration file to define project training and rollout parameters.
- Added the FinWorldJudgeByOpenJudge class, integrating multiple evaluators including RM Gallery and OpenJudge (@Haoran).
- Implemented a mechanism for converting task output, asynchronous calls, and retrying to ensure evaluation stability.
- Weight normalization manages the contributions of each evaluator, merging them to calculate the final reward and success determination.
* Precommit fix (#4)
* fix end of files
* autoflake import fix
* add mypy check
* fix test bench import
* refactor(finworld): Replace agent protocol and unify configuration updates
- Renamed ExampleAgentScopeLearnProtocol to ExampleDeepResearchProtocol and modified the execute method signature.
- Unified the parameter name of the model tuner to `tuner` and its related attribute references.
- Optimized the multi-turn interaction step configuration, changing it to use `tuner.config.ajet.rollout.multi_turn.max_steps`.
- Modified the context overflow judgment logic to prevent tool call blocking.
- Updated the finworld.yaml configuration, replacing astune with ajet-related configurations, and adjusted the workflow protocol and environment parameters.
- Modified the default environment variable values and log saving paths in finworld_judge.py.
- Added and improved multi-machine and single-machine startup scripts, supporting dynamic generation of MCP configuration and environment variable loading.
- Added the finworld_single.yaml template to adapt to single-machine training configurations.
- Adjusted the key reference for multi-turn step configuration in ma_deepresearch.py, using the ajet configuration path.
* feat(finworld): Added FinWorld training environment configuration scripts and templates
- Added bash startup scripts for multi-machine, multi-GPU training, supporting dynamic configuration generation and environment variable import.
- Implemented training configuration file templates, supporting automatic injection of various weight parameters and model paths.
- Adjusted the default request timeout of EnvClient from 30 seconds to 300 seconds to accommodate long training requests.
- Added a new finworld example directory and related documentation, improving the example project structure.
* refactor(utils): Remove unused extract and compute functions `extract_tool_stats_from_cmts`
* refactor(finworld): Replace the old model with OpenJudge, update evaluation configuration and scripts
- Replaced model initialization in FinWorldJudgeByOpenJudge with the `_init_openjudge_model` method
- Read Judge model parameters from the configuration file first, using environment variables as a fallback
- Optimized RM Gallery initialization, using configuration-first logic, and improved exception stack trace printing
- Cleaned up and removed the old `_init_model` singleton method and related code
- Updated the example startup script `ajet_finworld.sh`, adding OPENJUDGE_LLM and RM_LLM configurations
- Modified YAML templates and configuration files to unify the structure and field naming of Judge configuration items
- Deleted the outdated `cc_rm4_res2cit2fai2_30b.sh` script
- Adjusted the `env_service` startup path to improve environment activation compatibility
- Adjusted script log output format and content to enhance the clarity of configuration parameter printing
* feat(task_reader): Support data reading of type jsonl_with_env_service
- Added the jsonl_with_env_service type, which allows loading data from jsonl files while calling tools via env_service.
- Extended ResourceKeeper to handle the creation and release logic of environment instances for jsonl_with_env_service.
- Maintained the env_service type logic, calling create_instance to register instances and initializing them using init_messages from the jsonl file.
- Added an example protocol, ExampleDeepResearchProtocol, to implement multi-turn interaction and environment call coordination.
- Provided training scripts and YAML configuration templates for finworld, supporting the jsonl_with_env_service mode training environment.
- Optimized scripts to support multi-node multi-GPU training, including environment variables and Ray cluster configuration.
* feat(core): add finworld task reader support to framework
* feat(finworld): implement specialized data reader and openjudge-based grading logic
* refactor(finworld): optimize configuration templates and prompt engineering
* chore(finworld): update launch scripts and add variant experiment scripts
* feat(finworld): Added support for multi-machine, multi-GPU training scripts and configuration templates:
* chore(git): ignore finworld/yaml/*
* fix(metrics): Fix and enhance the compatibility and debugging output of the metrics update logic
- Modified the `update_metrics` function, adding a `prefix` parameter to distinguish between training and validation metrics.
- Adjusted the data source for extracting `reward_stats` and `tool_stats`, migrating from `workflow_metadata` to `log_metrics`.
- Added debug printing to output the `log_metrics` content and metric key names at key steps for easier troubleshooting.
- Used the appropriate prefix when calling `update_metrics` in `trainer_verl.py`, and added multiple debug prints.
- Modified `WorkflowOutput` to place `tool_stats` and `reward_stats` into the `log_metrics` field.
- Removed redundant and deprecated code for extracting `reward_stats` and calculation functions.
- Added debug information output to the `finworld` and `finworld_judge` modules to track log metrics and scoring data.
* fix(metrics): Remove debug prints and synchronize reward statistics
- Removed debug print statements before and after the `update_metrics` call in `trainer_verl.py`
- Removed debug print statements related to the `log_metrics` key in `finworld.py`
- Removed debug print statements before updating `metadata_stats` in `finworld_judge.py`
- Added logic in `general_runner.py` to synchronize `reward_stats` from `metadata` to `log_metrics` after the judge calculation
- Cleaned up debug print statements within `update_metrics` in `metric_helper`, improving code readability.
* chore: "Stop tracking existing yaml files in tutorial directory"
* fix(task_runner): Synchronize reward_stats to log_metrics
feat(tutorial): Added FinWorld multi-machine multi-GPU training startup script
* refactor(script): Refactored the finworld training script, integrating configuration and startup processes.
* Refactor(deep_finance): Replace and remove finworld-related implementations
- Switched the example directory from example_finworld to example_deep_finance
- Modified startup parameters and logic to support deep_finance, replacing the finworld option
- Replaced finworld_reader with deep_finance_reader in the task reader
- Adjusted environment client configuration in resource management, using deep_finance instead of finworld-related checks
- Updated reward metric tool documentation to support deep_finance
- Deleted finworld-related configuration files, scripts, code, and evaluation modules, cleaning up leftover files and scripts
- Replaced the keyword "finworld" with "deep_finance" in comments and logs
* refactor(deepfinance): Rename and unify DeepFinance module and config references
- Replace all "finworld" and "deep_finance" names with the unified "deepfinance" format.
- Modify command-line arguments to `--with-deepfinance` for consistency.
- Adjust the class name in `task_reader` from `deep_financeReader` to `DeepFinanceReader`.
- Update the documentation description and file name of the `metric_helper` module to DeepFinance.
- Modify environment variables and configuration paths in the example script `deep_finance.sh` to use the `DEEPFINANCE` prefix.
- Update `judge_protocol` to `DeepFinanceJudgeByOpenJudge` in the `deep_finance.yaml` configuration.
- Refactor the `FinWorldJudgeByOpenJudge` class in `deep_finance_judge.py` to `DeepFinanceJudgeByOpenJudge`.
- Rename the `FinworldReader` class in `deep_finance_reader.py` to `DeepFinanceReader`.
- Modify the debug log identifier and corresponding environment variable name to `DEEPFINANCE_DEBUG`.
- Update the evaluation protocol in the `deep_finance_template.yaml` template to `DeepFinanceJudgeByOpenJudge`.
- Ensure that internal references and comments in all modules are updated to use DeepFinance and deepfinance-related names.
* refactor(tutorial): Optimize dynamic generation logic for configuration file paths
* fix(deep_finance): argparse: with-deepfinance
* fix(tutorial): Fixed issues with multi-machine training environment variable settings
* fix(env): Corrected the assignment logic for reward and info when returning environment state
- Corrected the `env_output` return value structure in `BaseGymEnv` to ensure correct assignment of `reward` and `info` fields.
- Removed `RefJudge` and `StructureJudge` related metric calculations and statistics from `reward_metric_helper`.
- Cleaned up redundant code in `reward_metric_helper`, removing invalid comments and statistical items.
- Modified `save_trajectory_as_json` to always print trajectory saving confirmation information.
- Corrected log comments in `example_deep_finance` to avoid meaningless log output.
- Added the `save_trajectory_as_json_file` configuration item to `deep_finance_template.yaml` to support trajectory saving functionality.
* chore(config): Update example_deep_finance configuration and clean up files
- Added a new ignore rule for config file paths in .gitignore
- Deleted the automatically generated mcp_finance_tool_generated.json file in example_deep_finance
- Refactored the deep_finance.yaml configuration file, adjusting project and experiment names
- Reorganized Judge configuration, clarifying openjudge_llm and rm_llm models
- Optimized model paths and training parameter configurations, adding parallel and batch processing settings
- Adjusted data reading methods and training/validation set path placeholders
- Reduced GPU memory usage ratio for rollout to 0.8
- Updated the default save directory path for the trainer to a placeholder variable
- Cleaned up unused and commented-out code to improve configuration file conciseness
* Refactor(metric): Optimize tool metric calculation and data saving logic
- Corrected the data source field for timeline data used during trajectory saving.
- Removed redundant fields in tool execution time, cache hit rate, and error rate statistics.
- Updated .gitignore to add ignore rules for the example script directory.
- Removed unnecessary debugging information from logs to reduce log noise.
- Adjusted log printing in the multi-round interaction execution process to simplify output content.
- Streamlined log code for environment observation and termination checks to improve code readability.
* fix(metric_helper): fix tool cache metric
* fix little bug
* fix(utils): Suppress httpx AsyncClient.aclose() exception warnings
* comments to english
* feat: 支持服务名称前缀功能
- 在 launcher 中添加 --prefix 参数支持
- 在 pty_launch 函数中实现前缀逻辑
- 更新 deep_finance.sh 脚本以使用前缀功能
- 允许在同一环境中运行多个服务实例
* fix: 改进 MultiAgent 消息内容解析逻辑
- 支持 tool_result 格式的消息内容块
- 改进非文本内容的处理逻辑,继续处理其他项而非跳过整个消息
- 添加 tool_use 类型的处理(跳过,因为已通过 tool_calls 字段处理)
- 优化代码结构和注释,提高可读性
* fix: 优化 DeepFinance 判断逻辑和配置
- 修复 tool_stats 提取逻辑,从 log_metrics 中正确获取数据
- 添加惩罚项调试信息输出
- 启用 tool calls 功能(force_disable_toolcalls: False)
- 确保奖励计算准确性
* chore(deps): bump agentscope from 1.0.7 to 1.0.8
* fix(metric_helper): correct trajectory save path and add tool call metric
- Change trajectory save directory from "ctx_trackers" to "trajectory" to organize files better
- Add recording of tool call counts alongside error rates in tool metrics
- Update experiment suffix in deep finance example script for clearer naming convention
* revise message parsing
---------
Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com>
Co-authored-by: Qingxu Fu <qingxu.fu@outlook.com>
Co-authored-by: qingxu.fu <fuqingxu.fqx@alibaba-inc.com>1 parent 20e4296 commit 224560f
File tree
9 files changed
+31
-7
lines changed- ajet
- context_tracker
- utils
- metric_helper
- tutorial/example_deep_finance
- yaml_template
9 files changed
+31
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
85 | 97 | | |
86 | 98 | | |
87 | 99 | | |
88 | 100 | | |
89 | 101 | | |
90 | 102 | | |
91 | 103 | | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
92 | 107 | | |
93 | 108 | | |
94 | 109 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| 102 | + | |
102 | 103 | | |
103 | 104 | | |
104 | 105 | | |
| |||
304 | 305 | | |
305 | 306 | | |
306 | 307 | | |
307 | | - | |
| 308 | + | |
308 | 309 | | |
309 | 310 | | |
310 | 311 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
| 128 | + | |
128 | 129 | | |
129 | 130 | | |
130 | 131 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
99 | | - | |
| 99 | + | |
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
| 106 | + | |
| 107 | + | |
106 | 108 | | |
107 | 109 | | |
108 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
208 | 208 | | |
209 | 209 | | |
210 | 210 | | |
| 211 | + | |
211 | 212 | | |
212 | 213 | | |
213 | 214 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
373 | 373 | | |
374 | 374 | | |
375 | 375 | | |
376 | | - | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
377 | 379 | | |
| 380 | + | |
| 381 | + | |
378 | 382 | | |
379 | 383 | | |
380 | 384 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
0 commit comments