feat: add A-shares support via AKShare data vendor#797
feat: add A-shares support via AKShare data vendor#797yangbenchn-cmyk wants to merge 21 commits into
Conversation
… with questionary.text
There was a problem hiding this comment.
Code Review
This pull request integrates AKShare as a new data vendor, providing support for A-shares (Shanghai/Shenzhen) and Hong Kong stocks. Key additions include a new akshare_data module for fetching OHLCV data, fundamentals, financial statements, and news, along with logic in the CLI and trading graph to resolve Chinese stock names to their respective tickers. Feedback from the review highlights several areas for improvement: adhering to PEP 8 by moving inline imports to the top of modules, expanding ticker resolution logic to support the Beijing Stock Exchange, ensuring date validation is handled within existing error-handling blocks, and optimizing technical indicator calculations by replacing inefficient string-based filtering with DataFrame indexing.
| raw = ticker.strip() or "SPY" | ||
|
|
||
| # Resolve Chinese stock names | ||
| from tradingagents.dataflows.akshare_data import _is_chinese_name, resolve_ticker_name |
There was a problem hiding this comment.
Avoid using inline imports within functions unless it is strictly necessary to resolve a circular dependency. Moving this import to the top of the file is recommended to adhere to PEP 8 and improve code clarity.
References
- Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants. (link)
| if code.startswith('6'): | ||
| return f"{code}.SS" | ||
| else: | ||
| return f"{code}.SZ" |
There was a problem hiding this comment.
The logic for determining the exchange suffix is too simplistic. It assumes any ticker not starting with '6' is a Shenzhen stock ('.SZ'), which incorrectly handles Beijing Stock Exchange tickers (starting with '4' or '8', typically suffixed with '.BJ'). It is better to explicitly check for Shenzhen prefixes and handle Beijing stocks separately.
| if code.startswith('6'): | |
| return f"{code}.SS" | |
| else: | |
| return f"{code}.SZ" | |
| if code.startswith('6'): | |
| return f"{code}.SS" | |
| elif code.startswith(('0', '3')): | |
| return f"{code}.SZ" | |
| elif code.startswith(('4', '8')): | |
| return f"{code}.BJ" | |
| return f"{code}.SZ" |
| import pandas as pd | ||
| from datetime import datetime | ||
|
|
||
| import akshare as ak |
There was a problem hiding this comment.
These imports should be moved to the top of the file to comply with PEP 8. Placing imports in the middle of a module makes it harder to track dependencies and can lead to confusion regarding the module's global namespace.
References
- Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants. (link)
| datetime.strptime(start_date, "%Y-%m-%d") | ||
| datetime.strptime(end_date, "%Y-%m-%d") |
There was a problem hiding this comment.
| while result_dt >= curr_dt - relativedelta(days=look_back_days): | ||
| date_str = result_dt.strftime("%Y-%m-%d") | ||
| matching = wrapped[wrapped["Date"].str.startswith(date_str)] | ||
| if not matching.empty: | ||
| val = matching[indicator].values[0] | ||
| lines.append(f"{date_str}: {val if not pd.isna(val) else 'N/A: Not a trading day (weekend or holiday)'}") | ||
| else: | ||
| lines.append(f"{date_str}: N/A: Not a trading day (weekend or holiday)") | ||
| result_dt -= relativedelta(days=1) |
There was a problem hiding this comment.
Iterating through the lookback period and performing a string-based filter on the DataFrame (wrapped["Date"].str.startswith(date_str)) in each iteration is inefficient. For better performance, consider setting the date as the index and using direct label-based lookup.
| while result_dt >= curr_dt - relativedelta(days=look_back_days): | |
| date_str = result_dt.strftime("%Y-%m-%d") | |
| matching = wrapped[wrapped["Date"].str.startswith(date_str)] | |
| if not matching.empty: | |
| val = matching[indicator].values[0] | |
| lines.append(f"{date_str}: {val if not pd.isna(val) else 'N/A: Not a trading day (weekend or holiday)'}") | |
| else: | |
| lines.append(f"{date_str}: N/A: Not a trading day (weekend or holiday)") | |
| result_dt -= relativedelta(days=1) | |
| wrapped = wrapped.set_index("Date") | |
| result_dt = curr_dt | |
| lines = [] | |
| while result_dt >= curr_dt - relativedelta(days=look_back_days): | |
| date_str = result_dt.strftime("%Y-%m-%d") | |
| if date_str in wrapped.index: | |
| val = wrapped.loc[date_str, indicator] | |
| if isinstance(val, pd.Series): | |
| val = val.iloc[0] | |
| lines.append(f"{date_str}: {val if not pd.isna(val) else 'N/A: Not a trading day'}") | |
| else: | |
| lines.append(f"{date_str}: N/A: Not a trading day") | |
| result_dt -= relativedelta(days=1) |
| successful node on a subsequent invocation with the same ticker+date. | ||
| """ | ||
| # Resolve Chinese stock names before running the pipeline | ||
| from tradingagents.dataflows.akshare_data import _is_chinese_name, resolve_ticker_name |
There was a problem hiding this comment.
Inline imports should be avoided where possible. Since akshare_data does not depend on trading_graph, there is no circular dependency risk. Moving this import to the top of the file is recommended for PEP 8 compliance.
References
- Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants. (link)
LLM agents were choosing inconsistent start_date ranges for get_stock_data calls, leading to 141 trading days for one ticker and 40 for another. Enforce a 365-day floor relative to end_date so indicators like 200 SMA are always calculable.
Summary
.SS, Shenzhen.SZ) and Hong Kong (.HK) stocks贵州茅台auto-resolves to600519.SS)config["data_vendors"]["core_stock_apis"] = "akshare"How it works
tradingagents/dataflows/akshare_data.pywith 9 tool functions matching existing vendor signaturesinterface.pyalongside yfinance and Alpha Vantageget_ticker()) and programmatic entry point (propagate())Test Plan
propagate('贵州茅台', '2024-05-31')with AKShare configured resolves name and fetches data