本工具用于汽车零件销售行业,能够从多个网站自动登录账号、查询订单、存储到本地数据库,并支持导出数据到ERP系统。
This tool is designed for the auto parts sales industry. It can automatically log in to multiple websites, query orders, store data in a local database, and support exporting data to ERP systems.
- 多网站支持 / Multi-website Support:支持从多个网站查询订单 / Query orders from multiple websites
- 多账号管理 / Multi-account Management:支持同一网站多个账号登录 / Support multiple accounts for the same website
- 验证码处理 / Captcha Handling:自动识别和处理图片验证码 / Automatically recognize and handle image captchas
- 重复订单检查 / Duplicate Order Check:防止重复订单数据存储 / Prevent duplicate order data storage
- 定时自动运行 / Scheduled Execution:支持设置定时任务自动执行 / Support scheduled automatic execution
- 数据导出 / Data Export:支持导出订单数据到CSV和Excel格式 / Export order data to CSV and Excel formats
- 数据库备份 / Database Backup:支持自动备份数据库 / Support automatic database backup
- 详细日志 / Detailed Logging:记录系统运行状态和错误信息 / Record system status and error information
- 后端 / Backend:Python 3.8+
- Web自动化 / Web Automation:Selenium + ChromeDriver
- 验证码识别 / Captcha Recognition:Tesseract OCR + OpenCV
- 数据库 / Database:SQLite
- 定时任务 / Scheduler:APScheduler
- 数据处理 / Data Processing:Pandas
- 配置管理 / Configuration:YAML
# 进入项目目录 / Enter project directory
cd auto_parts_order_tool
# 安装依赖包 / Install dependencies
pip install -r requirements.txt- Windows:下载并安装 Tesseract OCR
- macOS:
brew install tesseract - Linux:
sudo apt-get install tesseract-ocr
- 下载并安装 Chrome浏览器 / Download and install Chrome Browser
- 下载对应版本的 ChromeDriver / Download the corresponding version of ChromeDriver
- 将ChromeDriver添加到系统PATH中 / Add ChromeDriver to system PATH
编辑 config/config.yaml 文件,添加网站配置信息:
Edit the config/config.yaml file to add website configuration:
websites:
- name: "网站A / Website A"
url: "https://example.com/login"
login_xpath:
username: "//input[@id='username']"
password: "//input[@id='password']"
captcha: "//input[@id='captcha']"
submit: "//button[@id='login-btn']"
captcha_xpath: "//img[@id='captcha-image']"
orders_xpath: "//table[@class='order-table']"
accounts:
- username: "user1"
password: "pass1"
- username: "user2"
password: "pass2"scheduler:
enabled: true
interval: 24 # 小时 / hours
time: "08:00" # 每天执行时间 / Daily execution timedatabase:
type: "sqlite"
path: "./data/orders.db"logging:
level: "INFO"
path: "./logs/auto_parts_order_tool.log"python main.py --runpython main.py --scheduler# 导出为CSV格式 / Export to CSV
python main.py --export csv
# 导出为Excel格式 / Export to Excel
python main.py --export excelpython main.py --backupauto_parts_order_tool/
├── config/ # 配置文件目录 / Configuration files
│ └── config.yaml # 主配置文件 / Main configuration file
├── modules/ # 模块目录 / Modules
│ ├── config/ # 配置管理模块 / Configuration management
│ ├── web/ # Web自动化模块 / Web automation
│ ├── captcha/ # 验证码处理模块 / Captcha handling
│ ├── storage/ # 数据存储模块 / Data storage
│ ├── scheduler/ # 定时任务模块 / Scheduler
│ ├── erp/ # ERP集成模块 / ERP integration
│ └── log/ # 日志模块 / Logging
├── data/ # 数据库目录 / Database
├── logs/ # 日志目录 / Logs
├── exports/ # 导出文件目录 / Exports
├── backups/ # 备份文件目录 / Backups
├── temp/ # 临时文件目录 / Temporary files
├── main.py # 主脚本 / Main script
└── requirements.txt # 依赖包配置 / Dependencies
- 验证码识别 / Captcha Recognition:验证码识别成功率取决于验证码的复杂度,对于复杂的验证码可能需要人工干预 / Captcha recognition success rate depends on captcha complexity. Complex captchas may require manual intervention.
- 网站结构变化 / Website Structure Changes:如果网站结构发生变化,需要更新配置文件中的XPath选择器 / If website structure changes, update XPath selectors in the configuration file.
- 登录状态 / Login Status:网站可能会定期登出,工具会自动重新登录 / Websites may log out periodically. The tool will automatically re-login.
- 数据安全 / Data Security:配置文件中包含账号密码,请妥善保管 / Configuration files contain account passwords. Keep them secure.
- 性能优化 / Performance:多账号同时操作时可能会占用较多系统资源 / Multiple accounts operating simultaneously may consume significant system resources.
- 检查账号密码是否正确 / Check if username and password are correct
- 检查XPath选择器是否正确 / Check if XPath selectors are correct
- 检查验证码是否能够正常识别 / Check if captcha can be recognized properly
- 检查orders_xpath是否正确 / Check if orders_xpath is correct
- 检查网站是否需要额外的导航步骤 / Check if additional navigation steps are required
- 检查网站是否有反爬机制 / Check if the website has anti-crawling mechanisms
- 检查scheduler配置是否启用 / Check if scheduler is enabled in configuration
- 检查系统时间是否正确 / Check if system time is correct
- 检查日志文件中的错误信息 / Check error messages in log files
- v1.0.0:初始版本,实现基本功能 / Initial version with basic features
MIT License