Releases: OpenCSGs/csghub-server
Releases · OpenCSGs/csghub-server
v2.1.0-ce
✨ New Features
-
AI Gateway & OpenAI-Compatible APIs
- Audio Transcription: Added OpenAI-compatible
/v1/audio/transcriptionssupport with multipart request rewriting and audio token usage counting. - Text/Image-to-Video: Added
/v1/videos,/v1/videos/{id}, and/v1/videos/{id}/contentAPIs with provider adapters for OpenAI-compatible endpoints, LightX2V, MiniMax, and Seedance. - Model Routing: Added provider-aware model IDs, composite model ID parsing, upstream catalog support, session routing, fallback retry for chat completions, and per-upstream availability reporting.
- Usage Limits: Added Redis-backed per-window usage limit checks for configured upstream policies.
- API Key Auth: AI Gateway inference endpoints now require user/org API keys instead of normal login sessions.
- Audio Transcription: Added OpenAI-compatible
-
API Key Management
- Added namespace-scoped API key management for users and organizations, including create, list, update, delete, built-in key retrieval, and built-in key refresh APIs.
- Added user/org API key authentication context propagation for downstream services.
-
Inference & Evaluation
- Added configurable model architecture checks for inference, including admin APIs to view and update inference architecture rules.
- Added SGLang-based Qwen3-Guard stream inference configuration and Docker assets.
- Added AMD EvalScope evaluation configuration and Docker image support.
- Updated vLLM and AMD vLLM inference images/configuration to v0.19.0.
-
Repository, Tags & Skills
- Added automatic industry tag scanning for model and dataset repositories using configured LLM prompts.
- Added
sourcetracking for repository tags and safer tag replacement/removal behavior. - Added skill
mirror_from_saasroutes and skill clone URL fields. - Added a dedicated skill tag category seed.
- Improved
SKILL.mdvalidation and added broader validator tests.
🚀 Enhancements & Bug Fixes
-
AI Gateway Reliability
- Fixed async model cache writes mutating live model lists.
- Fixed nil-user panic risk when listing CSGHub models.
- Improved sensitive-check whitelist lookup behavior.
- Made sensitive-check behavior configurable per LLM config where available.
- Improved SGLang Guard stream trace/session header handling.
-
Resource Scheduling
- Added unavailable reasons to resource list responses.
- Added cluster offline/unavailable status handling.
- Prevented CPU-only workloads from being scheduled onto XPU nodes.
- Added replica-aware resource checks for Spaces.
-
Repository & LFS
- Added repository size calculation trigger command.
- Added LFS pointer download nil-URL protection.
- Added LFS size checks before syncing files.
-
Data Viewer
- Added file-size checks and optimizations before converting preview files.
-
Finetune & Runner
- Fixed finetune jobs missing model and dataset revision data.
- Fixed potential runner panic paths in service/workflow handling.
-
Proxy & Networking
- Set proxied
Hostheaders without port where required. - Sanitized logged authorization headers in internal proxy logs.
- Set proxied
🛠 Maintenance
- Upgraded vulnerable dependencies reported by Dependabot.
- Improved accounting metering retry limit configurability.
- Added and refreshed unit tests across AI Gateway, API keys, resource checks, tags, skills, LFS, and database stores.
- Improved CI/test stability and separated CI build cache behavior.
Full Changelog: v2.0.0-ce...v2.1.0-ce
v2.0.0-ce
✨ New Features
- Repository & Git Operations
- Branch Management: Added APIs for creating and deleting branches.
- Resumable LFS: Enabled file-level resumable uploads for large files and implemented MinIO LFS cleanup on model deletion.
- Performance: Optimized repository query speeds and pre-upload logic for large file counts.
- Scheduling & Resource Management
- Memory Unit Support: Enhanced resource checks to support "Mi" and "M" units for memory configuration.
- Knative Integration: Updated service namespaces to
knative-servingand improved pod name RFC1123 compliance.
- AI Gateway & LLM Enhancements
- Multimodal Support: Added Text-to-Image capabilities and improved image generation proxying.
- Engine Upgrades: Upgraded SGLang to support Qwen 2.5, GLM 4.5, and Step-3; added
data-parallel-sizeconfig for vLLM.
🚀 Enhancements & Bug Fixes
- Security & Authentication
- Implemented a JWT session-building middleware.
- Enabled API Key authentication for Runner APIs.
- Restricted
GET /v1/modelsto allow unauthenticated access.
- Skills
- Added comprehensive APIs for Agent skills, enhance tags support
- Corrected skill tag scope bugs and validated
SKILL.mdfile integrity.
- Observability & Reliability
- Cluster Metrics: Added cluster heartbeat metrics and automated alerting rules.
- Enhanced Logging: Integrated Loki for better finetuning logs and improved SMTP attachment handling using in-memory bytes.
- Stability: Fixed nil deployer panics in MCP proxies and improved wakeup timeout handling for remote clusters.
- Bug Fixes
- Resolved token usage recording issues for interrupted streaming chat requests.
- Fixed status mismatch bugs where deleted Knative services still appeared as "Running."
- Fixed repository total size calculation bugs and search prioritization for organization namespaces.
🛠 Maintenance
- Upgraded Docker Golang version to 1.26.0.
- Refactored metering consumer logic into a unified structure.
- Added extensive unit testing for accounting workflows.
New Contributors
- @sfeng1996 made their first contribution in #878
Full Changelog: v1.17.0-ce...v2.0.0-ce
v1.17.0-ce
v1.17.0-ce
✨ New Features
- Advanced Scheduling with Volcano
- Integrated Volcano Scheduler to enhance high-performance workload management.
- Added support for Volcano vGPU resource definitions and MIG (Multi-Instance GPU) resources with full utility and test coverage.
- Infrastructure & Compute Visibility
- Introduced Public Cluster Info for a dedicated computation power page.
- Added Cluster Resource Health Check to monitor deployment readiness.
- Enhanced cluster management APIs to support portal integration and real-time node/queue data collection.
- Deployment & Runtime Enhancements
- Added Pending Status for deployments to provide better feedback during resource allocation.
- Introduced
RuntimeFrameworkIDandDriverVersionfields to deployment and space displays for better environment tracking. - Supported updating resource and cluster IDs for existing inference and finetune tasks.
- Agent & Skills Hub
- Introduced Skills Support with multi-sync capabilities and a "User Likes" API.
- Added Agent Access Middleware that enables token usage tracking.
- Security & Authentication
- Added Basic Auth support to the Authenticator.
- Introduced a configurable bypass for sensitive content detection in the AI Gateway.
- Enabled access token support for non-MCP spaces via reverse proxy.
🚀 Enhancements & Bug Fixes
- Storage & Git Performance
- Optimized LFS Sync pointer retrieval and fixed pre-upload timeout issues for repositories with 300k+ files.
- Fixed MinIO multipart upload errors and addressed metadata EOF issues via storage gateway pre-signed URLs.
- Added
ScanFileNumLimitconfiguration to prevent overhead during file scanning.
- Reporting & Data Export
- Added Time Range Queries and CSV Export functionality for system reports.
- Updated organizations table with a UUID column and implemented conflict checks.
- System Stability
- Added configurable timeouts for Temporal
GetSystemInfocalls. - Added a Scaffold Command to streamline code generation for developers.
- Added configurable timeouts for Temporal
- Bug Fixes
- Fixed a data duplication bug in the Models API.
- Resolved AMD GPU inference issues within Kubernetes (K8s) environments.
- Fixed xnet migration status filters for models and datasets index APIs.
- Addressed various error-handling bugs, including non-existent service deletion and MinIO upload failures.
New Contributors
Full Changelog: v1.16.1-ce...v1.17.0-ce
v1.16.1-ce
What's Changed
- fix(space): replace hardcoded engine version with constant and use FindSpaceLatestVersion by @phantom-rabbit in #773
- Fix repository pagination bug by @pulltheflower in #826
- Add temporal GetSystemInfo timeout config by @pulltheflower in #826
- enable VLLM eager by default by @ganisback in #826
Full Changelog: v1.16.0-ce...v1.16.1-ce
v1.16.0-ce
v1.16.0-ce
✨ New Features
- Advanced Agent Services
- Introduced Agent Memory Service to provide persistent context for agent interactions.
- Hardware & Compute Enhancements
- vGPU Support: Initial support for vGPU and physical GPU resource reporting and usage calculation.
- AMD GPU Support: Full compatibility for both Inference and Finetuning tasks on AMD hardware.
- CUDA Version Management: Added a new CUDA-version API, allowing resource switching based on specific CUDA version requirements.
- Enabled multi-host inference support for NVIDIA vLLM and SGLang.
- Storage & Infrastructure
- Added PVC support for Spaces, allowing persistent storage for application environments.
- Support for network interface configuration in "in-cluster" mode.
- Multimodal
- Support for Text-or-Image-to-Video (TI2V), allowing text-to-video and image-to-video generation.
- Developer Experience
- Upgraded Gradio SDK to 6.2.0 (maintaining backward compatibility with 5.1.0).
- Space optimization: Added ability to skip build steps for Gradio, Streamlit, Nginx, and MCP Server environments.
- Added GET/HEAD routes for Code and MCP resources within the CSGHub SDK.
🚀 Enhancements & Bug Fixes
- Space & Deployment Improvements
- Added status filters and availability filters for the Space Index API.
- Support for public Docker registry configuration and updated runtime images.
- Deployment responses now include the service name for easier identification.
- Added the ability to force-delete deployments.
- Repository & Mirroring
- Enhanced repo syncer with retry mechanisms for mirror tasks.
- Fixed issues where repositories total count mismatched.
- Reset running mirror tasks automatically after a mirror LFS service restart.
- Mirror tokens are now strictly limited to read-only actions for improved security.
- Safety & Content
- Refactored sensitive scenario handling and updated vocabulary for better moderation accuracy.
- Updated moderation service risk levels.
- General Fixes
- Fixed token usage recording for streaming chat completion requests.
- Resolved GGUF listing issues and collection message bugs.
- Fixed specific bugs in Notebook environments and resource checking logic.
- Improved repository sync status consistency between CE and EE versions.
- Refactoring
- Significant architectural cleanup including splitting repository implementations from interfaces and using cluster pool interfaces.
New Contributors
- @denny-zhao made their first contribution in [#718]
Full Changelog: v1.15.1-ce...v1.16.0-ce
v1.15.1-ce
What's Changed
- Cherry pick 1.15 by @ganisback in #745
Full Changelog: v1.15.0-ce...v1.15.1-ce
- Fix ce ee sync status bug (https://github.com/OpenCSGs/csghub-server/pull/740[)](https://github.com/OpenCSGs/csghub-server/commit/e217c63ea3a899437d68b78c62822d1f1f64422d)
- Cherry pick 1.15 (https://github.com/OpenCSGs/csghub-server/pull/745[)](https://github.com/OpenCSGs/csghub-server/commit/63af6dd320695dfb10222a580af49f39040c4445)
v1.14.2-ce
v1.15.0-ce
What's Changed
- build(deps): bump github.com/containerd/containerd from 1.7.27 to 1.7.29 by @dependabot[bot] in #511
- Sync allFiles changes by @pulltheflower in #563
- fix add time filtering for space log by @phantom-rabbit in #556
- Enhance repo creation by @pulltheflower in #570
- Optimize aigateway: refactor moderation component by @QinYuuuu in #543
- add recharge deposited notification by @QinYuuuu in #545
- Optimize cluster usage API performance through concurrent cluster queries by @QinYuuuu in #550
- print debug log by @QinYuuuu in #558
- add 'docker' tag to test and build ci by @QinYuuuu in #559
- add error i18n by @QinYuuuu in #569
- update i18n and ut by @QinYuuuu in #565
- Error scan cmd by @QinYuuuu in #566
- Hotfix for modelscope repo sync by @QinYuuuu in #560
- support flex gpu label type by @QinYuuuu in #555
- Reject git push in xnet enabled repo by @QinYuuuu in #564
- Add missing field for file by @pulltheflower in #577
- add pr_dependency_check by @QinYuuuu in #587
- Remove useless file and merge mcp resource by @HaiHui886 in #596
- optimize token counter for ut and fix token counter support different prompt struct by @QinYuuuu in #590
- Revert "add 'docker' tag to test and build ci" by @QinYuuuu in #578
- fix lint issue by @luojun96 in #599
- fix add instal log completa by @phantom-rabbit in #579
- Add admin repos API by @pulltheflower in #601
- Enhance logs by @QinYuuuu in #544
- Fix cluster info by @QinYuuuu in #561
- Fix xnet dataset upload bug by @QinYuuuu in #562
- remove embedding moderation by @QinYuuuu in #567
- Xnet csghub sdk by @QinYuuuu in #573
- Fix xnet refresh route bug by @QinYuuuu in #568
- Support tts by @QinYuuuu in #602
- Support function call by @QinYuuuu in #613
- Add NotebookType support and update scene validation by @QinYuuuu in #606
- fix upload databaset error by @QinYuuuu in #607
- Return empty response if repo has no files by @QinYuuuu in #612
- feat: build image name refactoring by @QinYuuuu in #616
- optimize ChatCompletionRequest JSON marshal and unmarshal,support unknown data by @QinYuuuu in #611
- fix swag fail by @QinYuuuu in #614
- add ut for repo full check workflow by @QinYuuuu in #618
- [dataviewer] Enhance UUID column type handling for messy value in dataviewer by @QinYuuuu in #620
- fix framework tag recoginization by @QinYuuuu in #605
- feat image sensitive check support url by @QinYuuuu in #588
- Add httpInsecureSkipVerify options for repo sync by @QinYuuuu in #575
- Add CreateBranch method by @QinYuuuu in #623
- fix inference name and model tag by @QinYuuuu in #604
- feat(inference): Support create, query, traffic control and delete for inference service versions by @QinYuuuu in #608
- Fix request id dup in rsp header by @QinYuuuu in #595
- Non admin user can quit organization by @QinYuuuu in #624
- Add SetDefaultBranch method by @QinYuuuu in #626
- style: embedding token counter use interface instead of struct by @QinYuuuu in #627
- add error handle for space resource create by @QinYuuuu in #628
- fix request for embedding by @QinYuuuu in #629
- add space resource list hardware types by @QinYuuuu in #630
- update user service by @luojun96 in #642
- fix accounting for replica cases by @QinYuuuu in #631
- Fix deploy mcp server with default branch master fail by @QinYuuuu in #632
- sync user invitation interface. by @luojun96 in #643
- feat: can custom docker namespace for imagebuilder by @QinYuuuu in #633
- Refactor(log): Add ctx context to all log prints in runner and user by @QinYuuuu in #625
- add filter to index space resource by @QinYuuuu in #634
- fix: PVC name in lowercase by @QinYuuuu in #635
- Merge account check function by @HaiHui886 in #650
- Merge use_limit column changes by @HaiHui886 in #654
- feat: accounting price list support query hardware type by @QinYuuuu in #636
- Add cache for model and dataset info API by @pulltheflower in #649
- Feat api limit for ip geo by @QinYuuuu in #637
- Fix search by industry tag no data bug by @QinYuuuu in #640
- Separate ce EE saas routes by @QinYuuuu in #638
- refactor user component to split phone related functions to single component. by @QinYuuuu in #641
- feat: optimize inference deployment instance logs by commit_id by @QinYuuuu in #644
- add metrics for runner webhook by @QinYuuuu in #647
- upgrade golang to 1.25.5 and use synctest package for concurrent unit tests by @QinYuuuu in #651
- Support query metering data without instance_name and update filtering logic by @QinYuuuu in #652
- Feat aigateway record usage support filter by @QinYuuuu in #655
- add finetune logs by @QinYuuuu in #656
- Update dependencies in Dockerfiles (coagent-python, huggingface_hub, gradio) by @QinYuuuu in #657
- docs: update space builder base README by @QinYuuuu in #658
- clean deploy code by @QinYuuuu in #659
- add query to list model by @QinYuuuu in #660
- update makefile by @QinYuuuu in #661
- Add git tree operation timeout config by @QinYuuuu in #662
- fix Revision delete using logical deletion by @QinYuuuu in #663
- add create table knative_service_revision in runner by @QinYuuuu in #664
- fix: update user login time on login by @QinYuuuu in #666
- update index for knative_revision by @QinYuuuu in #667
Full Changelog: v1.14.0-ce...v1.15.0-ce
v1.14.1-ce
What's Changed
- Fix wrong base image path for v1.14.0-ce by @HaiHui886 in #683
- Fix missing of mirror current task by @pulltheflower in #694
Full Changelog: v1.14.0-ce...v1.14.1-ce
v1.14.0-ce
v1.14.0-ce
✨ New Features
- Deployment Workflow Migration: Refactored the core Deploy Scheduler and successfully migrated its logic to use Temporal Workflow for improved reliability and scalability. ([#495])
- Finetuning Job Support: Added support to submit fine-tune tasks directly as standard jobs. ([#522])
- AI Gateway Integration: The AI Gateway now supports the MCP (Multi-Cluster Platform) for enhanced model serving capabilities. ([#529])
- Jupyter Notebook Support: Application space deployment now supports Jupyter Notebook environments.
- Internal Notification Tool: Introduced a new tool to generate basic internal notification messages. ([#551])
🚀 Enhancements & Bug Fixes
- Dependency Updates: Bumped
argo-workflowsfrom 3.5.13 to 3.6.12. UpgradedcasdoorSDK to v1.22.0 and updated user info retrieval to use UUID. ([#481], [#534]) - API & Core: Enhanced middleware robustness. Ensured header keys are handled as case-sensitive. Enhanced concurrency in
CheckRepoFilesand improved test coverage. Enabled soft delete functionality for licenses. ([#547], [#482], [#530], [#553]) - Deployment & Runner: Improved runner network discovery logic to read
clusterIPif no ingress IP is found. AddedStorageClasssupport in cluster report events. ([#484], [#525]) - Finetuning: Added
model/datasetfields to finetune job operations for richer metadata. ([#527]) - Integration: Added
User-Emailto request headers for DataFlow and Label Studio integrations. ([#492]) - Git/LFS Fixes: Fixed a client bug preventing the deletion of remote branches. Fixed a cache bug in LFS sync upload ID. ([#521], [#489])
- Temporal Workflow Fixes: Fixed command errors and a worker panic when the sensitive checker was not enabled. Added initialization sensitive checker in the temporal worker. ([#501], [#506], [#519])
- Multi-Sync & Rproxy Fixes: Corrected the logic for setting the service host name in the
rproxyheader. Fixed a bug where multi-sync MCP servers failed to display file lists and READMEs. ([#485], [#533]) - General Bug Fixes: Fixed bugs related to getting evaluations with access control, AI Gateway external LLM support, and repository creation. ([#529], [#532], [#536])
Full Changelog: v1.12.0-ce...v1.14.0-ce