Skip to content

Releases: OpenCSGs/csghub-server

v2.1.0-ce

06 May 10:18
24558cc

Choose a tag to compare

✨ New Features

  • AI Gateway & OpenAI-Compatible APIs

    • Audio Transcription: Added OpenAI-compatible /v1/audio/transcriptions support with multipart request rewriting and audio token usage counting.
    • Text/Image-to-Video: Added /v1/videos, /v1/videos/{id}, and /v1/videos/{id}/content APIs with provider adapters for OpenAI-compatible endpoints, LightX2V, MiniMax, and Seedance.
    • Model Routing: Added provider-aware model IDs, composite model ID parsing, upstream catalog support, session routing, fallback retry for chat completions, and per-upstream availability reporting.
    • Usage Limits: Added Redis-backed per-window usage limit checks for configured upstream policies.
    • API Key Auth: AI Gateway inference endpoints now require user/org API keys instead of normal login sessions.
  • API Key Management

    • Added namespace-scoped API key management for users and organizations, including create, list, update, delete, built-in key retrieval, and built-in key refresh APIs.
    • Added user/org API key authentication context propagation for downstream services.
  • Inference & Evaluation

    • Added configurable model architecture checks for inference, including admin APIs to view and update inference architecture rules.
    • Added SGLang-based Qwen3-Guard stream inference configuration and Docker assets.
    • Added AMD EvalScope evaluation configuration and Docker image support.
    • Updated vLLM and AMD vLLM inference images/configuration to v0.19.0.
  • Repository, Tags & Skills

    • Added automatic industry tag scanning for model and dataset repositories using configured LLM prompts.
    • Added source tracking for repository tags and safer tag replacement/removal behavior.
    • Added skill mirror_from_saas routes and skill clone URL fields.
    • Added a dedicated skill tag category seed.
    • Improved SKILL.md validation and added broader validator tests.

🚀 Enhancements & Bug Fixes

  • AI Gateway Reliability

    • Fixed async model cache writes mutating live model lists.
    • Fixed nil-user panic risk when listing CSGHub models.
    • Improved sensitive-check whitelist lookup behavior.
    • Made sensitive-check behavior configurable per LLM config where available.
    • Improved SGLang Guard stream trace/session header handling.
  • Resource Scheduling

    • Added unavailable reasons to resource list responses.
    • Added cluster offline/unavailable status handling.
    • Prevented CPU-only workloads from being scheduled onto XPU nodes.
    • Added replica-aware resource checks for Spaces.
  • Repository & LFS

    • Added repository size calculation trigger command.
    • Added LFS pointer download nil-URL protection.
    • Added LFS size checks before syncing files.
  • Data Viewer

    • Added file-size checks and optimizations before converting preview files.
  • Finetune & Runner

    • Fixed finetune jobs missing model and dataset revision data.
    • Fixed potential runner panic paths in service/workflow handling.
  • Proxy & Networking

    • Set proxied Host headers without port where required.
    • Sanitized logged authorization headers in internal proxy logs.

🛠 Maintenance

  • Upgraded vulnerable dependencies reported by Dependabot.
  • Improved accounting metering retry limit configurability.
  • Added and refreshed unit tests across AI Gateway, API keys, resource checks, tags, skills, LFS, and database stores.
  • Improved CI/test stability and separated CI build cache behavior.

Full Changelog: v2.0.0-ce...v2.1.0-ce

v2.0.0-ce

10 Apr 05:39

Choose a tag to compare

✨ New Features

  • Repository & Git Operations
    • Branch Management: Added APIs for creating and deleting branches.
    • Resumable LFS: Enabled file-level resumable uploads for large files and implemented MinIO LFS cleanup on model deletion.
    • Performance: Optimized repository query speeds and pre-upload logic for large file counts.
  • Scheduling & Resource Management
    • Memory Unit Support: Enhanced resource checks to support "Mi" and "M" units for memory configuration.
    • Knative Integration: Updated service namespaces to knative-serving and improved pod name RFC1123 compliance.
  • AI Gateway & LLM Enhancements
    • Multimodal Support: Added Text-to-Image capabilities and improved image generation proxying.
    • Engine Upgrades: Upgraded SGLang to support Qwen 2.5, GLM 4.5, and Step-3; added data-parallel-size config for vLLM.

🚀 Enhancements & Bug Fixes

  • Security & Authentication
    • Implemented a JWT session-building middleware.
    • Enabled API Key authentication for Runner APIs.
    • Restricted GET /v1/models to allow unauthenticated access.
  • Skills
    • Added comprehensive APIs for Agent skills, enhance tags support
    • Corrected skill tag scope bugs and validated SKILL.md file integrity.
  • Observability & Reliability
    • Cluster Metrics: Added cluster heartbeat metrics and automated alerting rules.
    • Enhanced Logging: Integrated Loki for better finetuning logs and improved SMTP attachment handling using in-memory bytes.
    • Stability: Fixed nil deployer panics in MCP proxies and improved wakeup timeout handling for remote clusters.
  • Bug Fixes
    • Resolved token usage recording issues for interrupted streaming chat requests.
    • Fixed status mismatch bugs where deleted Knative services still appeared as "Running."
    • Fixed repository total size calculation bugs and search prioritization for organization namespaces.

🛠 Maintenance

  • Upgraded Docker Golang version to 1.26.0.
  • Refactored metering consumer logic into a unified structure.
  • Added extensive unit testing for accounting workflows.

New Contributors

Full Changelog: v1.17.0-ce...v2.0.0-ce

v1.17.0-ce

10 Mar 02:56
2e051a3

Choose a tag to compare

v1.17.0-ce

✨ New Features

  • Advanced Scheduling with Volcano
    • Integrated Volcano Scheduler to enhance high-performance workload management.
    • Added support for Volcano vGPU resource definitions and MIG (Multi-Instance GPU) resources with full utility and test coverage.
  • Infrastructure & Compute Visibility
    • Introduced Public Cluster Info for a dedicated computation power page.
    • Added Cluster Resource Health Check to monitor deployment readiness.
    • Enhanced cluster management APIs to support portal integration and real-time node/queue data collection.
  • Deployment & Runtime Enhancements
    • Added Pending Status for deployments to provide better feedback during resource allocation.
    • Introduced RuntimeFrameworkID and DriverVersion fields to deployment and space displays for better environment tracking.
    • Supported updating resource and cluster IDs for existing inference and finetune tasks.
  • Agent & Skills Hub
    • Introduced Skills Support with multi-sync capabilities and a "User Likes" API.
    • Added Agent Access Middleware that enables token usage tracking.
  • Security & Authentication
    • Added Basic Auth support to the Authenticator.
    • Introduced a configurable bypass for sensitive content detection in the AI Gateway.
    • Enabled access token support for non-MCP spaces via reverse proxy.

🚀 Enhancements & Bug Fixes

  • Storage & Git Performance
    • Optimized LFS Sync pointer retrieval and fixed pre-upload timeout issues for repositories with 300k+ files.
    • Fixed MinIO multipart upload errors and addressed metadata EOF issues via storage gateway pre-signed URLs.
    • Added ScanFileNumLimit configuration to prevent overhead during file scanning.
  • Reporting & Data Export
    • Added Time Range Queries and CSV Export functionality for system reports.
    • Updated organizations table with a UUID column and implemented conflict checks.
  • System Stability
    • Added configurable timeouts for Temporal GetSystemInfo calls.
    • Added a Scaffold Command to streamline code generation for developers.
  • Bug Fixes
    • Fixed a data duplication bug in the Models API.
    • Resolved AMD GPU inference issues within Kubernetes (K8s) environments.
    • Fixed xnet migration status filters for models and datasets index APIs.
    • Addressed various error-handling bugs, including non-existent service deletion and MinIO upload failures.

New Contributors

Full Changelog: v1.16.1-ce...v1.17.0-ce

v1.16.1-ce

05 Mar 11:53

Choose a tag to compare

v1.16.1-ce Pre-release
Pre-release

What's Changed

  • fix(space): replace hardcoded engine version with constant and use FindSpaceLatestVersion by @phantom-rabbit in #773

Full Changelog: v1.16.0-ce...v1.16.1-ce

v1.16.0-ce

25 Feb 07:44

Choose a tag to compare

v1.16.0-ce

✨ New Features

  • Advanced Agent Services
    • Introduced Agent Memory Service to provide persistent context for agent interactions.
  • Hardware & Compute Enhancements
    • vGPU Support: Initial support for vGPU and physical GPU resource reporting and usage calculation.
    • AMD GPU Support: Full compatibility for both Inference and Finetuning tasks on AMD hardware.
    • CUDA Version Management: Added a new CUDA-version API, allowing resource switching based on specific CUDA version requirements.
    • Enabled multi-host inference support for NVIDIA vLLM and SGLang.
  • Storage & Infrastructure
    • Added PVC support for Spaces, allowing persistent storage for application environments.
    • Support for network interface configuration in "in-cluster" mode.
  • Multimodal
    • Support for Text-or-Image-to-Video (TI2V), allowing text-to-video and image-to-video generation.
  • Developer Experience
    • Upgraded Gradio SDK to 6.2.0 (maintaining backward compatibility with 5.1.0).
    • Space optimization: Added ability to skip build steps for Gradio, Streamlit, Nginx, and MCP Server environments.
    • Added GET/HEAD routes for Code and MCP resources within the CSGHub SDK.

🚀 Enhancements & Bug Fixes

  • Space & Deployment Improvements
    • Added status filters and availability filters for the Space Index API.
    • Support for public Docker registry configuration and updated runtime images.
    • Deployment responses now include the service name for easier identification.
    • Added the ability to force-delete deployments.
  • Repository & Mirroring
    • Enhanced repo syncer with retry mechanisms for mirror tasks.
    • Fixed issues where repositories total count mismatched.
    • Reset running mirror tasks automatically after a mirror LFS service restart.
    • Mirror tokens are now strictly limited to read-only actions for improved security.
  • Safety & Content
    • Refactored sensitive scenario handling and updated vocabulary for better moderation accuracy.
    • Updated moderation service risk levels.
  • General Fixes
    • Fixed token usage recording for streaming chat completion requests.
    • Resolved GGUF listing issues and collection message bugs.
    • Fixed specific bugs in Notebook environments and resource checking logic.
    • Improved repository sync status consistency between CE and EE versions.
  • Refactoring
    • Significant architectural cleanup including splitting repository implementations from interfaces and using cluster pool interfaces.

New Contributors

Full Changelog: v1.15.1-ce...v1.16.0-ce

v1.15.1-ce

30 Jan 05:50

Choose a tag to compare

v1.14.2-ce

20 Jan 10:13

Choose a tag to compare

v1.15.0-ce

16 Jan 06:52

Choose a tag to compare

What's Changed

Full Changelog: v1.14.0-ce...v1.15.0-ce

v1.14.1-ce

12 Jan 12:16

Choose a tag to compare

What's Changed

Full Changelog: v1.14.0-ce...v1.14.1-ce

v1.14.0-ce

08 Dec 01:23
7064e9a

Choose a tag to compare

v1.14.0-ce

✨ New Features

  • Deployment Workflow Migration: Refactored the core Deploy Scheduler and successfully migrated its logic to use Temporal Workflow for improved reliability and scalability. ([#495])
  • Finetuning Job Support: Added support to submit fine-tune tasks directly as standard jobs. ([#522])
  • AI Gateway Integration: The AI Gateway now supports the MCP (Multi-Cluster Platform) for enhanced model serving capabilities. ([#529])
  • Jupyter Notebook Support: Application space deployment now supports Jupyter Notebook environments.
  • Internal Notification Tool: Introduced a new tool to generate basic internal notification messages. ([#551])

🚀 Enhancements & Bug Fixes

  • Dependency Updates: Bumped argo-workflows from 3.5.13 to 3.6.12. Upgraded casdoor SDK to v1.22.0 and updated user info retrieval to use UUID. ([#481], [#534])
  • API & Core: Enhanced middleware robustness. Ensured header keys are handled as case-sensitive. Enhanced concurrency in CheckRepoFiles and improved test coverage. Enabled soft delete functionality for licenses. ([#547], [#482], [#530], [#553])
  • Deployment & Runner: Improved runner network discovery logic to read clusterIP if no ingress IP is found. Added StorageClass support in cluster report events. ([#484], [#525])
  • Finetuning: Added model/dataset fields to finetune job operations for richer metadata. ([#527])
  • Integration: Added User-Email to request headers for DataFlow and Label Studio integrations. ([#492])
  • Git/LFS Fixes: Fixed a client bug preventing the deletion of remote branches. Fixed a cache bug in LFS sync upload ID. ([#521], [#489])
  • Temporal Workflow Fixes: Fixed command errors and a worker panic when the sensitive checker was not enabled. Added initialization sensitive checker in the temporal worker. ([#501], [#506], [#519])
  • Multi-Sync & Rproxy Fixes: Corrected the logic for setting the service host name in the rproxy header. Fixed a bug where multi-sync MCP servers failed to display file lists and READMEs. ([#485], [#533])
  • General Bug Fixes: Fixed bugs related to getting evaluations with access control, AI Gateway external LLM support, and repository creation. ([#529], [#532], [#536])

Full Changelog: v1.12.0-ce...v1.14.0-ce