Skip to content

[Track Issue] Optimize third-party Docker startup time and usability #62101

@suxiaogang223

Description

@suxiaogang223

Background

The current third-party Docker startup flow in Doris has accumulated several usability and performance issues over time, especially around heavyweight services such as Hive, Iceberg-related components, and other stateful external dependencies.

Common pain points include:

  • Long startup latency caused by repeated initialization, redundant downloads, and expensive bootstrap steps
  • Service startup being tightly coupled with data initialization, making simple restart and daily development workflows slow
  • Lack of incremental refresh mechanisms, so small data/script changes often require broad re-initialization
  • Poor usability of startup control, with limited mode distinctions such as fast start, refresh, rebuild, and targeted reset
  • Repeated environment preparation work that could be cached or reused safely

These issues affect both local development efficiency and CI stability/cost.

Goal

This track issue focuses on improving the third-party Docker startup scripts with two primary goals:

  1. Reduce startup time for common developer and CI workflows
  2. Improve usability, observability, and control of startup/reset behavior

Scope

The optimization work may include, but is not limited to:

  • Reducing redundant initialization work during container startup
  • Caching or reusing downloaded/bootstrap artifacts when safe
  • Merging or simplifying expensive bootstrap steps
  • Removing unnecessary metadata repair or data scan operations
  • Decoupling service readiness from heavyweight data loading
  • Introducing clearer startup modes for different scenarios
  • Improving partial refresh / targeted rebuild support
  • Improving logs, diagnostics, and failure visibility
  • Standardizing script behavior across different third-party components

Non-goals

This track issue does not require all startup scripts to be fully redesigned in one step. Incremental improvements are acceptable as long as they clearly improve startup performance or usability without introducing instability.

Proposed Work Items

  • Audit current third-party startup bottlenecks by component
  • Optimize Hive startup hot path
  • Reduce repeated downloads and improve local cache reuse
  • Clean up redundant metadata repair and bootstrap work
  • Introduce clearer startup mode semantics where needed
  • Improve restart experience after machine reboot or container restart
  • Improve script usability and error reporting
  • Add regression coverage for key startup flows

Expected Benefits

  • Faster local setup and restart for contributors
  • Lower CI initialization cost and shorter feedback loops
  • Easier debugging and maintenance of third-party environments
  • More predictable and controllable startup behavior

Notes

This issue is intended to track a series of incremental PRs instead of one large refactor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions