Skip to content

[Discussion] How to manage Conda environments #18

@hcho3

Description

@hcho3

Currently, there are three places where Conda environments are specified:

  • Environment files in https://github.com/dmlc/xgboost-devops/tree/main/containers/conda_env. Environments are built as part of the container build and get cached as part of the CI container.
  • Dockerfiles, e.g.
    mamba create -y -n gpu_test -c ${RAPIDSAI_CONDA_CHANNEL} -c conda-forge -c nvidia \
    python=$PYTHON_VERSION "cudf=$RAPIDS_VERSION.*" "rmm=$RAPIDS_VERSION.*" cuda-version=$CUDA_SHORT_VER \
    "nccl>=${NCCL_SHORT_VER}" \
    dask \
    distributed \
    "dask-cuda=$RAPIDS_VERSION.*" "dask-cudf=$RAPIDS_VERSION.*" cupy \
    numpy pytest pytest-timeout scipy scikit-learn pandas matplotlib wheel \
    python-kubernetes urllib3 graphviz hypothesis "loky>=3.5.1" \
    "pyspark>=3.4.0" cloudpickle cuda-python && \
    Environments are built as part of the container build and get cached as part of the CI container.
  • Environment files in https://github.com/dmlc/xgboost/tree/master/ops/conda_env. Environments are re-built in every CI run.

So changing a test dependency involves changing code in three places. It would be great if we can gather Conda env specs in a single place.

cc @trivialfis

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions