gitlab-runner/docs/configuration/speed_up_job_execution.md at main · gitlabhq/gitlab-runner

stage	Verify
group	Runner Core
info	To determine the technical writer assigned to the Stage/Group associated with this page, see <https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments>
title	Speed up job execution

Tier: Free, Premium, Ultimate
Offering: GitLab.com, GitLab Self-Managed, GitLab Dedicated

You can improve performance of your jobs by caching your images and dependencies.

Use a proxy for containers

You can speed up the time it takes to download Docker images by using:

The GitLab Dependency Proxy or
A mirror of the DockerHub Registry
Other open source solutions

GitLab Dependency Proxy

To more quickly access container images, you can use the Dependency Proxy to proxy container images.

Docker Hub Registry mirror

You can also speed up the time it takes for your jobs to access container images by mirroring Docker Hub. This results in the Registry as a pull through cache. In addition to speeding up job execution, a mirror can make your infrastructure more resilient to Docker Hub outages and Docker Hub rate limits.

When the Docker daemon is configured to use the mirror it automatically checks for the image on your running instance of the mirror. If it's not available, it pulls the image from the public Docker registry and stores it locally before handing it back to you.

The next request for the same image pulls from your local registry.

For more information on how it works, see Docker daemon configuration documentation.

Use a Docker Hub Registry mirror

To create a Docker Hub Registry mirror:

Log in to a dedicated machine where the proxy container registry will run.
Make sure that Docker Engine is installed on that machine.
Create a new container registry:
```
docker run -d -p 6000:5000 \
    -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
    --restart always \
    --name registry registry:2
```
You can modify the port number (6000) to expose the registry on a different port. This will start the server with http. If you want to turn on TLS (https) follow the official documentation.
Check the IP address of the server:
```
hostname --ip-address
```
You should choose the private network IP address. The private network is usually the fastest solution for internal communication between machines on a single provider, like DigitalOcean, AWS, or Azure. Usually, data transferred on a private network is not applied against your monthly bandwidth limit.

The Docker Hub registry is accessible under MY_REGISTRY_IP:6000.

You can now configure config.toml to use the new registry server.

Use a distributed cache

You can speed up the time it takes to download language dependencies by using a distributed cache.

To specify a distributed cache, you set up the cache server and then configure runner to use that cache server.

If you are using autoscaling, learn more about the distributed runners cache feature.

The following cache servers are supported:

AWS S3
MinIO or other S3-compatible cache server
Google Cloud Storage
Azure Blob storage

Learn more about GitLab CI/CD cache dependencies and best practices.

Use AWS S3

To use AWS S3 as a distributed cache, edit runner's config.toml file to point to the S3 location and provide credentials for connecting. Make sure the runner has a network path to the S3 endpoint.

If you use a private subnet with a NAT gateway, to save cost on data transfers you can enable an S3 VPC endpoint.

Use MinIO

Instead of using AWS S3, you can create your own cache storage.

Log in to a dedicated machine where the cache server will run.
Make sure that Docker Engine is installed on that machine.

Start MinIO, a simple S3-compatible server written in Go:

docker run -d --restart always -p 9005:9000 \
        -v /.minio:/root/.minio -v /export:/export \
        -e "MINIO_ROOT_USER=<minio_root_username>" \
        -e "MINIO_ROOT_PASSWORD=<minio_root_password>" \
        --name minio \
        minio/minio:latest server /export

You can modify the port 9005 to expose the cache server on a different port.

Check the IP address of the server:
```
hostname --ip-address
```
Your cache server will be available at MY_CACHE_IP:9005.
Create a bucket that will be used by the runner:
```
sudo mkdir /export/runner
```
runner is the name of the bucket in that case. If you choose a different bucket, then it will be different. All caches will be stored in the /export directory.
Use the MINIO_ROOT_USER and MINIO_ROOT_PASSWORD values (from above) as your Access and Secret Keys when configuring your runner.

You can now configure config.toml to use the new cache server.

Use Google Cloud Storage

To use Google Cloud Platform as a distributed cache, edit runner's config.toml file to point to the GCP location and provide credentials for connecting. Make sure the runner has a network path to the GCS endpoint.

Use Azure Blob storage

To use Azure Blob storage as a distributed cache, edit runner's config.toml file to point to the Azure location and provide credentials for connecting. Make sure the runner has a network path to the Azure endpoint.

Speed up cache and artifact transfers

You can improve cache and artifact upload and download performance with the following options.

Backend-specific runner config

Each cache backend has its own config.toml section. Optimize for your backend:

S3 configuration): Set BucketLocation to the same region as your runners. Use RoleARN for archives larger than 5 GB to enable multipart uploads. Use the default S3 v2 adapter (do not set FF_USE_LEGACY_S3_CACHE_ADAPTER=true). Optionally enable Accelerate = true for AWS S3 Transfer Acceleration when runners are far from the bucket region. An S3 VPC endpoint in the same region can reduce latency and cost.
Google Cloud Storage configuration): Use a bucket in the same or nearest region to your runners.
Azure Blob configuration): Use a storage account in the same or nearest region to your runners.

Cache compression

Use faster compression to speed up cache archiving and download. This creates larger archives. Set compression options in your job or in CI/CD variables:

Variable	Recommended for speed	Description
`CACHE_COMPRESSION_LEVEL`	`fastest` or `fast`	Less CPU and faster upload or download. Archives are larger. Default is `default`.
`CACHE_COMPRESSION_FORMAT`	`zip`	`zip` is often faster to create. `tarzstd` gives better compression ratio but can be slower.

Example configuration in .gitlab-ci.yml:

variables:
  CACHE_COMPRESSION_LEVEL: fastest
  CACHE_COMPRESSION_FORMAT: zip

Cache request timeout

If large caches hit timeouts, increase the limit (in minutes) with the CACHE_REQUEST_TIMEOUT CI/CD variable. Default is 10. This setting does not speed up transfers but prevents failures on slow or large uploads and downloads.

Cache transfer buffer size (throughput)

Cache download and upload use a single streaming buffer. A larger buffer reduces system calls and often increases throughput, especially if you see transfers cap around 20 to 30 MB/s.

Set CACHE_TRANSFER_BUFFER_SIZE (in bytes) in the job environment or in CI/CD variables. Default is 4 MiB (4194304).

Example configuration for 8 MiB:

variables:
  CACHE_TRANSFER_BUFFER_SIZE: "8388608"

Cache chunk size and concurrency

Chunk size is the size in bytes of each part or chunk for parallel upload (GoCloud) or parallel download (presigned or GoCloud). Concurrency is how many chunks run in parallel. Memory use is approximately chunk size x concurrency.

Variable	Description	Default
`CACHE_CHUNK_SIZE`	Chunk size in bytes. For upload (GoCloud backends): limits are backend-dependent (for example, 5 MiB to 5 GiB per part, max 10,000 parts for S3; Azure and GCS have their own limits). For download: 0 = legacy sequential; when concurrency > 1, 16 MiB is used if unset.	Upload: 16 MiB (16777216). Download: 0 (legacy)
`CACHE_CONCURRENCY`	Number of concurrent chunks. Upload: GoCloud backends only (S3 with RoleARN, Azure, GCS). Download: 0 or 1 = legacy sequential mode; values greater than 1 = parallel mode (presigned or GoCloud).	Upload: 16. Download: 0 (legacy)

Example configuration for custom tuning (for example, 32 MiB chunks, 32 concurrent):

variables:
  CACHE_CHUNK_SIZE: "33554432"
  CACHE_CONCURRENCY: "32"

Artifact uploads to GitLab

GitLab sends artifacts to the GitLab coordinator, which might store them in object storage. To speed up the upload from the runner:

Variable	Recommended for speed	Description
`ARTIFACT_COMPRESSION_LEVEL`	`fastest` or `fast`	Reduces CPU and time spent compressing before upload.

Set compression options in your job or in CI/CD variables, for example:

variables:
  ARTIFACT_COMPRESSION_LEVEL: fastest

Artifact downloads from object storage

When the coordinator redirects artifact downloads to object storage (direct_download), you can enable parallel range downloads with the FF_USE_PARALLEL_ARTIFACT_TRANSFER feature flag. This is separate from parallel cache transfers (FF_USE_PARALLEL_CACHE_TRANSFER). See Parallel artifact downloads (direct download).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a proxy for containers

GitLab Dependency Proxy

Docker Hub Registry mirror

Use a Docker Hub Registry mirror

Other open source solutions

Use a distributed cache

Use AWS S3

Use MinIO

Use Google Cloud Storage

Use Azure Blob storage

Speed up cache and artifact transfers

Backend-specific runner config

Cache compression

Cache request timeout

Cache transfer buffer size (throughput)

Cache chunk size and concurrency

Artifact uploads to GitLab

Artifact downloads from object storage

FilesExpand file tree

speed_up_job_execution.md

Latest commit

History

speed_up_job_execution.md

File metadata and controls

Use a proxy for containers

GitLab Dependency Proxy

Docker Hub Registry mirror

Use a Docker Hub Registry mirror

Other open source solutions

Use a distributed cache

Use AWS S3

Use MinIO

Use Google Cloud Storage

Use Azure Blob storage

Speed up cache and artifact transfers

Backend-specific runner config

Cache compression

Cache request timeout

Cache transfer buffer size (throughput)

Cache chunk size and concurrency

Artifact uploads to GitLab

Artifact downloads from object storage