Skip to content

Commit 07515ff

Browse files
committed
feat: Implement Kubernetes job client and service
This commit introduces the Kubernetes job client and service, providing a mechanism to schedule tasks on Kubernetes clusters (including GKE and Kind), supporting both standard and Kata Containers. Key Features & Changes: - **Kubernetes Service**: Implemented `KubernetesService` in `clusterfuzz._internal.k8s.service` to manage job creation. - **Kata Support**: Added specialized job creation for Kata Containers (`create_kata_container_job`) with required security context (`privileged`, `capabilities: ALL`), networking (`hostNetwork: True`), and environment variables (`HOST_UID`). - **Dependency Management**: Added `kubernetes` and necessary Google Cloud dependencies (`google-api-python-client`, `google-cloud-storage`, `google-cloud-ndb`, etc.) to `Pipfile`. - **E2E Testing**: - Created `tests.core.k8s.k8s_service_e2e_test` to verify job lifecycle on a local Kind cluster. - Updated `local/tests/kubernetes_e2e_test.bash` to provision the test environment. - Updated CI workflow (`.github/workflows/kubernetes-e2e-tests.yaml`) to install JDK 21 (required for Datastore emulator). - Tests now verify job "Running" status to avoid timeouts with long-running commands. - `KubernetesService` skips default credential loading when `K8S_E2E` is set to utilize the test-provided kubeconfig. - **Unit Tests**: Added comprehensive unit tests in `tests.core.k8s.k8s_service_test` and `tests.core.kubernetes.kubernetes_test`, including mocking of `load_kube_config` and `_load_gke_credentials` to ensure robust testing without external dependencies. pipenv lock Signed-off-by: Javan Lacerda <javanlacerda@google.com> install kubernetes Signed-off-by: Javan Lacerda <javanlacerda@google.com> Update dependencies and fix linting move use_batch Signed-off-by: Javan Lacerda <javanlacerda@google.com> add todo Signed-off-by: Javan Lacerda <javanlacerda@google.com> Pr/metrics logging (#5115) Signed-off-by: Javan Lacerda <javanlacerda@google.com> fixes Signed-off-by: Javan Lacerda <javanlacerda@google.com> fix lint Signed-off-by: Javan Lacerda <javanlacerda@google.com> mock gcloud auth default Signed-off-by: Javan Lacerda <javanlacerda@google.com>
1 parent 4aeec35 commit 07515ff

42 files changed

Lines changed: 4627 additions & 1640 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Copyright 2025 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
name: Run Kubernetes e2e tests
16+
on: [pull_request]
17+
18+
permissions: read-all
19+
20+
jobs:
21+
build:
22+
runs-on: ubuntu-24.04
23+
24+
steps:
25+
- uses: actions/checkout@v3
26+
- run: | # Needed for git diff to work.
27+
git fetch origin master --depth 1
28+
git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master
29+
30+
- name: Setup python environment
31+
uses: actions/setup-python@b55428b1882923874294fa556849718a1d7f2ca5
32+
with:
33+
python-version: 3.11
34+
35+
- name: Set up JDK 21
36+
uses: actions/setup-java@v3
37+
with:
38+
java-version: '21'
39+
distribution: 'temurin'
40+
41+
- name: Run Kubernetes e2e tests
42+
run: ./local/tests/kubernetes_e2e_test.bash

Pipfile

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,19 @@ future = "==0.17.1"
1010
protobuf = "==4.23.4"
1111
psutil = "==5.9.4"
1212
google-cloud-ndb = "==2.3.4"
13+
kubernetes = "==34.1.0"
14+
google-api-python-client = "==2.93.0"
15+
aiohttp = "==3.10.5"
16+
google-cloud-storage = "==2.10.0"
17+
google-cloud-secret-manager = "==2.17.0"
18+
google-cloud-logging = "==3.6.0"
19+
google-cloud-monitoring = "==2.15.1"
20+
google-cloud-datastore = "==2.16.1"
21+
oauth2client = "==4.1.3"
22+
requests = "==2.21.0"
23+
PyYAML = "==6.0"
24+
httplib2 = "==0.19.0"
25+
google-auth-oauthlib = "==0.4.1"
1326

1427
[dev-packages]
1528
Fabric = "==1.14.1"

Pipfile.lock

Lines changed: 1449 additions & 536 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

butler.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -435,7 +435,6 @@ def main():
435435
'clean_indexes', help='Clean up undefined indexes (in index.yaml).')
436436
parser_clean_indexes.add_argument(
437437
'-c', '--config-dir', required=True, help='Path to application config.')
438-
439438
parser_create_config = subparsers.add_parser(
440439
'create_config', help='Create a new deployment config.')
441440
parser_create_config.add_argument(
Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
#!/bin/bash -ex
2+
#
13
# Copyright 2025 Google LLC
24
#
35
# Licensed under the Apache License, Version 2.0 (the "License");
@@ -11,17 +13,17 @@
1113
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1214
# See the License for the specific language governing permissions and
1315
# limitations under the License.
14-
"""Kubernetes batch client."""
15-
from clusterfuzz._internal.remote_task import RemoteTaskInterface
1616

17+
# This script is for running the Kubernetes end-to-end test in CI.
18+
19+
pip install pipenv
20+
21+
# Install dependencies.
22+
pipenv --python 3.11
23+
pipenv install
1724

18-
class KubernetesJobClient(RemoteTaskInterface):
19-
"""A remote task execution client for Kubernetes.
20-
21-
This class is a placeholder for a future implementation of a remote task
22-
execution client that uses Kubernetes. It is not yet implemented.
23-
"""
25+
./local/install_deps.bash
2426

25-
def create_job(self, spec, input_urls):
26-
"""Creates a Kubernetes job."""
27-
raise NotImplementedError('Kubernetes batch client is not implemented yet.')
27+
# Run the test.
28+
export K8S_E2E=1
29+
pipenv run python butler.py py_unittest -t core -p k8s_service_e2e_test.py

out.log

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Running in dry-run mode, no datastore writes are committed. For permanent modifications, re-run with --non-dry-run.
2+
Attempting to combine batch tasks.
3+
Combining 2901 batch tasks.
4+
K8s result: ['libfuzzer-chrome-asan-debug-7ca9c46f']
5+
6+
Please remember to run the migration individually on all projects.
7+

src/Pipfile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,15 @@ google-crc32c = "==1.5.0"
2727
grpcio = "==1.62.2"
2828
httplib2 = "==0.19.0"
2929
jira = "==2.0.0"
30+
kubernetes = "==34.1.0"
3031
mozprocess = "==1.3.1"
3132
oauth2client = "==4.1.3"
3233
psutil = "==5.9.4"
3334
protobuf = "==4.23.4"
3435
pygithub = "==1.55"
3536
pyOpenSSL = "==22.0.0"
3637
python-dateutil = "==2.8.1"
37-
PyYAML = "==6.0"
38+
PyYAML = "==6.0.1"
3839
pytz = "==2023.3"
3940
redis = "==4.6.0"
4041
requests = "==2.21.0"

src/Pipfile.lock

Lines changed: 951 additions & 673 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/clusterfuzz/_internal/base/tasks/__init__.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,12 @@
6464
'regression': 24 * 60 * 60,
6565
}
6666

67+
68+
def get_task_duration(command):
69+
"""Gets the duration of a task."""
70+
return TASK_LEASE_SECONDS_BY_COMMAND.get(command, TASK_LEASE_SECONDS)
71+
72+
6773
TASK_QUEUE_DISPLAY_NAMES = {
6874
'LINUX': 'Linux',
6975
'LINUX_WITH_GPU': 'Linux (with GPU)',
@@ -503,6 +509,7 @@ def __init__(self, pubsub_message):
503509
}
504510

505511
self.eta = datetime.datetime.utcfromtimestamp(float(self.attribute('eta')))
512+
self.do_not_ack = False
506513

507514
def attribute(self, key):
508515
"""Return attribute value."""
@@ -550,7 +557,8 @@ def lease(self, _event=None): # pylint: disable=arguments-differ
550557
leaser_thread.join()
551558

552559
# If we get here the task succeeded in running. Acknowledge the message.
553-
self._pubsub_message.ack()
560+
if not self.do_not_ack:
561+
self._pubsub_message.ack()
554562
track_task_end()
555563

556564
def dont_retry(self):
@@ -587,7 +595,8 @@ def lease(self, _event=None): # pylint: disable=arguments-differ
587595
leaser_thread.join()
588596

589597
# If we get here the task succeeded in running. Acknowledge the message.
590-
self._pubsub_message.ack()
598+
if not self.do_not_ack:
599+
self._pubsub_message.ack()
591600
track_task_end()
592601

593602

src/clusterfuzz/_internal/batch/data_structures.py

Lines changed: 0 additions & 50 deletions
This file was deleted.

0 commit comments

Comments
 (0)