Skip to content

Commit 16f50b9

Browse files
gaogaotiantianHyukjinKwon
authored andcommitted
[SPARK-56688][INFRA][PYTHON] Reorganize pyspark tests to empty a CI slot
### What changes were proposed in this pull request? We emptied a CI slot for future usage. After some optimization work, we have a CI slot `pyspark-core, pyspark-errors, pyspark-streaming, pyspark-logger` that normally just takes 30 minutes. It's a waste of slot because we can only have 20 concurrent jobs. This PR splits the workload into other slots * `pyspark-core, pyspark-errors, pyspark-logger` -> `pyspark-sql, pyspark-resource, pyspark-testing` * `pyspark-streaming` -> `pyspark-structured-streaming, pyspark-structured-streaming-connect` * pip test, which used to follow `pyspark-logger` -> follow `pyspark-pipelines` now We should still be able to keep all the CI slots below 90 minutes most of the time (120 is the limit). And we can have a new slot. ### Why are the changes needed? The reason we want a new slot, is to have a CI test against old client (4.0 client for example) to check backward compatibility issues. We have broken the client multiple times this year and we have 0 tests to gate it. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? It's a CI change only. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #55638 from gaogaotiantian/rearrange-pyspark-tests. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit cff6c0d) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 6778699 commit 16f50b9

1 file changed

Lines changed: 5 additions & 8 deletions

File tree

.github/workflows/build_and_test.yml

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -555,13 +555,11 @@ jobs:
555555
- ${{ inputs.java }}
556556
modules:
557557
- >-
558-
pyspark-sql, pyspark-resource, pyspark-testing
559-
- >-
560-
pyspark-core, pyspark-errors, pyspark-streaming, pyspark-logger
558+
pyspark-sql, pyspark-resource, pyspark-testing, pyspark-core, pyspark-errors, pyspark-logger
561559
- >-
562560
pyspark-mllib, pyspark-ml, pyspark-ml-connect, pyspark-pipelines
563561
- >-
564-
pyspark-structured-streaming, pyspark-structured-streaming-connect
562+
pyspark-streaming, pyspark-structured-streaming, pyspark-structured-streaming-connect
565563
- >-
566564
pyspark-connect
567565
- >-
@@ -578,10 +576,9 @@ jobs:
578576
# Always run if pyspark == 'true', even infra-image is skip (such as non-master job)
579577
# In practice, the build will run in individual PR, but not against the individual commit
580578
# in Apache Spark repository.
581-
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-sql, pyspark-resource, pyspark-testing' }}
582-
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-core, pyspark-errors, pyspark-streaming, pyspark-logger' }}
579+
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-sql, pyspark-resource, pyspark-testing, pyspark-core, pyspark-errors, pyspark-logger' }}
583580
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-mllib, pyspark-ml, pyspark-ml-connect' }}
584-
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-structured-streaming, pyspark-structured-streaming-connect' }}
581+
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-streaming, pyspark-structured-streaming, pyspark-structured-streaming-connect' }}
585582
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-connect' }}
586583
# pyspark-install is very slow so we only run it when it's changed or explicity requested
587584
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-install != 'true' && 'pyspark-install' }}
@@ -665,7 +662,7 @@ jobs:
665662
env: ${{ fromJSON(inputs.envs) }}
666663
shell: 'script -q -e -c "bash {0}"'
667664
run: |
668-
if [[ "$MODULES_TO_TEST" == *"pyspark-errors"* ]]; then
665+
if [[ "$MODULES_TO_TEST" == *"pyspark-pipelines"* ]]; then
669666
export SKIP_PACKAGING=false
670667
echo "Python Packaging Tests Enabled!"
671668
fi

0 commit comments

Comments
 (0)