Skip to content

Commit 1b2ccc7

Browse files
LuciferYangMaxGekk
authored andcommitted
[SPARK-57695][CORE] Make TestUtils.recursiveList null-safe by reusing Utils.recursiveList
### What changes were proposed in this pull request? `TestUtils.recursiveList` duplicated the recursive directory walk already provided by `Utils.recursiveList` (i.e. `SparkFileUtils.recursiveList`). This PR removes the duplicated copy and delegates to `Utils.recursiveList`: ```scala // before def recursiveList(f: File): Array[File] = { require(f.isDirectory) val current = f.listFiles current ++ current.filter(_.isDirectory).flatMap(recursiveList) } // after def recursiveList(f: File): Array[File] = Utils.recursiveList(f) ``` ### Why are the changes needed? The duplicated implementation carried the same two issues that were just fixed in `SparkFileUtils.recursiveList` under [SPARK-57530](https://issues.apache.org/jira/browse/SPARK-57530): 1. It called `File.listFiles` without a null check, so an IO error (or the directory being removed mid-walk) would throw an NPE. 2. The `current ++ ... .flatMap(...)` form had no linear-time guarantee. By delegating to `Utils.recursiveList`, `TestUtils.recursiveList` automatically picks up the null-safety (a directory that cannot be listed is skipped with a warning instead of throwing) and the O(n) traversal from SPARK-57530, and the duplicated logic is removed. This is a follow-up to SPARK-57530, which is already merged to master. ### Does this PR introduce _any_ user-facing change? No. `TestUtils` is a test-only `private[spark]` helper; this is an internal refactor with no behavior change for any successful directory walk. ### How was this patch tested? Existing tests that use `TestUtils.recursiveList` continue to exercise it. The behavioral contract for a readable directory tree is unchanged (same set of files returned); the only difference is that an unreadable directory is now skipped with a warning rather than throwing, matching `Utils.recursiveList`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code Closes #56901 from LuciferYang/SPARK-57695-testutils-recursivelist. Authored-by: YangJie <yangjie01@baidu.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit a4ce62b) Signed-off-by: Max Gekk <max.gekk@gmail.com>
1 parent 477d916 commit 1b2ccc7

1 file changed

Lines changed: 1 addition & 5 deletions

File tree

core/src/main/scala/org/apache/spark/TestUtils.scala

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -387,11 +387,7 @@ private[spark] object TestUtils extends SparkTestUtils {
387387
/**
388388
* Lists files recursively.
389389
*/
390-
def recursiveList(f: File): Array[File] = {
391-
require(f.isDirectory)
392-
val current = f.listFiles
393-
current ++ current.filter(_.isDirectory).flatMap(recursiveList)
394-
}
390+
def recursiveList(f: File): Array[File] = Utils.recursiveList(f)
395391

396392
/**
397393
* Returns the list of files at 'path' recursively. This skips files that are ignored normally

0 commit comments

Comments
 (0)