Skip to content

Make URITools.s3Region configurable via system properties/env vars for Spark/EMR #91

@carshadi

Description

@carshadi

URITools uses a static field:

public static String s3Region = null;

This works in single-JVM setups, but in Spark/EMR each executor runs in its own JVM. Setting URITools.s3Region on the driver does not propagate to executors, so S3 access may use the wrong/default region.

Proposal

Initialize s3Region from a system property and/or environment variable so it can be configured via EMR/Spark settings:

public class URITools {

    public static int cloudThreads = 256;

    public static String s3Region =
            System.getProperty(
                "s3Region",
                System.getenv("AWS_REGION")
            );

    public static boolean useS3CredentialsWrite = true;
    public static boolean useS3CredentialsRead = true;
}
spark.driver.extraJavaOptions=-Ds3Region=us-west-2
spark.executor.extraJavaOptions=-Ds3Region=us-west-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions