Skip to content

Latest commit

 

History

History
175 lines (130 loc) · 9.66 KB

File metadata and controls

175 lines (130 loc) · 9.66 KB
title Amazon Athena
description Give Cube IAM access to Athena, an S3 query-results path, region and catalog settings, plus optional assumed-role credentials.

Prerequisites

Setup

Manual

Add the following to a .env file in your Cube project:

Static Credentials

CUBEJS_DB_TYPE=athena
CUBEJS_AWS_KEY=AKIA************
CUBEJS_AWS_SECRET=****************************************
CUBEJS_AWS_REGION=us-east-1
CUBEJS_AWS_S3_OUTPUT_LOCATION=s3://my-athena-output-bucket
CUBEJS_AWS_ATHENA_WORKGROUP=primary
CUBEJS_DB_NAME=my_non_default_athena_database
CUBEJS_AWS_ATHENA_CATALOG=AwsDataCatalog

IAM Role Assumption

For enhanced security, you can configure Cube to assume an IAM role to access Athena:

CUBEJS_DB_TYPE=athena
CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN=arn:aws:iam::123456789012:role/AthenaAccessRole
CUBEJS_AWS_REGION=us-east-1
CUBEJS_AWS_S3_OUTPUT_LOCATION=s3://my-athena-output-bucket
CUBEJS_AWS_ATHENA_WORKGROUP=primary
# Optional: if the role requires an external ID
CUBEJS_AWS_ATHENA_ASSUME_ROLE_EXTERNAL_ID=unique-external-id

When using role assumption:

  • If running in AWS (EC2, ECS, EKS with IRSA), the driver will use the instance's IAM role or service account to assume the target role
  • You can also provide CUBEJS_AWS_KEY and CUBEJS_AWS_SECRET as master credentials for the role assumption

Cube Cloud

In some cases you'll need to allow connections from your Cube Cloud deployment IP address to your database. You can copy the IP address from either the Database Setup step in deployment creation, or from Settings → Configuration in your deployment.

In Cube Cloud, select AWS Athena** when creating a new deployment and fill in the required fields:

Cube Cloud AWS Athena Configuration Screen

Cube Cloud also supports connecting to data sources within private VPCs if dedicated infrastructure is used. Check out the VPC connectivity guide for details.

Environment Variables

Environment Variable Description Possible Values Required
CUBEJS_AWS_KEY The AWS Access Key ID to use for database connections A valid AWS Access Key ID 1
CUBEJS_AWS_SECRET The AWS Secret Access Key to use for database connections A valid AWS Secret Access Key 1
CUBEJS_AWS_REGION The AWS region of the Cube deployment A valid AWS region
CUBEJS_AWS_S3_OUTPUT_LOCATION The S3 path to store query results made by the Cube deployment A valid S3 path
CUBEJS_AWS_ATHENA_WORKGROUP The name of the workgroup in which the query is being started A valid Athena Workgroup
CUBEJS_AWS_ATHENA_CATALOG The name of the catalog to use by default A valid Athena Catalog name
CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN The ARN of the IAM role to assume for Athena access A valid IAM role ARN
CUBEJS_AWS_ATHENA_ASSUME_ROLE_EXTERNAL_ID The external ID to use when assuming the IAM role (if required by the role's trust policy) A string
CUBEJS_DB_NAME The name of the database to use by default A valid Athena Database name
CUBEJS_DB_SCHEMA The name of the schema to use as information_schema filter. Reduces count of tables loaded during schema generation. A valid schema name
CUBEJS_CONCURRENCY The number of concurrent queries to the data source A valid number

1 Either provide CUBEJS_AWS_KEY and CUBEJS_AWS_SECRET for static credentials, or use CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN for role-based authentication. When using role assumption without static credentials, the driver will use the AWS SDK's default credential chain (IAM instance profile, EKS IRSA, etc.).

Pre-Aggregation Feature Support

count_distinct_approx

Measures of type count_distinct_approx can be used in pre-aggregations when using AWS Athena as a source database. To learn more about AWS Athena's support for approximate aggregate functions, click here.

Pre-Aggregation Build Strategies

To learn more about pre-aggregation build strategies, head here.

Feature Works with read-only mode? Is default?
Batching
Export Bucket

By default, AWS Athena uses a batching strategy to build pre-aggregations.

Batching

No extra configuration is required to configure batching for AWS Athena.

Export Bucket

AWS Athena only supports using AWS S3 for export buckets.

AWS S3

For improved pre-aggregation performance with large datasets, enable export bucket functionality by configuring Cube with the following environment variables:

Ensure the AWS credentials are correctly configured in IAM to allow reads and writes to the export bucket in S3.

CUBEJS_DB_EXPORT_BUCKET_TYPE=s3
CUBEJS_DB_EXPORT_BUCKET=my.bucket.on.s3
CUBEJS_DB_EXPORT_BUCKET_AWS_KEY=<AWS_KEY>
CUBEJS_DB_EXPORT_BUCKET_AWS_SECRET=<AWS_SECRET>
CUBEJS_DB_EXPORT_BUCKET_AWS_REGION=<AWS_REGION>

SSL

Cube does not require any additional configuration to enable SSL as AWS Athena connections are made over HTTPS.