from analysis_orchestrator import AnalysisOrchestrator
# Initialize
orchestrator = AnalysisOrchestrator(token="your_token", project="YourProject")
# Run analysis
result = orchestrator.run_mean_analysis(concept_id=21490742, tres=["Nottingham"])
print(f"Result: {result['result']}")The system uses environment variables for configuration. See env.example for all available options:
# Required environment variables:
5STES_TOKEN=your_jwt_token_here
5STES_PROJECT=your_project_name
TES_BASE_URL=http://your-tes-endpoint:5034/v1/tasks
TES_DOCKER_IMAGE=harbor.your-registry.com/your-image:tag
# Database Configuration
DB_HOST=your-database-host
DB_PORT=5432
DB_USERNAME=your-database-username
DB_PASSWORD=your-database-password
DB_NAME=your-database-name
# MinIO Configuration
MINIO_STS_ENDPOINT=http://your-minio-endpoint:9000/sts
MINIO_ENDPOINT=your-minio-endpoint:9000
MINIO_OUTPUT_BUCKET=your-output-bucket-nameMain orchestrator class for federated analysis workflow.
AnalysisOrchestrator(token: str, project: str = None)Parameters:
token: Authentication token for TRE-FX services (required)project: Project name for TES tasks (defaults to 5STES_PROJECT environment variable)
Run a complete federated analysis workflow.
Parameters:
analysis_type(str): Type of analysis ("mean", "variance", "PMCC", "chi_squared_scipy")query(str): SQL query to executetres(List[str]): List of TREs to run analysis ontask_name(str, optional): Name for the TES task (defaults to "analysis {analysis_type}")bucket(str, optional): MinIO bucket for outputs (defaults to MINIO_OUTPUT_BUCKET environment variable)
Returns: Dict with analysis results
Run mean analysis for a specific concept.
Run variance analysis for a specific concept.
Run Pearson's correlation analysis between two concepts.
Run chi-squared analysis for gender vs race contingency table.
Build SQL queries for different analysis types.
Build query for mean calculation.
Build query for variance calculation.
Build query for Pearson's correlation coefficient.
Build query for contingency table generation.
Build a custom SQL query.
Validate SQL query for safety.
Perform statistical calculations and analysis.
Analyze data using specified analysis type.
Get list of supported analysis types.
Calculate descriptive statistics.
Perform hypothesis testing between two datasets.
Calculate confidence interval for the mean.
Handle data processing, aggregation, and file operations.
Import data from CSV string or list of strings.
Combine multiple contingency tables.
Aggregate data based on analysis type.
Validate data for a given analysis type.
Convert contingency table dictionary to numpy array.
Handle TES (Task Execution Service) operations.
BaseTESClient(base_url=None,
default_image=None,
default_db_config=None,
default_db_port=None)Parameters:
base_url: TES API endpoint (defaults toTES_BASE_URLenvironment variable)default_image: Docker image (defaults toTES_DOCKER_IMAGEenvironment variable)default_db_config: Database configuration dict (defaults to environment variables)default_db_port: Database port (defaults toDB_PORTenvironment variable)
Generate a TES task JSON configuration.
Submit a TES task using the requests library.
Get the status of a submitted task.
List recent tasks.
Handle MinIO operations and token management.
MinIOClient(token: str,
sts_endpoint=None,
minio_endpoint=None)Parameters:
token: OIDC token for authenticationsts_endpoint: STS endpoint URL (defaults toMINIO_STS_ENDPOINTenvironment variable)minio_endpoint: MinIO endpoint URL (defaults toMINIO_ENDPOINTenvironment variable)
Get object content from MinIO.
List objects in a bucket.
Wait for an object to appear and return its content.
Force refresh of credentials.
from analysis_orchestrator import AnalysisOrchestrator
orchestrator = AnalysisOrchestrator("your_token")
result = orchestrator.run_mean_analysis(21490742, ["Nottingham"])
print(f"Mean: {result['result']}")from query_builder import QueryBuilder
from analysis_orchestrator import AnalysisOrchestrator
qb = QueryBuilder()
custom_query = qb.build_summary_stats_query(concept_id=21490742)
orchestrator = AnalysisOrchestrator("your_token")
result = orchestrator.run_analysis(
analysis_type="mean",
query=custom_query,
tres=["Nottingham"],
task_name="Custom Analysis"
)from statistical_analyzer import StatisticalAnalyzer
from data_processor import DataProcessor
# Statistical analysis
analyzer = StatisticalAnalyzer()
supported_types = analyzer.get_supported_analysis_types()
print(f"Supported types: {supported_types}")
# Data processing
processor = DataProcessor()
sample_data = ["2,117", "3,150"]
processed_data = [processor.import_data(data) for data in sample_data]import os
from tes_client import BaseTESClient
# Configure via environment variables
os.environ['TES_BASE_URL'] = 'http://your-tes-endpoint:5034/v1/tasks'
os.environ['TES_DOCKER_IMAGE'] = 'harbor.your-registry.com/your-image:tag'
os.environ['DB_HOST'] = 'your-database-host'
os.environ['DB_PORT'] = '5432'
os.environ['DB_USERNAME'] = 'your-database-username'
os.environ['DB_PASSWORD'] = 'your-database-password'
os.environ['DB_NAME'] = 'your-database-name'
os.environ['MINIO_STS_ENDPOINT'] = 'http://your-minio-endpoint:9000/sts'
os.environ['MINIO_ENDPOINT'] = 'your-minio-endpoint:9000'
os.environ['MINIO_OUTPUT_BUCKET'] = 'your-output-bucket-name'
# Client will use environment variables automatically
client = BaseTESClient()TokenExpiredError: Token has expiredrequests.exceptions.RequestException: Network or API errorsValueError: Invalid parameters or dataKeyError: Missing required data
from minio_client import TokenExpiredError
try:
result = engine.run_analysis("mean", query, tres)
except TokenExpiredError:
print("Token expired, please refresh")
except requests.exceptions.RequestException as e:
print(f"Network error: {e}")
except ValueError as e:
print(f"Invalid data: {e}")"mean"