|
| 1 | +# Infrastructure Refactoring Design Document |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This design document outlines the architecture for a new unified AWS workshop infrastructure system. The system uses a convention-based approach with a single CDK codebase that generates different CloudFormation templates based on workshop type. The design emphasizes modularity, automation, and parallel deployment to create an efficient and maintainable infrastructure management system. |
| 6 | + |
| 7 | +## Architecture |
| 8 | + |
| 9 | +### Core Concept |
| 10 | + |
| 11 | +The architecture follows a **CDK → CloudFormation → Workshop Studio** workflow with convention-based conditional deployment. A single CDK application determines which resources to create based on the workshop type specified via environment variables. |
| 12 | + |
| 13 | +### Directory Structure |
| 14 | + |
| 15 | +``` |
| 16 | +infra/ |
| 17 | +├── cdk/ |
| 18 | +│ ├── src/main/java/sample/com/ |
| 19 | +│ │ ├── constructs/ # Reusable constructs |
| 20 | +│ │ │ ├── Vpc.java |
| 21 | +│ │ │ ├── Ide.java |
| 22 | +│ │ │ ├── Eks.java |
| 23 | +│ │ │ ├── Database.java |
| 24 | +│ │ │ ├── CodeBuild.java |
| 25 | +│ │ │ └── Roles.java |
| 26 | +│ │ ├── stacks/ |
| 27 | +│ │ │ └── WorkshopStack.java |
| 28 | +│ │ └── WorkshopApp.java # Main CDK application |
| 29 | +│ ├── pom.xml |
| 30 | +│ └── cdk.json |
| 31 | +├── cfn/ # Generated CloudFormation templates |
| 32 | +│ ├── ide.yaml |
| 33 | +│ ├── java-on-aws.yaml |
| 34 | +│ ├── java-on-eks.yaml |
| 35 | +│ ├── java-ai-agents.yaml |
| 36 | +│ └── java-spring-ai-agents.yaml |
| 37 | +├── scripts/ |
| 38 | +│ ├── workshops/ # Workshop-specific orchestration scripts |
| 39 | +│ │ ├── ide.sh |
| 40 | +│ │ ├── java-on-aws.sh |
| 41 | +│ │ ├── java-on-eks.sh |
| 42 | +│ │ ├── java-ai-agents.sh |
| 43 | +│ │ └── java-spring-ai-agents.sh |
| 44 | +│ ├── setup/ # Modular setup scripts |
| 45 | +│ │ ├── base.sh |
| 46 | +│ │ ├── eks.sh |
| 47 | +│ │ ├── app.sh |
| 48 | +│ │ ├── monitoring.sh |
| 49 | +│ │ └── ai-agents.sh |
| 50 | +│ ├── lib/ # Common utilities |
| 51 | +│ │ ├── common.sh |
| 52 | +│ │ └── wait-for-resources.sh |
| 53 | +│ ├── deploy/ # Deployment utilities |
| 54 | +│ ├── test/ # Testing scripts |
| 55 | +│ └── cleanup/ # Cleanup scripts |
| 56 | +└── package.json # Build automation |
| 57 | +``` |
| 58 | + |
| 59 | +## Components and Interfaces |
| 60 | + |
| 61 | +### CDK Components |
| 62 | + |
| 63 | +#### WorkshopStack |
| 64 | +The main CDK stack that conditionally creates resources based on workshop type: |
| 65 | + |
| 66 | +```java |
| 67 | +public class WorkshopStack extends Stack { |
| 68 | + public WorkshopStack(final Construct scope, final String id, final StackProps props) { |
| 69 | + super(scope, id, props); |
| 70 | + |
| 71 | + String workshopType = System.getenv("WORKSHOP_TYPE"); |
| 72 | + if (workshopType == null) { |
| 73 | + workshopType = "ide"; // default |
| 74 | + } |
| 75 | + |
| 76 | + // Core infrastructure (always created) |
| 77 | + var roles = new Roles(this, "Roles"); |
| 78 | + var vpc = new Vpc(this, "Vpc"); |
| 79 | + var ide = new Ide(this, "Ide", vpc.getVpc(), roles); |
| 80 | + |
| 81 | + // Conditional resources based on workshop type |
| 82 | + if (!"ide".equals(workshopType) && !"java-ai-agents".equals(workshopType)) { |
| 83 | + new Eks(this, "Eks", vpc.getVpc(), roles); |
| 84 | + } |
| 85 | + |
| 86 | + if (!"ide".equals(workshopType)) { |
| 87 | + new Database(this, "Database", vpc.getVpc()); |
| 88 | + } |
| 89 | + |
| 90 | + // CodeBuild for workshop setup |
| 91 | + new CodeBuild(this, "CodeBuild", |
| 92 | + Map.of("STACK_NAME", Aws.STACK_NAME, "WORKSHOP_TYPE", workshopType)); |
| 93 | + } |
| 94 | +} |
| 95 | +``` |
| 96 | + |
| 97 | +#### Reusable Constructs |
| 98 | + |
| 99 | +**Vpc**: Creates VPC with appropriate subnets and networking configuration |
| 100 | +**Ide**: Creates VS Code IDE environment with necessary permissions |
| 101 | +**Eks**: Creates EKS cluster with AutoMode |
| 102 | +**Database**: Configures RDS instances and database schemas |
| 103 | +**CodeBuild**: Creates CodeBuild project for workshop setup automation |
| 104 | +**Roles**: Creates IAM roles and policies for workshop resources |
| 105 | + |
| 106 | +### Script Organization |
| 107 | + |
| 108 | +#### Convention-Based Script Discovery |
| 109 | +Scripts are organized using a naming convention where the script name matches the stack name: |
| 110 | +- `ide.sh` → executed for ide workshop type |
| 111 | +- `java-on-aws.sh` → executed for java-on-aws workshop type |
| 112 | +- `java-on-eks.sh` → executed for java-on-eks workshop type |
| 113 | + |
| 114 | +#### Modular Setup Scripts |
| 115 | +Common functionality is extracted into reusable modules: |
| 116 | +- `base.sh`: Common tools and AWS CLI configuration |
| 117 | +- `eks.sh`: EKS cluster configuration and kubectl setup |
| 118 | +- `app.sh`: Application deployment and configuration |
| 119 | +- `monitoring.sh`: Observability stack setup |
| 120 | +- `ai-agents.sh`: AI-specific setup for agent workshops |
| 121 | + |
| 122 | +### Build Automation |
| 123 | + |
| 124 | +#### Template Generation |
| 125 | +The build process generates one unified CloudFormation template and syncs it to workshop directories: |
| 126 | + |
| 127 | +```json |
| 128 | +{ |
| 129 | + "scripts": { |
| 130 | + "generate": "./scripts/cfn/generate.sh", |
| 131 | + "sync": "./scripts/cfn/sync.sh" |
| 132 | + } |
| 133 | +} |
| 134 | +``` |
| 135 | + |
| 136 | +**scripts/cfn/generate.sh**: |
| 137 | +```bash |
| 138 | +#!/bin/bash |
| 139 | +set -e |
| 140 | + |
| 141 | +echo "🔧 Generating unified template..." |
| 142 | + |
| 143 | +cd cdk |
| 144 | +mvn clean package |
| 145 | +cdk synth stack --yaml --path-metadata false --version-reporting false > ../cfn/stack.yaml |
| 146 | +cd .. |
| 147 | + |
| 148 | +echo "✅ Generated cfn/stack.yaml" |
| 149 | +``` |
| 150 | + |
| 151 | +**scripts/cfn/sync.sh**: |
| 152 | +```bash |
| 153 | +#!/bin/bash |
| 154 | +set -e |
| 155 | + |
| 156 | +WORKSHOPS=("ide" "java-on-aws" "java-on-eks" "java-ai-agents" "java-spring-ai-agents") |
| 157 | + |
| 158 | +for workshop in "${WORKSHOPS[@]}"; do |
| 159 | + target_dir="../$workshop/static" |
| 160 | + |
| 161 | + if [[ -d "$target_dir" ]]; then |
| 162 | + # Copy CloudFormation template |
| 163 | + cp "cfn/stack.yaml" "$target_dir/$workshop-stack.yaml" |
| 164 | + echo "✅ Synced stack.yaml to $workshop/static/$workshop-stack.yaml" |
| 165 | + |
| 166 | + # Copy IAM policy if it exists |
| 167 | + if [[ -f "policies/policy.json" ]]; then |
| 168 | + cp "policies/policy.json" "$target_dir/" |
| 169 | + echo "✅ Synced policy to $workshop/static/" |
| 170 | + fi |
| 171 | + else |
| 172 | + echo "⚠️ Directory $target_dir not found, skipping sync for $workshop" |
| 173 | + fi |
| 174 | +done |
| 175 | + |
| 176 | +echo "🎉 All templates and policies synced successfully!" |
| 177 | +``` |
| 178 | + |
| 179 | +## Data Models |
| 180 | + |
| 181 | +### Workshop Configuration |
| 182 | +```java |
| 183 | +public class WorkshopConfig { |
| 184 | + private String workshopType; |
| 185 | + private boolean includeEks; |
| 186 | + private boolean includeDatabase; |
| 187 | + private boolean includeBedrock; |
| 188 | + private Map<String, String> environmentVariables; |
| 189 | + |
| 190 | + // Constructor, getters, setters |
| 191 | +} |
| 192 | +``` |
| 193 | + |
| 194 | +### Script Execution Context |
| 195 | +```java |
| 196 | +public class ScriptContext { |
| 197 | + private String stackName; |
| 198 | + private String workshopType; |
| 199 | + private String region; |
| 200 | + private Map<String, String> resourceIds; |
| 201 | + private List<String> setupSteps; |
| 202 | + |
| 203 | + // Constructor, getters, setters |
| 204 | +} |
| 205 | +``` |
| 206 | + |
| 207 | +### Build Configuration |
| 208 | +```java |
| 209 | +public class BuildConfig { |
| 210 | + private List<String> workshopTypes; |
| 211 | + private String outputDirectory; |
| 212 | + private Map<String, String> templateMappings; |
| 213 | + |
| 214 | + // Constructor, getters, setters |
| 215 | +} |
| 216 | +``` |
| 217 | + |
| 218 | +## Correctness Properties |
| 219 | + |
| 220 | +*A property is a characteristic or behavior that should hold true across all valid executions of a system-essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.* |
| 221 | + |
| 222 | +### Property 1: Workshop Type Resource Mapping |
| 223 | +*For any* workshop type, the CDK stack should create exactly the resources specified for that workshop type and no others |
| 224 | +**Validates: Requirements 1.2** |
| 225 | + |
| 226 | +### Property 2: Template Generation Consistency |
| 227 | +*For any* workshop type, generating a CloudFormation template should produce a template containing only the resources appropriate for that workshop type |
| 228 | +**Validates: Requirements 1.3** |
| 229 | + |
| 230 | +### Property 3: Script Output Format Consistency |
| 231 | +*For any* setup script execution, all output messages should follow the emoji-based logging format |
| 232 | +**Validates: Requirements 2.3, 3.3** |
| 233 | + |
| 234 | +### Property 4: Error Handling Consistency |
| 235 | +*For any* error condition during setup, the system should halt execution and display detailed error information with troubleshooting guidance |
| 236 | +**Validates: Requirements 2.4, 3.2** |
| 237 | + |
| 238 | +### Property 5: Convention-Based Script Discovery |
| 239 | +*For any* stack name, the system should find and execute the corresponding workshop script with matching name |
| 240 | +**Validates: Requirements 3.1** |
| 241 | + |
| 242 | +### Property 6: Timeout Handling |
| 243 | +*For any* setup operation that exceeds timeout limits, the system should abort with clear timeout messages and suggested actions |
| 244 | +**Validates: Requirements 3.4** |
| 245 | + |
| 246 | +### Property 7: Service Verification |
| 247 | +*For any* completed setup script, the system should verify that all critical services are operational |
| 248 | +**Validates: Requirements 3.5** |
| 249 | + |
| 250 | +### Property 8: Build Log Capture |
| 251 | +*For any* CodeBuild failure, the system should capture full logs and provide build ID for support reference |
| 252 | +**Validates: Requirements 3.6** |
| 253 | + |
| 254 | +### Property 9: Resource Waiting Behavior |
| 255 | +*For any* critical resource that is not ready, the system should wait with progress indicators up to defined timeout limits |
| 256 | +**Validates: Requirements 3.7** |
| 257 | + |
| 258 | +### Property 10: Template Generation Completeness |
| 259 | +*For any* build process execution, all expected workshop-specific CloudFormation templates should be generated |
| 260 | +**Validates: Requirements 4.1** |
| 261 | + |
| 262 | +### Property 11: Template Distribution Accuracy |
| 263 | +*For any* generated template, it should be copied to the correct workshop directory with matching name |
| 264 | +**Validates: Requirements 4.2** |
| 265 | + |
| 266 | +### Property 12: Build Error Reporting |
| 267 | +*For any* template generation failure, the build process should halt and report specific errors |
| 268 | +**Validates: Requirements 4.4** |
| 269 | + |
| 270 | +### Property 13: Distribution Verification |
| 271 | +*For any* template distribution, the system should verify successful copying to all target locations |
| 272 | +**Validates: Requirements 4.5** |
| 273 | + |
| 274 | +### Property 14: Migration Directory Safety |
| 275 | +*For any* migration process, the new infra/ directory structure should be created without modifying any files in infrastructure/ |
| 276 | +**Validates: Requirements 5.1** |
| 277 | + |
| 278 | +### Property 15: Base Stack Composition |
| 279 | +*For any* base CDK stack implementation, it should contain exactly VPC, IDE, and CodeBuild resources |
| 280 | +**Validates: Requirements 5.2** |
| 281 | + |
| 282 | +### Property 16: Template Equivalence |
| 283 | +*For any* migrated workshop type, the new template should produce equivalent infrastructure to the existing template |
| 284 | +**Validates: Requirements 5.5** |
| 285 | + |
| 286 | +### Property 17: Lambda Function Consolidation |
| 287 | +*For any* consolidated Lambda handler, it should provide equivalent functionality to all original Python/JavaScript functions |
| 288 | +**Validates: Requirements 5.8** |
| 289 | + |
| 290 | +## Error Handling |
| 291 | + |
| 292 | +### Script Error Handling Strategy |
| 293 | +All setup scripts implement consistent error handling using bash error traps: |
| 294 | + |
| 295 | +```bash |
| 296 | +#!/bin/bash |
| 297 | +set -e # Exit on any error |
| 298 | + |
| 299 | +handle_error() { |
| 300 | + local exit_code=$1 |
| 301 | + local line_number=$2 |
| 302 | + echo "❌ ERROR: Command failed with exit code $exit_code at line $line_number" |
| 303 | + echo "🔍 Check the logs above for details" |
| 304 | + echo "📞 Contact workshop support if this persists" |
| 305 | + exit $exit_code |
| 306 | +} |
| 307 | + |
| 308 | +trap 'handle_error $? $LINENO' ERR |
| 309 | +``` |
| 310 | + |
| 311 | +### CDK Error Handling |
| 312 | +CDK constructs implement validation and error reporting: |
| 313 | + |
| 314 | +```java |
| 315 | +public class WorkshopVpc extends Construct { |
| 316 | + public WorkshopVpc(final Construct scope, final String id) { |
| 317 | + super(scope, id); |
| 318 | + |
| 319 | + try { |
| 320 | + this.vpc = Vpc.Builder.create(this, "Vpc") |
| 321 | + .maxAzs(2) |
| 322 | + .natGateways(1) |
| 323 | + .build(); |
| 324 | + } catch (Exception e) { |
| 325 | + System.err.println("Failed to create VPC: " + e.getMessage()); |
| 326 | + throw new RuntimeException("VPC creation failed", e); |
| 327 | + } |
| 328 | + } |
| 329 | +} |
| 330 | +``` |
| 331 | + |
| 332 | +### Timeout Management |
| 333 | +Operations implement explicit timeout handling: |
| 334 | + |
| 335 | +```bash |
| 336 | +echo "⏳ Waiting for EKS cluster (timeout: 20 minutes)..." |
| 337 | +timeout 1200 bash -c 'while ! check_cluster; do sleep 10; done' || { |
| 338 | + echo "⏰ TIMEOUT: EKS cluster did not become ready within 20 minutes" |
| 339 | + echo "🔍 Check CloudFormation events for cluster creation issues" |
| 340 | + exit 1 |
| 341 | +} |
| 342 | +``` |
| 343 | + |
| 344 | +## Testing Strategy |
| 345 | + |
| 346 | +### Unit Testing Approach |
| 347 | +Unit tests focus on specific components and their interactions: |
| 348 | +- CDK construct validation |
| 349 | +- Script parsing and execution logic |
| 350 | +- Configuration validation |
| 351 | +- Error handling scenarios |
| 352 | + |
| 353 | +### Property-Based Testing Approach |
| 354 | +Property-based tests verify universal properties across all inputs using **QuickCheck for Java** (or similar library): |
| 355 | +- Template generation consistency across workshop types |
| 356 | +- Script discovery and execution patterns |
| 357 | +- Error handling behavior across different failure modes |
| 358 | +- Resource creation patterns for different workshop configurations |
| 359 | + |
| 360 | +Each property-based test runs a minimum of 100 iterations to ensure comprehensive coverage of the input space. |
| 361 | + |
| 362 | +### Test Organization |
| 363 | +- Unit tests: Verify specific examples and integration points |
| 364 | +- Property tests: Verify universal correctness properties |
| 365 | +- Integration tests: Validate end-to-end workshop deployment scenarios |
| 366 | + |
| 367 | +### Property Test Implementation |
| 368 | +Each property-based test is tagged with a comment referencing the design document property: |
| 369 | + |
| 370 | +```java |
| 371 | +/** |
| 372 | + * Feature: infra, Property 1: Workshop Type Resource Mapping |
| 373 | + */ |
| 374 | +@Property |
| 375 | +void workshopTypeResourceMapping(@ForAll String workshopType) { |
| 376 | + // Test implementation |
| 377 | +} |
| 378 | +``` |
0 commit comments