Skip to content

Latest commit

 

History

History
210 lines (152 loc) · 6.96 KB

File metadata and controls

210 lines (152 loc) · 6.96 KB

EC2 — Elastic Compute Cloud

What Is It?

EC2 provides virtual machines (instances) in the cloud. You choose the OS, instance type (CPU/RAM), and networking. AWS manages the physical hardware.


Instance Types — Naming Convention

m5.2xlarge
│ │  └── Size: nano, micro, small, medium, large, xlarge, 2xlarge...
│ └── Generation: 5th gen
└── Family: m = general purpose
Family Optimized for Examples
t Burstable, cost-efficient t3.micro, t3.small
m General purpose (balanced) m5.large, m6i.xlarge
c Compute optimized c5.xlarge (CPU-intensive)
r Memory optimized r5.2xlarge (Redis, Spark)
g / p GPU g4dn.xlarge (ML inference)
i Storage optimized (NVMe) i3.large (high IOPS)
inf Machine learning inference inf1.xlarge

Purchasing Options

Option When to use Savings vs On-Demand
On-Demand Short-term, unpredictable 0% (baseline)
Reserved (1yr) Steady-state workload ~40%
Reserved (3yr) Long-term steady workload ~60%
Savings Plans Flexible (commit to $/hr) ~40-60%
Spot Fault-tolerant batch jobs ~70-90%
Dedicated Host Compliance (BYOL, socket licensing) N/A
Dedicated Instance Physical isolation requirement Higher cost

Spot Instances — Key Facts

  • Cheapest option — spare AWS capacity
  • AWS can reclaim with 2-minute warning
  • Never use for: databases, critical stateful apps
  • Use for: batch processing, ML training, rendering, CI/CD workers
  • Spot Fleet: mix of instance types to maintain target capacity even if one type reclaimed

User Data — Bootstrap Script

Runs once at first launch (as root):

#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "Hello from $(hostname)" > /var/www/html/index.html

Instance Metadata Service (IMDS)

EC2 instances can query their own metadata:

# IMDSv1 (legacy - insecure)
curl http://169.254.169.254/latest/meta-data/

# IMDSv2 (secure - token-based, recommended)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/iam/security-credentials/

# What you can get:
# - Instance ID, type, region, AZ
# - IAM role credentials (temporary!)
# - Public/private IP
# - User data script

IMDSv2 is mandatory for exam answers about security — prevents SSRF attacks on metadata.


Placement Groups

Control how EC2 instances are placed physically:

Type Physical placement Use for
Cluster Same rack (same AZ) HPC, low latency network (10Gbps+ between instances)
Spread Different racks, different AZ Max HA, max 7 instances per AZ
Partition Different partitions (rack groups) Large distributed systems (Kafka, HDFS, Cassandra)

AMI — Amazon Machine Image

AMI = snapshot of an instance (OS + installed software + config):

EC2 Instance (fully configured) → Create AMI → Launch new instances from AMI

Golden AMI pattern:

  1. Launch base instance
  2. Install all dependencies, configure everything
  3. Create AMI
  4. Auto Scaling Group uses this AMI → instances launch faster (no bootstrap needed)

AMI is regional — must copy to other regions for cross-region launch.


Auto Scaling Groups (ASG)

Automatically maintain the right number of EC2 instances:

ASG: min=2, desired=4, max=10
Policy: scale out when CPU > 70%, scale in when CPU < 30%

Traffic spike → CPU 85% → ASG launches 2 more instances → CPU drops
Traffic drop → CPU 20% → ASG terminates 2 instances → save cost

Scaling Policies

Policy How Use for
Target Tracking Maintain metric at target CPU=70%, ALB requests/instance
Step Scaling Add/remove based on metric bands Aggressive scaling at critical thresholds
Scheduled Scale at specific times Known traffic patterns (9am-5pm)
Predictive ML-based, forecast and pre-scale Cyclical patterns

Lifecycle Hooks

Pause instance during launch or termination for custom actions:

Launch: Pending → [Lifecycle Hook: install agent, register with Consul] → InService
Terminate: Terminating → [Lifecycle Hook: drain connections, backup data] → Terminated

Instance Refresh

Rolling update of all instances (e.g., after AMI update):

Set MinHealthyPercentage: 80%
→ Terminate 20% of instances (replace with new AMI)
→ Wait for them healthy → continue until all replaced

Good Practices

Practice Reason
Use IAM Instance Profile (not access keys) Temporary credentials, auto-rotated
Enable IMDSv2 Prevent SSRF metadata attacks
Use Auto Scaling Groups Automatic capacity management
Use Golden AMI Fast, consistent launches
Use Spot Instances for batch workloads 70-90% cost reduction
Enable detailed monitoring (1-min) Faster reaction to scaling events
Use lifecycle hooks for graceful shutdown No dropped connections on scale-in

Bad Practices

Anti-Pattern Impact Fix
Hardcoding AWS credentials on EC2 Security risk, manual rotation Use Instance Profile (IAM role)
IMDSv1 enabled SSRF vulnerability → credential theft Require IMDSv2
Single instance, no ASG Single point of failure Use ASG with min=2 across AZs
Using On-Demand for everything 60% higher cost than Reserved Use Reserved for steady-state

Exam Tips

  1. EC2 Instance Connect: browser-based SSH. No need to manage SSH keys (temporary key pushed). Requires inbound SSH from AWS IP range.
  2. IMDSv2 uses token — PUT to get token, then use token in GET. Prevents SSRF.
  3. Placement Group Cluster: same AZ, low latency. Spread: different racks, max HA.
  4. Spot interruption: 2-minute notice. Lambda/EventBridge can handle the interruption.
  5. T instance bursting: CPU credits accumulate when idle, used when CPU > baseline. Unlimited mode: charges extra for sustained burst.
  6. On-Demand vs Reserved exam trick: If workload runs 24/7 for > 1 year → Reserved is cheaper.

Common Exam Scenarios

Q: ML training job needs cheapest compute for batch workload?Spot Instances — can handle interruptions by checkpointing.

Q: Auto Scaling not scaling fast enough during sudden traffic spike? → Use Predictive Scaling or Scheduled Scaling if pattern is known. Or set a more aggressive step scaling policy.

Q: EC2 needs to call DynamoDB securely? → Attach an IAM Instance Profile with DynamoDB permissions.

Q: Need to run 7 instances always on separate physical hardware?Spread Placement Group (max 7 per AZ per group).