RDS is AWS's managed relational database service — AWS handles OS patching, backups, failover, and replication. You focus on your schema and queries.
Real-World: Instead of running MySQL on EC2 (which requires you to manage OS updates, disk management, replication setup, backup scripts), you use RDS — AWS does all that.
| Engine | Best for |
|---|---|
| MySQL | Open-source apps, WordPress |
| PostgreSQL | Complex queries, JSON, extensions |
| MariaDB | MySQL-compatible, open-source |
| Oracle | Enterprise apps, legacy systems |
| SQL Server | .NET apps, Windows integration |
| Aurora MySQL | High-performance MySQL compatible |
| Aurora PostgreSQL | High-performance PostgreSQL compatible |
| Feature | Multi-AZ | Read Replica |
|---|---|---|
| Purpose | High Availability | Performance/Scaling |
| Replication | Synchronous | Asynchronous |
| Failover | Automatic (< 2 min) | Manual promotion |
| Can read from standby? | No (standby is passive) | Yes (that's the point) |
| Cross-region? | No | Yes |
| DNS change on failover? | Yes (same DNS, new IP) | No (separate endpoint) |
Primary DB (us-east-1a) → Synchronous replication → Standby (us-east-1b)
↓
Primary fails → RDS DNS CNAME flips to standby → App reconnects
Same DNS name, ~60-120 second failover. Not instant — design apps for reconnection.
Primary DB ← Writes only
Read Replica 1 ← Reads (reporting, analytics)
Read Replica 2 ← Reads (app read traffic)
Read Replica 3 in eu-west-1 ← Cross-region reads
Up to 5 read replicas per RDS instance (15 for Aurora).
Promoting a Read Replica: Make it standalone DB (for disaster recovery cross-region).
- Daily snapshot + transaction logs
- Retention: 0-35 days
- Point-in-time recovery to any second within retention window
- Stored in S3 (you don't see/pay for this storage separately)
- You trigger explicitly
- Kept until you delete them (even after DB deleted)
- Restore = creates new DB instance
RDS Snapshot + Share: Share with other AWS accounts or make public.
- Must enable at creation time (can't add later without restore)
- Uses KMS (CMK or AWS managed)
- Encrypted DB → encrypted snapshots → encrypted read replicas
- Encrypting unencrypted DB: snapshot → copy snapshot with encryption → restore
Problem: Lambda functions each open their own DB connection. 1,000 concurrent Lambdas = 1,000 DB connections → MySQL max connections exceeded.
Solution: RDS Proxy pools connections.
1,000 Lambda instances → RDS Proxy (50 pooled connections) → RDS
Benefits:
- Reduces database load
- Reduces failover time (handles connection during Multi-AZ failover)
- Enforces IAM authentication
- Secrets Manager integration
RDS Proxy is highly recommended for Lambda + RDS.
Aurora is NOT just another RDS option — it's a completely re-architected cloud-native database.
Aurora Cluster:
├── Primary Writer (1 instance — reads + writes)
├── Replica 1 (reads only)
├── Replica 2 (reads only)
└── Shared Storage (6 copies across 3 AZs — self-healing)
- Storage: automatically grows in 10GB increments up to 128TB
- Replicas: up to 15 (vs 5 for standard RDS)
- Failover: < 30 seconds (vs 60-120s for RDS Multi-AZ)
- Performance: 5x MySQL performance, 3x PostgreSQL performance
| Endpoint | Use for |
|---|---|
| Cluster endpoint (Writer) | All writes + reads if you want |
| Reader endpoint (load-balanced) | Read traffic — auto load balances across replicas |
| Instance endpoint | Direct to specific instance (for diagnostics) |
| Custom endpoint | You define which instances (e.g., larger instances for analytics) |
No provisioned instances — scales automatically based on actual usage:
Low traffic → scales to 0 (you pay nothing)
Traffic spike → scales up in seconds
Use for: Dev/test, infrequent or unpredictable workloads, new apps.
Aurora Serverless v2: Instant scaling (unlike v1 which had a scaling delay).
Multi-region, low-latency reads:
Primary Region (us-east-1): Writer + Readers
Secondary Region (eu-west-1): Readers only
Secondary Region (ap-southeast-1): Readers only
Replication lag: typically < 1 second
RPO: < 1 second
RTO: < 1 minute (promote secondary to primary)
Use for: Global apps, disaster recovery, compliance (data in specific region).
| Feature | Redis | Memcached |
|---|---|---|
| Data structures | Strings, Lists, Sets, Sorted Sets, Hashes | Strings only |
| Persistence | Optional (RDB/AOF) | No |
| Replication | Yes (Multi-AZ) | No |
| Clustering | Yes (cluster mode) | Yes (horizontal) |
| Lua scripting | Yes | No |
| Pub/Sub | Yes | No |
| Use for | Caching + sessions + leaderboards + pub-sub | Simple caching, high throughput |
Cache-Aside (Lazy Loading) — most common:
def get_user(user_id):
# Try cache first
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# Cache miss - get from DB
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
# Store in cache with TTL
redis.setex(f"user:{user_id}", 3600, json.dumps(user))
return userWrite-Through — update cache when DB is updated:
def update_user(user_id, data):
db.execute("UPDATE users SET ... WHERE id = ?", user_id)
redis.setex(f"user:{user_id}", 3600, json.dumps(data)) # Update cache tooCache Invalidation — delete on update:
def update_user(user_id, data):
db.execute("UPDATE users SET ... WHERE id = ?", user_id)
redis.delete(f"user:{user_id}") # Next read will repopulate# Store user session
redis.setex(f"session:{session_id}", 3600, json.dumps({
'userId': 'user_123',
'email': 'john@example.com',
'role': 'admin'
}))
# Retrieve session
session = redis.get(f"session:{session_id}")# Add/update player score
redis.zadd('leaderboard', {'player_123': 9500})
redis.zadd('leaderboard', {'player_456': 8200})
redis.zadd('leaderboard', {'player_789': 11000})
# Get top 10 players
top10 = redis.zrevrange('leaderboard', 0, 9, withscores=True)
# [('player_789', 11000), ('player_123', 9500), ('player_456', 8200)]
# Get player rank
rank = redis.zrevrank('leaderboard', 'player_123') # 1 (0-indexed)| Practice | Reason |
|---|---|
| Enable Multi-AZ for production RDS | Automatic failover, no manual intervention |
| Use RDS Proxy with Lambda | Prevents connection pool exhaustion |
| Use Aurora for new high-traffic apps | Better performance, faster failover |
| Set appropriate connection pool size | Avoid overwhelming DB with connections |
| Use Read Replicas for read-heavy workloads | Scale reads independently |
| Enable encryption at creation | Can't add later without data migration |
| Use parameter groups for DB config | Version-controlled DB configuration |
| Cache frequently-read data in ElastiCache | Reduce DB load, improve latency |
| Anti-Pattern | Impact | Fix |
|---|---|---|
| Lambda connecting directly to RDS without proxy | Connection exhaustion at scale | Use RDS Proxy |
| Read traffic going to primary | Unnecessary load on writer | Use Read Replica endpoint |
| Storing sessions in RDS | High read load, latency | Use ElastiCache Redis for sessions |
| Not using Multi-AZ for production | Single point of failure | Enable Multi-AZ |
| Long-lived Lambda DB connections without proper handling | Stale connections, errors | Use connection pooling via RDS Proxy |
- Multi-AZ = HA (High Availability). Read Replicas = scalability.
- Multi-AZ standby is NOT readable — it's a hot standby, not a read replica.
- Aurora replicas serve as Multi-AZ standby AND read replicas simultaneously.
- RDS automated backup retention: 0 (disabled) to 35 days.
- ElastiCache in VPC: cannot access from internet — must be in same VPC or via VPN.
- Redis vs Memcached for exam: If question mentions sessions, sorted sets, pub/sub, Multi-AZ → Redis. Simple key-value, multi-threaded, no persistence needed → Memcached.
- Aurora Global Database: primary region handles writes; secondary regions handle reads. RPO = 1 second.
Q: Lambda causes too many RDS connections? → Use RDS Proxy.
Q: Reduce RDS load from reporting queries? → Create a Read Replica and point reporting to it.
Q: App needs sub-millisecond read latency for product catalog? → Use ElastiCache (Redis or Memcached) as cache layer in front of RDS.
Q: Store user sessions across multiple EC2 instances? → ElastiCache Redis for centralized session storage.
Q: Database needs to survive AZ failure automatically? → Enable RDS Multi-AZ.
Q: Global app with low-latency reads in multiple regions? → Aurora Global Database with read replicas in each region.