Aurora Serverless RDS Multi-AZ: Database Architecture

AWS Database Services: RDS, Aurora, ElastiCache, and DynamoDB Deep Dive

Introduction

Choosing the right database architecture is critical for application performance, scalability, and cost efficiency. AWS provides four primary database services, each optimized for different use cases: Amazon RDS for managed relational databases, Amazon Aurora for cloud-native performance, ElastiCache for sub-millisecond caching, and DynamoDB for serverless NoSQL at scale.

This guide explores the architecture, configuration, and production patterns for each service. We'll cover RDS Multi-AZ deployments with automated failover, Aurora's storage architecture and serverless scaling, ElastiCache cluster modes for Redis and Memcached, and DynamoDB's partition key design with Global Secondary Indexes.


AWS Developer Certification Series

📚 View Complete AWS Developer Certification Guide - Master all 7 parts with our comprehensive learning path.

This is Part II (Current Article) of our comprehensive 7-part AWS developer guide:

  1. Part I: IAM EC2 & Auto Scaling
  2. Part II: RDS Aurora & DynamoDB (Current Article)
  3. Part III: SQS SNS & Kinesis
  4. Part IV: Lambda API Gateway
  5. Part V: ECS Fargate & IaC
  6. Part VI: Cognito KMS Security
  7. Part VII: CodePipeline & Monitoring

← Part I: IAM EC2 & Auto Scaling | Part III: SQS SNS & Kinesis →


Amazon RDS (Relational Database Service)

RDS Architecture and Deployment Modes

RDS manages relational database infrastructure—backups, patching, replication, and failover—while you control schema design and queries.

   RDS Multi-AZ Deployment   

   Availability Zone 3   

   Availability Zone 2   

   Availability Zone 1   

   writes   

   reads   

   sync replication   

   async replication   

  Primary Instance  

  db.r5.large  

  Read/Write  

  Standby Instance  

  Synchronous Replication  

  Automatic Failover  

  Read Replica  

  Asynchronous Replication  

  Read-Only  

  Application Servers  

  EBS Storage  

  Encrypted  

Deployment Options:

Mode Use Case RPO/RTO Cost
Single-AZ Dev/test environments Minutes/Hours Lowest
Multi-AZ Production HA Zero data loss / <60s 2x instance cost
Read Replicas Read-heavy workloads N/A (read scaling) +1 instance per replica

RDS Storage and Backup

Storage Types:

Type IOPS Use Case
gp3 (General Purpose SSD) 3,000-16,000 baseline Most workloads (recommended)
io1/io2 (Provisioned IOPS) Up to 64,000 High-performance OLTP

Automated Backups:

  • Daily snapshots during backup window
  • Transaction logs every 5 minutes for point-in-time recovery (PITR)
  • Retention: 0-35 days

Security:

  • Encryption: AES-256 via KMS (must enable at creation)
  • Network isolation: Private subnets, security groups
  • IAM database authentication: Token-based (15-min validity)

Amazon Aurora

Aurora Architecture

Aurora separates compute from storage, enabling independent scaling and sub-10-second failover.

   Aurora Storage Layer   

   Aurora Cluster   

   writes   

   reads   

  Writer Instance  

  Read/Write  

  Primary Endpoint  

  Reader Instance 1  

  Read-Only  

  Reader Instance 2  

  Read-Only  

  Reader Endpoint  

  Load Balancing  

  Copy 1  

  Copy 2  

  Copy 3  

  Copy 4  

  Copy 5  

  Copy 6  

  Application  

Key Features:

  1. 6-way replication across 3 AZs (can lose 2 copies without write impact)
  2. Self-healing storage - automatic repair of disk failures
  3. Backtrack - rewind database without restore (PostgreSQL only)
  4. Global Database - <1 second cross-region replication
  5. Aurora Serverless v2 - auto-scaling from 0.5 to 128 ACU

Aurora Serverless v2

resource "aws_rds_cluster" "aurora_serverless" {
  cluster_identifier = "aurora-serverless-cluster"
  engine             = "aurora-postgresql"
  engine_mode        = "provisioned"
  engine_version     = "15.4"
  database_name      = "myapp"

  serverlessv2_scaling_configuration {
    max_capacity = 16.0  # 16 ACUs = 32 GB RAM
    min_capacity = 0.5   # 0.5 ACU = 1 GB RAM
  }

  backup_retention_period = 7
  deletion_protection     = true
}

When to use:

  • Variable/unpredictable workloads
  • Dev/test environments (scale down during off-hours)
  • Multi-tenant SaaS (per-tenant scaling)

Cost savings: 60-80% vs provisioned for variable workloads


Amazon ElastiCache

ElastiCache Architecture

ElastiCache provides managed Redis and Memcached for caching, session storage, and real-time analytics.

   ElastiCache Redis Cluster Mode   

   Shard 3   

   Shard 2   

   Shard 1   

   connects   

   replicates   

   replicates   

   replicates   

  Primary Node  

  cache.r6g.large  

  Replica Node  

  Primary Node  

  Replica Node  

  Primary Node  

  Replica Node  

  Configuration Endpoint  

  Auto-Discovery  

  Application Servers  

Redis vs Memcached

Feature Redis Memcached
Data Structures Strings, Lists, Sets, Hashes Key-value only
Persistence Snapshots + AOF None
Replication Multi-AZ with failover Multi-node, no replication
Use Cases Session store, leaderboards, pub/sub Simple caching

Caching Strategy

Lazy Loading (Cache-Aside):

import redis
import json

redis_client = redis.Redis(
    host='redis-cluster.abc123.cfg.use1.cache.amazonaws.com',
    port=6379,
    ssl=True,
    password='auth-token',
    decode_responses=True
)

def get_user(user_id):
    cache_key = f"user:{user_id}"

    # Try cache first
    cached_data = redis_client.get(cache_key)
    if cached_data:
        return json.loads(cached_data)

    # Cache miss - fetch from database
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Store in cache for 1 hour
    redis_client.setex(cache_key, 3600, json.dumps(user))
    return user

TTL Best Practices:

  • Short TTL (60s): Frequently changing data (stock counts)
  • Medium TTL (1 hour): Semi-static data (user profiles)
  • Long TTL (24 hours): Rarely changing data (categories)
  • Add jitter (±5 min) to prevent thundering herd

Amazon DynamoDB

DynamoDB Architecture

DynamoDB is a fully managed, serverless NoSQL database with single-digit millisecond performance.

   DynamoDB Table   

   query by email   

   Partition 3   

  PK: user#1001 - user#1500  

  10 GB storage  

   Partition 2   

  PK: user#501 - user#1000  

  10 GB storage  

   Partition 1   

  PK: user#001 - user#500  

  10 GB storage  

  Global Secondary Index  

  Alternate access pattern  

  Application  

  DynamoDB Accelerator  

  Microsecond latency  

Core Concepts:

  • Partition Key (PK): Determines data distribution
  • Sort Key (SK): Optional, enables range queries
  • GSI: Alternate partition/sort keys for different query patterns
  • LSI: Alternate sort key (must define at table creation)

Single-Table Design Example

resource "aws_dynamodb_table" "ecommerce" {
  name         = "ecommerce-single-table"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "PK"
  range_key    = "SK"

  attribute {
    name = "PK"
    type = "S"
  }

  attribute {
    name = "SK"
    type = "S"
  }

  attribute {
    name = "GSI1PK"
    type = "S"
  }

  global_secondary_index {
    name            = "GSI1"
    hash_key        = "GSI1PK"
    projection_type = "ALL"
  }

  point_in_time_recovery {
    enabled = true
  }
}

Data Model:

PK SK GSI1PK Attributes
USER#123 #METADATA EMAIL#alice@example.com name, createdAt
USER#123 ORDER#2025-01-15 STATUS#PENDING total, items[]
PRODUCT#ABC #METADATA CATEGORY#electronics price, stock

Query Examples:

import boto3
from boto3.dynamodb.conditions import Key

table = boto3.resource('dynamodb').Table('ecommerce-single-table')

# Get user profile
response = table.get_item(Key={'PK': 'USER#123', 'SK': '#METADATA'})

# Get all orders for user
response = table.query(
    KeyConditionExpression=Key('PK').eq('USER#123') &
                          Key('SK').begins_with('ORDER#')
)

# Find user by email (using GSI)
response = table.query(
    IndexName='GSI1',
    KeyConditionExpression=Key('GSI1PK').eq('EMAIL#alice@example.com')
)

Capacity Modes

Mode Pricing Use Case
On-Demand $1.25/M writes, $0.25/M reads Unpredictable traffic
Provisioned $0.00013/WCU/hour Predictable traffic

Capacity Units:

  • 1 WCU = 1 write/sec of item ≤1 KB
  • 1 RCU = 1 strongly consistent read/sec ≤4 KB (or 2 eventually consistent)

Production Best Practices

Database Selection

Decision criteria:

  • Relational data + complex joins → RDS PostgreSQL/MySQL or Aurora
  • Extreme performance + auto-scaling → Aurora Serverless v2
  • Flexible schema + simple queries → DynamoDB
  • Sub-millisecond caching → ElastiCache Redis
  • Session store with persistence → ElastiCache Redis
  • Simple cache, multi-threaded → ElastiCache Memcached

Cost Optimization

  1. RDS: Reserved Instances (40% savings), gp3 storage (30% cheaper than gp2)
  2. Aurora: Serverless v2 for variable workloads (60-80% savings)
  3. DynamoDB: On-Demand for spiky traffic, Provisioned for steady loads
  4. ElastiCache: Right-size nodes based on memory usage metrics

Monitoring Essentials

Critical metrics to monitor:

RDS/Aurora:

  • CPUUtilization (>70% for 5 min)
  • FreeableMemory (<1 GB)
  • DatabaseConnections (>80% max)

ElastiCache:

  • Evictions (non-zero = need more memory)
  • CacheHitRate (<80% = ineffective)

DynamoDB:

  • UserErrors (throttling)
  • ConsumedReadCapacityUnits/ConsumedWriteCapacityUnits

Exam Tips

Key Concepts:

  1. RDS Multi-AZ = HA (synchronous), Read Replicas = scaling (asynchronous)
  2. Aurora Global Database = <1 sec replication across regions
  3. DynamoDB Partition Key must have high cardinality (avoid hot partitions)
  4. ElastiCache Cluster Mode = horizontal scaling with shards

Common Scenarios:

  • "Need sub-millisecond latency" → ElastiCache or DynamoDB with DAX
  • "Database must survive AZ failure" → RDS Multi-AZ or Aurora
  • "Global app, low latency" → Aurora Global Database or DynamoDB Global Tables
  • "Unpredictable traffic" → DynamoDB On-Demand or Aurora Serverless

Tricky Edge Cases:

  • RDS encryption cannot be enabled after creation
  • DynamoDB LSI must be created at table creation, GSI can be added later
  • Aurora backtrack only available for PostgreSQL
  • ElastiCache Redis AUTH token required for encryption in-transit

Frequently Asked Questions

Q: What is the difference between RDS Multi-AZ and Read Replicas?

RDS Multi-AZ provides high availability through synchronous replication to a standby instance in another availability zone with automatic failover under 60 seconds. Read Replicas use asynchronous replication for read scaling and can be in different regions. Use Multi-AZ for disaster recovery, Read Replicas for performance.

Q: When should I use Aurora instead of RDS?

Use Aurora when you need superior performance (5x MySQL, 3x PostgreSQL), automatic storage scaling up to 128TB, or serverless capabilities. Aurora costs 20% more but delivers faster replication, point-in-time recovery up to 35 days, and supports up to 15 read replicas. Choose RDS for budget constraints or specific engine versions.

Q: How does DynamoDB partition key design affect performance?

DynamoDB distributes data across partitions based on the partition key hash. Poor key design creates hot partitions where traffic concentrates on few partitions, causing throttling. Use high-cardinality keys like user IDs or composite keys. Avoid low-cardinality keys like status fields or timestamps alone.

Q: What is the difference between ElastiCache Redis and Memcached?

Redis supports complex data structures (sorted sets, lists), persistence, replication, and pub/sub messaging. Memcached is simpler, multi-threaded for high throughput, but lacks persistence and replication. Choose Redis for advanced features and persistence, Memcached for pure caching with multi-core scaling.

Q: How does Aurora Serverless scaling work?

Aurora Serverless automatically adjusts database capacity based on application demand using Aurora Capacity Units (ACUs). It scales compute and memory in seconds without downtime. Version 2 offers granular scaling from 0.5 to 128 ACUs, instant scaling, and compatibility with all Aurora features. Ideal for variable or unpredictable workloads.

Q: What are DynamoDB Global Secondary Indexes used for?

Global Secondary Indexes (GSI) enable queries on non-primary-key attributes with different partition and sort keys from the base table. They support eventual consistency, have independent throughput capacity, and can be created or deleted anytime. Use GSIs to support multiple access patterns without duplicating tables.

Q: How do I optimize ElastiCache Redis cluster performance?

Optimize Redis by enabling cluster mode for horizontal scaling across shards, using appropriate node types based on memory requirements, implementing connection pooling, and monitoring cache hit rates above 80%. Use Redis pipelining for bulk operations, set appropriate eviction policies, and enable Multi-AZ for production high availability.


Conclusion

AWS database services provide specialized solutions for different workload patterns. RDS simplifies managed relational databases with Multi-AZ HA, Aurora delivers cloud-native performance with serverless scaling, ElastiCache provides sub-millisecond caching, and DynamoDB offers serverless NoSQL at unlimited scale.

Select RDS for traditional SQL workloads, Aurora for extreme performance needs, ElastiCache to accelerate read-heavy applications, and DynamoDB for internet-scale apps with flexible schemas. Understanding deployment architectures, capacity modes, and cost optimization strategies is essential for both production systems and the AWS Certified Developer Associate exam.