SQS FIFO Queues SNS Pub/Sub: Event-Driven Systems

Introduction

Decoupled architectures are essential for building scalable, resilient cloud applications. AWS provides three core messaging services for different communication patterns: Amazon SQS for asynchronous queue-based messaging, Amazon SNS for pub/sub and fan-out notifications, and Amazon Kinesis for real-time data streaming and analytics.

This guide explores the architecture, configuration, and production patterns for each service. We'll cover SQS standard vs FIFO queues with dead-letter queues, SNS topic subscriptions with message filtering, and Kinesis data streams with consumer scaling patterns.

AWS Developer Certification Series

📚 View Complete AWS Developer Certification Guide - Master all 7 parts with our comprehensive learning path.

This is Part III (Current Article) of our comprehensive 7-part AWS developer guide:

← Part II: RDS Aurora & DynamoDB | Part IV: Lambda API Gateway →

Amazon SQS (Simple Queue Service)

SQS Architecture

SQS provides fully managed message queues for decoupling application components with at-least-once delivery.

Key Features:

At-least-once delivery (Standard) or exactly-once processing (FIFO)
Message retention: 1 minute to 14 days (default 4 days)
Message size: Up to 256 KB (use S3 for larger)
Visibility timeout: Prevents duplicate processing (default 30s)
Long polling: Reduces empty responses (up to 20s wait)

SQS Standard vs FIFO Queues

Feature	Standard Queue	FIFO Queue
Throughput	Unlimited TPS	300 TPS (3,000 with batching)
Ordering	Best-effort ordering	Strict ordering
Delivery	At-least-once	Exactly-once processing
Use Case	High throughput, duplicates OK	Order-critical workflows
Naming	Any name	Must end with `.fifo`

SQS Configuration with Terraform

resource "aws_sqs_queue" "orders" {
  name                       = "orders-queue.fifo"
  fifo_queue                 = true
  content_based_deduplication = true

  # Message settings
  message_retention_seconds  = 1209600  # 14 days
  visibility_timeout_seconds = 300      # 5 minutes

  # Dead letter queue
  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.orders_dlq.arn
    maxReceiveCount     = 3
  })

  # Encryption
  sqs_managed_sse_enabled = true
}

resource "aws_sqs_queue" "orders_dlq" {
  name                      = "orders-dlq.fifo"
  fifo_queue                = true
  message_retention_seconds = 1209600
}

SQS Message Processing Patterns

Long Polling (Recommended):

import boto3
import json

sqs = boto3.client('sqs')
queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/orders-queue.fifo'

def process_messages():
    while True:
        # Long polling with 20-second wait
        response = sqs.receive_message(
            QueueUrl=queue_url,
            MaxNumberOfMessages=10,
            WaitTimeSeconds=20,  # Long polling
            VisibilityTimeout=300,
            MessageAttributeNames=['All']
        )

        messages = response.get('Messages', [])

        for message in messages:
            try:
                # Process message
                body = json.loads(message['Body'])
                process_order(body)

                # Delete after successful processing
                sqs.delete_message(
                    QueueUrl=queue_url,
                    ReceiptHandle=message['ReceiptHandle']
                )
            except Exception as e:
                print(f"Error processing message: {e}")
                # Message returns to queue after visibility timeout

Batch Operations:

# Send batch (up to 10 messages)
entries = [
    {
        'Id': str(i),
        'MessageBody': json.dumps({'order_id': i}),
        'MessageGroupId': 'orders',  # FIFO requirement
        'MessageDeduplicationId': str(uuid.uuid4())
    }
    for i in range(10)
]

sqs.send_message_batch(QueueUrl=queue_url, Entries=entries)

Amazon SNS (Simple Notification Service)

SNS Architecture

SNS provides pub/sub messaging for fan-out patterns and multi-protocol delivery.

Supported Protocols:

SQS - Queue-based processing
Lambda - Serverless event handling
HTTP/HTTPS - Webhook delivery
Email/Email-JSON - Notifications
SMS - Text messages
Mobile Push - iOS, Android, Firebase

SNS with SQS Fan-out Pattern

resource "aws_sns_topic" "orders" {
  name = "order-events"

  # Enable encryption
  kms_master_key_id = aws_kms_key.sns.id
}

# Subscription 1: Inventory service
resource "aws_sns_topic_subscription" "inventory" {
  topic_arn = aws_sns_topic.orders.arn
  protocol  = "sqs"
  endpoint  = aws_sqs_queue.inventory.arn

  # Message filtering
  filter_policy = jsonencode({
    event_type = ["order_placed", "order_cancelled"]
  })
}

# Subscription 2: Analytics service
resource "aws_sns_topic_subscription" "analytics" {
  topic_arn = aws_sns_topic.orders.arn
  protocol  = "sqs"
  endpoint  = aws_sqs_queue.analytics.arn
}

# SQS queue policy to allow SNS
resource "aws_sqs_queue_policy" "inventory" {
  queue_url = aws_sqs_queue.inventory.url

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = { Service = "sns.amazonaws.com" }
      Action = "sqs:SendMessage"
      Resource = aws_sqs_queue.inventory.arn
      Condition = {
        ArnEquals = { "aws:SourceArn" = aws_sns_topic.orders.arn }
      }
    }]
  })
}

SNS Message Filtering

import boto3
import json

sns = boto3.client('sns')

# Publish with attributes for filtering
sns.publish(
    TopicArn='arn:aws:sns:us-east-1:123456789012:order-events',
    Message=json.dumps({
        'order_id': '12345',
        'amount': 99.99,
        'status': 'placed'
    }),
    MessageAttributes={
        'event_type': {'DataType': 'String', 'StringValue': 'order_placed'},
        'priority': {'DataType': 'String', 'StringValue': 'high'},
        'region': {'DataType': 'String', 'StringValue': 'us-east-1'}
    }
)

Filter Policy Example:

{
  "event_type": ["order_placed", "order_shipped"],
  "priority": ["high"],
  "region": [{ "prefix": "us-" }]
}

Amazon Kinesis Data Streams

Kinesis Architecture

Kinesis enables real-time streaming data processing with ordered, replayable records.

Key Concepts:

Shard: Base unit of capacity (1 MB/s or 1,000 records/s write, 2 MB/s read)
Partition Key: Determines shard assignment (use high-cardinality keys)
Sequence Number: Unique per shard, preserves order
Retention: 24 hours default (up to 365 days with extended retention)
Consumers: Shared (2 MB/s per shard) or Enhanced Fan-out (2 MB/s per consumer)

Kinesis Stream Configuration

resource "aws_kinesis_stream" "events" {
  name             = "application-events"
  shard_count      = 3
  retention_period = 168  # 7 days

  shard_level_metrics = [
    "IncomingBytes",
    "IncomingRecords",
    "OutgoingBytes",
    "OutgoingRecords"
  ]

  stream_mode_details {
    stream_mode = "PROVISIONED"  # or ON_DEMAND
  }

  encryption_type = "KMS"
  kms_key_id      = aws_kms_key.kinesis.id
}

Kinesis Producer and Consumer

Kinesis Producer (Python):

import boto3
import json
from datetime import datetime

kinesis = boto3.client('kinesis')

def send_event(user_id, event_type, data):
    kinesis.put_record(
        StreamName='application-events',
        Data=json.dumps({
            'user_id': user_id,
            'event_type': event_type,
            'timestamp': datetime.utcnow().isoformat(),
            'data': data
        }),
        PartitionKey=str(user_id)  # Ensures same user goes to same shard
    )

# Batch writes (up to 500 records)
records = [
    {
        'Data': json.dumps({'user_id': i, 'action': 'click'}),
        'PartitionKey': str(i)
    }
    for i in range(100)
]

kinesis.put_records(StreamName='application-events', Records=records)

Lambda Consumer:

import json
import base64

def lambda_handler(event, context):
    for record in event['Records']:
        # Kinesis data is base64 encoded
        payload = base64.b64decode(record['kinesis']['data'])
        data = json.loads(payload)

        print(f"Processing: {data}")

        # Process event
        process_event(data)

    return {'statusCode': 200}

def process_event(data):
    # Business logic
    user_id = data['user_id']
    event_type = data['event_type']
    # ... process data

Comparison and Use Cases

When to Use Each Service

Service	Use Case	Pattern
SQS Standard	High-throughput async processing, duplicates OK	Queue-based decoupling
SQS FIFO	Order-critical workflows, exactly-once processing	Ordered queue processing
SNS	Fan-out to multiple subscribers, multi-protocol delivery	Pub/sub notifications
SNS + SQS	Reliable fan-out with queue buffering	Fan-out + queue pattern
Kinesis	Real-time analytics, ordered streaming, replay capability	Stream processing

Architecture Patterns

Pattern 1: Decoupled Microservices (SQS)

Producer sends messages to SQS
Consumer polls and processes asynchronously
DLQ handles failed messages

Pattern 2: Event Fan-out (SNS + SQS)

Publisher sends to SNS topic
Multiple SQS queues subscribe to topic
Each service processes independently

Pattern 3: Real-time Analytics (Kinesis)

Producers stream events to Kinesis
Lambda processes in real-time
Firehose archives to S3/Redshift

Production Best Practices

SQS Best Practices

Visibility Timeout: Set to 6x your processing time
Dead Letter Queue: Always configure with maxReceiveCount of 3-5
Long Polling: Enable WaitTimeSeconds=20 to reduce costs
Batching: Use batch operations for throughput (10 messages max)
Idempotency: Handle duplicate messages (use deduplication IDs for FIFO)

SNS Best Practices

Message Filtering: Use filter policies to reduce unnecessary deliveries
Retry Policies: Configure delivery retry for HTTP/HTTPS endpoints
DLQ for Lambda: Set up DLQ for failed Lambda invocations
Encryption: Enable KMS encryption for sensitive data
Raw Message Delivery: Enable for SQS subscribers to avoid SNS wrapper

Kinesis Best Practices

Partition Keys: Use high-cardinality keys (user ID, device ID) to distribute load
Shard Scaling: Monitor IncomingBytes and IncomingRecords metrics
Enhanced Fan-out: Use for multiple consumers (dedicated 2 MB/s per consumer)
Checkpointing: Use KCL (Kinesis Client Library) for automatic checkpointing
Error Handling: Configure Lambda retry and DLQ for stream processing

Monitoring and Troubleshooting

Key Metrics to Monitor

SQS:

ApproximateNumberOfMessagesVisible (queue depth)
ApproximateAgeOfOldestMessage (processing lag)
NumberOfMessagesSent / NumberOfMessagesDeleted

SNS:

NumberOfMessagesPublished
NumberOfNotificationsFailed
NumberOfNotificationsDelivered

Kinesis:

IncomingBytes / IncomingRecords (producer throughput)
GetRecords.IteratorAgeMilliseconds (consumer lag)
WriteProvisionedThroughputExceeded (throttling)

Exam Tips

Key Concepts:

SQS FIFO requires .fifo suffix and has 300 TPS limit (3,000 with batching)
SNS cannot deliver to SQS FIFO - use SNS Standard → SQS FIFO
Kinesis ordering is per shard (use same partition key for ordering)
Dead Letter Queues are for debugging failed messages, not for retry logic

Common Scenarios:

"Decouple microservices" → SQS
"Fan-out to multiple services" → SNS + SQS
"Real-time clickstream analytics" → Kinesis Data Streams
"Exactly-once processing in order" → SQS FIFO
"Replay events from 3 days ago" → Kinesis (with extended retention)

Frequently Asked Questions

Q: What is the difference between SQS Standard and FIFO queues?

SQS Standard queues offer unlimited throughput, at-least-once delivery, and best-effort ordering. FIFO queues guarantee exactly-once processing, strict message ordering, and 300 transactions per second (3,000 with batching). Use Standard for high throughput, FIFO when order matters like payment processing or inventory updates.

Q: How does SNS message filtering work?

SNS message filtering uses filter policies attached to subscriptions to control which messages are delivered to each subscriber. Policies use JSON to match message attributes like numeric ranges, string matching, or prefix patterns. This reduces unnecessary deliveries, lowers costs, and simplifies subscriber logic.

Q: What is Kinesis Data Streams retention period?

Kinesis Data Streams retains records for 24 hours by default, extendable to 365 days. This allows consumers to replay data or recover from processing failures. Extended retention enables reprocessing historical data for analytics, debugging, or compliance requirements without re-ingesting from source systems.

Q: When should I use SQS versus SNS?

Use SQS when you need point-to-point async processing with one consumer pulling messages. Use SNS for pub/sub patterns where one message needs to reach multiple subscribers simultaneously. Combine both: SNS publishes to multiple SQS queues, enabling fan-out with decoupled consumers processing independently.

Q: How does Kinesis partition key affect data distribution?

Kinesis uses partition keys to hash data across shards for parallel processing. The same partition key always routes to the same shard, ensuring ordering for related records. Use high-cardinality keys like user IDs to distribute load evenly. Poor key selection causes hot shards and throttling.

Q: What is SQS visibility timeout and how should it be set?

Visibility timeout is the period a message is hidden from other consumers after being received, preventing duplicate processing. Set it to 6 times your maximum processing time to allow for retries. If timeout expires before deletion, the message becomes visible again for reprocessing.

Q: How does Kinesis Enhanced Fan-out improve performance?

Enhanced Fan-out provides dedicated 2 MB/s throughput per consumer using HTTP/2 push instead of polling. Multiple consumers can read simultaneously without competing for shard capacity. Standard consumers share 2 MB/s per shard. Use Enhanced Fan-out for latency-sensitive applications with multiple real-time processors.

Conclusion

AWS messaging services provide specialized solutions for different communication patterns. SQS enables asynchronous decoupling with queue-based messaging, SNS provides pub/sub fan-out across multiple subscribers, and Kinesis delivers real-time streaming for analytics and event processing.

Choose SQS for reliable async processing, SNS for broadcasting events to multiple consumers, and Kinesis for ordered, replayable data streams. Understanding message ordering guarantees, throughput limits, and integration patterns is essential for both production architectures and the AWS Certified Developer Associate exam.