Async AI Content Generation with AWS Bedrock Batch API: 50% Cost Reduction

The AI Content Cost Challenge

E-commerce platforms generate thousands of product descriptions for SEO optimization. AWS Bedrock provides Claude 3.5 Sonnet access at $0.003 per 1K input tokens and $0.015 per 1K output tokens. A store with 1,000 products requiring 300-token descriptions across 3 languages incurs $40+ per generation cycle.

Bedrock Batch API reduces costs by 50%—$0.0015 input, $0.0075 output—but requires event-driven infrastructure instead of synchronous API calls. The challenge: build reliable async AI pipelines that handle S3 triggers, parse responses, and update databases without blocking application threads.

Key Results:

50% cost reduction using Bedrock Batch API vs real-time
Event-driven pipeline: S3 → Lambda → SNS → Bedrock → Lambda → SQS
Regex parsing for multi-language AI responses
Standard SQS queues with dead-letter handling

Architecture: Event-Driven AI Pipeline

The system orchestrates async AI generation through S3 event triggers, Lambda functions, SNS messaging, Bedrock batch processing, and SQS result queues.

Architecture Flow:

Cron Job: Queries database for products with auto_generation_status=pending
JSONL Upload: Generates batch prompts, uploads to S3 input prefix
SNS Trigger: Publishes message with S3 URIs to SNS topic
Lambda Orchestrator: Receives SNS, creates Bedrock batch job
Bedrock Processing: Async execution, writes .jsonl.out to S3
S3 Event: Triggers parser Lambda when output file created
Lambda Parser: Regex extraction, batch send to SQS
SQS Consumer: Application polls queue, updates database
DLQ Handling: Failed messages route to dead-letter queue for investigation

Implementation: Batch Processing Pipeline

Prompt Generation and JSONL Format

Application generates structured prompts with embedded identifiers for regex parsing after AI processing.

Pseudo Code - Prompt Builder:

FUNCTION generatePrompt(product, language):
    RETURN """
    Generate SEO-optimized HTML product description:
      - Product ID: {product.id}
      - Name: {product.title}
      - Keywords: {product.seo_keywords}
      - Language: {language.name}
      - HTML tags: p, ul, li, span
      - Length: 250-300 tokens

    CRITICAL:
      - Start with: #{product.id}#{language.id}#
      - No intro text, only HTML
    """

FUNCTION createBatchFile(products, languages):
    lines = []

    FOR EACH product IN products:
        FOR EACH language IN languages:
            prompt = generatePrompt(product, language)

            jsonLine = {
                "anthropic_version": "bedrock-2023-05-31",
                "messages": [{"role": "user", "content": prompt}],
                "max_tokens": 1000
            }

            lines.APPEND(JSON.stringify(jsonLine))

    RETURN lines.JOIN("\n")  // JSONL format

FUNCTION uploadToS3(content, batchId):
    s3.upload(
        bucket: "ai-output",
        key: "input/batch-{batchId}.jsonl",
        body: content
    )

Key Pattern:

Identifier prefix #{productId}#{languageId}# enables response extraction
JSONL format: one JSON object per line (Bedrock requirement)
Multi-language: all translations in single batch file

SNS Trigger and Lambda Orchestration

SNS decouples application from Lambda, allowing multiple subscribers and retry logic.

Terraform - SNS and Lambda Setup:

# SNS Topic
resource "aws_sns_topic" "ai_batch_trigger" {
  name = "ai-batch-trigger-topic"
}

# Lambda Function
resource "aws_lambda_function" "create_bedrock_job" {
  function_name = "create-bedrock-batch-job"
  runtime       = "python3.11"
  handler       = "handler.lambda_handler"
  timeout       = 30

  environment {
    variables = {
      BEDROCK_ROLE_ARN = aws_iam_role.bedrock_batch.arn
    }
  }
}

# SNS → Lambda Subscription
resource "aws_sns_topic_subscription" "lambda_trigger" {
  topic_arn = aws_sns_topic.ai_batch_trigger.arn
  protocol  = "lambda"
  endpoint  = aws_lambda_function.create_bedrock_job.arn
}

Pseudo Code - SNS Publisher:

FUNCTION publishBatchJob(batchId, inputUri, outputUri):
    message = {
        "jobName": "ai-products-{batchId}",
        "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
        "s3InputUri": inputUri,
        "s3OutputUri": outputUri
    }

    sns.publish(
        topic: "ai-batch-trigger-topic",
        message: JSON.stringify(message)
    )

Pseudo Code - Lambda Handler:

FUNCTION lambda_handler(event):
    snsMessage = event.Records[0].Sns.Message
    params = JSON.parse(snsMessage)

    bedrockJob = bedrock.createModelInvocationJob(
        modelId: params.modelId,
        jobName: params.jobName,
        inputConfig: {
            s3Uri: params.s3InputUri
        },
        outputConfig: {
            s3Uri: params.s3OutputUri
        },
        roleArn: ENV.BEDROCK_ROLE_ARN
    )

    RETURN {
        statusCode: 200,
        body: bedrockJob.jobArn
    }

S3 Event-Triggered Response Parsing

Bedrock writes results with .jsonl.out suffix. S3 notification triggers parser Lambda automatically.

Terraform - S3 Event Configuration:

resource "aws_s3_bucket_notification" "bedrock_output" {
  bucket = aws_s3_bucket.ai_output.id

  lambda_function {
    lambda_function_arn = aws_lambda_function.output_parser.arn
    events              = ["s3:ObjectCreated:*"]
    filter_prefix       = "output/"
    filter_suffix       = ".jsonl.out"
  }
}

Pseudo Code - Parser Lambda:

REGEX_PATTERN = /#(\d+)#(\d+)#([\s\S]*?)(?=#\d+#\d+#|$)/

FUNCTION extractProducts(aiResponse):
    results = []
    text = aiResponse.content[0].text

    FOR EACH match IN REGEX_PATTERN.findAll(text):
        results.APPEND({
            "productId": match.group(1),
            "languageId": match.group(2),
            "htmlContent": match.group(3).trim(),
            "aiGenerated": true
        })

    RETURN results

FUNCTION handler(s3Event):
    bucket = s3Event.Records[0].s3.bucket.name
    key = s3Event.Records[0].s3.object.key

    // Download output file
    content = s3.getObject(bucket, key)
    lines = content.split("\n")

    batch = []

    FOR EACH line IN lines:
        IF line.isEmpty(): CONTINUE

        payload = JSON.parse(line)
        products = extractProducts(payload)

        FOR EACH product IN products:
            batch.APPEND({
                "Id": "{index}-{product.productId}",
                "MessageBody": JSON.stringify(product)
            })

            // SQS batch limit: 10 messages
            IF batch.length == 10:
                sqs.sendMessageBatch(
                    queueUrl: ENV.PRODUCTS_QUEUE_URL,
                    entries: batch
                )
                batch.clear()

    // Send remaining messages
    IF batch.notEmpty():
        sqs.sendMessageBatch(
            queueUrl: ENV.PRODUCTS_QUEUE_URL,
            entries: batch
        )

Regex Breakdown:

#(\d+)# → Product ID (capture group 1)
#(\d+)# → Language ID (capture group 2)
#([\s\S]*?) → HTML content (capture group 3, non-greedy)
(?=#\d+#\d+#|$) → Lookahead: next record or end of string

SQS Queue and Consumer Pattern

Standard SQS queue buffers AI results for application consumption with dead-letter handling.

Terraform - Queue Setup:

resource "aws_sqs_queue" "products_dlq" {
  name                      = "products-dlq"
  message_retention_seconds = 1209600  // 14 days
  sqs_managed_sse_enabled   = true
}

resource "aws_sqs_queue" "products" {
  name                        = "products-queue"
  message_retention_seconds   = 345600  // 4 days
  receive_wait_time_seconds   = 20      // Long polling
  visibility_timeout_seconds  = 120
  sqs_managed_sse_enabled     = true

  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.products_dlq.arn
    maxReceiveCount     = 2
  })
}

Pseudo Code - SQS Consumer:

FUNCTION pollQueue():
    WHILE true:
        messages = sqs.receiveMessage(
            queueUrl: "products-queue",
            maxMessages: 10,
            waitTimeSeconds: 20  // Long polling
        )

        FOR EACH message IN messages:
            TRY:
                product = JSON.parse(message.Body)

                // Update database
                db.updateProductDescription(
                    productId: product.productId,
                    languageId: product.languageId,
                    description: product.htmlContent,
                    aiGenerated: true,
                    updatedAt: NOW()
                )

                // Delete from queue on success
                sqs.deleteMessage(
                    queueUrl: "products-queue",
                    receiptHandle: message.ReceiptHandle
                )

            CATCH error:
                LOG.error("Failed to process message", error)
                // Message returns to queue after visibility timeout
                // After 2 failures → moves to DLQ

Queue Configuration:

Long polling (20s): Reduces empty responses, lowers costs
Visibility timeout (120s): Prevents duplicate processing during DB updates
Max receive count (2): Failed messages move to DLQ after 2 attempts
Message retention (4 days): Enough time for retry and investigation

Cost Analysis and Trade-offs

Bedrock Batch vs Real-Time Pricing

Scenario: 1,000 products, 3 languages, 300-token descriptions

Real-Time API:

Input:  1,000 × 3 × 250 tokens × $0.003/1K = $2.25
Output: 1,000 × 3 × 300 tokens × $0.015/1K = $13.50
Total: $15.75 per generation cycle

Batch API:

Input:  1,000 × 3 × 250 tokens × $0.0015/1K = $1.13
Output: 1,000 × 3 × 300 tokens × $0.0075/1K = $6.75
Total: $7.88 per generation cycle
Savings: $7.87 (50% reduction)

Infrastructure Costs:

Lambda: ~$1/month (minimal invocations)
S3: ~$2/month (storage + requests)
SQS: ~$0.50/month (standard queue)
SNS: ~$0.10/month (notifications)
Total overhead: ~$3.60/month

Monthly Savings (4 cycles): $31.48 - $3.60 = $27.88 per store Multi-tenant (100 stores): $2,788/month savings

Performance Characteristics

Latency:

Real-time API: <1 second response
Batch API: 5-15 minutes processing time
Acceptable for: overnight jobs, scheduled regeneration
Not suitable for: user-facing features, real-time updates

Throughput:

Single batch limit: 10,000 requests
Multiple batches: unlimited (sequential processing)
Parallel execution: multiple stores can run simultaneously

Error Handling Patterns

S3 Event Failures:

Lambda parser errors trigger automatic retry (2 attempts)
CloudWatch alarms on repeated failures
Manual intervention for corrupted output files

SQS Processing Failures:

Visibility timeout returns message to queue
After 2 receive attempts → moves to DLQ
DLQ messages require manual inspection and redrive

Regex Parse Failures:

Missing identifiers: skip message, log warning
Malformed HTML: accept as-is (LLM-generated)
Unicode issues: ensure UTF-8 encoding throughout pipeline

Results and Best Practices

Cost Optimization:

50% reduction on AI processing costs
Fixed infrastructure: ~$4/month regardless of volume
Variable costs scale linearly with content generation

What Worked:

✅ S3 events eliminate polling overhead—Lambda triggers automatically

✅ Regex parsing reliable for structured AI outputs with identifiers

✅ SNS decoupling allows adding consumers without code changes

✅ Long polling (20s) reduces SQS costs by 90% vs short polling

✅ Standard queues sufficient—no ordering requirements for products

✅ DLQ pattern handles failures gracefully without data loss

Pitfalls to Avoid:

❌ Hardcoded S3 paths—use environment variables for multi-environment support

❌ Missing IAM permissions cause silent failures—test Lambda roles thoroughly

❌ Short visibility timeout (<120s) causes duplicate database updates

❌ Regex must handle edge cases—test with various AI response formats

❌ Bedrock batch role needs both bucket-level and object-level S3 permissions

Monitoring Essentials:

CloudWatch: Bedrock job status, Lambda errors, SQS queue depth
Alarms: DLQ message count >10, Lambda failure rate >5%
Dashboards: Cost per 1K tokens, processing time trends

Conclusion

AWS Bedrock Batch API with event-driven architecture delivers 50% cost savings for async AI content generation. The S3 → Lambda → SNS → Bedrock → SQS pipeline handles thousands of product descriptions reliably while maintaining low infrastructure overhead.

Implementation Checklist:

✓ Create S3 bucket with input/output prefixes
✓ Set up SNS topic for batch job triggers
✓ Deploy Lambda for Bedrock job creation
✓ Configure S3 event notification for output parsing
✓ Set up SQS queue with DLQ redrive policy
✓ Build application poller for queue consumption
✓ Monitor CloudWatch metrics and DLQ depth

When to Use This Pattern:

AI content generation at scale (>100 items)
Async processing acceptable (5-15 min latency)
Cost optimization priority (50% savings)
Multi-language content requirements
Event-driven architecture already in place

When to Avoid:

Real-time user interactions (<1s response needed)
Small batches (<10 items) where overhead exceeds savings
Regulatory requirements prohibit async AI processing
Team lacks AWS Lambda/SQS expertise

Organizations generating thousands of AI descriptions achieve substantial cost reduction without compromising reliability. Event-driven patterns handle failures through retries, DLQs, and structured parsing—making async AI practical for production e-commerce platforms.

Resources: