Async AI Content Generation with AWS Bedrock Batch API: 50% Cost Reduction

Async AI Content Generation with AWS Bedrock

The AI Content Cost Challenge

E-commerce platforms generate thousands of product descriptions for SEO optimization. AWS Bedrock provides Claude 3.5 Sonnet access at $0.003 per 1K input tokens and $0.015 per 1K output tokens. A store with 1,000 products requiring 300-token descriptions across 3 languages incurs $40+ per generation cycle.

Bedrock Batch API reduces costs by 50%—$0.0015 input, $0.0075 output—but requires event-driven infrastructure instead of synchronous API calls. The challenge: build reliable async AI pipelines that handle S3 triggers, parse responses, and update databases without blocking application threads.

Key Results:

  • 50% cost reduction using Bedrock Batch API vs real-time
  • Event-driven pipeline: S3 → Lambda → SNS → Bedrock → Lambda → SQS
  • Regex parsing for multi-language AI responses
  • Standard SQS queues with dead-letter handling

Architecture: Event-Driven AI Pipeline

The system orchestrates async AI generation through S3 event triggers, Lambda functions, SNS messaging, Bedrock batch processing, and SQS result queues.

 Message Queue 

 Response Handler 

 AWS Bedrock 

 Orchestration 

 S3: ai-output bucket 

 Application Layer 

   1. Upload JSONL   

   2. Publish   

   3. Trigger   

   4. Create job   

   5. Write results   

   6. S3 event   

   7. Send messages   

   failures   

   8. Consume   

   9. Update   

Cron Job:

Find pending products

Database

SQS Consumer:

Update products

input/

batch-123.jsonl

output/

batch-123.jsonl.out

SNS Topic:

batch-trigger

Lambda:

Create Bedrock Job

Batch API:

Claude 3.5 Sonnet

Lambda:

Parse AI Output

Regex extraction

SQS: products-queue

Standard queue

Dead Letter Queue

Architecture Flow:

  1. Cron Job: Queries database for products with auto_generation_status=pending
  2. JSONL Upload: Generates batch prompts, uploads to S3 input prefix
  3. SNS Trigger: Publishes message with S3 URIs to SNS topic
  4. Lambda Orchestrator: Receives SNS, creates Bedrock batch job
  5. Bedrock Processing: Async execution, writes .jsonl.out to S3
  6. S3 Event: Triggers parser Lambda when output file created
  7. Lambda Parser: Regex extraction, batch send to SQS
  8. SQS Consumer: Application polls queue, updates database
  9. DLQ Handling: Failed messages route to dead-letter queue for investigation

Implementation: Batch Processing Pipeline

Prompt Generation and JSONL Format

Application generates structured prompts with embedded identifiers for regex parsing after AI processing.

Pseudo Code - Prompt Builder:

FUNCTION generatePrompt(product, language):
    RETURN """
    Generate SEO-optimized HTML product description:
      - Product ID: {product.id}
      - Name: {product.title}
      - Keywords: {product.seo_keywords}
      - Language: {language.name}
      - HTML tags: p, ul, li, span
      - Length: 250-300 tokens

    CRITICAL:
      - Start with: #{product.id}#{language.id}#
      - No intro text, only HTML
    """

FUNCTION createBatchFile(products, languages):
    lines = []

    FOR EACH product IN products:
        FOR EACH language IN languages:
            prompt = generatePrompt(product, language)

            jsonLine = {
                "anthropic_version": "bedrock-2023-05-31",
                "messages": [{"role": "user", "content": prompt}],
                "max_tokens": 1000
            }

            lines.APPEND(JSON.stringify(jsonLine))

    RETURN lines.JOIN("\n")  // JSONL format

FUNCTION uploadToS3(content, batchId):
    s3.upload(
        bucket: "ai-output",
        key: "input/batch-{batchId}.jsonl",
        body: content
    )

Key Pattern:

  • Identifier prefix #{productId}#{languageId}# enables response extraction
  • JSONL format: one JSON object per line (Bedrock requirement)
  • Multi-language: all translations in single batch file

SNS Trigger and Lambda Orchestration

SNS decouples application from Lambda, allowing multiple subscribers and retry logic.

Terraform - SNS and Lambda Setup:

# SNS Topic
resource "aws_sns_topic" "ai_batch_trigger" {
  name = "ai-batch-trigger-topic"
}

# Lambda Function
resource "aws_lambda_function" "create_bedrock_job" {
  function_name = "create-bedrock-batch-job"
  runtime       = "python3.11"
  handler       = "handler.lambda_handler"
  timeout       = 30

  environment {
    variables = {
      BEDROCK_ROLE_ARN = aws_iam_role.bedrock_batch.arn
    }
  }
}

# SNS → Lambda Subscription
resource "aws_sns_topic_subscription" "lambda_trigger" {
  topic_arn = aws_sns_topic.ai_batch_trigger.arn
  protocol  = "lambda"
  endpoint  = aws_lambda_function.create_bedrock_job.arn
}

Pseudo Code - SNS Publisher:

FUNCTION publishBatchJob(batchId, inputUri, outputUri):
    message = {
        "jobName": "ai-products-{batchId}",
        "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
        "s3InputUri": inputUri,
        "s3OutputUri": outputUri
    }

    sns.publish(
        topic: "ai-batch-trigger-topic",
        message: JSON.stringify(message)
    )

Pseudo Code - Lambda Handler:

FUNCTION lambda_handler(event):
    snsMessage = event.Records[0].Sns.Message
    params = JSON.parse(snsMessage)

    bedrockJob = bedrock.createModelInvocationJob(
        modelId: params.modelId,
        jobName: params.jobName,
        inputConfig: {
            s3Uri: params.s3InputUri
        },
        outputConfig: {
            s3Uri: params.s3OutputUri
        },
        roleArn: ENV.BEDROCK_ROLE_ARN
    )

    RETURN {
        statusCode: 200,
        body: bedrockJob.jobArn
    }

S3 Event-Triggered Response Parsing

Bedrock writes results with .jsonl.out suffix. S3 notification triggers parser Lambda automatically.

Terraform - S3 Event Configuration:

resource "aws_s3_bucket_notification" "bedrock_output" {
  bucket = aws_s3_bucket.ai_output.id

  lambda_function {
    lambda_function_arn = aws_lambda_function.output_parser.arn
    events              = ["s3:ObjectCreated:*"]
    filter_prefix       = "output/"
    filter_suffix       = ".jsonl.out"
  }
}

Pseudo Code - Parser Lambda:

REGEX_PATTERN = /#(\d+)#(\d+)#([\s\S]*?)(?=#\d+#\d+#|$)/

FUNCTION extractProducts(aiResponse):
    results = []
    text = aiResponse.content[0].text

    FOR EACH match IN REGEX_PATTERN.findAll(text):
        results.APPEND({
            "productId": match.group(1),
            "languageId": match.group(2),
            "htmlContent": match.group(3).trim(),
            "aiGenerated": true
        })

    RETURN results

FUNCTION handler(s3Event):
    bucket = s3Event.Records[0].s3.bucket.name
    key = s3Event.Records[0].s3.object.key

    // Download output file
    content = s3.getObject(bucket, key)
    lines = content.split("\n")

    batch = []

    FOR EACH line IN lines:
        IF line.isEmpty(): CONTINUE

        payload = JSON.parse(line)
        products = extractProducts(payload)

        FOR EACH product IN products:
            batch.APPEND({
                "Id": "{index}-{product.productId}",
                "MessageBody": JSON.stringify(product)
            })

            // SQS batch limit: 10 messages
            IF batch.length == 10:
                sqs.sendMessageBatch(
                    queueUrl: ENV.PRODUCTS_QUEUE_URL,
                    entries: batch
                )
                batch.clear()

    // Send remaining messages
    IF batch.notEmpty():
        sqs.sendMessageBatch(
            queueUrl: ENV.PRODUCTS_QUEUE_URL,
            entries: batch
        )

Regex Breakdown:

  • #(\d+)# → Product ID (capture group 1)
  • #(\d+)# → Language ID (capture group 2)
  • #([\s\S]*?) → HTML content (capture group 3, non-greedy)
  • (?=#\d+#\d+#|$) → Lookahead: next record or end of string

SQS Queue and Consumer Pattern

Standard SQS queue buffers AI results for application consumption with dead-letter handling.

Terraform - Queue Setup:

resource "aws_sqs_queue" "products_dlq" {
  name                      = "products-dlq"
  message_retention_seconds = 1209600  // 14 days
  sqs_managed_sse_enabled   = true
}

resource "aws_sqs_queue" "products" {
  name                        = "products-queue"
  message_retention_seconds   = 345600  // 4 days
  receive_wait_time_seconds   = 20      // Long polling
  visibility_timeout_seconds  = 120
  sqs_managed_sse_enabled     = true

  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.products_dlq.arn
    maxReceiveCount     = 2
  })
}

Pseudo Code - SQS Consumer:

FUNCTION pollQueue():
    WHILE true:
        messages = sqs.receiveMessage(
            queueUrl: "products-queue",
            maxMessages: 10,
            waitTimeSeconds: 20  // Long polling
        )

        FOR EACH message IN messages:
            TRY:
                product = JSON.parse(message.Body)

                // Update database
                db.updateProductDescription(
                    productId: product.productId,
                    languageId: product.languageId,
                    description: product.htmlContent,
                    aiGenerated: true,
                    updatedAt: NOW()
                )

                // Delete from queue on success
                sqs.deleteMessage(
                    queueUrl: "products-queue",
                    receiptHandle: message.ReceiptHandle
                )

            CATCH error:
                LOG.error("Failed to process message", error)
                // Message returns to queue after visibility timeout
                // After 2 failures → moves to DLQ

Queue Configuration:

  • Long polling (20s): Reduces empty responses, lowers costs
  • Visibility timeout (120s): Prevents duplicate processing during DB updates
  • Max receive count (2): Failed messages move to DLQ after 2 attempts
  • Message retention (4 days): Enough time for retry and investigation

Cost Analysis and Trade-offs

Bedrock Batch vs Real-Time Pricing

Scenario: 1,000 products, 3 languages, 300-token descriptions

Real-Time API:

Input:  1,000 × 3 × 250 tokens × $0.003/1K = $2.25
Output: 1,000 × 3 × 300 tokens × $0.015/1K = $13.50
Total: $15.75 per generation cycle

Batch API:

Input:  1,000 × 3 × 250 tokens × $0.0015/1K = $1.13
Output: 1,000 × 3 × 300 tokens × $0.0075/1K = $6.75
Total: $7.88 per generation cycle
Savings: $7.87 (50% reduction)

Infrastructure Costs:

  • Lambda: ~$1/month (minimal invocations)
  • S3: ~$2/month (storage + requests)
  • SQS: ~$0.50/month (standard queue)
  • SNS: ~$0.10/month (notifications)
  • Total overhead: ~$3.60/month

Monthly Savings (4 cycles): $31.48 - $3.60 = $27.88 per store Multi-tenant (100 stores): $2,788/month savings

Performance Characteristics

Latency:

  • Real-time API: <1 second response
  • Batch API: 5-15 minutes processing time
  • Acceptable for: overnight jobs, scheduled regeneration
  • Not suitable for: user-facing features, real-time updates

Throughput:

  • Single batch limit: 10,000 requests
  • Multiple batches: unlimited (sequential processing)
  • Parallel execution: multiple stores can run simultaneously

Error Handling Patterns

S3 Event Failures:

  • Lambda parser errors trigger automatic retry (2 attempts)
  • CloudWatch alarms on repeated failures
  • Manual intervention for corrupted output files

SQS Processing Failures:

  • Visibility timeout returns message to queue
  • After 2 receive attempts → moves to DLQ
  • DLQ messages require manual inspection and redrive

Regex Parse Failures:

  • Missing identifiers: skip message, log warning
  • Malformed HTML: accept as-is (LLM-generated)
  • Unicode issues: ensure UTF-8 encoding throughout pipeline

Results and Best Practices

Cost Optimization:

  • 50% reduction on AI processing costs
  • Fixed infrastructure: ~$4/month regardless of volume
  • Variable costs scale linearly with content generation

What Worked:

✅ S3 events eliminate polling overhead—Lambda triggers automatically

✅ Regex parsing reliable for structured AI outputs with identifiers

✅ SNS decoupling allows adding consumers without code changes

✅ Long polling (20s) reduces SQS costs by 90% vs short polling

✅ Standard queues sufficient—no ordering requirements for products

✅ DLQ pattern handles failures gracefully without data loss

Pitfalls to Avoid:

❌ Hardcoded S3 paths—use environment variables for multi-environment support

❌ Missing IAM permissions cause silent failures—test Lambda roles thoroughly

❌ Short visibility timeout (<120s) causes duplicate database updates

❌ Regex must handle edge cases—test with various AI response formats

❌ Bedrock batch role needs both bucket-level and object-level S3 permissions

Monitoring Essentials:

  • CloudWatch: Bedrock job status, Lambda errors, SQS queue depth
  • Alarms: DLQ message count >10, Lambda failure rate >5%
  • Dashboards: Cost per 1K tokens, processing time trends

Conclusion

AWS Bedrock Batch API with event-driven architecture delivers 50% cost savings for async AI content generation. The S3 → Lambda → SNS → Bedrock → SQS pipeline handles thousands of product descriptions reliably while maintaining low infrastructure overhead.

Implementation Checklist:

  1. ✓ Create S3 bucket with input/output prefixes
  2. ✓ Set up SNS topic for batch job triggers
  3. ✓ Deploy Lambda for Bedrock job creation
  4. ✓ Configure S3 event notification for output parsing
  5. ✓ Set up SQS queue with DLQ redrive policy
  6. ✓ Build application poller for queue consumption
  7. ✓ Monitor CloudWatch metrics and DLQ depth

When to Use This Pattern:

  • AI content generation at scale (>100 items)
  • Async processing acceptable (5-15 min latency)
  • Cost optimization priority (50% savings)
  • Multi-language content requirements
  • Event-driven architecture already in place

When to Avoid:

  • Real-time user interactions (<1s response needed)
  • Small batches (<10 items) where overhead exceeds savings
  • Regulatory requirements prohibit async AI processing
  • Team lacks AWS Lambda/SQS expertise

Organizations generating thousands of AI descriptions achieve substantial cost reduction without compromising reliability. Event-driven patterns handle failures through retries, DLQs, and structured parsing—making async AI practical for production e-commerce platforms.

Resources: