Home DevOps AWS SQS vs SNS Explained — Queues, Topics and When to Use Each

AWS SQS vs SNS Explained — Queues, Topics and When to Use Each

In Plain English 🔥
Imagine a busy pizza restaurant. SNS is the manager who shouts 'Order 42 is ready!' — every station (kitchen, cashier, delivery) that cares about that announcement hears it at the same time. SQS is the ticket rail above the grill — each chef grabs one ticket, works through it at their own pace, and the ticket is gone once it's done. One broadcasts, one queues. That's the whole mental model.
⚡ Quick Answer
Imagine a busy pizza restaurant. SNS is the manager who shouts 'Order 42 is ready!' — every station (kitchen, cashier, delivery) that cares about that announcement hears it at the same time. SQS is the ticket rail above the grill — each chef grabs one ticket, works through it at their own pace, and the ticket is gone once it's done. One broadcasts, one queues. That's the whole mental model.

Modern applications rarely do one thing at a time. A user places an order and suddenly you need to charge their card, send a confirmation email, update inventory, notify the warehouse, and log an audit trail — all reliably, even if your email service crashes at 2 a.m. That's the problem AWS SQS and SNS were built to solve. They decouple the parts of your system so that a failure in one place doesn't cascade everywhere.

Without messaging services like these, you'd wire services together with direct HTTP calls. Service A calls Service B, which calls Service C. If B is slow, A waits. If C is down, the whole chain breaks. SQS introduces a buffer — a durable queue that holds messages until a consumer is ready to process them. SNS takes a different angle: it lets one event instantly fan out to dozens of subscribers without the publisher needing to know who they are.

By the end of this article you'll know the architectural difference between a queue and a publish-subscribe topic, how to wire SQS and SNS together for a real fan-out pattern, what dead-letter queues are and why you desperately need them, and exactly when to reach for each service in your next cloud project.

SQS — The Durable Message Queue That Saves Your System at 2 A.M.

SQS (Simple Queue Service) is a fully managed message queue. A producer drops a message into the queue, and one or more consumers poll the queue and process messages at their own pace. The key word is 'one' — by default each message is delivered to exactly one consumer. This is point-to-point messaging.

Why does that matter? Because it gives you back-pressure handling for free. If your order-processing service is overwhelmed, messages just pile up in the queue safely. The queue acts as a shock absorber between the part of your system that generates work and the part that does the work.

There are two flavours. Standard queues give you maximum throughput with at-least-once delivery and best-effort ordering — meaning a message might appear twice (rare, but plan for it). FIFO queues guarantee exactly-once processing and strict order, but cap you at 3,000 messages per second with batching. Choose FIFO when order actually matters — financial transactions, state machines. Choose Standard everywhere else.

Messages live in the queue for up to 14 days. The visibility timeout is the other critical setting: after a consumer picks up a message, it becomes invisible to other consumers for that window. If your Lambda or EC2 worker crashes mid-process, the message reappears and gets retried. That's your built-in retry mechanism.

sqs_producer_consumer.py · PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
import boto3
import json
import time

# --- PRODUCER: Order Service drops a new order into the queue ---

sqs_client = boto3.client('sqs', region_name='us-east-1')

ORDER_QUEUE_URL = 'https://sqs.us-east-1.amazonaws.com/123456789012/order-processing-queue'

def place_order(order_id: str, customer_email: str, items: list) -> dict:
    """
    Sends a new order message to SQS.
    The fulfilment service will pick this up independently.
    """
    order_payload = {
        'order_id': order_id,
        'customer_email': customer_email,
        'items': items,
        'timestamp': time.time()
    }

    # MessageGroupId is only needed for FIFO queues.
    # For Standard queues, just send the body.
    response = sqs_client.send_message(
        QueueUrl=ORDER_QUEUE_URL,
        MessageBody=json.dumps(order_payload),
        # DelaySeconds postpones visibility to consumers — useful for
        # scheduling near-future jobs (max 900 seconds = 15 minutes)
        DelaySeconds=0
    )

    print(f"[PRODUCER] Message sent. SQS Message ID: {response['MessageId']}")
    return response


# --- CONSUMER: Fulfilment Service polls and processes orders ---

def process_pending_orders():
    """
    Polls the queue for up to 10 messages at a time.
    Uses long polling (WaitTimeSeconds=20) to reduce empty responses
    and lower your AWS bill — this is the single most impactful SQS setting.
    """
    print("[CONSUMER] Polling for messages...")

    response = sqs_client.receive_message(
        QueueUrl=ORDER_QUEUE_URL,
        MaxNumberOfMessages=10,   # Batch up to 10 messages per poll
        WaitTimeSeconds=20,       # Long polling — ALWAYS use this
        VisibilityTimeout=60      # Give our worker 60s to finish before retry
    )

    messages = response.get('Messages', [])

    if not messages:
        print("[CONSUMER] Queue is empty. Nothing to process.")
        return

    for message in messages:
        receipt_handle = message['ReceiptHandle']  # Needed to delete the message
        order = json.loads(message['Body'])

        print(f"[CONSUMER] Processing order {order['order_id']} "
              f"for {order['customer_email']}")

        # Simulate doing the actual work (charge card, update DB, etc.)
        fulfil_order(order)

        # CRITICAL: Delete the message after successful processing.
        # If you skip this, SQS re-delivers it after VisibilityTimeout expires.
        sqs_client.delete_message(
            QueueUrl=ORDER_QUEUE_URL,
            ReceiptHandle=receipt_handle
        )
        print(f"[CONSUMER] Order {order['order_id']} done. Message deleted.")


def fulfil_order(order: dict):
    """Placeholder for real business logic."""
    print(f"  -> Charging card and packing {len(order['items'])} item(s)...")
    time.sleep(0.1)  # Simulate processing time


# --- Run it ---
place_order(
    order_id='ORD-9981',
    customer_email='alex@example.com',
    items=['Laptop Stand', 'USB-C Hub']
)

process_pending_orders()
▶ Output
[PRODUCER] Message sent. SQS Message ID: 4e2b1a3f-7c9d-4f1e-b2a0-8d3e5f6c7b8a
[CONSUMER] Polling for messages...
[CONSUMER] Processing order ORD-9981 for alex@example.com
-> Charging card and packing 2 item(s)...
[CONSUMER] Order ORD-9981 done. Message deleted.
⚠️
Watch Out: Short Polling Will Drain Your WalletIf you omit WaitTimeSeconds or set it to 0, SQS uses short polling. Your consumer hammers the API with empty responses and you pay for every single API call. Always set WaitTimeSeconds=20 (the maximum). It's a one-line change that cuts polling costs by up to 95% on quiet queues.

SNS — The Pub/Sub Megaphone That Notifies Everyone at Once

SNS (Simple Notification Service) works on the publish-subscribe model. You publish one message to a Topic, and SNS fans it out simultaneously to every subscriber — SQS queues, Lambda functions, HTTP endpoints, email addresses, mobile push notifications. The publisher has zero knowledge of who's listening. Adding a new subscriber doesn't touch the publisher at all.

This is the architectural superpower. Imagine your user-signup event needs to trigger a welcome email, a CRM record creation, a Slack notification to your growth team, and an analytics event. With SNS, your Auth service publishes one 'UserRegistered' message and walks away. Four independent services consume it in parallel.

Message filtering is what takes SNS from useful to essential. Instead of every subscriber receiving every message on a topic, you attach a filter policy to a subscription. Your EU payments service can subscribe to the 'transactions' topic but only receive messages where region=EU. This keeps each service focused on what it actually cares about.

SNS does not store messages. If a subscriber is down when the message arrives, that message is gone unless the subscriber is an SQS queue (which durably stores it). That's the most important SNS limitation to internalise — and it leads directly to the most powerful pattern: SNS + SQS fan-out.

sns_fanout_publisher.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596
import boto3
import json

sns_client = boto3.client('sns', region_name='us-east-1')

USER_EVENTS_TOPIC_ARN = 'arn:aws:sns:us-east-1:123456789012:user-lifecycle-events'

def publish_user_registered_event(user_id: str, email: str, plan: str, region: str):
    """
    Publishes a single UserRegistered event to the SNS topic.
    SNS delivers this simultaneously to ALL subscribers:
      - SQS queue -> welcome-email-service
      - SQS queue -> crm-sync-service (filter: plan=pro)
      - Lambda   -> real-time-analytics
    The auth service doesn't know or care who those subscribers are.
    """
    event_payload = {
        'event_type': 'UserRegistered',
        'user_id': user_id,
        'email': email,
        'plan': plan,
        'region': region
    }

    response = sns_client.publish(
        TopicArn=USER_EVENTS_TOPIC_ARN,
        Message=json.dumps(event_payload),
        Subject='UserRegistered',  # Used by email subscribers as the email subject

        # MessageAttributes enable server-side filtering.
        # A subscription with filter {"plan": ["pro"]} only receives pro-plan signups.
        # This avoids every subscriber having to filter in their own code.
        MessageAttributes={
            'event_type': {
                'DataType': 'String',
                'StringValue': 'UserRegistered'
            },
            'plan': {
                'DataType': 'String',
                'StringValue': plan  # 'free' or 'pro'
            },
            'region': {
                'DataType': 'String',
                'StringValue': region  # 'EU', 'US', 'APAC'
            }
        }
    )

    print(f"[SNS] Event published. Message ID: {response['MessageId']}")
    print(f"[SNS] Delivered to all active subscribers in parallel.")
    return response


# --- Subscribing an SQS queue to this SNS topic (infra setup, done once) ---

def subscribe_sqs_queue_to_topic(
    topic_arn: str,
    queue_arn: str,
    filter_policy: dict = None
) -> str:
    """
    Wires an SQS queue as an SNS subscriber.
    Optionally attach a filter so this queue only gets relevant messages.
    """
    subscribe_kwargs = {
        'TopicArn': topic_arn,
        'Protocol': 'sqs',
        'Endpoint': queue_arn
    }

    if filter_policy:
        # FilterPolicy limits which messages reach this subscription.
        # The CRM service only wants to hear about 'pro' plan signups.
        subscribe_kwargs['Attributes'] = {
            'FilterPolicy': json.dumps(filter_policy)
        }

    response = sns_client.subscribe(**subscribe_kwargs)
    print(f"[SNS] Queue subscribed. Subscription ARN: {response['SubscriptionArn']}")
    return response['SubscriptionArn']


# --- Example: set up the CRM subscription with a filter ---
subscribe_sqs_queue_to_topic(
    topic_arn=USER_EVENTS_TOPIC_ARN,
    queue_arn='arn:aws:sqs:us-east-1:123456789012:crm-sync-queue',
    filter_policy={'plan': ['pro']}  # Only pro signups reach the CRM queue
)

# --- Publish a new user signup ---
publish_user_registered_event(
    user_id='usr-4471',
    email='jordan@example.com',
    plan='pro',
    region='EU'
)
▶ Output
[SNS] Queue subscribed. Subscription ARN: arn:aws:sns:us-east-1:123456789012:user-lifecycle-events:a1b2c3d4-e5f6-7890-abcd-ef1234567890
[SNS] Event published. Message ID: 7f3a2b1c-9e8d-4c7b-a6f5-2e1d0c9b8a7f
[SNS] Delivered to all active subscribers in parallel.
⚠️
Pro Tip: Use Message Attributes for Filtering, Not Message BodySNS filter policies only work on MessageAttributes, not on the JSON inside the Message body. A common mistake is embedding filter criteria inside the payload and wondering why all subscribers still get everything. Put your routing metadata in MessageAttributes and keep your payload clean.

Dead-Letter Queues and the SNS+SQS Fan-Out Pattern — Production Essentials

Two patterns separate a toy cloud setup from a production-grade one: dead-letter queues (DLQs) and the SNS+SQS fan-out architecture. You need to understand both.

A DLQ is just another SQS queue. You configure it on your main queue and set maxReceiveCount — say, 3. If a message fails processing 3 times, SQS automatically moves it to the DLQ instead of retrying forever or silently dropping it. Your team gets alerted, investigates the poisoned message, and the rest of your queue keeps flowing normally. Without a DLQ, one bad message can block your queue or create an infinite retry storm.

The fan-out pattern solves SNS's biggest weakness — no durability. The rule is: never subscribe a Lambda directly to SNS in production if message loss is unacceptable. Instead, subscribe an SQS queue to the SNS topic. The queue durably catches every message. Your Lambda then polls the queue. You get SNS's broadcasting power AND SQS's durability and retry logic together. This is the architectural backbone of most event-driven AWS systems.

The code example below wires both patterns together with IaC-style Boto3 calls, showing exactly how a DLQ connects to a main queue.

dlq_and_fanout_setup.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
import boto3
import json

sqs_client = boto3.client('sqs', region_name='us-east-1')
sns_client = boto3.client('sns', region_name='us-east-1')


def create_queue_with_dlq(main_queue_name: str, dlq_name: str) -> tuple[str, str]:
    """
    Creates a main SQS queue and wires up a Dead-Letter Queue.
    If a message fails maxReceiveCount times, it moves to the DLQ
    instead of looping forever or vanishing silently.
    """

    # Step 1: Create the DLQ first — it's just a regular SQS queue
    dlq_response = sqs_client.create_queue(
        QueueName=dlq_name,
        Attributes={
            'MessageRetentionPeriod': '1209600'  # Keep failed messages for 14 days
        }
    )
    dlq_url = dlq_response['QueueUrl']
    print(f"[SETUP] DLQ created: {dlq_url}")

    # Step 2: Get the DLQ's ARN — needed to reference it as a redrive target
    dlq_attributes = sqs_client.get_queue_attributes(
        QueueUrl=dlq_url,
        AttributeNames=['QueueArn']
    )
    dlq_arn = dlq_attributes['Attributes']['QueueArn']

    # Step 3: Create the main queue with a RedrivePolicy pointing at the DLQ
    # maxReceiveCount=3 means: after 3 failed delivery attempts, move to DLQ
    redrive_policy = {
        'deadLetterTargetArn': dlq_arn,
        'maxReceiveCount': '3'
    }

    main_queue_response = sqs_client.create_queue(
        QueueName=main_queue_name,
        Attributes={
            'VisibilityTimeout': '60',
            'RedrivePolicy': json.dumps(redrive_policy)
        }
    )
    main_queue_url = main_queue_response['QueueUrl']
    print(f"[SETUP] Main queue created with DLQ attached: {main_queue_url}")

    return main_queue_url, dlq_url


def wire_sns_to_sqs_fanout(
    topic_arn: str,
    queue_url: str,
    queue_arn: str
):
    """
    Subscribes an SQS queue to an SNS topic — the fan-out pattern.
    Also grants SNS permission to write to the SQS queue.
    Forgetting the SQS access policy is the #1 cause of 'messages not arriving' bugs.
    """

    # Step 1: Grant SNS permission to send messages to this SQS queue
    # Without this policy, SNS silently drops messages — no error thrown
    sqs_policy = {
        'Version': '2012-10-17',
        'Statement': [{
            'Sid': 'AllowSNSToSendToSQS',
            'Effect': 'Allow',
            'Principal': {'Service': 'sns.amazonaws.com'},
            'Action': 'sqs:SendMessage',
            'Resource': queue_arn,
            'Condition': {
                'ArnEquals': {'aws:SourceArn': topic_arn}
            }
        }]
    }

    sqs_client.set_queue_attributes(
        QueueUrl=queue_url,
        Attributes={'Policy': json.dumps(sqs_policy)}
    )
    print(f"[SETUP] SQS access policy updated to allow SNS.")

    # Step 2: Subscribe the queue to the topic
    # RawMessageDelivery=true means the SQS message body IS your JSON payload
    # Without it, the body is wrapped in an SNS envelope — harder to parse
    subscription = sns_client.subscribe(
        TopicArn=topic_arn,
        Protocol='sqs',
        Endpoint=queue_arn,
        Attributes={
            'RawMessageDelivery': 'true'  # Skip the SNS wrapper — always set this
        }
    )

    print(f"[SETUP] Fan-out wired. Subscription ARN: {subscription['SubscriptionArn']}")
    return subscription['SubscriptionArn']


# --- Run the setup ---
main_url, dlq_url = create_queue_with_dlq(
    main_queue_name='order-fulfilment-queue',
    dlq_name='order-fulfilment-dlq'
)

wire_sns_to_sqs_fanout(
    topic_arn='arn:aws:sns:us-east-1:123456789012:order-events',
    queue_url=main_url,
    queue_arn='arn:aws:sqs:us-east-1:123456789012:order-fulfilment-queue'
)

print("\n[READY] Production fan-out pattern active:")
print("  SNS Topic -> SQS Queue (durable) -> Lambda/Worker")
print("  Failed messages -> DLQ after 3 attempts")
▶ Output
[SETUP] DLQ created: https://sqs.us-east-1.amazonaws.com/123456789012/order-fulfilment-dlq
[SETUP] Main queue created with DLQ attached: https://sqs.us-east-1.amazonaws.com/123456789012/order-fulfilment-queue
[SETUP] SQS access policy updated to allow SNS.
[SETUP] Fan-out wired. Subscription ARN: arn:aws:sns:us-east-1:123456789012:order-events:b9c8d7e6-f5a4-3b2c-1d0e-9f8a7b6c5d4e

[READY] Production fan-out pattern active:
SNS Topic -> SQS Queue (durable) -> Lambda/Worker
Failed messages -> DLQ after 3 attempts
🔥
Interview Gold: Why SNS+SQS Instead of SNS Directly to Lambda?Lambda can subscribe to SNS directly, but if Lambda throttles (hits concurrency limits), SNS retries only twice then drops the message — gone forever. With an SQS queue in between, messages wait safely in the queue until Lambda capacity frees up. The queue absorbs the spike. This is the canonical answer to 'how do you handle Lambda throttling in an event-driven system?'
Feature / AspectSQS (Queue)SNS (Topic)
Delivery modelPoint-to-point (one consumer per message)Pub/sub (all subscribers receive every message)
Message durabilityStored up to 14 daysNot stored — fire and forget
Delivery guaranteeAt-least-once (Standard) or Exactly-once (FIFO)At-least-once to active subscribers only
Consumer countOne consumer processes each messageUnlimited simultaneous subscribers
OrderingBest-effort (Standard) or Strict (FIFO)No ordering guarantee
ThroughputUnlimited (Standard) / 3,000 msg/s (FIFO)300 publishes/s (default, extendable)
Retry on failureAutomatic via visibility timeout + DLQLimited retries to HTTP endpoints; none to SQS
FilteringConsumer-side (your code decides)Server-side filter policies on subscriptions
Primary use caseTask queues, job workers, rate limitingEvent broadcasting, fan-out, notifications
Combined withSNS for fan-out input; Lambda for processingSQS for durability; Lambda/HTTP for real-time

🎯 Key Takeaways

  • SQS is a durable queue — one message, one consumer, processed at the consumer's pace. Use it for task workers, job queues, and anywhere you need reliable exactly-once or at-least-once processing.
  • SNS is a pub/sub broadcaster — one message, every subscriber, delivered in parallel. Use it to decouple event producers from multiple downstream consumers without the producer knowing who they are.
  • The SNS+SQS fan-out pattern is the production standard: SNS broadcasts, SQS catches and durably stores for each subscriber group. This gives you both broadcasting power and message durability — neither service alone gives you both.
  • Dead-letter queues are non-negotiable in production. Set maxReceiveCount=3 on every processing queue and alert on DLQ depth. A message in your DLQ is a bug waiting to be investigated, not a message to silently discard.

⚠ Common Mistakes to Avoid

  • Mistake 1: Forgetting to delete the SQS message after processing — Symptom: every message is processed multiple times after the VisibilityTimeout expires, causing duplicate orders, double-charges, or duplicate emails — Fix: always call sqs.delete_message() with the ReceiptHandle immediately after your business logic succeeds. Wrap your processing and deletion in a try/except so a crash mid-way lets the message reappear naturally for retry.
  • Mistake 2: Missing the SQS access policy when subscribing to SNS — Symptom: messages published to SNS never arrive in the SQS queue, no error is thrown, CloudWatch shows SNS delivery failures — Fix: attach an IAM resource policy to the SQS queue explicitly allowing sns.amazonaws.com to call sqs:SendMessage with a Condition matching the specific SNS topic ARN. Terraform and CDK do this automatically but Boto3 does not.
  • Mistake 3: Using short polling on SQS consumers in a Lambda or EC2 worker — Symptom: thousands of empty API calls per hour, higher AWS bill than expected, CPU spin-looping with nothing to do — Fix: always set WaitTimeSeconds=20 in every receive_message call. Long polling holds the connection open for up to 20 seconds waiting for messages, dramatically reducing both API call volume and cost on low-traffic queues.

Interview Questions on This Topic

  • QWhat is the difference between SQS and SNS, and when would you use both together in the same architecture?
  • QA Lambda function is subscribed directly to an SNS topic and you're seeing message loss during traffic spikes. What's causing it and how do you fix it?
  • QYour SQS consumer is processing the same message multiple times even though your code is correct. Walk me through all the possible reasons and how you'd diagnose each one.

Frequently Asked Questions

What is the difference between SQS and SNS in AWS?

SQS is a message queue where each message is consumed by exactly one worker — it's designed for task processing and decoupling producer and consumer workloads. SNS is a pub/sub service where one published message is delivered simultaneously to all subscribers. Use SQS when you need one thing to process each job; use SNS when you need many things to react to the same event.

Can SQS and SNS be used together?

Yes, and this is the most common production pattern. You subscribe one or more SQS queues to an SNS topic. SNS fans the message out to all queues simultaneously, and each queue durably holds its copy for its own set of consumers. This combines SNS's broadcasting with SQS's durability and retry logic — something neither service provides alone.

What happens to an SNS message if the subscriber is down?

If the subscriber is an HTTP endpoint or Lambda, SNS retries a limited number of times then drops the message permanently. If the subscriber is an SQS queue, the message is safely stored in the queue until a consumer comes back online. This is why subscribing SQS queues (rather than Lambda or HTTP endpoints directly) to SNS is the recommended pattern for any message you can't afford to lose.

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousDocker Networking Deep DiveNext →AWS CloudWatch Basics
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged