Instagram serves 2B+ users uploading 100M photos/videos daily, delivering feeds in under 300ms
Core components: Load balancer, API gateway, Object Store (S3), SQL/NoSQL databases, CDN, Feed Service
Feed generation uses a hybrid push-pull model: push for normal users (fan-out on write), pull for celebrities (fan-out on read)
Caching layers: Redis for precomputed feeds, LRU for profiles/metadata, CDN for static media – reduces DB reads by 80%
Storage: 200TB new data daily; hot in S3 for 30 days, then cold archive; CDN edge caches popular media globally
Scaling: shard by user_id via consistent hashing, use message queues (Kafka) for async fan-out, watch for celebrity hot spots
Plain-English First
Imagine a giant post office where a billion people each send photo postcards every day. The post office has to instantly sort every postcard, deliver it only to the people who care about that sender, store the original photo safely forever, and let anyone pull up an old postcard in under a second — even at 3 AM during a major event. Instagram is exactly that post office, and designing it means figuring out every room, shelf, conveyor belt, and delivery truck needed to make it all work without ever losing a single photo.
Instagram serves over two billion monthly active users, processes roughly 100 million photo and video uploads every day, and is expected to return a personalised feed in under 300 milliseconds. When an interviewer asks you to design it, they're not looking for a diagram of boxes connected by arrows — they're watching whether you can reason about trade-offs at scale, make deliberate architectural decisions, and defend them under pressure. This is one of the most common system design questions in FAANG-level loops, and candidates who haven't internalised the nuances consistently get stuck on the feed generation problem or grossly underestimate storage requirements.
The core challenge Instagram solves is deceptively simple on the surface: store media, show it to followers, let people discover new content. Underneath, it's a collision of three genuinely hard distributed-systems problems — write-heavy media ingestion, read-heavy personalised feed delivery, and near-real-time social graph queries — all happening concurrently at planetary scale. Each of those problems demands different storage engines, caching strategies, and consistency models, and they have to coexist inside one coherent product.
By the end of this article you'll have a complete, defensible Instagram design you can present in a 45-minute interview. You'll know the exact numbers to anchor your estimates, the right database choices for each data type, how to generate feeds without melting your servers, where CDNs fit, how to shard your data, and — critically — which trade-offs to call out proactively so the interviewer knows you're thinking like an engineer who has shipped things to production, not just read about it.
Core Component Architecture: Handling 100M Uploads Daily
Designing Instagram isn't just about 'uploading a file.' It's about a decoupled architecture where the Write Path (Uploading) and the Read Path (Feed Generation) are optimized independently. On the write side, we use a Load Balancer to distribute traffic to an API Gateway, which handles authentication and rate limiting. The media itself (Photos/Videos) never touches our relational database; instead, it's streamed to an Object Store like AWS S3 or Google Cloud Storage.
We store the metadata (User ID, Photo URL, Timestamp, Location) in a distributed NoSQL database like Cassandra or a sharded PostgreSQL cluster. This separation ensures that even if our metadata database is busy, our media storage remains performant and durable. To make the images load instantly worldwide, we push them to Edge Locations via a Content Delivery Network (CDN).
io.thecodeforge.instagram.MediaService.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
package io.thecodeforge.instagram;
import java.util.UUID;
import java.time.Instant;
/**
* Represents the Metadata record for a high-scale media upload.
* In production, this would be persisted to a sharded DB cluster.
*/
publicclassPhotoMetadata {
privatefinalString photoId;
privatefinalString userId;
privatefinalString s3Url;
privatefinallong timestamp;
publicPhotoMetadata(String userId, String s3Url) {
this.photoId = UUID.randomUUID().toString();
this.userId = userId;
this.s3Url = s3Url;
this.timestamp = Instant.now().getEpochSecond();
}
publicvoidsaveToDatabase() {
// High-level logic for sharded database insertionSystem.out.println("Persisting metadata to shard based on userId: " + userId);
System.out.println("Photo ID: " + photoId + " | Storage Path: " + s3Url);
}
publicstaticvoidmain(String[] args) {
PhotoMetadata upload = new PhotoMetadata("user_8821", "https://s3.thecodeforge.io/bucket/img_99.jpg");
upload.saveToDatabase();
}
}
Output
Persisting metadata to shard based on userId: user_8821
For regular users, 'Push' their posts to followers' pre-computed feeds (Fan-out). For celebrities with millions of followers, 'Pull' their content only when a follower refreshes their feed. This hybrid approach prevents 'Celebrity Fan-out' from crashing your message queues.
Production Insight
Straight push to all followers works only when your fan-out ratio is low.
At 100M followers per celebrity, a single write triggers 100M queue entries – that's enough to OOM any worker pool.
Rule: always add a dynamic threshold that switches to pull for users above a certain follower count.
Debug it: monitor Kafka partition lag per user – if one user's partition lag grows while others are idle, you've hit the celebrity edge case.
Key Takeaway
Separate write and read paths are non-negotiable.
Push vs pull: efficiency comes from knowing where your curve breaks.
If you don't set a celebrity threshold, a single viral post will take down your feed.
Feed Generation Strategy Decision
IfFollower count < 10,000
→
UsePush: write post to each follower's feed cache immediately. Low write cost.
IfFollower count 10,000 – 1,000,000
→
UsePush with batch fan-out: enqueue fan-out tasks in Kafka; use 50 workers per partition.
IfFollower count > 1,000,000
→
UsePull: store post in a hot table; on feed request, merge recent posts from all followed accounts via a read-path query.
Database Sharding & Scalability Strategies
A single database instance will fail at Instagram's scale. We must shard our data. The best strategy is to shard by User_ID. This ensures that all photos from a single user live on the same shard, making the 'View Profile' query extremely fast. However, for the 'Global Feed', we might need a secondary index or a specialized search service like Elasticsearch.
We also implement a multi-layered caching strategy: Redis for the 'Latest Feed' (Pre-computed), and an LRU cache at the application level for frequently accessed user profiles. This reduces the DB read pressure by over 80%.
SchemaDesign.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- io.thecodeforge.instagram - Database Schema Concepts-- Sharding Key: user_idCREATETABLE io_thecodeforge.users (
user_id BIGINTPRIMARYKEY,
username VARCHAR(50) UNIQUENOTNULL,
email VARCHAR(100) NOTNULL,
created_at TIMESTAMPDEFAULT CURRENT_TIMESTAMP
);
CREATETABLE io_thecodeforge.photos (
photo_id BIGINTPRIMARYKEY,
user_id BIGINTREFERENCES io_thecodeforge.users(user_id),
image_path VARCHAR(255) NOTNULL,
caption TEXT,
latitude DECIMAL(9,6),
longitude DECIMAL(9,6),
created_at TIMESTAMPDEFAULT CURRENT_TIMESTAMP
);
-- Index for fast Feed Generation (Sorted by Time)CREATEINDEX idx_user_photos_time ON io_thecodeforge.photos(user_id, created_at DESC);
Output
Schema created. Sharding logic should be handled by the application layer or middleware like Vitess.
Production Reality:
Don't use auto-incrementing IDs in a distributed system. Use a 'Snowflake' ID generator (like Twitter's) to create unique, time-sortable 64-bit IDs across multiple shards.
Production Insight
Sharding by user_id makes profile queries fast but creates hot spots for celebrities.
One celebrity shard gets 100x the writes of others, causing latency tail to spike.
Rule: use consistent hashing with virtual nodes to spread celebrity write load.
Debug it: monitor per-shard write latency; if one shard is hotter, reassign virtual nodes dynamically.
Key trade-off: you give up easy range queries across users – secondary indexes are needed for global feed or search.
Key Takeaway
Shard by user_id for profile locality.
Consistent hashing handles elastic scaling without downtime.
Always plan for hot keys – they're not a rare edge case, they're guaranteed as you grow.
Shard Key Selection
IfEqual distribution of users across shards
→
UseUse user_id modulo N – simple, but leads to incremental rebalancing when adding shards.
IfHandle celebrity hot spots without manual rebalancing
→
UseUse consistent hashing with 1000 virtual nodes per physical shard – automatically distributes hot keys.
IfNeed cross-shard queries (e.g., find photos near a location)
→
UseAdd a secondary index table sharded differently (e.g., by geo hash) or use Elasticsearch.
Feed Generation: The Push-Pull Hybrid Model
The feed is the heart of Instagram. It must show recent posts from followed users in reverse chronological order (or ranked by engagement). Two classic approaches exist: pull (fan-out on read) and push (fan-out on write). Pull means when a user opens the app, we query all followed users' recent posts and merge. Push means when a user posts, we insert that post into every follower's precomputed feed list.
Pull is efficient for celebrities because you don't push to millions; but it's slow for users following hundreds of accounts because you need many queries. Push is fast for reading but costly for writes, especially for popular users. The solution: use push for regular users (fan-out on write) and pull for users with >1M followers (fan-out on read). This hybrid model balances the trade-offs.
package io.thecodeforge.instagram;
import java.util.*;
import java.util.concurrent.*;
publicclassFeedGenerationService {
privatestaticfinallong CELEBRITY_THRESHOLD = 1_000_000;
privatefinalFollowerService followerService;
privatefinalFeedCache feedCache;
privatefinalExecutorService fanoutExecutor = Executors.newFixedThreadPool(20);
publicvoidonNewPost(Post post, String userId) {
long followerCount = followerService.getFollowerCount(userId);
if (followerCount < CELEBRITY_THRESHOLD) {
fanoutToAllFollowers(post, userId);
} else {
// celebrity: store post for pull-based retrieval
feedCache.storeCelebrityPost(userId, post);
}
}
privatevoidfanoutToAllFollowers(Post post, String userId) {
List<String> followers = followerService.getFollowers(userId);
for (String followerId : followers) {
fanoutExecutor.submit(() -> feedCache.addToFeed(followerId, post));
}
}
publicList<Post> getFeed(String userId, int limit) {
// merge cached feed with recent posts from followed celebritiesList<Post> cachedPosts = feedCache.getFeed(userId);
List<String> followedCelebrities = followerService.getFollowedCelebrities(userId);
List<Post> celebrityPosts = feedCache.getRecentCelebrityPosts(followedCelebrities);
List<Post> merged = mergeAndSort(cachedPosts, celebrityPosts);
return merged.subList(0, Math.min(limit, merged.size()));
}
privateList<Post> mergeAndSort(List<Post> a, List<Post> b) {
// merge two sorted (by timestamp descending) listsList<Post> result = newArrayList<>();
int i = 0, j = 0;
while (i < a.size() && j < b.size()) {
result.add(a.get(i).getTimestamp() >= b.get(j).getTimestamp() ? a.get(i++) : b.get(j++));
}
while (i < a.size()) result.add(a.get(i++));
while (j < b.size()) result.add(b.get(j++));
return result;
}
}
Output
Feed generation service uses hybrid push/pull based on follower count.
The Pub/Sub Hybrid Analogy
Push: writer writes once, but many recipients must read. Good for small fan-outs.
Pull: reader reads from many sources; good for large fan-outs because writer isn't burdened.
Hybrid: set a threshold where the cost of push exceeds the benefit of instant delivery.
Threshold should be configurable and adjusted based on system load – you can even make it dynamic.
Production Insight
Push fan-out with 100M followers creates 100M writes per post – that's a 100M spike in write traffic.
If your message queue isn't partitioned properly, a single partition gets backed up and all feeds slow down.
Rule: use multiple Kafka partitions and route fan-out tasks by hash of follower ID, not user ID.
Debug it: if one partition lag is high, check if that user's fan-out tasks are all in one partition due to bad key selection.
Performance impact: hybrid reduces write amplification by 99% for celebrities while keeping feed generation under 50ms for normal users.
Key Takeaway
Feed generation is the main bottleneck at scale.
Push works for small groups, pull works for massive groups – hybrid is the engineering answer.
Make the threshold configurable: you'll tune it based on observed consumer throughput and latency SLAs.
Storage Strategy: Object Store, CDN, and Cold Archival
Media storage at Instagram's scale requires a tiered approach. The primary storage is an object store (AWS S3, Google Cloud Storage) because they offer near-infinite capacity, strong durability (99.999999999%), and pay-per-GB pricing. However, serving every image directly from S3 would be too slow for users far from the data center and expensive in egress costs. That's where CDNs come in: we push popular media to edge servers worldwide so users download from a nearby node.
For cold data – photos older than 30 days with zero views – we move them to a cheaper archival tier (Amazon S3 Glacier, Google Cloud Archive) and serve a placeholder if accessed. The CDN cache also holds a copy for a shorter TTL (e.g., 7 days for popular content, 1 day for normal). Videos are stored as HLS segments for adaptive bitrate streaming.
Media URL: https://s3.us-west-2.amazonaws.com/instagram-cold/photo_abc123
Data Lifecycle Management
Set S3 lifecycle rules to transition objects: 30 days to Infrequent Access (reduced cost), 365 days to Glacier (archive). CDN TTLs should be shorter (1-7 days) for freshness; use invalidation handles for immediate removal.
Production Insight
Storing all media in one S3 bucket makes it easy to manage but creates a permission nightmare.
Use separate buckets per storage tier (hot, cold, archive) with IAM roles restricting write access to the upload service only.
Rule: never give public read access to S3 – always use CDN signed URLs with expiry.
Performance impact: CDN adds 10-20ms latency but reduces origin load by 95% and saves 80% on egress costs.
Key trade-off: faster global delivery costs more for cache invalidation – you can't immediately purge all edges; use versioned URLs (e.g., include upload timestamp) to avoid cache staleness.
Key Takeaway
S3 for durability, CDN for speed, Glacier for cost.
Never expose S3 direct URLs – always use CDN signed URLs.
Lifecycle policies are cheap automation – set them on day 1, or pay the cost later.
Caching Strategy: Multi-Level Cache for Sub-300ms Feeds
To achieve sub-300ms feed loads, we need a multi-layer cache. The first layer is a CDN for static media (images, video thumbs). The second layer is an in-memory cache (Redis) for precomputed feeds of active users. The third layer is an application-level LRU cache for frequently accessed metadata (user profiles, popular photos).
For feed data: we store the top N recent posts for each user in Redis (capped at 1000 per user). When a user posts, we push to followers' feed caches (for non-celebrities) or store in a 'celebrity post list' in Redis. For membership services (follower counts, liked), we use a separate Redis cluster with eventual consistency. Cache invalidation is handled via version numbers: each user has a feed version; when a new version is available (due to new post), the client refetches.
Layer 1 (CDN): Static media – miss penalty = 50-100ms (fetch from origin). Hit ratio > 95%.
Layer 2 (Redis): Feed data, popularity scores – miss penalty = 5-10ms to get from DB. Hit ratio > 80%.
Layer 3 (LRU): User profiles, photo metadata – miss penalty = 1-5ms (local). Hit ratio > 90%.
If you have to go to DB, your response time jumps from microseconds to milliseconds – that's where your SLAs break.
Production Insight
Putting all feed data in Redis sounds great until you estimate the memory cost.
500M active users × 1KB per feed entry × 1000 entries = 500TB of RAM. That's $15M/month just for Redis.
Rule: only cache feeds for active users (logged in last 24h). For inactive, regenerate on login.
Memory optimisation: use Redis with compressed data (snappy) and cap feed entries to 200 per user.
Debug it: monitor Redis INFO keyspace; if keyspace_hits ratio drops below 90%, you're caching too little or TTLs are too short.
Performance impact: 99th percentile latency drops from 500ms (no cache) to 180ms (multi-level cache).
Key Takeaway
Cache the feed, not the whole world.
Active-only caching reduces memory by 70% without losing performance.
Always have a cache miss plan: if Redis goes down, your feeds should still work (just slower) by falling back to DB reads.
● Production incidentPOST-MORTEMseverity: high
Celebrity Post Causes Global Feed Delay
Symptom
Feed updates stalled globally for 2 hours. API latency spiked from 200ms to 20s. Message queues (Kafka) backed up with millions of unprocessed fan-out tasks. OOM errors on feed workers.
Assumption
The push-based fan-out model could handle any user because the system was horizontally scalable. The threshold for 'celebrity' handling was not defined.
Root cause
The system applied the same fan-out behaviour to all users. A celebrity with 100M followers triggered 100M entries in a single Kafka partition, overwhelming the consumer group's ability to process before new messages arrived.
Fix
Introduced a hybrid feed generation model: users with follower count > 10M are treated as 'celebrities'. Their posts are stored in a hot table and pulled into follower feeds only when that follower requests their feed. The threshold is configurable via a dynamic feature flag.
Key lesson
Profile user follower distributions regularly – the tail (celebrities) is where your system breaks.
Always have a safety threshold for push vs pull – don't treat all writes equally.
Monitor per-partition lag in Kafka (or equivalent) – it's the first signal that you're overwhelming a single consumer.
Production debug guideSymptom → Immediate Action → Root Cause Analysis4 entries
Symptom · 01
Feed loads slowly (>500ms) or shows stale content
→
Fix
Check Redis cache hit ratio via redis-cli info stats – if hits < 85%, check feed precomputation worker health. Examine Kafka consumer lag for the feed partition.
Symptom · 02
Photo loading takes >2 seconds or shows broken images
→
Fix
Verify CDN cache status via curl -I https://cdn.instagram.com/p/.... If X-Cache: MISS, check origin S3 bucket for file existence. Ensure CDN purge didn't wipe popular content.
Symptom · 03
Upload fails with 503 or timeout
→
Fix
Check API gateway rate limiting logs – if rate exceeded, scale gateway or adjust per-user limits. Also verify auth token expiration and S3 bucket permissions.
Symptom · 04
User sees incorrect follower count or feed order
→
Fix
Check database replication lag – if SQL/NoSQL replicas are behind, downgrade consistency to LOCAL_QUORUM for reads. Examine async job queue for pending follower count updates.
★ Cheat sheet: Instant Diagnosis for Instagram-Scale IssuesWhen the system goes wrong, these commands get you to the root cause in under 2 minutes.
High API latency−
Immediate action
Check if the bottleneck is CPU, memory, or I/O on feed workers.
High write volume (100M/day); no complex joins; eventual consistency fine.
Follower Graph
Requires junction table; slow for massive fan-out queries
Not ideal for graph traversal
Cassandra (with denormalisation) or dedicated graph DB (Neo4j for recommendations)
Follower graph is read-heavy; denormalise for fast fan-out; graph DB for recommendation.
Search / Explore
Full-text search limited
Elasticsearch/Solr is not NoSQL in the traditional sense
Elasticsearch (search) + Cassandra (data)
Use Elasticsearch for full-text search over posts/descriptions; store source of truth in Cassandra.
Key takeaways
1
Separate Read and Write paths to handle lopsided traffic patterns (1:100 write-to-read ratio).
2
Use Consistent Hashing and Sharding by User_ID to manage data growth across multiple servers.
3
Implement a Hybrid Feed Model (Push for normal users, Pull for celebrities) to avoid the 'Thundering Herd' problem.
4
Leverage CDNs to minimize 'Time to First Byte' (TTFB) for global users.
5
Cache only for active users; use multi-level cache (CDN, Redis, LRU) to achieve sub-300ms feed loads.
6
Always set a celebrity follower threshold
it's not optional, it's a hard requirement for feed stability.
Common mistakes to avoid
5 patterns
×
Storing images directly in the database (BLOB columns)
Symptom
Database size balloons to petabytes, backups take days, read/write latency skyrockets as DB becomes I/O-bound.
Fix
Always store images/videos in an object store (S3, GCS). Keep only the URL string in the database. Use presigned URLs for access control.
×
Ignoring the CDN or deploying a single-region CDN
Symptom
Users in Europe and Asia experience >5s load times for photos stored in US-West; high egress costs from origin.
Fix
Use a global CDN (CloudFront, Cloudflare). Set proper cache-control headers to max-age=86400 for popular content. Pre-warm CDN for major events.
×
Assuming strong consistency is needed for all data (e.g., like counts)
Symptom
Write latency increases due to cross-region replication; availability drops when replicas are down.
Fix
Use eventual consistency for social metrics (likes, comments, follower counts). Show approximate counts with a + indicator. Only enforce strong consistency for user settings and posts.
×
Under-calculating storage growth for videos
Symptom
After 1 year, storage costs exceed projected budget; OOM errors on ingestion pipeline due to slow compression.
Fix
Estimate storage: 100M photos/day * 2MB = 200TB daily. Compress videos with H.265, transcode to multiple resolutions, and move older videos to cold storage (Glacier). Set lifecycle policies from day one.
×
Using auto-increment IDs in distributed sharded databases
Symptom
ID collisions when inserting concurrently; coordination overhead kills write throughput.
Fix
Use a Snowflake-style ID generator (64-bit, time-sortable, datacenter + worker bits). Or use UUID v7 (sortable). Avoid auto-increment entirely.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
How would you design Instagram's feed generation system to handle celebr...
Q02SENIOR
How would you shard the database for Instagram? Which shard key would yo...
Q03SENIOR
How do you ensure high availability and durability for photo uploads?
Q04SENIOR
Compare using Cassandra vs PostgreSQL for the photo metadata table. Whic...
Q05SENIOR
How would you estimate the storage and bandwidth needed for Instagram?
Q01 of 05SENIOR
How would you design Instagram's feed generation system to handle celebrities with millions of followers?
ANSWER
Use a hybrid push-pull model. For users below a threshold (e.g., 1M followers), push posts to followers' feed caches (fan-out on write) using Kafka for async fan-out. For celebrities above the threshold, store their posts in a hot table in Redis; at feed request time, merge the precomputed cached feed with recent celebrity posts. The threshold should be configurable and monitored via Kafka consumer lag. This prevents write amplification from destroying the message queue and keeps read latency acceptable.
Q02 of 05SENIOR
How would you shard the database for Instagram? Which shard key would you choose and why?
ANSWER
Shard by user_id using consistent hashing with virtual nodes. This ensures all data for a user (profile, photos, followers) stays on one shard, making profile queries fast and avoiding cross-shard joins. Consistent hashing allows adding/removing shards with minimal data movement. For hot keys (celebrities), virtual nodes help spread write load. Drawback: cross-shard queries for global feeds require a secondary index (Elasticsearch).
Q03 of 05SENIOR
How do you ensure high availability and durability for photo uploads?
ANSWER
Use asynchronous upload: the client uploads to a temporary signed URL pointing to S3, with parallel upload if the file is large. Once uploaded, the metadata (user_id, S3 URL, timestamp) is written to a distributed queue (Kafka) and then persisted to a sharded Cassandra table. For durability, S3 replicates data across multiple AZs. For availability, if the metadata write fails, a retry worker processes the queue; if S3 write fails, the client is notified to re-upload. Images are served via a CDN with origin shield to reduce load on S3.
Q04 of 05SENIOR
Compare using Cassandra vs PostgreSQL for the photo metadata table. Which would you choose and why?
ANSWER
Cassandra is the better choice for photo metadata because the write throughput is massive (100M inserts/day) and reads are primary-key based (user_id, photo_id). Cassandra provides linear scalability with no single point of failure, tunable consistency, and column-level TTLs (for auto-expiring stories). PostgreSQL requires manual sharding and has higher write overhead. However, if you need complex queries (e.g., 'find all photos near a location'), PostgreSQL with PostGIS might be better; but at scale, that query would be served by Elasticsearch anyway, not the primary metadata store.
Q05 of 05SENIOR
How would you estimate the storage and bandwidth needed for Instagram?
ANSWER
Assume 100M uploads/day, average photo 2MB, average video 10MB (5% of uploads are video, 5 seconds at 20MB/s). Daily storage: 100M 0.95 2MB + 100M 0.05 10MB = 190TB + 50TB = 240TB/day. Bandwidth for uploads: 240TB/day = 20Gbps. For reads, assume 100:1 read-to-write ratio: 20Gbps * 100 = 2Tbps peak. CDN reduces origin bandwidth by ~95%: origin needs around 100Gbps. After compression, storage reduces to ~120TB/day. Plan for 3x replication: 360TB/day capacity needed. Cold storage after 30 days reduces hot storage by 90%.
01
How would you design Instagram's feed generation system to handle celebrities with millions of followers?
SENIOR
02
How would you shard the database for Instagram? Which shard key would you choose and why?
SENIOR
03
How do you ensure high availability and durability for photo uploads?
SENIOR
04
Compare using Cassandra vs PostgreSQL for the photo metadata table. Which would you choose and why?
SENIOR
05
How would you estimate the storage and bandwidth needed for Instagram?
SENIOR
FAQ · 6 QUESTIONS
Frequently Asked Questions
01
How do you handle the 'Celebrity' problem in Instagram's feed?
We use a hybrid approach. For regular users, we 'fan-out' (push) their posts to followers' feeds. For celebrities like Cristiano Ronaldo, we don't push to 600M+ people. Instead, we store their post in a 'Hot Table' and pull it into a follower's feed only at request time.
Was this helpful?
02
Which database is better for Instagram: SQL or NoSQL?
Both. SQL (sharded) is better for relational data like User Profiles and Followers where referential integrity matters. NoSQL (like Cassandra) is better for the massive, write-heavy stream of Likes, Comments, and Activity Feeds.
Was this helpful?
03
How do you ensure 'High Availability' for image viewing?
We use Object Storage with cross-region replication and a global CDN. If one region goes down, the CDN automatically routes requests to the next nearest healthy edge location or origin server.
Was this helpful?
04
How do you handle video uploads and streaming?
Videos are uploaded as single files, then a transcoding service (e.g., AWS Elastic Transcoder) converts them to HLS segments at multiple bitrates. The segments are stored in S3 and served via a CDN with adaptive bitrate playback. Metadata (duration, thumbnail URLs) is stored alongside photo metadata in Cassandra. A separate worker handles thumbnail generation.
Was this helpful?
05
What happens when a user deletes a photo?
The delete request goes to the API gateway, which marks the photo as deleted in Cassandra (soft delete). A background worker then removes the file from S3 and invalidates the CDN cache for that URL. The metadata is retained for a short period (30 days) for legal compliance, then hard-deleted. The feed cache is updated to remove the post from followers' feeds via a delete marker.
Was this helpful?
06
How do you handle search in Instagram (Explore page)?
We index photo captions, hashtags, and locations into Elasticsearch. When a user opens Explore, we serve a curated feed using collaborative filtering and popularity signals. Search queries hit Elasticsearch for full-text matching, while the actual photo metadata (URLs) is fetched from Cassandra in a second round trip. This decoupling allows each system to scale independently.