If you're preparing for an interview for a big systems design job, Reddit is definitely one of your "must-brush" companies. As a world-renowned social news aggregator, Reddit is one of the most popular companies in the world.Reddit's user scale has exceeded 500 million monthly activity. It is not only the "extranet of Zhihu", but also the favorite of interviewers in Silicon Valley - why? Because Reddit System Design Interview is super close to the real scene: high concurrency, read-write separation, personalized recommendations, which are often interviewed by big factories (e.g. Meta, Google).
But then again, a system design interview can be difficult. It's not like algorithmic questions with standard answers, but rather an examination of your architectural thinking: can you build a scalable system step by step from requirements analysis to bottleneck optimization? Many people get stuck on "how to cache under high QPS" or "how to asynchronize write load", and the interviewer will reveal his/her identity when asked. Don't panic, today I'm going to take you to the depths of the two classic questions from Reddit:Design Reddit's feed ranking system And Design the voting system. These are high-frequency Reddit interview questions to help you clear your mind and feel confident in your interviews.

Question 1: Design Reddit's Feed Ranking System, including Home Feed and r/popular
This is an appetizer for Reddit's system design, and it's all about how well you can handle a read-intensive system: the feed is the first thing users see when they open the app, and if the latency is more than 200ms, they'll run away. Core: generate a personalized/hot list for users from a large number of posts, with infinite scrolling support.
1. Functional requirements: clarifying boundaries first
- Home Feed: Logged in users see posts subscribed to subreddits (subsections), sorted by Hot/Best/New. Strong personalization and consideration of user activity.
- r/popular feed: Site-wide top hits, anonymous users can read them too. No subreddit, purely by algorithm to push global Top.
- Non-functional: QPS over 100,000, 99.9% availability, support for A/B test sorting algorithm.
2. Architecture outline: read-write separation is king
The pain point of Feed systems is that read QPS is 100 times higher than write! Read the database directly? Crash. Solution:Fan-out on Write(fanout on write), precomputed Feed storage cache.
- write path(Post Published):
- Users post → store Posts table (using Cassandra, write friendly, partitioned key is subreddit_id).
- Kafka delivers "new post event" → Feed service subscription, query list of subscribed users (from User-Sub table, sharding by user_id).
- For each active subscriber (e.g., logged in last 30 days), push post_id into their Redis Feed (sorted set, score is sorted value). Big subreddit (e.g. r/all) subscribers in the millions? Use a hybrid strategy: full fanout for small sub, real-time merge on big sub reads.
- read path(La Feed):
- User request → fetch Top N post_ids from Redis (ZREVRANGE).
- Batch pull post details (cached from Memcached, miss then Cassandra).
- Support for paging: infinite scrolling with Redis cursor or timestamp anchors.
What about r/popular? Simple: maintain a global Redis sorted set, new posts are pushed in real-time across the site, and scores are calculated using the Hot formula. Anonymous users read directly, high QPS but low personalization.
3. Sorting algorithms: Reddit's "black tech"
Sorting is the soul! Reddit has open-sourced formulas, so don't memorize them, understand the principles.
- Hot RankingNew posts are scored and old posts are decimated. Formula: score = log10(|up - down|) + (timestamp / 45000). log keeps extreme votes down, and the time term lets the post explode in a "hot window". Implementation: Feed service calculates score and stores it in Redis.
- Best Ranking: anti-scraping, with Wilson Score Interval (confidence interval): score = (p + z²/(2n) - z√(p(1-p)/n + z²/(4n²))) / (1 + z²/n), with p being the like rate, n total votes, and z = 1.96 (95% confidence). Smaller posts penalize larger ones, fair enough.
- Extension: plus ML? Embedding user vectors with TensorFlow Serving, real-time re-rank Top 100.
4. Optimization & scaling: don't forget the bottlenecks
- (computing) cache: L1 Redis (Feed IDs), L2 Memcached (post content). Failure: asynchronous evict on post update.
- Hot Sub issues: Limit the flow of big sub posts, or use "pull on read": merge from the sub feed (with Bloom Filter de-weighting) when a user reads it.
- control: Prometheus catches QPS/latency, A/B tests different algorithms.
This question faces off, and interviewers often ask, "What if subreddit breaks a billion subscriptions?" A: Fan out hierarchically + sample active subscribers. Steady!
Question 2: Design the Voting System (Upvote/Downvote) for Posts and Comments
Polling is the "social heart" of Reddit, with 10,000 write operations per second! This question is about write-heavy design, focusing on anti-contention and asynchronization. Bottleneck: concurrently updating the vote lock table.
1. Requirements & Bottlenecks: High Concurrency Writes the Devil
- functionality: Users vote +1/-1 for post/comment, with support for withdrawal/toggle. Real-time update of UI votes, but ultimately consistent OK.
- bottlenecks: Naive update up/down field of Posts table, lock contention under high concurrency.QPS: 10k/s for write, higher for read.
2. Architecture: asynchronous aggregation, decoupled writing and reading
Don't write DBs synchronously!Event Driven + Bulk Aggregation.
- API layer: POST /vote {content_id, direction: 1/-1/0}. Check if the user has voted (Redis set: user_votes:{user_id}).
- write path:
- After casting → store Votes table (schema: user_id, content_id, direction; sharding by content_id, Vitess or CockroachDB).
- Drop Kafka events → Vote Aggregator service subscription, batch aggregation every minute/5 minutes: SUM(direction) updates Posts with up/down/score.
- read path: Votes read Posts (eventually consistent, delay <1min). Real-time UI? push increments with WebSocket, or Redis pub/sub broadcast popular posts.
3. Data model: redundancy & sharding
- Posts/Comments: id, up, down, score=up-down. cassandra partitioning.
- Votes: Jumbo table, TTL expired old cast. Index: (user_id, content_id) anti-reprojection.
- User Votes: Redis hash, fast check/withdrawal.
4. Anti-Abuse: community safety first
Reddit has blocked countless bots.
- Rate Limit: API gateway, user/IP limit 5/s, use Token Bucket.
- Behavioral Detection: Asynchronous Spark job analysis: new number of cluster votes? IP clusters? Label with ML (Isolation Forest) to exclude their votes.
- thumbprint: Store device_id + IP, clustering against vests.
- Captcha: High-risk cast plus validation.
Optimization: Aggregation window reduced (1min for popular posts), daily batch for cold posts. Interview follow-up: "How do you notify users of eventual inconsistencies?" A: optimistic UI + polling/WS.
Summary of Interview Ideas
Reddit's system design questions are of medium to high difficulty and are of the "multi-module collaboration + trade-off expression" type. The interviewer expects to hear the following:
- You can identify system read/write ratios and performance bottlenecks.
- You can design mechanisms for event streaming (Kafka) and caching (Redis) to work together.
- You can analyze the trade-off logic of fan-out on write/read.
- You've taken into account the anti-cheating and ultimate consistency issues.
During the interview, you can draw the general architecture diagram with a whiteboard, start with data flow, and then talk about the details of optimization and extension solutions.
Programhelp Assisting System Design Interviews: Voice Real-Time Saves the Day and Pushes Straight to the Offer
Big Factory Real Guns MentorThe team of several Amazon SDE, Meta engineers, they not only teach theory, but also share the insider: Reddit interviewers love to ask "Hot Subreddit load how to break?". We have a ready-made case to help you answer in seconds.
Real-time voice-assisted, zero-stress simulation: Remote Zoom/voice connection throughout, with mentor whispering, "Don't forget Wilson Score against small post swipes!" Or "Add a Redis sorted set here to draw a diagram". Help you clear up the whole process from requirements → architecture → optimization, optimize the rhythm of answering questions - 5min requirements, 10min high-level diagrams, 5min bottlenecks.
Real Interviews Without Trace Guard: Support live interview remote voice (headset mode, interviewer can't hear). Secure encryption, no trace exit. Feedback from students: "Meta system design round, was stuck in Voting sharding, mentor voice point to wake up, smooth over!"
lit. success rate bursting at the seams: Has helped 500+ trainees crush interviews at Meta, Reddit, Amazon, and other big companies . One student shared, "Programhelp took me from 'can't draw a picture' to 'interviewer nodding' and straight to HFT Offer."