Design Reddit: System Design #1

Imagine you’re the engineer responsible for building Reddit from the ground up. Walk me through how you would design the system to support the following functionality:

Requirements:

  • Users can attach images to their posts
  • Users can upvote or downvote posts
  • Users can add comments to posts
  • Users can see a feed of posts sorted by ranking or recency

Constraints:

Hints

How does the large volume of users impact our architecture? What can we do to ensure that our system scales properly?

Solution:

This is a broad problem with many interesting aspects to explore: hosting user-generated content, voting and ranking, and designing a system at a massive scale. For our solution, we’ll follow the approach outlined in the first lesson of this module.

1. Define problem space

  • Do we need to support users on mobile apps or only web?
  • Will users upload their images to Reddit or link to a third-party image hosting service?
  • Are there any performance or latency requirements that would impact our design choices?

For now, let’s assume that we only care about web users, we do want to host user content directly on our servers, and we want this content to load quickly for users around the world, regardless of location.

2. High-level Design

Let’s start by picking the core components of our system! We know from the project requirements that we want to allow users to view, post, upvote, and comment. Think about what components of your system you’ll need to support this: databases, servers, user interfaces, etc.

Let’s start from the database and work our way up. We know we’ll need a data store for all of our users, posts, and upvotes, and we’ll also need to store and retrieve large image files. For the first type of data, a relational database makes the most sense because there’s a clear relational structure — users have many posts, posts have many upvotes, etc. For this reason, it makes sense to pick a SQL database since it is more efficient at modelling and querying for relational data. We can’t store arbitrary files in a SQL database, however, so we also need an object storage system, like Amazon S3.

Now that we know how we’re going to store our data, we need application servers to perform “CRUD” operations on the underlying data, handle user authentication, and the rest of our business logic. Due to the scale of our system, we’ll also need many server instances (or even multiple points of presence), along with a load balancer to distribute traffic across these servers. We’ll also need to implement caching layers across the board for common operations like ranking as well as content distribution.

3. Define each part of our system

Database Schema

Users Schema
Posts Schema
Subreddits Schema
Upvotes Schema
Comments Schema

Indexing:

Sharding:

API

Caching

  • Retrieval and ranking: To improve the performance of retrieval and ranking of posts, we can add a caching layer (like Memcached or Redis) between our application and databases. We can update our cache on a periodic basis, either with a scheduled job or directly upon user-initiated actions like posting and voting. Our choice will depend largely on the estimated volume of views vs. posts and the acceptable latency in ranking updates. To make our cache usage more efficient, we must also consider what “eviction policy” would make sense for our application; one option would be to cache the rankings by subreddit and use a least-recently-used (LRU) policy to prioritize more popular subreddits over time.
  • Content delivery network: To deliver static file content, such as user-uploaded images and frontend resources, we will need to make use of a distributed content delivery network (CDN). This service will permit us to cache resources at nodes around the world, reducing the load on our backend servers while also decreasing latency for users. Newly uploaded resources can be pushed to the CDN or pulled by the CDN from object storage as needed. Some CDNs also offer additional benefits such as automatic image compression and optimization for different types of devices.

References:

> https://youtu.be/aRjiHoG5RB0

> https://www.youtube.com/watch?v=Rmb-LxYuon0&t=26s

> http://highscalability.com/

Software Engineer @Vedantu, Former Intern @Hackerrank, GSoC @Wikimedia, Codeforces(Expert)