Design - Gaming Leaderboard

FUNCTIONAL REQUIREMENTS -

Display top 10 users on the leaderboard
Show user's specific rank
Display users who are 4 places up and 4 places down below the user

NON-FUNCTIONAL REQUIREMENTS -

Real-time update on scores
Score update is reflected on the leaderboard in real-time
Scalability, Availability and Reliability

API SIGNATURE -

POST /v1/scores (update score on the leaderboard)
GET /v1/scores (fetch top 10 users with their scores)
GET /v1/scores/{userdId} (get the user rank on the leaderboard)

DATA MODEL -

We can use a RDBMS solution for this problem. Whenever a user wins a match we update the score of the user by 1, but the main problem arises when we have to fetch the ranks of user/top 10 user.
We cannot sort millions of rows in the DB for every request.
One optimisation we could do is to add an index and limit the results, but this also requires a full scan of the DB.

REDIS Solution -

For this problem, we'll use the Redis sorted sets.

What are sorted sets?

Redis Sorted Sets (`zset`) are a powerful data structure in Redis that store unique elements with an associated score, which determines their order. Unlike regular sets in Redis, where elements are unordered, sorted sets maintain a sorted order of elements based on the scores provided. This makes them highly useful for scenarios where ordered data retrieval is required, like leaderboards, rankings, and time-series data.

Key Characteristics of Redis Sorted Sets

Unique Elements:
- Each element in a sorted set is unique, meaning duplicate values cannot exist. However, scores can repeat across different elements.
Associated Scores:
- Each element has a floating-point number called a score, which represents the sorting criterion. Scores determine the ordering of elements within the sorted set.
Automatic Ordering:
- Elements are automatically stored in ascending order of scores. This allows for fast retrieval of elements in a specific order.
Efficient Operations:
- Redis Sorted Sets offer O(log(N)) complexity for most operations, where N is the number of elements in the sorted set. This makes them efficient for insertion, deletion, and range queries.

Commands and Usage

ZADD: Insert the user into the set if they don't exist else update the score of the user. This operation takes O(log(n)) time.
ZINCRBY: Update the score of the user by a specified amount, and if the user does not exist start with the score of 0 and update it to 1. O(log(n)) execution time again.
ZRANGE/ZREVRANGE: Fetch a range of users sorted by the score. We can specify the number of entries we want to fetch and the position to start from. This takes about O(log(n) + m) where m is the number of entries to fetch and n is the number of entries in the sorted set.
ZRANK/ZREVRANK: Fetch the position of any user at any given point in time. This operation also takes O(log(n)) time.

DESIGN -

Game Service - We don't want the client to directly talk to our leaderboard service. As it is not secure, and can lead to attacks. So, the game service will authorise the client and send the payload to leaderboard service.

Leaderboard Service - It updates the the score in SQL DB and also into redis sorted sets because Redis may give update failures and we want a source of truth table. We can run some cron jobs (every 1 hour) to sync the data b/w SQL table and redis. It also fetches the rank/top 10 ranks and also fetched user details from SQL DB and then returns the response to game service.

OPTIMIZATIONS -

How to scale Redis if we suddenly get 500 million DAU ?

We'll consider two ways to shard the data inside redis, fixed partition and hash partition.

Fixed Partition -

For fixed partitions approach, we first get the overall range of the scores in the leaderboard for a month. Let’s say the score ranges from 1 to 1000 and we break up the range into 10 sets, like 1–100, 101–200, 201–300… and so on. We breakup in such a way such that that the users are evenly distributed.

In this approach, we shard the data ourselves by implementing the logic in the application code.

Updating the score logic: When we are inserting or updating the score for the user we need to know in which shard the user is present. One way to have this is to look for the current score in the RDBMS instance and according to that update the score in the appropriate shard. But this might not be performant so we would need to maintain a secondary cache, which has a mapping from user_id to score information. Also while updating the score, a user might have to be moved between shards.
To fetch the top 10 leaderboard: We simply have to fetch the top 10 players from the shard which is the last shard i.e 901–1000 as the shards are based on the scores.
To fetch the rank of the user: Once we determine the shard for the user by using a secondary cache we would determine the local_rank of the user within the shard. We will also have to calculate the number of users that are present in all the shards before the present shard and then with both the information we can find the global_rank i.e. the rank of the user.

Hash Partition -

For Hash Partition we use a Redis Cluster which provides a way to automatically shard the data across multiple Redis Nodes. Internally this does not use Consistent Hashing but a different hashing scheme called as the hash slot.

There are 16384 hash slots and we can compute the hash slot by using CRC16(key) % 16384. This allows us to add and remove the keys from the cluster easily without redistributing all the keys.

Updating the score logic: This logic is straightforward, we will determine the hash slot of the user by using CRC16(key) % 16384, and based on the hash slot we can easily determine the shard.
To fetch the top 10 leaderboard: This logic is more complicated, we would need to gather the top 10 players of each shard and then finally compute the top 10 players combined of all the shards. The first step of finding the top 10 players in each shard can be done in parallel but it is a blocking approach since we have to wait for all the shards to respond and it will take time as much as the slowest shard.
To fetch the rank of the user: There is no straightforward solution that exists for this.

As observed in the above points we can see the fixed partition approach has more merits than demerits. The only demerit of the fixed partition approach is that we have to maintain a lot of logic on the application side. So for our use case it's better to go with the fixed partition approach.

How to break tie?

In the case of two users having the same score, we'll also store the timestamp of the latest update in the score and sort on basis of timestamp if the score is same. The user with older timestamp ranks higher.

Alternative Solution : NoSql

We can select a DB like Amazon DynamoDB, Cassandra or MongoDB. We can partition the data and add score as sort key for faster retrieval.
The partition key will look something like: game_name#{year-month}#p{partition-number}.

Hemant's Blog