-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #226 from harrisonDG/main
- Loading branch information
Showing
25 changed files
with
243 additions
and
0 deletions.
There are no files selected for viewing
134 changes: 134 additions & 0 deletions
134
...em-design-interview/12-design-a-chat-system/donggu-ch12-design-a-chat-system.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Chapter 12. Design a news chat system | ||
|
||
## Step 1. Understand the problem and establish design scope | ||
- What kind of chat app shall we design? 1 on 1 or group based? | ||
- Is this a mobile app? Or a web app? Or both? | ||
- What is the scale of this app? | ||
- What is the group member limit? | ||
- What features are important for the chat app? Can it support attachement? | ||
- Is there a message size limit? | ||
- Is end-to-end encryption required? | ||
- How long shall we store the chat history? | ||
|
||
### Requirements | ||
- Small group chat (max of 100 people) | ||
- Online presence | ||
- Multiple device support. | ||
- Supports 50 million DAU. | ||
- A one-on-one chat with low delivery latency. | ||
|
||
## Step 2. Propose high-level design and get buy-in | ||
- The relationship between clients | ||
![figure12-2.png](donggu/figure12-2.png) | ||
|
||
### Polling | ||
- It could consume server resources to answer a questions that offers no as an answer most of the time. | ||
![figure12-3.png](donggu/figure12-3.png) | ||
|
||
### Long polling | ||
- In long polling, a client holds the connection open until there are actually new messages available or a timeout threshold has been reached. | ||
- Drawbacks | ||
1. Sender and receiver may not connect to the same chat server. HTTP based servers are usually statless. If you use round robin for load balancing, the server that receives the message might not have a long-polling connection with the client who receives the message. | ||
2. A server has no good way to tell if a client is disconnected. | ||
3. It is inefficient. If a user does not chat much, long polling still amkes periodic connections after timeout. | ||
![figure12-4.png](donggu/figure12-4.png) | ||
|
||
### WebSocket | ||
- The most common soultion for sending asynchronous updates from server to client. | ||
- WebSocket connection is initiated by the client. It is ***bi-directional and persistent***. | ||
- WebSocket connections generally work even if a firewall is in place. This is because they use port 80 or 443 which are also used by HTTP/HTTPS connections. | ||
![figure12-5.png](donggu/figure12-5.png) | ||
|
||
### High-level design | ||
![figure12-7.png](donggu/figure12-7.png) | ||
|
||
#### Stateless Services | ||
- Stateless services are traditional public-facing request/response services, used to manage the login, signup, user profile, etc. | ||
- Stateless services sit behind a load balancer whose job is to route requests to the correct services based on the request paths. | ||
|
||
#### Stateful Service | ||
- The chat service is stateful because each client maintains a persistent network connetion to a chat server. | ||
|
||
#### Third-party integration | ||
- Push notification | ||
|
||
#### Scalability | ||
![figure12-8.png](donggu/figure12-8.png) | ||
- Chat servers facilitate message sending/receiving. | ||
- Presence servers manage online/offline status. | ||
- API servers handle everything including user login, signup, chage profile, etc. | ||
- Notification servers send push notifications. | ||
- Key-value store is used to store chat history. | ||
|
||
#### Storage | ||
- Generic data: Such as user profile, setting, user friends list. These data are stored in robust and reliable relational databases. Replication and sharding are common techniques to satisfy availability and scalability requirements. | ||
- Chat systesm: chat history. | ||
- The amount of data is enormous for chat systems. | ||
- Only recent chats are accessed frequently. Users do not usually look up for old chats. | ||
- Although very recent chat history is viewed in most cases, users might use features that require random access of data, such as search, view your mentions, jump to specific messages, etc. These cases should be supported by the data access layer. | ||
- The read to write ratio is about 1:1 for 1 on 1 chat apps. | ||
- Key-value stores | ||
- Key-value stores allow easy horizontal scaling. | ||
- Key-value sotres provide very low latency to access data. | ||
- Relational databases do not handle long tail of data well. When the indexes grow large, random access is expensive. | ||
- Key-value stores are adopted by other proven reliable chat applications. | ||
|
||
#### Data models | ||
##### Message table for 1 on 1 chat | ||
- The primary key is message_id, which helps to decide message sequence. We cannot rely on created_at to decide the message sequence because two messages can be created at the same time. | ||
|
||
##### Message table for group chat | ||
- The composite primary key is (channel_id, message_id). Channel and group represent the same meaning here. channel_id is the partition key because all queries in a group chat operate in a channel. | ||
|
||
##### Message ID | ||
- To ascertain the order of messages, message_id must satisfy the following two requirments: | ||
- IDs must be unique. | ||
- IDs should be sortable by time, meaning new rows have higher IDs than old ones. | ||
- auto_increment, but NoSQL databases usually do not provide such a feature. | ||
- use a global 64-bit sequence number generator like Snowflake | ||
|
||
## Step 3. Design deep dive | ||
### Service discovery | ||
- Apache Zookeeper is a popular open-source solution for service discovery. It registers all the available chat servers and picks the best chat server for a client based on predefined criteria. | ||
- ![figure12-11.png](donggu/figure12-11.png) | ||
|
||
### Message flows | ||
#### 1 on 1 chat flow | ||
- ![figure12-12.png](donggu/figure12-12.png) | ||
|
||
#### Message synchronization across multiple devices | ||
- ![figure12-13.png](donggu/figure12-13.png) | ||
- Each device maintains a variable called cur_max_message_id, which keeps track of the latest message ID on the device. Messages that satisfy the following two conditions are considered as news messages: | ||
- The recipient ID is equal to the currently logged-in user ID. | ||
- Message ID in the key-value store is larger than cur_max_message_id. | ||
- With distinct cur_max_message_id on each device, message synchronization is easy as each device can get new messages from the KV store. | ||
|
||
#### Small group chat flow | ||
- When User A sends message in a group chat: | ||
- ![figure12-14.png](donggu/figure12-14.png) | ||
- This design choice is good for small group chat because it simplifies message sync flow as each client only needs to check its own inbox to get new messages. | ||
- When the group number is small, storing a copy in each recipient's inbox is not too expensive. | ||
- In the recipient side, | ||
- ![figure12-15.png](donggu/figure12-15.png) | ||
|
||
### Online presence | ||
#### User login | ||
- After a WebSocket connection is built between the client and the real-time service, user A's online status and last_active_at timestamp are saved in the KV store. | ||
- ![figure12-16.png](donggu/figure12-16.png) | ||
|
||
#### User logout | ||
- ![figure12-17.png](donggu/figure12-17.png) | ||
|
||
#### User disconnection | ||
- When a user disconnects from the internet, the persistent connection between the client and server is lost. A naive way to handle user disconnection is to mark the user as offline and change the status to online when the connection re-establishes. However, it results in poor user experience. | ||
- Heartbeat mechanism to solve this problem. Periodically, an online client sends a heartbeat event to resence servers. If presence servers receive a heartbeat event within a certain time, say x seoncds from the client, a user is considred as online. Otherwise, it is offline. | ||
- ![figure12-18.png](donggu/figure12-18.png) | ||
|
||
#### Online status fanout | ||
- ![figure12-19.png](donggu/figure12-19.png) | ||
- How do user A's friends know about the status changes? | ||
- Presence servers use a publish-subscribe model, in which each friend pari maintains a channel. When User A's online status changes, it published the event to three channels. Those three channels are subscribed by User B, C, and D, respectively.Thus, it is easy for friends to get online satus updates. The communication betwen clients and servers is through real-time WebSocket. | ||
|
||
## Step 4. Wrap up | ||
- End-to-end encryption | ||
- Caching messages on the client-side is effective to reduce the data transfer between the client and server. |
Binary file added
BIN
+110 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+131 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+100 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+118 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-14.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+106 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+54.9 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+58.8 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-17.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+75.2 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-18.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+126 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+46 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+115 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+78.8 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+80 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+217 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+219 KB
06-system-design-interview/12-design-a-chat-system/donggu/figure12-8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
109 changes: 109 additions & 0 deletions
109
...a-search-autocomplete-system/donggu-ch13-design-a-search-autocomplete-system.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Chapter 13. Design a search autocomplete system | ||
|
||
## Step 1. Understand the problem and establish design scope | ||
- Is the matching only supported at the beginning of a search query or in the middle as well? | ||
- How many autocomplete suggestions should the system return? | ||
- How does the system know which 5 suggestions to return? | ||
- Does the system suport spell check? | ||
- Are search querues in English? | ||
- Do we allow capitalization and special characters? | ||
- How many users use the product? | ||
|
||
### Requirements | ||
- Fast response time. | ||
- Relevant. | ||
- Sorted. | ||
- Highlt available | ||
|
||
### Back of the envelope estimation. | ||
- Assume 10 million daily active users (DAU) | ||
- An average person performs 10 searches per day. | ||
- 20 bytes of data per query string: | ||
- Assume we use ASCII chracter encoding, 1 character = 1 byte | ||
- Assume a query contains 4 words, and each word contains 5 characters on average | ||
- That is 4 x 5 = 20 bytes per query. | ||
- ~24,000 query per second (QPS) = 10,000,000 users * 10 queries / day * 20 characters / 24 hours / 3600 seconds. | ||
- Peak QPS = QPS * 2 = ~48,000 | ||
- Assume 20% of the daily queries are new. 10 million * 10 queries / day * 20 byte per query * 20% = 0.4TB. This means 0.4GB of new data is added to storage daily. | ||
|
||
## Step 2. Propose high-level design and get buy-in | ||
### Data gathering service | ||
- Frequency table\ | ||
![figure13-2.png](donggu/figure13-2.png) | ||
|
||
### Query service | ||
- Query and frequency | ||
|
||
## Step 3. Design deep dive | ||
### Trie data structure | ||
- Fetching the op 5 search quereis from a relational database is inefficient. | ||
- The data structure trie(prefix tree) is used to overcome the problem. | ||
- The main idea of trie consists of the following: | ||
* A trie is a tree-like data structure. | ||
* The root represents and empty string. | ||
* Each node stores a character and has 26 children, one for each possible character. | ||
* Each tree node represents a single word or a prefix string | ||
- ![figure13-6.png](donggu/figure13-6.png) | ||
|
||
- **How does autocomplete work with trie?** | ||
``` | ||
p: length of a prefix | ||
n: total number of nodes in a trie | ||
c: number of children of a given node | ||
``` | ||
1. Fine the prefix. Time complexity: O(p). | ||
2. Traverse the subtree from the prefix node to get all valid children. A child is valid if it can form a valid query string. Time complexity: O(c) | ||
3. Sort the children and get top k. Time complexity: O(clogc) | ||
|
||
- The time complexity of this algorithm is the sum of time spent on each step mentioned above: O(p) + O(c) + O(clogc) | ||
|
||
#### Limit the max length of a prefix | ||
- Users rarely type a long search query into the search box. Thus, it is safe to say p is a small integer number, say 50. If we limit the length of a prefix, the time complexity for "Fine the prefix" can be reduced from O(p) to O(small constant), aka O(1) | ||
|
||
#### Cache top search queries at each node | ||
- To avoid traversing the whole trie, we store top k most frequently used queries at each node. Since 5 to 10 autocomplete suggestions are enough for users, k is relatively small nubmer. In our specific case, only the top 5 search queries are cached. | ||
- By caching top search queries at every node, we significantly reduce the time complexity to retrieve the top 5 queries. | ||
- ![figure13-8.png](donggu/figure13-8.png) | ||
|
||
### Data gathering service | ||
- In our previous desing, whenever a user types a search query, data is updated in real-time. This approach is not practical for the following two reasons: | ||
- Users may enter billions of queries per day. Updating the trie on every query significantly slows down the query service. | ||
- Top suggestions may not change much once the trie is built. Thus, it is unnecessary to update the trie frequently. | ||
- ![figure13-9.png](donggu/figure13-9.png) | ||
- **Analytics logs**: It stores raw data about search queries. Logs are append-only and are not indexed. | ||
- **Aggregators**: The size of analytics logs is usually very large, and data is not in the right format. | ||
- **Aggregated data** | ||
- **Worker**: Workers are a set of servers that perform asynchronus jobs ar regular intervals. They build the trie data structure and store it in Trie DB. | ||
- **Trie Cache**: Trie cache is a distributed cache system that keeps trie in memory for fast read. It takes a weekly snapshot of the DB. | ||
- **Trie DB**: Trie DB is thre persistent storage. | ||
- 1. Document store: Since a new trie is built weekly, we can periodically take a snapshot of it, serialize it, and store thre serialized data in the database. | ||
- 2. Key-value store: A trie can be represented in a hash table form by applying the followin logic: | ||
- Every prefix in the trie is mapped to a key in a hash table. | ||
- Data on each trie node is mapped to a value in a hash table. | ||
- ![figure13-10.png](donggu/figure13-10.png) | ||
|
||
### Query service | ||
- ![figure13-11.png](donggu/figure13-11.png) | ||
- Query service requires lighting-fast speed: | ||
- AJAX request. For web applications, browers usually send AJAZ requests to fetch autocomplete results. The main benefit of AJAX is that sending/receiving a request/response does not refresh the whole web page. | ||
- Browser caching | ||
- Data sampling | ||
|
||
### Trie operations | ||
- Create | ||
- Update | ||
- Option1: Update the trie weekly. once a new trie is created, the new trie replaces the old one. | ||
- Option2: Update individual trie node directly. | ||
- ![figure13-13.png](donggu/figure13-13.png) | ||
- Delete | ||
- We have to remove inappropriated autocomplete suggestions with a filter layer. | ||
|
||
### Scale the storage | ||
- A naive way to shard is based on the first character. | ||
- Using a historical data distribution pattern and apply smarter sharding logic | ||
|
||
## Step 4. Wrap up | ||
- How do you extend your design to support multiple languages? | ||
- Use Unicod character in trie nodes. | ||
- What if top search queries in one country are different from others? | ||
- Build different tries for different countries. To improve the response time, we can store tries in CDNs. |
Binary file added
BIN
+127 KB
...-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+125 KB
...-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+131 KB
...-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+51.8 KB
...m-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+63.3 KB
...m-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+103 KB
...m-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+130 KB
...m-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+91.3 KB
...m-design-interview/13-design-a-search-autocomplete-system/donggu/figure13-9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.