This repository contains the architecture and implementation details for a scalable, low-latency real estate chatbot system. The system is designed to handle user queries related to information retrieval, inventory management, and general inquiries with high accuracy and efficiency.
To ensure that the system can handle a large number of projects (100+) while maintaining low latency and high accuracy, the following design principles and strategies have been implemented:
- SQL DB Design:
- Partitioning: We can store all data related to 100+ projects in an SQL DB for efficient query creation.
- Query Caching: Frequently requested queries are stored in a distributed cache (e.g., Redis, Memcached) to reduce database load and response times.
- Document Caching: Commonly accessed documents or metadata are cached, minimizing the need for repeated retrieval from MongoDB or other storage.
- Precomputation: Complex / Simple queries can be precomputed and their results can be stored.
- Vector Embeddings: Document embeddings are stored in a vector database for efficient similarity searches, including project-specific metadata to narrow down search results quickly.
- Re-ranking: A re-ranking mechanism is implemented to prioritize documents based on relevance, using additional project-specific metadata.
- Personalized Caching: Results are cached at a user or project level, ensuring repeated queries benefit from lower latency.
- Load Balancer
- Mircoservice architecture
- Using Groq: Groq is used for fast AI inferencing which reduces time for response.
In software development, certain principles stand as the bedrock for writing code that is not only functional but also clean, maintainable, and efficient. Please follow this software Design principles:
- Keep It Simple, Stupid (KISS)
- Don't Repeat Yourself (DRY)
- You Aren't Gonna Need It (YAGNI)
- Encapsulate What Varies
- Program to an Interface, Not an Implementation
- Favor Composition Over Inheritance
- Strive for Loosely Coupled Designs
- The Law of Demeter
- SOLID Principles
- Single Responsibility Principle (SRP)
- Open/Closed Principle (OCP)
- Liskov Substitution Principle (LSP)
- Interface Segregation Principle (ISP)
- Dependency Inversion Principle (DIP)
Exact search, also known as exact match or precise search, looks for results that perfectly match the given query. It returns only items that have an exact correspondence with the search terms.
Fuzzy search is a technique that finds approximate matches to search terms. It allows for minor differences or errors in the query, such as misspellings or slight variations, and still returns relevant results.
Semantic search caching is a technique where the meaning of queries is analyzed, and similar queries are cached together. This reduces latency by retrieving cached responses for semantically similar queries, rather than re-executing complex search processes.
Condition | Status | Time | Size |
---|---|---|---|
Initial | 200 OK | 8.84 s | 3.51 KB |
After Exact Query Match Caching | 200 OK | 19 ms | 3.51 KB |
After Fuzzy Search Caching | 200 OK | 161 ms | 223 B |
Condition | Status | Time | Size |
---|---|---|---|
Initial | 200 OK | 3.97 s | 12.59 KB |
After Exact Query Match Caching | 200 OK | 23 ms | 12.59 KB |
After Fuzzy Search Caching | 200 OK | 49 ms | 12.59 KB |
Note: There is a flaw in Fuzzy search matching it just mesaures how similar words are not meaning, For example: "List of properties which are sold" and "List of properties which are not sold" will have same matching
Condition | Status | Time | Size |
---|---|---|---|
Before Semantic Caching | 200 OK | 3.83 s | 17.08 KB |
After Semantic Caching | 200 OK | 453 ms | 17.08 KB |