Healthcare Assistant RAG Model

Overview

This project is a Healthcare Assistant built using a Retrieval-Augmented Generation (RAG) model. The system provides accurate and relevant healthcare advice based on user queries by retrieving information from the NHS A-Z website and augmenting it with a Large Language Model (LLM).

The project is developed as part of the MLH "Hack for Hackers" hackathon.

Features

Comprehensive Healthcare Information: The system leverages the NHS A-Z data, which covers a wide range of medical conditions, treatments, self-care advice, and medicines.
Retrieval-Augmented Generation (RAG): Combines a vector similarity search with an LLM to deliver personalized and accurate responses.
Efficient Query Handling: For each query, the system retrieves the most relevant documents from the database to enhance the LLM's output.
Scalable Backend: Powered by MongoDB Atlas for efficient storage and retrieval of embeddings and documents.

System Architecture

Components:

Frontend:
- Accepts user queries and displays responses.
- Communicates with the backend via REST API.
REST API:
- Acts as a bridge between the frontend and the RAG engine.
- Sends user queries to the RAG engine and returns the response to the frontend.
RAG Engine:
- MongoDB Atlas:
  - Stores document embeddings and a vector search index.
  - Performs similarity searches to retrieve the top 5 relevant documents for each query.
- Web Scraper:
  - Extracts data from the NHS A-Z website and stores it in the database.
- LLM:
  - Processes the user query along with the retrieved documents to generate a biased, contextually enriched response.

Data Source

The primary data source is the NHS A-Z website, which provides comprehensive healthcare information. The data is scraped and stored in MongoDB Atlas as embedded documents to facilitate efficient similarity search.

Technologies Used

Frontend: React, HTML, CSS
Backend: Python, FastAPI, Langchain
Database: MongoDB Atlas
LLM: OpenAI GPT
Web Scraper: Python (BeautifulSoup, Requests)
Embedding Generation: OpenAI embeddings API

Installation and Setup

Clone the repository:
```
git clone <repository-url>
cd nhs-rag
```

Install dependencies for the backend:

cd rest
npm pip install -r requirements.txt

Install dependencies for the web scraper:

cd scraper
pip install -r requirements.txt

Set up environment variables: Create a .env file in the rest directory with the following details:

MONGODB_ATLAS_CLUSTER_URI=<your-mongodb-atlas-uri>
OPENAI_API_KEY=<your-openai-api-key>
NHS_URL=https://www.nhs.uk/conditions/

Run the web scraper to populate the database:
```
python scraper/scraper.py
```
Start the backend server:
```
fastapi dev main.py
```
Run the frontend (optional):
```
cd frontend
npm install
npm start
```

How It Works

User Query: The user submits a query through the frontend.
Document Retrieval: The REST API sends the query to the RAG engine, which performs a similarity search on the MongoDB database to fetch the top 5 relevant documents.
Augmented Response: The query and retrieved documents are sent to the LLM to generate a contextually enriched response.
Response Delivery: The response is sent back to the frontend and displayed to the user.

Future Improvements

Enhance the web scraper to update data periodically from the NHS website.
Add multi-language support for a wider audience.
Incorporate additional healthcare datasets to improve the breadth of information.
Optimize the embedding generation and similarity search processes for faster responses.

Project Members

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests to improve the project.

License

This project is licensed under the MIT License.

Acknowledgments

NHS A-Z Website for providing the data.
MLH for organizing the "Hack for Hackers" hackathon.
OpenAI for the LLM and embeddings API.

We hope this Healthcare Assistant helps users make informed healthcare decisions efficiently!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
FrontEnd		FrontEnd
backend		backend
brainstorming		brainstorming
notebooks		notebooks
scrapper		scrapper
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthcare Assistant RAG Model

Overview

Features

System Architecture

Components:

Data Source

Technologies Used

Installation and Setup

How It Works

Future Improvements

Project Members

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

jukha/nhs-rag

Folders and files

Latest commit

History

Repository files navigation

Healthcare Assistant RAG Model

Overview

Features

System Architecture

Components:

Data Source

Technologies Used

Installation and Setup

How It Works

Future Improvements

Project Members

Contributing

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages