RESEARCH ASSISTANT APP

Empowering Research, One Query at a Time!

Built with the tools and technologies:

Overview

The researchAssistantApp is an innovative open-source project designed to streamline the research process. It serves as a virtual research assistant, capable of conducting interviews, generating questions, and synthesizing information into comprehensive reports. Key features include AI analyst personas, web search capabilities, and a user-friendly interface. This tool is ideal for researchers, students, and professionals seeking efficient and organized data collection and analysis.

🌟 Most Important Features:
- 🌐 Multi-Agent Architecture using LangGraph:
  - The app employs a multi-agent architecture powered by LangGraph, enabling parallel research across various perspectives.
  - Users can select from a single agent to multiple agents, allowing for flexibility based on the desired depth and breadth of research.
  - This architecture enhances the ability to gather diverse insights simultaneously, making the research process more efficient and comprehensive.
- 🔍 Tool Calling for LLM Agents:
  - The app includes a tool calling functionality that empowers LLM agents to perform web searches for the most recent documents and information.
  - This feature ensures that agents can access up-to-date resources, enhancing the quality and relevance of the information gathered.
  - Additionally, agents can retrieve results from Wikipedia, providing a reliable source of general knowledge and context for the research topic.
- 🔗 Managing LangGraph and Streamlit States:
  - Both LangGraph and Streamlit maintain their own state stores, which can complicate the integration and management of data between the two systems.
  - Developing a cohesive solution to synchronize these states is a non-trivial challenge that requires careful consideration of data flow and user interactions.
  - Addressing this issue is crucial for ensuring a seamless user experience and maintaining the integrity of the research process.

🎲 A demo version of the app is hosted here.

Main Idea

🎯 Purpose

Facilitate comprehensive research on a specific topic from multiple perspectives.
Utilize Interview personas to simulate diverse viewpoints during the research process.

🛠️ User Interaction

1. Selection of Interview Agents

Users can choose the number of Interview agents to engage for their query.
Options range from a single agent to multiple agents, depending on the depth of research desired.

2. Persona Generation

The app generates unique personas for each selected agent, reflecting different fields, backgrounds, and expertise.
Users can review and customize the generated personas based on their research needs.

3. Feedback Mechanism

Users can provide feedback to include additional interviewers from different fields or backgrounds.
This allows for a more tailored and relevant research experience.

🗣️ Interview Process

1. Engagement with Expert LLM Agent

Once the user is satisfied with the personas, the agents conduct an interview with an expert LLM agent.
The LLM agent has access to web search and Wikipedia search tools to provide accurate and up-to-date information.

2. Interview Dynamics

The interview continues for a predetermined number of turns, as specified by the user at the beginning.
The interview may also conclude earlier if the agents are satisfied with the responses received.

📄 Report Compilation

1. Findings Consolidation

After the interviews, the findings from all personas are compiled into a comprehensive report.

2. Structured Report Creation

Three specialized agents are assigned to write distinct sections of the report:
- Introduction: Summarizes the research topic and objectives.
- Body: Presents detailed findings, insights, and perspectives gathered from the interviews.
- Conclusion: Offers a synthesis of the research findings and potential implications.

🌈 User Benefits

Gain insights from diverse perspectives, enhancing the depth and breadth of research.
Streamlined process for gathering and organizing information efficiently.
Customizable experience tailored to individual research needs and preferences.

🚀 Future Enhancements

Potential integration of additional data sources and research tools.
Continuous improvement of persona generation algorithms for more nuanced perspectives.
User feedback will be actively sought to refine and enhance app functionality.

📈 The graph associated with the APP functionality can be seen as follows:

Features

	Feature	Summary
⚙️	Architecture	The project is structured around a main entry point `app.py` which manages the application's state, user input, and interactions with language models and web search tools. The architecture includes a `models` directory that defines data structures for managing the state of research analysis. Utilities for web searching, question generation, and report writing are leveraged to ensure a structured and efficient process for information gathering and synthesis.
🔩	Code Quality	The codebase is written in Python, with a focus on readability and maintainability. It uses `pydantic` for data validation and `typing_extensions` for additional typing capabilities, enhancing the code's robustness and clarity. The project follows good practices of error handling.
📄	Documentation	The project's primary language is Python Installation, usage, and test commands are well-documented, providing clear instructions for setting up and running the project. The `LICENSE.txt` file contains the GNU General Public License (GPL) version 3, outlining the terms and conditions for copying, modifying, and distributing the software.
🔌	Integrations	The project integrates with language learning models (LLM) from either OpenAI or Google, as seen in `utils/llms.py`. It also supports the Tavily search tool for web searching, as seen in `utils/webSearchTool.py`. User interface elements are managed using `streamlit`, a Python library for data apps.
🧩	Modularity	The project is highly modular, with separate utilities for web searching, question generation, report writing, and more. It has a `models` directory that encapsulates data structures for managing the state of research analysis. Each utility module has a specific role, enhancing the code's readability and maintainability.
⚡️	Performance	The App is able to check for valid API keys. It is capable of Generating research level answers and cite their sources.

Project Structure

└── researchAssistant_App.git/
    ├── LICENSE.txt
    ├── README.md
    ├── app.py
    ├── models
    │   ├── constants.py
    │   ├── graph.py
    │   ├── models.py
    │   └── states.py
    ├── requirements.txt
    └── utils
        ├── analyst_utils.py
        ├── interview_utils.py
        ├── llms.py
        ├── print_helpers.py
        ├── sidebar.py
        ├── webSearchTool.py
        └── writer_utils.py

Project Index

RESEARCHASSISTANT_APP.GIT/

__root__

LICENSE.txt - The file 'LICENSE.txt' is a crucial part of the project's codebase
- It contains the GNU General Public License (GPL) version 3, which outlines the terms and conditions for copying, modifying, and distributing the software
- This license ensures the software remains free and open-source, allowing users to share and modify the program while maintaining the freedom of software distribution
- It's not directly involved in the functionality or architecture of the software but serves as a legal framework that governs its use and distribution.

app.py - App.py serves as the main entry point for a research assistant application
- It manages the application's state, user input, and interactions with language models and web search tools
- The application allows users to ask questions, receive responses from simulated analysts, provide feedback, and receive a final report
- It also handles API key management and user interface elements such as forms, buttons, and containers.

requirements.txt - Requirements.txt manages the dependencies for the project, specifying the exact versions of the libraries needed
- It ensures consistent environment setup across different stages of the project, including langchain modules, pydantic for data validation, python-dotenv for environment variable management, streamlit for data apps, and typing extensions for additional typing capabilities.

models

states.py - The 'states.py' in the 'models' directory defines data structures for managing the state of research analysis
- It includes classes for generating analysts, managing the research graph, and conducting interviews
- These classes handle tasks such as tracking research topics, managing analyst feedback, and structuring the final report.

graph.py - The 'graph.py' in the 'models' directory constructs a state graph for a research assistant system
- It outlines the flow of operations from creating analysts, conducting interviews, to writing and finalizing reports
- It leverages various utilities for web searching, question generation, and report writing, ensuring a structured and efficient process for information gathering and synthesis.

constants.py - The 'constants.py' file in the 'models' directory serves as a repository for instructions and guidelines used across the project
- It contains predefined instructions for various tasks such as creating AI analyst personas, conducting interviews, writing search queries, answering questions, and crafting different sections of a report
- These instructions are used to guide the behavior of different components within the system.

models.py - Models.py defines data models for the project, specifically the Analyst, Perspectives, and SearchQuery classes
- These classes represent an analyst's information, a collection of analysts, and a search query respectively
- They are crucial for data validation, serialization, and documentation in the project's architecture.

utils

webSearchTool.py - WebSearchTool.py in the Langchain project serves as a utility for instantiating a web search tool
- It primarily supports the Tavily search tool, using an API key and a maximum result limit as parameters
- The code also includes error handling for incorrect or missing API keys.

llms.py - The llms.py utility in the Langchain project serves to instantiate Language Learning Models (LLM) from either OpenAI or Google
- It validates the provided API keys, sets the model parameters, and handles exceptions, ensuring the correct and efficient operation of the chosen LLM within the broader codebase.

analyst_utils.py - Analyst_utils.py contributes to the project by managing the creation of analysts and handling human feedback within the GenerateAnalystsState
- It enforces structured output, generates system messages, and determines the next node to execute based on the presence of human feedback
- It also plays a role in ending the process when necessary.

interview_utils.py - The 'interview_utils.py' module in the project serves as a utility for conducting simulated interviews
- It generates questions, searches the web and Wikipedia for relevant information, formulates answers, and saves the interview data
- It also routes messages between the question and answer phases and initiates all interviews in parallel using the Send API.

writer_utils.py - Writer_utils.py is a utility module in the project that focuses on generating various sections of a research report
- It includes functions to write individual sections, an introduction, a conclusion, and a final report based on the state of the research graph
- The module interacts with the language model to generate content and format it into a cohesive report.

sidebar.py - The 'sidebar.py' module in the project creates a user interface sidebar for a web application powered by Langchain
- It allows users to select and configure language learning models (LLM) and web search tools from providers like OpenAI and Tavily
- The module also handles the input of API keys and other parameters for these services.

print_helpers.py - Print_helpers.py, located in the utils directory, serves as a utility module for displaying analyst data
- It empties a given container and populates it with formatted information about each analyst, including their name, affiliation, role, and description
- A separate function for Streamlit integration is also defined but currently unimplemented.

Getting Started

Prerequisites

Before getting started with researchAssistant_App.git, ensure your runtime environment meets the following requirements:

Programming Language: Python
Package Manager: Pip

Installation

Install researchAssistant_App.git using one of the following methods:

Build from source:

Clone the researchAssistant_App.git repository:

❯ git clone https://github.com/jeet-ss/researchAssistant_App.git

Navigate to the project directory:

❯ cd researchAssistant_App.git

Install the project dependencies:

Using pip

❯ pip install -r requirements.txt

Usage

Run researchAssistant_App.git using the following command: Using streamlit

❯ python {entrypoint}

Project Roadmap

Task 1: ~~Implement Interview based Agents.~~
Task 2: Implement Query Generator and Answer agent.

Contributing

💬 Join the Discussions: Share your insights, provide feedback, or ask questions.
🐛 Report Issues: Submit bugs found or log feature requests for the researchAssistant_App project.
💡 Submit Pull Requests: Review open PRs, and submit your own PRs.

Contributing Guidelines

Fork the Repository: Start by forking the project repository to your github account.
Clone Locally: Clone the forked repository to your local machine using a git client.
```
git clone https://github.com/jeet-ss/researchAssistant_App.git
```
Create a New Branch: Always work on a new branch, giving it a descriptive name.
```
git checkout -b new-feature-x
```
Make Your Changes: Develop and test your changes locally.
Commit Your Changes: Commit with a clear message describing your updates.
```
git commit -m 'Implemented new feature x.'
```
Push to github: Push the changes to your forked repository.
```
git push origin new-feature-x
```
Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!

Contributor Graph

License

This project is protected under the SELECT-A-LICENSE License. For more details, refer to the LICENSE file.

Acknowledgments

List any resources, contributors, inspiration, etc. here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RESEARCH ASSISTANT APP

Table of Contents

Overview

Main Idea

🎯 Purpose

🛠️ User Interaction

1. Selection of Interview Agents

2. Persona Generation

3. Feedback Mechanism

🗣️ Interview Process

1. Engagement with Expert LLM Agent

2. Interview Dynamics

📄 Report Compilation

1. Findings Consolidation

2. Structured Report Creation

🌈 User Benefits

🚀 Future Enhancements

📈 The graph associated with the APP functionality can be seen as follows:

Features

Project Structure

Project Index

Getting Started

Prerequisites

Installation

Usage

Project Roadmap

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.devcontainer		.devcontainer
models		models
resources		resources
utils		utils
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

LICENSE.txt	- The file 'LICENSE.txt' is a crucial part of the project's codebase - It contains the GNU General Public License (GPL) version 3, which outlines the terms and conditions for copying, modifying, and distributing the software - This license ensures the software remains free and open-source, allowing users to share and modify the program while maintaining the freedom of software distribution - It's not directly involved in the functionality or architecture of the software but serves as a legal framework that governs its use and distribution.
app.py	- App.py serves as the main entry point for a research assistant application - It manages the application's state, user input, and interactions with language models and web search tools - The application allows users to ask questions, receive responses from simulated analysts, provide feedback, and receive a final report - It also handles API key management and user interface elements such as forms, buttons, and containers.
requirements.txt	- Requirements.txt manages the dependencies for the project, specifying the exact versions of the libraries needed - It ensures consistent environment setup across different stages of the project, including langchain modules, pydantic for data validation, python-dotenv for environment variable management, streamlit for data apps, and typing extensions for additional typing capabilities.

states.py	- The 'states.py' in the 'models' directory defines data structures for managing the state of research analysis - It includes classes for generating analysts, managing the research graph, and conducting interviews - These classes handle tasks such as tracking research topics, managing analyst feedback, and structuring the final report.
graph.py	- The 'graph.py' in the 'models' directory constructs a state graph for a research assistant system - It outlines the flow of operations from creating analysts, conducting interviews, to writing and finalizing reports - It leverages various utilities for web searching, question generation, and report writing, ensuring a structured and efficient process for information gathering and synthesis.
constants.py	- The 'constants.py' file in the 'models' directory serves as a repository for instructions and guidelines used across the project - It contains predefined instructions for various tasks such as creating AI analyst personas, conducting interviews, writing search queries, answering questions, and crafting different sections of a report - These instructions are used to guide the behavior of different components within the system.
models.py	- Models.py defines data models for the project, specifically the Analyst, Perspectives, and SearchQuery classes - These classes represent an analyst's information, a collection of analysts, and a search query respectively - They are crucial for data validation, serialization, and documentation in the project's architecture.

webSearchTool.py	- WebSearchTool.py in the Langchain project serves as a utility for instantiating a web search tool - It primarily supports the Tavily search tool, using an API key and a maximum result limit as parameters - The code also includes error handling for incorrect or missing API keys.
llms.py	- The llms.py utility in the Langchain project serves to instantiate Language Learning Models (LLM) from either OpenAI or Google - It validates the provided API keys, sets the model parameters, and handles exceptions, ensuring the correct and efficient operation of the chosen LLM within the broader codebase.
analyst_utils.py	- Analyst_utils.py contributes to the project by managing the creation of analysts and handling human feedback within the GenerateAnalystsState - It enforces structured output, generates system messages, and determines the next node to execute based on the presence of human feedback - It also plays a role in ending the process when necessary.
interview_utils.py	- The 'interview_utils.py' module in the project serves as a utility for conducting simulated interviews - It generates questions, searches the web and Wikipedia for relevant information, formulates answers, and saves the interview data - It also routes messages between the question and answer phases and initiates all interviews in parallel using the Send API.
writer_utils.py	- Writer_utils.py is a utility module in the project that focuses on generating various sections of a research report - It includes functions to write individual sections, an introduction, a conclusion, and a final report based on the state of the research graph - The module interacts with the language model to generate content and format it into a cohesive report.
sidebar.py	- The 'sidebar.py' module in the project creates a user interface sidebar for a web application powered by Langchain - It allows users to select and configure language learning models (LLM) and web search tools from providers like OpenAI and Tavily - The module also handles the input of API keys and other parameters for these services.
print_helpers.py	- Print_helpers.py, located in the utils directory, serves as a utility module for displaying analyst data - It empties a given container and populates it with formatted information about each analyst, including their name, affiliation, role, and description - A separate function for Streamlit integration is also defined but currently unimplemented.

License

jeet-ss/researchAssistant_App

Folders and files

Latest commit

History

Repository files navigation

RESEARCH ASSISTANT APP

Table of Contents

Overview

Main Idea

🎯 Purpose

🛠️ User Interaction

1. Selection of Interview Agents

2. Persona Generation

3. Feedback Mechanism

🗣️ Interview Process

1. Engagement with Expert LLM Agent

2. Interview Dynamics

📄 Report Compilation

1. Findings Consolidation

2. Structured Report Creation

🌈 User Benefits

🚀 Future Enhancements

📈 The graph associated with the APP functionality can be seen as follows:

Features

Project Structure

Project Index

Getting Started

Prerequisites

Installation

Usage

Project Roadmap

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages