Skip to content

jeet-ss/researchAssistant_App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RESEARCH ASSISTANT APP

Empowering Research, One Query at a Time!

license last-commit repo-top-language repo-language-count

Built with the tools and technologies:

Streamlit Python Pydantic


Table of Contents


Overview

The researchAssistantApp is an innovative open-source project designed to streamline the research process. It serves as a virtual research assistant, capable of conducting interviews, generating questions, and synthesizing information into comprehensive reports. Key features include AI analyst personas, web search capabilities, and a user-friendly interface. This tool is ideal for researchers, students, and professionals seeking efficient and organized data collection and analysis.

  • 🌟 Most Important Features:

    • 🌐 Multi-Agent Architecture using LangGraph:

      • The app employs a multi-agent architecture powered by LangGraph, enabling parallel research across various perspectives.
      • Users can select from a single agent to multiple agents, allowing for flexibility based on the desired depth and breadth of research.
      • This architecture enhances the ability to gather diverse insights simultaneously, making the research process more efficient and comprehensive.
    • πŸ” Tool Calling for LLM Agents:

      • The app includes a tool calling functionality that empowers LLM agents to perform web searches for the most recent documents and information.
      • This feature ensures that agents can access up-to-date resources, enhancing the quality and relevance of the information gathered.
      • Additionally, agents can retrieve results from Wikipedia, providing a reliable source of general knowledge and context for the research topic.
    • πŸ”— Managing LangGraph and Streamlit States:

      • Both LangGraph and Streamlit maintain their own state stores, which can complicate the integration and management of data between the two systems.
      • Developing a cohesive solution to synchronize these states is a non-trivial challenge that requires careful consideration of data flow and user interactions.
      • Addressing this issue is crucial for ensuring a seamless user experience and maintaining the integrity of the research process.

🎲 A demo version of the app is hosted here.


Main Idea

🎯 Purpose

  • Facilitate comprehensive research on a specific topic from multiple perspectives.
  • Utilize Interview personas to simulate diverse viewpoints during the research process.

πŸ› οΈ User Interaction

1. Selection of Interview Agents

  • Users can choose the number of Interview agents to engage for their query.
  • Options range from a single agent to multiple agents, depending on the depth of research desired.

2. Persona Generation

  • The app generates unique personas for each selected agent, reflecting different fields, backgrounds, and expertise.
  • Users can review and customize the generated personas based on their research needs.

3. Feedback Mechanism

  • Users can provide feedback to include additional interviewers from different fields or backgrounds.
  • This allows for a more tailored and relevant research experience.

πŸ—£οΈ Interview Process

1. Engagement with Expert LLM Agent

  • Once the user is satisfied with the personas, the agents conduct an interview with an expert LLM agent.
  • The LLM agent has access to web search and Wikipedia search tools to provide accurate and up-to-date information.

2. Interview Dynamics

  • The interview continues for a predetermined number of turns, as specified by the user at the beginning.
  • The interview may also conclude earlier if the agents are satisfied with the responses received.

πŸ“„ Report Compilation

1. Findings Consolidation

  • After the interviews, the findings from all personas are compiled into a comprehensive report.

2. Structured Report Creation

  • Three specialized agents are assigned to write distinct sections of the report:
    • Introduction: Summarizes the research topic and objectives.
    • Body: Presents detailed findings, insights, and perspectives gathered from the interviews.
    • Conclusion: Offers a synthesis of the research findings and potential implications.

🌈 User Benefits

  • Gain insights from diverse perspectives, enhancing the depth and breadth of research.
  • Streamlined process for gathering and organizing information efficiently.
  • Customizable experience tailored to individual research needs and preferences.

πŸš€ Future Enhancements

  • Potential integration of additional data sources and research tools.
  • Continuous improvement of persona generation algorithms for more nuanced perspectives.
  • User feedback will be actively sought to refine and enhance app functionality.

πŸ“ˆ The graph associated with the APP functionality can be seen as follows:

app graph


Features

Feature Summary
βš™οΈ Architecture
  • The project is structured around a main entry point app.py which manages the application's state, user input, and interactions with language models and web search tools.
  • The architecture includes a models directory that defines data structures for managing the state of research analysis.
  • Utilities for web searching, question generation, and report writing are leveraged to ensure a structured and efficient process for information gathering and synthesis.
πŸ”© Code Quality
  • The codebase is written in Python, with a focus on readability and maintainability.
  • It uses pydantic for data validation and typing_extensions for additional typing capabilities, enhancing the code's robustness and clarity.
  • The project follows good practices of error handling.
πŸ“„ Documentation
  • The project's primary language is Python
  • Installation, usage, and test commands are well-documented, providing clear instructions for setting up and running the project.
  • The LICENSE.txt file contains the GNU General Public License (GPL) version 3, outlining the terms and conditions for copying, modifying, and distributing the software.
πŸ”Œ Integrations
  • The project integrates with language learning models (LLM) from either OpenAI or Google, as seen in utils/llms.py.
  • It also supports the Tavily search tool for web searching, as seen in utils/webSearchTool.py.
  • User interface elements are managed using streamlit, a Python library for data apps.
🧩 Modularity
  • The project is highly modular, with separate utilities for web searching, question generation, report writing, and more.
  • It has a models directory that encapsulates data structures for managing the state of research analysis.
  • Each utility module has a specific role, enhancing the code's readability and maintainability.
⚑️ Performance
  • The App is able to check for valid API keys.
  • It is capable of Generating research level answers and cite their sources.

Project Structure

└── researchAssistant_App.git/
    β”œβ”€β”€ LICENSE.txt
    β”œβ”€β”€ README.md
    β”œβ”€β”€ app.py
    β”œβ”€β”€ models
    β”‚   β”œβ”€β”€ constants.py
    β”‚   β”œβ”€β”€ graph.py
    β”‚   β”œβ”€β”€ models.py
    β”‚   └── states.py
    β”œβ”€β”€ requirements.txt
    └── utils
        β”œβ”€β”€ analyst_utils.py
        β”œβ”€β”€ interview_utils.py
        β”œβ”€β”€ llms.py
        β”œβ”€β”€ print_helpers.py
        β”œβ”€β”€ sidebar.py
        β”œβ”€β”€ webSearchTool.py
        └── writer_utils.py

Project Index

RESEARCHASSISTANT_APP.GIT/
__root__
LICENSE.txt - The file 'LICENSE.txt' is a crucial part of the project's codebase
- It contains the GNU General Public License (GPL) version 3, which outlines the terms and conditions for copying, modifying, and distributing the software
- This license ensures the software remains free and open-source, allowing users to share and modify the program while maintaining the freedom of software distribution
- It's not directly involved in the functionality or architecture of the software but serves as a legal framework that governs its use and distribution.
app.py - App.py serves as the main entry point for a research assistant application
- It manages the application's state, user input, and interactions with language models and web search tools
- The application allows users to ask questions, receive responses from simulated analysts, provide feedback, and receive a final report
- It also handles API key management and user interface elements such as forms, buttons, and containers.
requirements.txt - Requirements.txt manages the dependencies for the project, specifying the exact versions of the libraries needed
- It ensures consistent environment setup across different stages of the project, including langchain modules, pydantic for data validation, python-dotenv for environment variable management, streamlit for data apps, and typing extensions for additional typing capabilities.
models
states.py - The 'states.py' in the 'models' directory defines data structures for managing the state of research analysis
- It includes classes for generating analysts, managing the research graph, and conducting interviews
- These classes handle tasks such as tracking research topics, managing analyst feedback, and structuring the final report.
graph.py - The 'graph.py' in the 'models' directory constructs a state graph for a research assistant system
- It outlines the flow of operations from creating analysts, conducting interviews, to writing and finalizing reports
- It leverages various utilities for web searching, question generation, and report writing, ensuring a structured and efficient process for information gathering and synthesis.
constants.py - The 'constants.py' file in the 'models' directory serves as a repository for instructions and guidelines used across the project
- It contains predefined instructions for various tasks such as creating AI analyst personas, conducting interviews, writing search queries, answering questions, and crafting different sections of a report
- These instructions are used to guide the behavior of different components within the system.
models.py - Models.py defines data models for the project, specifically the Analyst, Perspectives, and SearchQuery classes
- These classes represent an analyst's information, a collection of analysts, and a search query respectively
- They are crucial for data validation, serialization, and documentation in the project's architecture.
utils
webSearchTool.py - WebSearchTool.py in the Langchain project serves as a utility for instantiating a web search tool
- It primarily supports the Tavily search tool, using an API key and a maximum result limit as parameters
- The code also includes error handling for incorrect or missing API keys.
llms.py - The llms.py utility in the Langchain project serves to instantiate Language Learning Models (LLM) from either OpenAI or Google
- It validates the provided API keys, sets the model parameters, and handles exceptions, ensuring the correct and efficient operation of the chosen LLM within the broader codebase.
analyst_utils.py - Analyst_utils.py contributes to the project by managing the creation of analysts and handling human feedback within the GenerateAnalystsState
- It enforces structured output, generates system messages, and determines the next node to execute based on the presence of human feedback
- It also plays a role in ending the process when necessary.
interview_utils.py - The 'interview_utils.py' module in the project serves as a utility for conducting simulated interviews
- It generates questions, searches the web and Wikipedia for relevant information, formulates answers, and saves the interview data
- It also routes messages between the question and answer phases and initiates all interviews in parallel using the Send API.
writer_utils.py - Writer_utils.py is a utility module in the project that focuses on generating various sections of a research report
- It includes functions to write individual sections, an introduction, a conclusion, and a final report based on the state of the research graph
- The module interacts with the language model to generate content and format it into a cohesive report.
sidebar.py - The 'sidebar.py' module in the project creates a user interface sidebar for a web application powered by Langchain
- It allows users to select and configure language learning models (LLM) and web search tools from providers like OpenAI and Tavily
- The module also handles the input of API keys and other parameters for these services.
print_helpers.py - Print_helpers.py, located in the utils directory, serves as a utility module for displaying analyst data
- It empties a given container and populates it with formatted information about each analyst, including their name, affiliation, role, and description
- A separate function for Streamlit integration is also defined but currently unimplemented.

Getting Started

Prerequisites

Before getting started with researchAssistant_App.git, ensure your runtime environment meets the following requirements:

  • Programming Language: Python
  • Package Manager: Pip

Installation

Install researchAssistant_App.git using one of the following methods:

Build from source:

  1. Clone the researchAssistant_App.git repository:
❯ git clone https://github.com/jeet-ss/researchAssistant_App.git
  1. Navigate to the project directory:
❯ cd researchAssistant_App.git
  1. Install the project dependencies:

Using pip Β 

❯ pip install -r requirements.txt

Usage

Run researchAssistant_App.git using the following command: Using streamlit Β 

❯ python {entrypoint}

Project Roadmap

  • Task 1: Implement Interview based Agents.
  • Task 2: Implement Query Generator and Answer agent.

Contributing

  • πŸ’¬ Join the Discussions: Share your insights, provide feedback, or ask questions.
  • πŸ› Report Issues: Submit bugs found or log feature requests for the researchAssistant_App project.
  • πŸ’‘ Submit Pull Requests: Review open PRs, and submit your own PRs.
Contributing Guidelines
  1. Fork the Repository: Start by forking the project repository to your github account.
  2. Clone Locally: Clone the forked repository to your local machine using a git client.
    git clone https://github.com/jeet-ss/researchAssistant_App.git
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear message describing your updates.
    git commit -m 'Implemented new feature x.'
  6. Push to github: Push the changes to your forked repository.
    git push origin new-feature-x
  7. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
  8. Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!
Contributor Graph


License

This project is protected under the SELECT-A-LICENSE License. For more details, refer to the LICENSE file.


Acknowledgments

  • List any resources, contributors, inspiration, etc. here.

About

Research Assitant app powered by LLM Agents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages