GPT-Researcher Tools for LangChain

Introduction
- Key Features
Installation and Setup
Usage Examples
- LocalGPTResearcher Example
- WebGPTResearcher Example
Chaining with Other Components and Agentic Systems
- Using AgentExecutor with WebGPTResearcher
- Simple Sequential Chaining of WebGPTResearcher
Building from Base Class
Performance Considerations
Links and References
Contribution Guide

Introduction

The LocalGPTResearcher and WebGPTResearcher tools are designed to assist with conducting thorough research on specific topics or queries. These tools leverage the power of GPT models to generate detailed reports, making them ideal for various research-related tasks. The LocalGPTResearcher tool accesses local data files, while the WebGPTResearcher retrieves information from the web.

Key Features

🔬 The LocalGPTResearcher can work with various local file formats such as PDF, Word documents, CSVs, and more.
🛜 The WebGPTResearcher fetches data directly from the internet, making it suitable for up-to-date information gathering.
📝 Generate research, outlines, resources and lessons reports with local documents and web sources
📜 Can generate long and detailed research reports (over 2K words)
🌐 Aggregates over 20 web sources per research to form objective and factual conclusions
🖥️ Includes an easy-to-use web interface (HTML/CSS/JS)
🔍 Scrapes web sources with javascript support
📂 Keeps track and context of visited and used web sources
📄 Export research reports to PDF, Word and more...

Installation and Setup

Prerequisites

Ensure you have Python 3 installed on your system.

Installation

Install the necessary packages using pip:

pip install gpt-researcher

Environment Variables

For LocalGPTResearcher, you need to set the following environment variables:

export DOC_PATH=/path/to/your/documents
export OPENAI_API_KEY=your-openai-api-key
export TAVILY_API_KEY=your-tavily-api-key

For WebGPTResearcher, only the OPENAI_API_KEY and TAVILY_API_KEY are required:

export OPENAI_API_KEY=your-openai-api-key
export TAVILY_API_KEY=your-tavily-api-key

Usage Examples

LocalGPTResearcher Example

This example demonstrates how to use LocalGPTResearcher to generate a report based on local documents.

from libs.community.langchain_community.tools.gpt_researcher.tool import LocalGPTResearcher  # This will be changed after successful PR

# Initialize the tool
researcher_local = LocalGPTResearcher(report_type="research_report")
# You can also define it as `researcher_local = LocalGPTResearcher()` - default report_type is research_report.

# Run a query
query = "What is the demographics of Apple inc look like?"
report = researcher_local.invoke({"query":query})

print("Generated Report:", report)

WebGPTResearcher Example

This example shows how to use WebGPTResearcher to generate a report based on web data.

from libs.community.langchain_community.tools.gpt_researcher.tool import WebGPTResearcher  # This will be changed after successful PR

# Initialize the tool
researcher_web = WebGPTResearcher(report_type="research_report") # report_type="research_report" is optional as the default value is `research_report`

# Run a query
query = "What are the latest advancements in AI?"
report = researcher_web.invoke({"query":query})

print("Generated Report:", report)

Chaining with Other Components and Agentic Systems

Example: Using `AgentExecutor` with `WebGPTResearcher`

Let us see how to build an AgentExecutor wrapper that uses an LLM and our tool to write an essay and provide a citation/signature at the end of the report.

from libs.community.langchain_community.tools.gpt_researcher.tool import WebGPTResearcher  # This will be changed after successful PR

from langchain import hub
from langchain_core.tools import Tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import tool
from langchain_openai import ChatOpenAI


# Let us see how to use the WebGPTResearcher tool along with AgentExecutor to perform a grand task with decision making.
# 1. Let us build a Reactive Agent who takes decisions based on reasoning.
# 2. Let us give our agent 2 tools - WebGPTResearcher and a dummy tool that provides a signature at the end of the text
# 3. We can then wrap our agent and tools inside an AgentExecutor object and ask our question!
# The expectation is the response must be signed at the end after a long report on a research topic.


# Create a new tool called citation_provider.
@tool
def citation_provider(text: str) -> str:
    """Provides a citation or signature"""
    return "\n- Written by GPT-Makesh\nThanks for reading!\n"


# Create the WebGPTResearcher tool
researcher_web = WebGPTResearcher("research_report")

# Initialize tools and components
tools = [
    researcher_web,
    Tool(
        name = "citation_tool",  
        func = citation_provider,  
        description = "Useful for when you need to add citation or signature at the end of text",
    ),
]

# Create an LLM
llm = ChatOpenAI(model="gpt-4o")
prompt = hub.pull("hwchase17/react")

# Create the ReAct agent using the create_react_agent function
agent = create_react_agent(
    llm=llm,
    tools=tools,
    prompt=prompt,
    stop_sequence=True,
)

# Wrap the components inside an AgentExecutor
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)

# Run the agent
question = "What are the recent advancements in AI? Provide a citation for your report too."
response = agent_executor.invoke({"input": question})
print("Agent Response:", response)

Example: Simple Sequential Chaining of `WebGPTResearcher`

Let us build a chain of runnables that have a researcher who writes a report and a grader who then grades and scores the report.

from libs.community.langchain_community.tools.gpt_researcher.tool import WebGPTResearcher

from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnableLambda
from langchain_openai import ChatOpenAI

# Let us use WebGPTResearcher to grade the essay using LECL langchain Chaining tricks
# 1. Use the researcher to write an essay
# 2. Pass it as a chat_prompt_template (a runnable) to a grader to score the essay
# 3. Parse the output as a string


# Create a ChatOpenAI model
grader = ChatOpenAI(model="gpt-4o")
researcher_tool = WebGPTResearcher()
prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a essay grader. Give score out of 10 in brief"),
        ("human", "The essay: {essay}"),
    ]
)

# Define our WebGPTResearcher tool as a RunnableLambda
researcher = RunnableLambda(lambda x: researcher_tool.invoke(x))

# Create the combined chain using LangChain Expression Language (LCEL)
chain = researcher | prompt_template | grader | StrOutputParser() 

# Run the chain
result = chain.invoke({"query": "What are the recent advancements in AI?"})

# Output
print(result)

Building from Base Class

Extending `BaseGPTResearcher`

You can create custom tools by extending the BaseGPTResearcher class. Here's an example:

from libs.community.langchain_community.tools.gpt_researcher.tool import WebGPTResearcher

class CustomGPTResearcher(BaseGPTResearcher):
    name = ""
    description = ""  
    def __init__(self, report_type: ReportType = ReportType.RESEARCH):
        super().__init__(report_type=report_type, report_source="web")

    # Override or extend methods as needed (You need to implement `_run()` method, `_arun()` is optional)

API reference: (GPT Researcher tool)[link]

Building CustomGPTResearcher

You can define a custom GPTR tool as shown below:

import asyncio
from typing import Optional, Type

from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import BaseTool
from gpt_researcher import GPTResearcher


class GPTRInput(BaseModel):
    """Input schema for the GPT-Researcher tool."""
    query: str = Field(description="The search query for the research")

class MyGPTResearcher(BaseTool):
    name: str = "custom_gpt_researcher"
    description: str = "Base tool for researching and producing detailed information about a topic or query using the internet."
    args_schema: Type[BaseModel] = GPTRInput

    async def get_report(self, query: str) -> str:
        try:
            researcher = GPTResearcher(
                query=query,
                report_type="research_report",
                report_source="web",
                verbose=False
            )
            await researcher.conduct_research()
            report = await researcher.write_report()
            return report
        except Exception as e:
            raise ValueError(f"Error generating report: {str(e)}")

    def _run(
            self, 
            query: str, 
            run_manager: Optional[CallbackManagerForToolRun] = None
        ) -> str:
        answer = asyncio.run(self.get_report(query=query))
        answer += "\n\n- By GPT-Makesh.\nThanks for reading!"
        return answer

my_researcher = MyGPTResearcher()
report = my_researcher.invoke({"query": "What are the recent advancements in AI?"})
print(report)

Off-the-Shelf Usage

Alternatively, you can directly use the provided tools without modification off-the-shelf.

from libs.community.langchain_community.tools.gpt_researcher.tool import WebGPTResearcher, LocalGPTResearcher

# Use LocalGPTResearcher
researcher_local = LocalGPTResearcher(report_type="research_report")
report = researcher_local.invoke({'query':"What can you tell about the company?"})

# Use WebGPTResearcher
researcher_web = WebGPTResearcher(report_type="research_report")
report = researcher_web.invoke({'query':"What are the latest advancements in AI?"})

Performance Considerations

Time and Cost Estimates: The tools are optimized for performance and cost, using models like gpt-4o-mini and gpt-4o (128K context) only when necessary. The average research task takes about 3 minutes and costs approximately $0.005.
Usage Limitations: Be aware of potential limitations such as maximum query length and data size when working with large local datasets or complex web queries.

Links and References

GPT-Researcher Documentation: For a comprehensive guide, visit GPT-Researcher Documentation.
GitHub Repository: Explore the code and contribute at GPT-Researcher on GitHub.

Contribution Guide

We welcome contributions to improve and extend the GPT-Researcher tools. Visit the GitHub repository to get started with contributing.

Name		Name	Last commit message	Last commit date
Latest commit History 10,799 Commits
.devcontainer		.devcontainer
.github		.github
Archive		Archive
cookbook		cookbook
docker		docker
docs		docs
libs		libs
scripts		scripts
templates		templates
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MIGRATE.md		MIGRATE.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT-Researcher Tools for LangChain

Table of Contents

Introduction

Key Features

Installation and Setup

Prerequisites

Installation

Environment Variables

Usage Examples

LocalGPTResearcher Example

WebGPTResearcher Example

Chaining with Other Components and Agentic Systems

Example: Using `AgentExecutor` with `WebGPTResearcher`

Example: Simple Sequential Chaining of `WebGPTResearcher`

Building from Base Class

Extending `BaseGPTResearcher`

Building CustomGPTResearcher

Off-the-Shelf Usage

Performance Considerations

Links and References

Contribution Guide

About

Releases

Packages

Contributors 3,017

Languages

License

Makesh-Srinivasan/LangChain-GPT-Researcher

Folders and files

Latest commit

History

Repository files navigation

GPT-Researcher Tools for LangChain

Table of Contents

Introduction

Key Features

Installation and Setup

Prerequisites

Installation

Environment Variables

Usage Examples

LocalGPTResearcher Example

WebGPTResearcher Example

Chaining with Other Components and Agentic Systems

Example: Using AgentExecutor with WebGPTResearcher

Example: Simple Sequential Chaining of WebGPTResearcher

Building from Base Class

Extending BaseGPTResearcher

Building CustomGPTResearcher

Off-the-Shelf Usage

Performance Considerations

Links and References

Contribution Guide

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3,017

Languages

Example: Using `AgentExecutor` with `WebGPTResearcher`

Example: Simple Sequential Chaining of `WebGPTResearcher`

Extending `BaseGPTResearcher`

Packages