forked from langchain-ai/langchain
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Harrison/nebula graph (langchain-ai#5865)
Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>
- Loading branch information
1 parent
b3dc57d
commit 4cc7808
Showing
9 changed files
with
712 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,270 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "c94240f5", | ||
"metadata": {}, | ||
"source": [ | ||
"# NebulaGraphQAChain\n", | ||
"\n", | ||
"This notebook shows how to use LLMs to provide a natural language interface to NebulaGraph database." | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "dbc0ee68", | ||
"metadata": {}, | ||
"source": [ | ||
"You will need to have a running NebulaGraph cluster, for which you can run a containerized cluster by running the following script:\n", | ||
"\n", | ||
"```bash\n", | ||
"curl -fsSL nebula-up.siwei.io/install.sh | bash\n", | ||
"```\n", | ||
"\n", | ||
"Other options are:\n", | ||
"- Install as a [Docker Desktop Extension](https://www.docker.com/blog/distributed-cloud-native-graph-database-nebulagraph-docker-extension/). See [here](https://docs.nebula-graph.io/3.5.0/2.quick-start/1.quick-start-workflow/)\n", | ||
"- NebulaGraph Cloud Service. See [here](https://www.nebula-graph.io/cloud)\n", | ||
"- Deploy from package, source code, or via Kubernetes. See [here](https://docs.nebula-graph.io/)\n", | ||
"\n", | ||
"Once the cluster is running, we could create the SPACE and SCHEMA for the database." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c82f4141", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install ipython-ngql\n", | ||
"%load_ext ngql\n", | ||
"\n", | ||
"# connect ngql jupyter extension to nebulagraph\n", | ||
"%ngql --address 127.0.0.1 --port 9669 --user root --password nebula\n", | ||
"# create a new space\n", | ||
"%ngql CREATE SPACE IF NOT EXISTS langchain(partition_num=1, replica_factor=1, vid_type=fixed_string(128));\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "eda0809a", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Wait for a few seconds for the space to be created.\n", | ||
"%ngql USE langchain;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "119fe35c", | ||
"metadata": {}, | ||
"source": [ | ||
"Create the schema, for full dataset, refer [here](https://www.siwei.io/en/nebulagraph-etl-dbt/)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "5aa796ee", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%ngql\n", | ||
"CREATE TAG IF NOT EXISTS movie(name string);\n", | ||
"CREATE TAG IF NOT EXISTS person(name string, birthdate string);\n", | ||
"CREATE EDGE IF NOT EXISTS acted_in();\n", | ||
"CREATE TAG INDEX IF NOT EXISTS person_index ON person(name(128));\n", | ||
"CREATE TAG INDEX IF NOT EXISTS movie_index ON movie(name(128));" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "66e4799a", | ||
"metadata": {}, | ||
"source": [ | ||
"Wait for schema creation to complete, then we can insert some data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "d8eea530", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"UsageError: Cell magic `%%ngql` not found.\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"%%ngql\n", | ||
"INSERT VERTEX person(name, birthdate) VALUES \"Al Pacino\":(\"Al Pacino\", \"1940-04-25\");\n", | ||
"INSERT VERTEX movie(name) VALUES \"The Godfather II\":(\"The Godfather II\");\n", | ||
"INSERT VERTEX movie(name) VALUES \"The Godfather Coda: The Death of Michael Corleone\":(\"The Godfather Coda: The Death of Michael Corleone\");\n", | ||
"INSERT EDGE acted_in() VALUES \"Al Pacino\"->\"The Godfather II\":();\n", | ||
"INSERT EDGE acted_in() VALUES \"Al Pacino\"->\"The Godfather Coda: The Death of Michael Corleone\":();" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "62812aad", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.chat_models import ChatOpenAI\n", | ||
"from langchain.chains import NebulaGraphQAChain\n", | ||
"from langchain.graphs import NebulaGraph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "0928915d", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"graph = NebulaGraph(\n", | ||
" space=\"langchain\",\n", | ||
" username=\"root\",\n", | ||
" password=\"nebula\",\n", | ||
" address=\"127.0.0.1\",\n", | ||
" port=9669,\n", | ||
" session_pool_size=30,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "58c1a8ea", | ||
"metadata": {}, | ||
"source": [ | ||
"## Refresh graph schema information\n", | ||
"\n", | ||
"If the schema of database changes, you can refresh the schema information needed to generate nGQL statements." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "4e3de44f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# graph.refresh_schema()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"id": "1fe76ccd", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Node properties: [{'tag': 'movie', 'properties': [('name', 'string')]}, {'tag': 'person', 'properties': [('name', 'string'), ('birthdate', 'string')]}]\n", | ||
"Edge properties: [{'edge': 'acted_in', 'properties': []}]\n", | ||
"Relationships: ['(:person)-[:acted_in]->(:movie)']\n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"print(graph.get_schema)" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "68a3c677", | ||
"metadata": {}, | ||
"source": [ | ||
"## Querying the graph\n", | ||
"\n", | ||
"We can now use the graph cypher QA chain to ask question of the graph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"id": "7476ce98", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"chain = NebulaGraphQAChain.from_llm(\n", | ||
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n", | ||
")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"id": "ef8ee27b", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"\n", | ||
"\n", | ||
"\u001b[1m> Entering new NebulaGraphQAChain chain...\u001b[0m\n", | ||
"Generated nGQL:\n", | ||
"\u001b[32;1m\u001b[1;3mMATCH (p:`person`)-[:acted_in]->(m:`movie`) WHERE m.`movie`.`name` == 'The Godfather II'\n", | ||
"RETURN p.`person`.`name`\u001b[0m\n", | ||
"Full Context:\n", | ||
"\u001b[32;1m\u001b[1;3m{'p.person.name': ['Al Pacino']}\u001b[0m\n", | ||
"\n", | ||
"\u001b[1m> Finished chain.\u001b[0m\n" | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"'Al Pacino played in The Godfather II.'" | ||
] | ||
}, | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"chain.run(\"Who played in The Godfather II?\")" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
"""Question answering over a graph.""" | ||
from __future__ import annotations | ||
|
||
from typing import Any, Dict, List, Optional | ||
|
||
from pydantic import Field | ||
|
||
from langchain.base_language import BaseLanguageModel | ||
from langchain.callbacks.manager import CallbackManagerForChainRun | ||
from langchain.chains.base import Chain | ||
from langchain.chains.graph_qa.prompts import CYPHER_QA_PROMPT, NGQL_GENERATION_PROMPT | ||
from langchain.chains.llm import LLMChain | ||
from langchain.graphs.nebula_graph import NebulaGraph | ||
from langchain.prompts.base import BasePromptTemplate | ||
|
||
|
||
class NebulaGraphQAChain(Chain): | ||
"""Chain for question-answering against a graph by generating nGQL statements.""" | ||
|
||
graph: NebulaGraph = Field(exclude=True) | ||
ngql_generation_chain: LLMChain | ||
qa_chain: LLMChain | ||
input_key: str = "query" #: :meta private: | ||
output_key: str = "result" #: :meta private: | ||
|
||
@property | ||
def input_keys(self) -> List[str]: | ||
"""Return the input keys. | ||
:meta private: | ||
""" | ||
return [self.input_key] | ||
|
||
@property | ||
def output_keys(self) -> List[str]: | ||
"""Return the output keys. | ||
:meta private: | ||
""" | ||
_output_keys = [self.output_key] | ||
return _output_keys | ||
|
||
@classmethod | ||
def from_llm( | ||
cls, | ||
llm: BaseLanguageModel, | ||
*, | ||
qa_prompt: BasePromptTemplate = CYPHER_QA_PROMPT, | ||
ngql_prompt: BasePromptTemplate = NGQL_GENERATION_PROMPT, | ||
**kwargs: Any, | ||
) -> NebulaGraphQAChain: | ||
"""Initialize from LLM.""" | ||
qa_chain = LLMChain(llm=llm, prompt=qa_prompt) | ||
ngql_generation_chain = LLMChain(llm=llm, prompt=ngql_prompt) | ||
|
||
return cls( | ||
qa_chain=qa_chain, | ||
ngql_generation_chain=ngql_generation_chain, | ||
**kwargs, | ||
) | ||
|
||
def _call( | ||
self, | ||
inputs: Dict[str, Any], | ||
run_manager: Optional[CallbackManagerForChainRun] = None, | ||
) -> Dict[str, str]: | ||
"""Generate nGQL statement, use it to look up in db and answer question.""" | ||
_run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager() | ||
callbacks = _run_manager.get_child() | ||
question = inputs[self.input_key] | ||
|
||
generated_ngql = self.ngql_generation_chain.run( | ||
{"question": question, "schema": self.graph.get_schema}, callbacks=callbacks | ||
) | ||
|
||
_run_manager.on_text("Generated nGQL:", end="\n", verbose=self.verbose) | ||
_run_manager.on_text( | ||
generated_ngql, color="green", end="\n", verbose=self.verbose | ||
) | ||
context = self.graph.query(generated_ngql) | ||
|
||
_run_manager.on_text("Full Context:", end="\n", verbose=self.verbose) | ||
_run_manager.on_text( | ||
str(context), color="green", end="\n", verbose=self.verbose | ||
) | ||
|
||
result = self.qa_chain( | ||
{"question": question, "context": context}, | ||
callbacks=callbacks, | ||
) | ||
return {self.output_key: result[self.qa_chain.output_key]} |
Oops, something went wrong.