Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add neo4j vector index docs #2895

Merged
merged 3 commits into from
Oct 13, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Neo4j Vector Index

Neo4j is an open-source graph database with integrated support for vector similarity search.
It supports:

- approximate nearest neighbor search
- Euclidean similarity and cosine similarity
- Hybrid search combining vector and keyword searches

## Setup

To work with Neo4j Vector Index, you need to install the `neo4j-driver` package:

```bash npm2yarn
npm install neo4j-driver
```

### Setup a `Neo4j` self hosted instance with `docker-compose`

`Neo4j` provides a prebuilt Docker image that can be used to quickly setup a self-hosted Neo4j database instance.
Create a file below named `docker-compose.yml`:

import CodeBlock from "@theme/CodeBlock";
import DockerExample from "@examples/indexes/vector_stores/neo4j_vector/docker-compose.example.yml";

<CodeBlock language="yml" name="docker-compose.yml">
{DockerExample}
</CodeBlock>
Comment on lines +18 to +28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be

docker run -p7474:7474 -p7687:7687 --env NEO4J_AUTH=neo4j/pleaseletmein neo4j

But maybe using Docker Compose is the preferred way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everybody else uses the docker compose, so I went with that


And then in the same directory, run `docker compose up` to start the container.

You can find more information on how to setup `Neo4j` on their [website](https://neo4j.com/docs/operations-manual/current/installation/).

## Usage

import Example from "@examples/indexes/vector_stores/neo4j_vector/neo4j_vector.ts";

One complete example of using `Neo4jVectorIndex` is the following:

<CodeBlock language="typescript">{Example}</CodeBlock>

### Use retrievalQuery parameter to customize responses

import RetrievalExample from "@examples/indexes/vector_stores/neo4j_vector/neo4j_vector_retrieval.ts";

<CodeBlock language="typescript">{RetrievalExample}</CodeBlock>

### Instantiate Neo4jVectorIndex from existing graph

import ExistingGraphExample from "@examples/indexes/vector_stores/neo4j_vector/neo4j_vector_existinggraph.ts";

<CodeBlock language="typescript">{ExistingGraphExample}</CodeBlock>
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
services:
database:
image: neo4j
ports:
- 7687:7687
- 7474:7474
environment:
- NEO4J_AUTH=neo4j/pleaseletmein
36 changes: 36 additions & 0 deletions examples/src/indexes/vector_stores/neo4j_vector/neo4j_vector.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { Neo4jVectorIndex } from "langchain/vectorstores/neo4j_vector";

// Configuration object for Neo4j connection and other related settings
const config = {
url: "bolt://localhost:7687", // URL for the Neo4j instance
username: "neo4j", // Username for Neo4j authentication
password: "pleaseletmein", // Password for Neo4j authentication
indexName: "vector", // Name of the vector index
keywordIndexName: "keyword", // Name of the keyword index if using hybrid search
searchType: "vector", // Type of search (e.g., vector, hybrid)
nodeLabel: "Chunk", // Label for the nodes in the graph
textNodeProperty: "text", // Property of the node containing text
embeddingNodeProperty: "embedding", // Property of the node containing embedding
};

const documents = [
{ pageContent: "what's this", metadata: { a: 2 } },
{ pageContent: "Cat drinks milk", metadata: { a: 1 } },
];

const neo4jVectorIndex = await Neo4jVectorIndex.fromDocuments(
documents,
new OpenAIEmbeddings(),
config
);

const results = await neo4jVectorIndex.similaritySearch("water", 1);

console.log(results);

/*
[ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]
*/

await Neo4jVectorIndex.close();
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { Neo4jVectorIndex } from "langchain/vectorstores/neo4j_vector";

/**
* `fromExistingGraph` Method:
*
* Description:
* This method initializes a `Neo4jVectorStore` instance using an existing graph in the Neo4j database.
* It's designed to work with nodes that already have textual properties but might not have embeddings.
* The method will compute and store embeddings for nodes that lack them.
*
* Note:
* This method is particularly useful when you have a pre-existing graph with textual data and you want
* to enhance it with vector embeddings for similarity searches without altering the original data structure.
*/

// Configuration object for Neo4j connection and other related settings
const config = {
url: "bolt://localhost:7687", // URL for the Neo4j instance
username: "neo4j", // Username for Neo4j authentication
password: "pleaseletmein", // Password for Neo4j authentication
indexName: "wikipedia",
nodeLabel: "Wikipedia",
textNodeProperties: ["title", "description"],
embeddingNodeProperty: "embedding",
searchType: "hybrid",
};

// You should have a populated Neo4j database to use this method
const neo4jVectorIndex = await Neo4jVectorIndex.fromExistingGraph(
new OpenAIEmbeddings(),
config
);

await Neo4jVectorIndex.close();
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { Neo4jVectorIndex } from "langchain/vectorstores/neo4j_vector";

/*
* The retrievalQuery is a customizable Cypher query fragment used in the Neo4jVectorStore class to define how
* search results should be retrieved and presented from the Neo4j database. It allows developers to specify
* the format and structure of the data returned after a similarity search.
* Mandatory columns for `retrievalQuery`:
*
* 1. text:
* - Description: Represents the textual content of the node.
* - Type: String
*
* 2. score:
* - Description: Represents the similarity score of the node in relation to the search query. A
* higher score indicates a closer match.
* - Type: Float (ranging between 0 and 1, where 1 is a perfect match)
*
* 3. metadata:
* - Description: Contains additional properties and information about the node. This can include
* any other attributes of the node that might be relevant to the application.
* - Type: Object (key-value pairs)
* - Example: { "id": "12345", "category": "Books", "author": "John Doe" }
*
* Note: While you can customize the `retrievalQuery` to fetch additional columns or perform
* transformations, never omit the mandatory columns. The names of these columns (`text`, `score`,
* and `metadata`) should remain consistent. Renaming them might lead to errors or unexpected behavior.
*/

// Configuration object for Neo4j connection and other related settings
const config = {
url: "bolt://localhost:7687", // URL for the Neo4j instance
username: "neo4j", // Username for Neo4j authentication
password: "pleaseletmein", // Password for Neo4j authentication
retrievalQuery: `
RETURN node.text AS text, score, {a: node.a * 2} AS metadata
`,
};

const documents = [
{ pageContent: "what's this", metadata: { a: 2 } },
{ pageContent: "Cat drinks milk", metadata: { a: 1 } },
];

const neo4jVectorIndex = await Neo4jVectorIndex.fromDocuments(
documents,
new OpenAIEmbeddings(),
config
);

const results = await neo4jVectorIndex.similaritySearch("water", 1);

console.log(results);

/*
[ Document { pageContent: 'Cat drinks milk', metadata: { a: 2 } } ]
*/

await Neo4jVectorIndex.close();