Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@langchain/community module "chromadb" throws if filter for search is not defined #7181

Closed
5 tasks done
commenthol opened this issue Nov 11, 2024 · 2 comments · Fixed by #7183
Closed
5 tasks done
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@commenthol
Copy link
Contributor

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { Chroma } from '@langchain/community/vectorstores/chroma'
import { OllamaEmbeddings } from '@langchain/ollama'

const documents = [
  {
    id: '1',
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: {}
  }
]

const embeddings = new OllamaEmbeddings({
  model: 'nomic-embed-text:latest'
})
const vectorStore = new Chroma(embeddings, {
  url: 'http://localhost:8000',
  collectionName: 'issue',
  collectionMetadata: {
    'hnsw:space': 'cosine'
  }
})
await vectorStore.addDocuments(documents, { ids: ['1'] })

const results = await vectorStore.similaritySearch('biology', 1)

Error Message and Stack Trace (if applicable)

ChromaClientError: Bad request to http://localhost:8000/api/v1/collections/e20e78e4-e278-4e14-aafc-04736871e8b8/query with status: Bad Request
    at chromaFetch (file://issue/node_modules/.pnpm/chromadb@1.9.2_openai@4.71.1_zod@3.23.8_/node_modules/chromadb/dist/chromadb.mjs:2604:17)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async Collection.query (file://issue/node_modules/.pnpm/chromadb@1.9.2_openai@4.71.1_zod@3.23.8_/node_modules/chromadb/dist/chromadb.mjs:2295:12)
    at async Chroma.similaritySearchVectorWithScore (file://issue/node_modules/.pnpm/@langchain+community@0.3.12_@ibm-cloud+watsonx-ai@1.1.2_@langchain+core@0.3.17_openai@4.71.1__57zubc44l3hvik5mkoauqn25qi/node_modules/@langchain/community/dist/vectorstores/chroma.js:321:24)
    at async Chroma.similaritySearch (file://issue/node_modules/.pnpm/@langchain+core@0.3.17_openai@4.71.1_zod@3.23.8_/node_modules/@langchain/core/dist/vectorstores.js:104:25)
    at async file://issue/issue.js:24:17 {
  cause: undefined
}

Description

// L348 ff
async similaritySearchVectorWithScore(
    query: number[],
    k: number,
    filter?: this["FilterType"]
  ) {
    if (filter && this.filter) {
      throw new Error("cannot provide both `filter` and `this.filter`");
    }
    const _filter = filter ?? this.filter;
    // +
    const where = _filter === undefined ? undefined : { ..._filter }

    const collection = await this.ensureCollection();

    // similaritySearchVectorWithScore supports one query vector at a time
    // chroma supports multiple query vectors at a time
    const result = await collection.query({
      queryEmbeddings: query,
      nResults: k,
      // -where: { ..._filter },
      where
    });
   

System Info

platform: mac

npm version
{
  npm: '10.9.0',
  node: '22.11.0',

package.json

  "dependencies": {
    "@langchain/community": "^0.3.12",
    "@langchain/core": "^0.3.17",
    "@langchain/ollama": "^0.1.2",
    "chromadb": "^1.9.2",
    "langchain": "^0.3.5"
}
Copy link

dosubot bot commented Nov 11, 2024

Hey @commenthol! I'm here to help you with any bugs, questions, or contributions. Let's dig into this together!

The issue you're encountering seems to be related to how the similaritySearchVectorWithScore method in the Chroma module handles the filter parameter. By default, if no filter is provided, the method should perform a similarity search without any filtering criteria, considering all available vectors in the collection [1].

The proposed code change you mentioned aims to address this by setting the where clause to undefined if no filter is provided, which aligns with the expected behavior. This change should prevent the 400 Bad Request error when no filter is specified, allowing the search to proceed without filtering.

If you implement this change, it should resolve the issue by ensuring that the where clause is only applied when a filter is explicitly defined. This approach maintains the flexibility of querying without mandatory filters, which seems to be the intended functionality.

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Nov 11, 2024
commenthol added a commit to commenthol/langchainjs that referenced this issue Nov 11, 2024
commenthol added a commit to commenthol/langchainjs that referenced this issue Nov 11, 2024
@jacoblee93
Copy link
Collaborator

Oh dear, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants