Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: PineconeVectorStore.get_nodes() does not return embedding values #17627

Open
LHFO94 opened this issue Jan 25, 2025 · 4 comments
Open

[Bug]: PineconeVectorStore.get_nodes() does not return embedding values #17627

LHFO94 opened this issue Jan 25, 2025 · 4 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@LHFO94
Copy link
Contributor

LHFO94 commented Jan 25, 2025

Bug Description

When trying to call .get_nodes() on a PineconeVectorStore the embedding values are not returned despite them being in the index.

Versions

llama-index-core == 0.12.11
llama-index-vector-stores-pinecone == 0.4.2

Version

0.12.11

Steps to Reproduce

from dotenv import load_dotenv
from llama_index.core.vector_stores import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)
from llama_index.vector_stores.pinecone import PineconeVectorStore

load_dotenv()

index = PineconeVectorStore.from_params(
    api_key=os.getenv("PINECONE_API_KEY"),
    index_name="documents",
    namespace="dails",
)

doc_id_filter = MetadataFilters(
    filters=[
        MetadataFilter(
            key="doc_id",
            value="02023R1803-20250101",
            operator=FilterOperator.EQ,
        )
    ],
)

nodes = index.get_nodes(filters=doc_id_filter)
for node in nodes:
    print(
        f"nodeid={node.node_id}, ref_doc_id={node.ref_doc_id}, embeddings={node.embedding}"
    )

The code above results in the following output

nodeid=0, ref_doc_id=02023R1803-20250101, embeddings=None
nodeid=1, ref_doc_id=02023R1803-20250101, embeddings=None
nodeid=2, ref_doc_id=02023R1803-20250101, embeddings=None
nodeid=3, ref_doc_id=02023R1803-20250101, embeddings=None
nodeid=4, ref_doc_id=02023R1803-20250101, embeddings=None

The issue lies in the metadata_dict_to_node function which doesn't correctly parse the response from the pinecone index query. If this issue is acknowledged as a bug I would to happy to provide a fix for this.

Relevant Logs/Tracbacks

@LHFO94 LHFO94 added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jan 25, 2025
Copy link

dosubot bot commented Jan 25, 2025

Hey @LHFO94! I'm here to help you with any bugs, questions, or contributions. Let me know how I can assist you!

I found a similar closed issue titled "[Bug]: Metadata is missing from vector store results" which was resolved by a pull request that fixed a variable name clash for legacy vector indexes [1].

In your case, the issue seems to be related to the metadata_dict_to_node function not correctly parsing the response from the Pinecone index query. The get_nodes method should assign the embedding values from the Pinecone query response to the node.embedding attribute. The embedding values are included in the match.values field of the response [2].

You might want to check if the metadata_dict_to_node function is correctly handling the embedding values and ensure that the get_nodes method is properly assigning these values to the nodes.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@LHFO94
Copy link
Contributor Author

LHFO94 commented Jan 25, 2025

As stated in my PR the metadata_dict_to_node function does not work properly. The reference to the PR supplied is unrelated.

@logan-markewich
Copy link
Collaborator

Eh, it's a cross between a bug and a feature. Returning embeddings can eat up a ton of memory.

I'd prefer some kind of bool option

@LHFO94
Copy link
Contributor Author

LHFO94 commented Jan 26, 2025

Something like this? #17635

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants