Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparql connector default graph #2761

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

guizsantos
Copy link

@guizsantos guizsantos commented Apr 12, 2024

Summary of changes

This PR adds the functionality to pass the default_graph parameter from within kwargs. This will be particularly useful to the LangChain RdfGraph object usage. I am pushing a PR there that will leverage the funcionality from this PR.

Furthermore, the default_graph parameter which is passed directly to the query method has priority over the default_graph parameter set from kwargs, which should be aligned with users expected behavior and offer backwards compatibilty.

Finally, I am not sure an Issue should be created and linked or new tests should be introduced, this is actually my first ever PR on a public repository, so any guidance is appreciated.

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Checked that all tests and type checking passes.
  • If the change adds new features or changes the RDFLib public API:
    • Created an issue to discuss the change and get in-principle agreement.
    • Considered adding an example in ./examples.
  • If the change has a potential impact on users of this project: the changes should have no impact
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

Copy link
Contributor

@WhiteGobo WhiteGobo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am not of the maintainers but) this looks like a good functionality.

I would catch default_graph in __init__(..., default_graph=None, ...) and save it as self.default_graph=default_graph and later default to self.default_graph instead of self.kwargs.get("default_graph"). Because currently self.kwargs is only used as params for request.

I would consider adding short docstring for SPARQLConnector.query and.update that default_graph defaults to self.kwargs["default_graph"].
A test for this functionality might be a good idea as well.

@guizsantos
Copy link
Author

guizsantos commented Apr 21, 2024

Thanks for the feedback @WhiteGobo , I have little experience with the usage of rdflib so I went for the most straightforward way to pass the parameter over. However, I agree with your thoughts there... I will update the PR accordinly.

In the matter of testing, though, I am not sure how to proceed there. Let me take a look at the current tests and try to figure it out on my own. I will report back here.

baskaryan added a commit to langchain-ai/langchain that referenced this pull request Apr 27, 2024
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](RDFLib/rdflib#2761).

Example usage:
```python
from langchain_community.graphs import RdfGraph

graph = RdfGraph(
    query_endpoint="http://localhost/sparql",
    standard="rdf",
    store_kwargs=dict(
        default_graph="http://example.com/mygraph"
    )
)
```

<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
@nicholascar
Copy link
Member

nicholascar commented Apr 27, 2024

@guizsantos I'm a maintainer.

Just echoing @WhiteGobo: can you please add a short docstring? If you don't it won't be obvious at the function signature level that query() and update() can work with kwargs in addition to the function parameter option.

Oh yes: we will need a test. This is simple functionality but it's a rule of this repository that all new additions have a test. This change should be able to have a very simple test created for it that tests for 1. no default graph, default_graph and kwargs "default_graph" options.

@coveralls
Copy link

Coverage Status

coverage: 91.029% (+0.001%) from 91.028%
when pulling d58032f on guizsantos:sparql-connector-default-graph
into 8dabb87 on RDFLib:main.

Remove extra blank line that Ruff is complaining about.
@coveralls
Copy link

coveralls commented Jul 24, 2024

Coverage Status

coverage: 90.651% (+0.001%) from 90.65%
when pulling 561d5e1 on guizsantos:sparql-connector-default-graph
into 14d1006 on RDFLib:main.

@nicholascar
Copy link
Member

@guizsantos would you be able to provide any updates for this PR now that the GitHub testing mechanism is working properly again?

@nicholascar nicholascar self-requested a review July 25, 2024 02:00
Copy link
Member

@nicholascar nicholascar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per previous comment about DocString

@nicholascar nicholascar added the awaiting feedback More feedback is needed from the author of the PR or Issue. label Jul 25, 2024
@ashleysommer
Copy link
Contributor

I just noticed one potentially concerning issue with this PR.

The Docstring for Query and Update in the SPARQL Connector states:

Any additional keyword arguments will be passed to to the request, and can be used to setup timeouts etc.

So I was checking whether there might be a problem caused if the default_graph parameter also gets forwarded to the Request instance. However it seems the way the Request is constructed must've changed since the docstring was written because in the current code only kwargs["headers"] and kwargs["params"] are used, so the comment about "all other kwargs" and "can be used to set up timeouts, etc" are both incorrect.

@guizsantos Are you able to fix up the docstring for both query and update methods reflect the actual purpose of kwargs, including the new default_graph parameter.

@nicholascar nicholascar self-assigned this Sep 29, 2024
@nicholascar nicholascar removed the awaiting feedback More feedback is needed from the author of the PR or Issue. label Sep 29, 2024
@nicholascar
Copy link
Member

I've assigned myself to add the doc string that @guizsantos hasn't added before this gets merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants