Implement an identifier mapping service #41

cthoyt · 2023-03-04T23:21:07Z

References biopragmatics/bioregistry#686.

This pull request implements the identifier mapping service described in SPARQL-enabled identifier conversion with Identifiers.org. The goal of such a service is to act as an interoperability in SPARQL queries that federate data from multiple places and potentially use different IRIs for the same things.

This can be demonstrated in the following SPARQL query for proteins in a specific model in the BioModels Database and their associated domains in the UniProt:

PREFIX biomodel: <http://identifiers.org/biomodels.db/>
PREFIX bqbio: <http://biomodels.net/biology-qualifiers#>
PREFIX sbmlrdf: <http://identifiers.org/biomodels.vocabulary#>
PREFIX up: <http://purl.uniprot.org/core/>

SELECT DISTINCT ?protein ?protein_domain
WHERE {
    # The first part of this query extracts the proteins appearing in an RDF serialization
    # of the BioModels database (see https://www.ebi.ac.uk/biomodels/BIOMD0000000372) on
    # insulin/glucose feedback. Note that modelers call entities appearing in compartmental
    # models "species", and this does not refer to taxa.
    biomodel:BIOMD0000000372 sbmlrdf:species/bqbio:isVersionOf ?biomodels_protein .

    # The second part of this query maps BioModels protein IRIs to UniProt protein IRIs
    # using service XXX - that's what we're implementing here.
    SERVICE <XXX> {
        ?biomodels_protein owl:sameAs ?uniprot_protein.
    }

    # The third part of this query gets links between UniProt proteins and their
    # domains. Since the service maps between the BioModels query, this only gets
    # us relevant protein domains to the insulin/glucose model.
    SERVICE <http://beta.sparql.uniprot.org/sparql> {
        ?uniprot_protein a up:Protein;
            up:organism taxon:9606;
            rdfs:seeAlso ?protein_domain.
    }
}

The SPARQL endpoint running at the web address XXX takes in the bound values for ?biomodels_protein one at a time and dynamically generates triples with owl:sameAs as the predicate mapping and other equivalent IRIs (based on the definition of the converter) as the objects. This allows for gluing together multiple services that use different URIs for the same entities - in this example, there are two ways of referring to UniProt Proteins:

The BioModels database example represents a SBML model on insulin-glucose feedback and uses legacy Identifiers.org URIs for proteins such as http://identifiers.org/uniprot/P01308.
The first-part UniProt database uses its own PURLs such as https://purl.uniprot.org/uniprot/P01308.

Implementation Notes

By @JervenBolleman's suggestion in biopragmatics/bioregistry#686 (comment), this was done by overriding the triples method of RDFLib's graph data structure.

Therefore, any arbitrary (extended) prefix map loaded in a curies.Converter can be used to run this service. The main use will be to deploy this as part of the Bioregistry, but it's nice that it can be reused for any arbitrary use case when implemented in this package (which is lower level and more generic than the Bioregistry)

Follow-up

Wrap this so it can be easily mounted on to the Flask and FastAPI apps (taking inspiration from https://github.com/vemonet/rdflib-endpoint)
Deploy this in the Bioregistry at an address like https://bioregistry.io/services/mapping (don't pick something generic like https://bioregistry.io/sparql since we might want to deploy the Bioregisry RDF later)

Alternate/Past Ideas

Hack in the query parser

# -*- coding: utf-8 -*-
# type: ignore

"""
Custom re-implementation of RDFLib's algebra, so we can
more efficiently join values clause first. This all has been done
to change the following one line:

.. code-block:: python

    if q.valuesClause:
        M = Join(p2=M, p1=ToMultiSet(translateValues(q.valuesClause)))

to

.. code-block:: python

    if q.valuesClause:
        M = Join(p1=ToMultiSet(translateValues(q.valuesClause)), p2=M)

Code in this module has been copied (mostly verbatim, save some style
changes) from RDFLib (https://github.com/RDFLib/rdflib) under the
BSD-3-Clause license (https://github.com/RDFLib/rdflib/blob/main/LICENSE)

Huge thanks to the RDFLib developers.
"""

import functools
from typing import List, Mapping, Optional, Set, Tuple

from pyparsing import ParseResults
from rdflib.plugins.sparql.algebra import (
    Extend,
    Filter,
    Group,
    Join,
    OrderBy,
    Project,
    ToMultiSet,
    _addVars,
    _findVars,
    _hasAggregate,
    _simplifyFilters,
    _traverse,
    _traverseAgg,
    analyse,
    simplify,
    translateAggregates,
    translateGroupGraphPattern,
    translatePath,
    translatePName,
    translatePrologue,
    translateValues,
    traverse,
    triples,
)
from rdflib.plugins.sparql.evaluate import evalQuery
from rdflib.plugins.sparql.operators import and_
from rdflib.plugins.sparql.parser import parseQuery
from rdflib.plugins.sparql.parserutils import CompValue
from rdflib.plugins.sparql.processor import SPARQLProcessor
from rdflib.plugins.sparql.sparql import Query
from rdflib.term import Variable

__all__ = ["CustomSPARQLProcessor"]


class CustomSPARQLProcessor(SPARQLProcessor):
    """A custom SPARQL processor.

    This processor is verbatim from :class:`rdflib.plugins.sparql.processor.SPARQLProcessor`
    except it switches in a custom version of :func:`rdflib.plugins.sparql.algebra.translateQuery`
    which in turn is verbatim other than switching out a custom version of
    :func:`rdflib.plugins.sparql.algebra.translate`, which in turn is verbatim other than changing
    the join order for VALUES clauses.
    """

    def query(self, query, initBindings=None, initNs=None, base=None, DEBUG=False):
        """Evaluate a SPARQL query on this processor's graph."""
        if not isinstance(query, Query):
            parsetree = parseQuery(query)
            query = _custom_translate_query(parsetree, base, initNs or {})
        return evalQuery(self.graph, query, initBindings or {}, base)


# The following code was copied verbatim from
# https://github.com/RDFLib/rdflib/blob/a146e0a066df3f85176226c9e826bd181389049c/rdflib/plugins/sparql/algebra.py#L920-L954
# except for changing calls to ``translate()`` (part of RDFLib) to
# ``_custom_translate()`` (see custom implementation below)


def _custom_translate_query(
    q: ParseResults,
    base: Optional[str] = None,
    init_ns: Optional[Mapping[str, str]] = None,
) -> Query:
    """
    Translate a query-parsetree to a SPARQL Algebra Expression

    Return a rdflib.plugins.sparql.sparql.Query object
    """

    # We get in: (prologue, query)

    prologue = translatePrologue(q[0], base, init_ns)

    # absolutize/resolve prefixes
    q[1] = traverse(q[1], visitPost=functools.partial(translatePName, prologue=prologue))

    P, PV = _custom_translate(q[1])
    datasetClause = q[1].datasetClause
    if q[1].name == "ConstructQuery":
        template = triples(q[1].template) if q[1].template else None

        res = CompValue(q[1].name, p=P, template=template, datasetClause=datasetClause)
    else:
        res = CompValue(q[1].name, p=P, datasetClause=datasetClause, PV=PV)

    res = traverse(res, visitPost=simplify)
    _traverseAgg(res, visitor=analyse)
    _traverseAgg(res, _addVars)

    return Query(prologue, res)


# The following code was copied with minor modifications (noted in caps) from
# https://github.com/RDFLib/rdflib/blob/a146e0a066df3f85176226c9e826bd181389049c/rdflib/plugins/sparql/algebra.py#L630-L767


def _custom_translate(q: CompValue) -> Tuple[CompValue, List[Variable]]:
    """
    http://www.w3.org/TR/sparql11-query/#convertSolMod
    """
    _traverse(q, _simplifyFilters)

    q.where = traverse(q.where, visitPost=translatePath)

    # TODO: Var scope test
    VS: Set[Variable] = set()
    traverse(q.where, functools.partial(_findVars, res=VS))

    # all query types have a where part
    M = translateGroupGraphPattern(q.where)

    aggregate = False
    if q.groupby:
        conditions = []
        # convert "GROUP BY (?expr as ?var)" to an Extend
        for c in q.groupby.condition:
            if isinstance(c, CompValue) and c.name == "GroupAs":
                M = Extend(M, c.expr, c.var)
                c = c.var
            conditions.append(c)

        M = Group(p=M, expr=conditions)
        aggregate = True
    elif (
        traverse(q.having, _hasAggregate, complete=False)
        or traverse(q.orderby, _hasAggregate, complete=False)
        or any(
            traverse(x.expr, _hasAggregate, complete=False) for x in q.projection or [] if x.evar
        )
    ):
        # if any aggregate is used, implicit group by
        M = Group(p=M)
        aggregate = True

    if aggregate:
        M, E = translateAggregates(q, M)
    else:
        E = []

    # HAVING
    if q.having:
        M = Filter(expr=and_(*q.having.condition), p=M)

    # VALUES
    if q.valuesClause:
        # THIS IS THE LINE WE CHANGED IN :mod:`curies`
        M = Join(p2=M, p1=ToMultiSet(translateValues(q.valuesClause)))

    if not q.projection:
        # select *
        PV = list(VS)
    else:
        PV = list()
        for v in q.projection:
            if v.var:
                if v not in PV:
                    PV.append(v.var)
            elif v.evar:
                if v not in PV:
                    PV.append(v.evar)

                E.append((v.expr, v.evar))
            else:
                raise Exception("I expected a var or evar here!")

    for e, v in E:
        M = Extend(M, e, v)

    # ORDER BY
    if q.orderby:
        M = OrderBy(
            M,
            [CompValue("OrderCondition", expr=c.expr, order=c.order) for c in q.orderby.condition],
        )

    # PROJECT
    M = Project(M, PV)

    if q.modifier:
        if q.modifier == "DISTINCT":
            M = CompValue("Distinct", p=M)
        elif q.modifier == "REDUCED":
            M = CompValue("Reduced", p=M)

    if q.limitoffset:
        offset = 0
        if q.limitoffset.offset is not None:
            offset = q.limitoffset.offset.toPython()

        if q.limitoffset.limit is not None:
            M = CompValue("Slice", p=M, start=offset, length=q.limitoffset.limit.toPython())
        else:
            M = CompValue("Slice", p=M, start=offset)

    return M, PV

Hack the RDF generator for services

# type: ignore

import re
from textwrap import dedent

from pyparsing import ParseException
from rdflib.plugins.sparql import evaluate, parser
from rdflib.plugins.sparql.sparql import QueryContext


def _buildQueryStringForServiceCall(ctx: QueryContext, match: re.Match) -> str:
    service_query = match.group(2)
    try:
        parser.parseQuery(service_query)
    except ParseException:
        prefixes = "\n".join(
            f"PREFIX {prefix}: {ns.n3()}"
            for prefix, ns in ctx.prologue.namespace_manager.store.namespaces()
        )
        body = dedent(
            f"""\
            SELECT REDUCED * WHERE {{
                {_get_init(ctx)}
                {service_query.strip()}
            }}
        """
        )
        return f"{prefixes}\n{body}"
    else:
        return service_query + _get_init(ctx)


def _get_init(ctx):
    sol = ctx.solution()
    if not len(sol):
        return ""
    variables = " ".join([v.n3() for v in sol])
    variables_bound = " ".join([ctx.get(v).n3() for v in sol])
    return f"VALUES (" + variables + ") {(" + variables_bound + ")}"


def monkey_patch_service_call():
    evaluate._buildQueryStringForServiceCall = _buildQueryStringForServiceCall

References biopragmatics/bioregistry#686.

codecov-commenter · 2023-03-04T23:43:59Z

Codecov Report

Merging #41 (8a0e86e) into main (4233832) will decrease coverage by 0.24%.
The diff coverage is 98.79%.

@@             Coverage Diff             @@
##              main      #41      +/-   ##
===========================================
- Coverage   100.00%   99.76%   -0.24%     
===========================================
  Files            5        7       +2     
  Lines          339      422      +83     
  Branches        76       95      +19     
===========================================
+ Hits           339      421      +82     
- Partials         0        1       +1

Impacted Files	Coverage Δ
src/curies/mapping_service/rdflib_custom.py	`95.65% <95.65%> (ø)`
src/curies/mapping_service/__init__.py	`100.00% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

JervenBolleman · 2023-03-05T09:05:02Z

Great work! I did want to note that the generic RDF store can be in the same endpoint as the mapping service. I can try to have a look at it this afternoon.

cthoyt · 2023-03-05T09:09:18Z

The first pass implementation seemed to go well, but after I set this up to run as a service, I noticed the following problem. Given a stupid simple triple store with two triples:

https://www.ebi.ac.uk/chebi/searchId.do?chebiId=1 rdfs:label "dummy label"
http://purl.obolibrary.org/obo/CHEBI_1 rdfs:label http://purl.obolibrary.org/obo/CHEBI_2

And the following query (Which should return the dummy label of ChEBI:1 after mapping the URIs):

SELECT ?label WHERE {
    ?child rdfs:subClassOf <http://purl.obolibrary.org/obo/CHEBI_2> .
    SERVICE <http://127.0.0.1:5000/sparql> {
        ?child owl:sameAs ?child_mapped .
    }
    ?child_mapped rdfs:label ?label .
}

I get a query to the service that looks like

SELECT REDUCED * WHERE {
    ?child owl:sameAs ?child_mapped .
}
VALUES (?child) {
    (<http://purl.obolibrary.org/obo/CHEBI_1>) 
    (<http://purl.obolibrary.org/obo/CHEBI_2>)
}

I'm not sure if this is even valid SPARQL since the VALUES comes outside the WHERE clause, but at the moment I'm stumped how overriding RDFLib's Graph.triples is able to deal with this.

How to Reproduce

First, run the service with python -m curies.mapping_service (for now, there is an example app running there backed by a small example extended prefix map useful for ChEBI IRIs)

Code to run the example

from rdflib import RDFS, Graph, Literal, URIRef
from tabulate import tabulate


def main():
    graph = Graph()
    graph.add(
        (
            URIRef("https://www.ebi.ac.uk/chebi/searchId.do?chebiId=1"),
            RDFS.label,
            Literal("label 1"),
        )
    )
    graph.add(
        (
            URIRef("http://purl.obolibrary.org/obo/CHEBI_1"),
            RDFS.subClassOf,
            URIRef("http://purl.obolibrary.org/obo/CHEBI_2"),
        )
    )
    # Get labels of children of 2
    res = graph.query(
        """
    SELECT ?label WHERE {
        ?child rdfs:subClassOf <http://purl.obolibrary.org/obo/CHEBI_2> .
        SERVICE <http://127.0.0.1:5000/sparql> {
            ?child owl:sameAs ?child_mapped .
        }
        ?child_mapped rdfs:label ?label .
    }
    """
    )

    print(tabulate(list(res)))


if __name__ == "__main__":
    main()

JervenBolleman · 2023-03-05T09:59:32Z

The query is valid, but broken.

SELECT REDUCED * WHERE {
    ?child owl:sameAs ?child_mapped .
}
VALUES (?s) {
    (<http://purl.obolibrary.org/obo/CHEBI_1>) 
    (<http://purl.obolibrary.org/obo/CHEBI_2>)
}

Should be

SELECT REDUCED * WHERE {
    ?child owl:sameAs ?child_mapped .
}
VALUES (?child) { #note the difference here
    (<http://purl.obolibrary.org/obo/CHEBI_1>) 
    (<http://purl.obolibrary.org/obo/CHEBI_2>)
}

That looks like an error in the SPARQL engine sending the query. The values outside is valid and usually only done in federated queries.

So as in the broken query ?s is used nothing is passed to the ?child variable and therefore nothing is available for the triples method to work on.

cthoyt · 2023-03-05T18:36:06Z

The s was only a typo when I changed the variable names to make better context. The issue is with the values on the outside, the triples function in rdflib doesn’t get passed anything to let it know what the actual values are

cthoyt · 2023-03-05T19:43:05Z

Apparently the way the service SPARQL is getting generated is based on the following code: https://github.com/RDFLib/rdflib/blob/d2c9edc7a3db57e1b447c2c4ba06e1838b7ada0c/rdflib/plugins/sparql/evaluate.py#L372-L393

still - the issue is if the VALUES is not inside the WHERE, then there is no information passed to Graph.triples that we can work with to dynamically generate results. I'm still not sure if there's a reason it appears after instead of inside the WHERE clause

JervenBolleman · 2023-03-05T19:48:59Z

The final values is SPARQL ok. The question is why that does not give a join with the inline data being the feeding algebra element. It's been a while since I did any python rdflib development, and I had not migrated my development environment. The error is certainly not in your code.

At most I need to add in an optimizer like step to make sure inline data has higher priority (more likely to be left side of a join).

At least that is what I think. Let me get a debugger on it and have a look.

Looking at it if the order of p1 and p2 is inverted at line 707 of algebra.py of rdflib the code works fine.

I think it is valid optimization to by default join from the known bindings into the local store, and I think it is worth raising this on the rdflib mailinglist/issue tracker.

cthoyt · 2023-03-06T09:27:07Z

I think RDFLib/rdflib#2125 might be related. I monkey patched the rdflib code for the function I mentioned with

def _buildQueryStringForServiceCall(ctx: QueryContext, match: re.Match) -> str:
    service_query = match.group(2)
    try:
        parser.parseQuery(service_query)
    except ParseException:
        prefixes = "\n".join(
            f"PREFIX {prefix}: {ns.n3()}"
            for prefix, ns in ctx.prologue.namespace_manager.store.namespaces()
        )
        body = dedent(f"""\
            SELECT REDUCED * WHERE {{
                {_get_init(ctx)}
                {service_query.strip()}
            }}
        """)
        return f"{prefixes}\n{body}"
    else:
        return service_query + _get_init(ctx)


def _get_init(ctx):
    sol = ctx.solution()
    if not len(sol):
        return ""
    variables = " ".join([v.n3() for v in sol])
    variables_bound = " ".join([ctx.get(v).n3() for v in sol])
    return "VALUES (" + variables + ") {(" + variables_bound + ")}"

and now it creates SPARQL that is formatted ~~correctly to properly bind the values~~ that works better with the default RDFLib SPARQLProcessor to bind the values. However, like you mentioned, if the previous way is still "valid" SPARQL then a solution with the way the query is optimized/executed would probably be preferable. Looking at your solution for inverting the join order, this would be much more simple :)

JervenBolleman · 2023-03-06T10:07:45Z

@cthoyt I asked on stackoverflow.

I suspect the best way is to make our own extension of the SPARQLProcessor that does this join reordering before evaluation.

cthoyt · 2023-03-06T12:03:10Z

This seems doable, but because the SPARQLProcessor delegates to some module level functions, I am worried there will be no way to do this other than with lots of code duplication

JervenBolleman · 2023-03-06T12:15:52Z

This seems doable, but because the SPARQLProcessor delegates to some module level functions, I am worried there will be no way to do this other than with lots of code duplication

Yeah, let's give the stackoverflow a few days to see if we get a reply. RDFlib developers prefer to answer there, but if needed we can try the the dev mailing list.

vemonet · 2023-03-08T09:42:43Z

Hi @cthoyt and @JervenBolleman , I had a look at the implementation and might be able to give some insights on exposing the RDFLib graph as a SPARQL endpoint

The main thing rdflib-endpoint does is to take a rdflib model and defines an API endpoint that handles all stuff a SPARQL endpoint is expected to handle to pass queries to the RDFLib graph: query through GET and POST, content negotiation through Accept headers, etc

Afaik those things are required if you want your API endpoint to be considered as a valid API endpoint by others SPARQL endpoints. So that those SPARQL endpoints are able to send and resolve SERVICE queries to your endpoint

Deploying an endpoint from your custom CURIEServiceGraph with rdflib-endpoint should be quite straightforward:

from rdflib_endpoint import SparqlEndpoint

curieG = CURIEServiceGraph()

app = SparqlEndpoint(
    graph=curieG,
    cors_enabled=True,
    # Metadata used for the SPARQL service description and Swagger UI:
    title="SPARQL endpoint to serve CURIE mappings",
    description="A SPARQL endpoint to serve CURIE mappings",
    version="0.1.0",
    public_url='https://your-endpoint-url/sparql',
)

Then run the app with uvicorn in development, and gunicorn in production (for async workers)

I can add it with tests if you want @cthoyt , but since you have already everything setup you might want to do it directly, let me know what you prefer

cthoyt · 2023-03-08T13:40:23Z

@vemonet we've got a minimal version of this implemented in https://github.com/cthoyt/curies/blob/1c05478ff12764e5fb30d70799e9fd8984fa1ab4/src/curies/mapping_service.py#L178-L207 - I think the best way to go right now is to focus on making the SPARQL and the service get interpreted correctly, then in a follow-up we can get fancy and compliant. Thanks for the comments!

JervenBolleman · 2023-03-08T21:28:28Z

@cthoyt I had a thought about the normal bioregistry.io sparql endpoint being both this special mapping one and the normal one. Pull request regarding that added as material for inspiration :)

cthoyt · 2023-03-10T11:55:46Z

@JervenBolleman I didn't see any activity on your stack overflow post. do you want to try messaging the RDFLib dev mailing list?

JervenBolleman · 2023-03-10T11:58:41Z

I tried their matrix first, and if that does not get a reply. dev mailing list it is.

cthoyt · 2023-03-10T13:20:47Z

@JervenBolleman in the mean time, I have vendored the algebra code and made the modification you suggested. Interestingly, this works on py37 and locally, but the py311 test shows the subjects and objects are getting out of order

cthoyt · 2023-03-12T09:15:47Z

See alternative approach for optimizing query after it’s been parsed: RDFLib/rdflib#2257

cthoyt · 2023-03-15T08:49:36Z

@JervenBolleman in b243e33, I implemented the post-processor you suggested in RDFLib/rdflib#2257. It seems to work non-deterministically - sometimes it's fine and other times, it returns results with the subjects and objects switched. Do you think you know what might be going on?

JervenBolleman · 2023-03-15T10:05:38Z

@cthoyt not at first sight. I will try to have a look at it this week but no promises.

well I can reproduce the issue with subject, object switched with it being ok in one run but not the other.

I see, I expect this is due to the iteration order in the _stmt not being stable in the test code.

cthoyt · 2023-03-15T21:31:09Z

well I can reproduce the issue with subject, object switched with it being ok in one run but not the other.

I see, I expect this is due to the iteration order in the _stmt not being stable in the test code.

This is some really interesting insight! I assumed that the ResultRow objects were like tuples that corresponded to the order of the variables in the query, but maybe that's not the case. I updated the implementation in 32d1044 - hopefully this leads to deterministic tests passing :)

If so, I will call this PR finished

This will allow for hacking in a custom SPARQL processor, that e.g., rewrites some nodes as we demonstrated in biopragmatics/curies#41

Closes #686 This adds the URI mapping service implemented in biopragmatics/curies#41. It will allow for SPARQL queries to be written that call the Bioregistry as a service for generating URI mappings (e.g., between OBO PURLs, Identifiers.org URIs, and first-party URIs whose URI prefixes are stored in the Bioregistry). Here's a simplified example that doesn't require any triple store, and can be directly executed with RDFLib: ```sparql SELECT DISTINCT ?s ?o WHERE { VALUES ?s { <http://purl.obolibrary.org/obo/CHEBI_24867> <http://purl.obolibrary.org/obo/CHEBI_24868> } SERVICE <https://bioregistry.io/sparql> { ?s owl:sameAs ?o } } ``` returns the following (some not shown, you should get the idea): | subject | object | |---------------------------------------|------------------------------------------------- | | http://purl.obolibrary.org/obo/CHEBI_24867 | http://purl.obolibrary.org/obo/CHEBI_24867 | | http://purl.obolibrary.org/obo/CHEBI_24867 | http://identifiers.org/chebi/24867 | | http://purl.obolibrary.org/obo/CHEBI_24867 | https://www.ebi.ac.uk/chebi/searchId.do?chebiId=24867 | | ... | ... | This is built on top of the [`curies.Converter.expand_pair_all`](https://curies.readthedocs.io/en/latest/api/curies.Converter.html#curies.Converter.expand_pair_all), which itself is populated by all of the URI format strings available in the Bioregistry. To see examples of the possible ChEBI URIs, see https://bioregistry.io/registry/chebi.

cthoyt added 6 commits March 5, 2023 00:07

Add initial mapping service

eec5ff6

References biopragmatics/bioregistry#686.

Update mapping_service.py

a92c385

Update mapping_service.py

c841991

Update tests

bd0aa7c

Update test_mapping_service.py

89a9f8b

Test last functionality

4fdfea4

Update test_mapping_service.py

3fa482d

cthoyt added 4 commits March 5, 2023 10:09

Update test_mapping_service.py

5f280f3

Update mapping_service.py

33f6800

Add flask app

e33487b

Update mapping_service.py

1c05478

cthoyt added 3 commits March 10, 2023 13:28

Add custom processor

6004018

Reorganize code

e71b14b

Reorganize and add tests

1b10fc8

cthoyt added 3 commits March 10, 2023 13:49

Cleanup

8553143

Update rdflib_custom.py

2c6c210

Consolidate

b4ef1d7

Update tests and config

8bb58f9

Add Jerven post-processor

b243e33

CI

dfbde68

cthoyt added 3 commits March 15, 2023 15:32

Clean up implementation and tests

a301d0c

Flake8

9a37305

Make rows deterministic

32d1044

cthoyt added 3 commits March 15, 2023 23:08

Finish full testing and CI

8ba8f26

Clean up

8a0e86e

Add docs and coverage

ffa0c33

cthoyt merged commit aefcc72 into main Mar 15, 2023

cthoyt deleted the mapping-service branch March 15, 2023 22:37

cthoyt added a commit to cthoyt/rdflib-endpoint that referenced this pull request Mar 15, 2023

Add ability to specify custom parser

0deca37

This will allow for hacking in a custom SPARQL processor, that e.g., rewrites some nodes as we demonstrated in biopragmatics/curies#41

This was referenced Mar 15, 2023

Add ability to specify custom parser vemonet/rdflib-endpoint#6

Merged

Implement URI mapping service biopragmatics/bioregistry#773

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement an identifier mapping service #41

Implement an identifier mapping service #41

cthoyt commented Mar 4, 2023 •

edited

Loading

codecov-commenter commented Mar 4, 2023 •

edited

Loading

JervenBolleman commented Mar 5, 2023

cthoyt commented Mar 5, 2023 •

edited

Loading

JervenBolleman commented Mar 5, 2023 •

edited

Loading

cthoyt commented Mar 5, 2023

cthoyt commented Mar 5, 2023

JervenBolleman commented Mar 5, 2023 •

edited

Loading

cthoyt commented Mar 6, 2023 •

edited

Loading

JervenBolleman commented Mar 6, 2023 •

edited by cthoyt

Loading

cthoyt commented Mar 6, 2023

JervenBolleman commented Mar 6, 2023

vemonet commented Mar 8, 2023 •

edited

Loading

cthoyt commented Mar 8, 2023

JervenBolleman commented Mar 8, 2023

cthoyt commented Mar 10, 2023

JervenBolleman commented Mar 10, 2023

cthoyt commented Mar 10, 2023

cthoyt commented Mar 12, 2023

cthoyt commented Mar 15, 2023

JervenBolleman commented Mar 15, 2023 •

edited

Loading

cthoyt commented Mar 15, 2023

Implement an identifier mapping service #41

Implement an identifier mapping service #41

Conversation

cthoyt commented Mar 4, 2023 • edited Loading

Implementation Notes

Follow-up

Alternate/Past Ideas

Hack in the query parser

Hack the RDF generator for services

codecov-commenter commented Mar 4, 2023 • edited Loading

Codecov Report

JervenBolleman commented Mar 5, 2023

cthoyt commented Mar 5, 2023 • edited Loading

How to Reproduce

JervenBolleman commented Mar 5, 2023 • edited Loading

cthoyt commented Mar 5, 2023

cthoyt commented Mar 5, 2023

JervenBolleman commented Mar 5, 2023 • edited Loading

cthoyt commented Mar 6, 2023 • edited Loading

JervenBolleman commented Mar 6, 2023 • edited by cthoyt Loading

cthoyt commented Mar 6, 2023

JervenBolleman commented Mar 6, 2023

vemonet commented Mar 8, 2023 • edited Loading

cthoyt commented Mar 8, 2023

JervenBolleman commented Mar 8, 2023

cthoyt commented Mar 10, 2023

JervenBolleman commented Mar 10, 2023

cthoyt commented Mar 10, 2023

cthoyt commented Mar 12, 2023

cthoyt commented Mar 15, 2023

JervenBolleman commented Mar 15, 2023 • edited Loading

cthoyt commented Mar 15, 2023

cthoyt commented Mar 4, 2023 •

edited

Loading

codecov-commenter commented Mar 4, 2023 •

edited

Loading

cthoyt commented Mar 5, 2023 •

edited

Loading

JervenBolleman commented Mar 5, 2023 •

edited

Loading

JervenBolleman commented Mar 5, 2023 •

edited

Loading

cthoyt commented Mar 6, 2023 •

edited

Loading

JervenBolleman commented Mar 6, 2023 •

edited by cthoyt

Loading

vemonet commented Mar 8, 2023 •

edited

Loading

JervenBolleman commented Mar 15, 2023 •

edited

Loading