Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DOI and WikipediaVersioned Extractor #1049

Merged

Conversation

shawntanzk
Copy link
Contributor

Adding DOI and WikipediaVersioned (two things used quite often in a few of the obo foundry ontologies)
Followed patterns from searching repo - probably needs a proper looking over to make sure I didn't do anything stupid.
Tagging @hkir-dev & @matentzn
Related to obophenotype/uberon#1874

@matentzn
Copy link

Have you tested this or just guessed it from the source code?

Can we add

OMIM, OMIMPS, DOI, ORCID?

Or better yet, add support for an external prefix map?

@shawntanzk
Copy link
Contributor Author

@matentzn guess it lol - I can try building it, but currently don't have eclipse installed, and not sure even if I did I really know how to handle stuff. Was mostly doing this cause I had some free time and thought I'd try to take some workload off @hkir-dev (was hoping there was already mapping, but after search realised it was this way, so while I was at it, thought I'd do these changes following pattern, hence really needing proper looking over).

Or better yet, add support for an external prefix map?

100% would be keen for this, but it might be beyond my skillset

@hkir-dev
Copy link
Collaborator

Good progress @shawntanzk . Seems we need to fix the doi regex (though seems it is a little difficult to have a perfectly working one). We can work and test together.

@matentzn providing an external map might be the ultimate solution, but it will require a lot of changes (config menu updates, config file validation and processing etc.) It would be good to go iteratively and have a first version with the current design.

@matentzn
Copy link

Sound good @hkir-dev, can you perhaps add then OMIM, OMIMPS, DOI, ORCID, Orphanet? This would make some of our powerusers very happy!

@shawntanzk
Copy link
Contributor Author

@hkir-dev
Copy link
Collaborator

Unit and manual tests completed, new extractors are working as expected.

@hkir-dev
Copy link
Collaborator

hkir-dev commented Mar 25, 2022

But observed a malfunction in the already existing OboFoundryLinkExtractor.
UBERON:0007329 is redirected to identifiers.org/UBERON:0007329 instead of purl.obolibrary.org/obo/UBERON_0007329

I suspect about the order of IdentifiersDotOrgLinkExtractor and OboFoundryLinkExtractor but seems this order changed intentionally: dd514de

Independent from the current PR, obo link extraction requires further analysis and a separate bug report.

@shawntanzk
Copy link
Contributor Author

Thanks @hkir-dev, see you cleaned up some of my silly mistakes too heh, thanks heaps!

Copy link

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent @hkir-dev! Looks great!

@matthewhorridge matthewhorridge merged commit 0aab1be into protegeproject:master May 23, 2022
@matthewhorridge
Copy link
Contributor

This is a great contribution! Thanks very much @shawntanzk. We really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants