[WIP] feat: use author known identifiers in import API #10110
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
===This is a WIP and is not ready for review===
this should be squash merged to avoid conflicts with #10092, which this is split off of (I am not a git commit history expert). this depends on #10092 being merged because it depends on a new author method in that PR
model update pr: internetarchive/openlibrary-client#419
Closes #9448
Closes #9411
Technical
TODO: unit tests
TODO: how do we flag conflicts to librarians?
The import model is expanded by adding some additional logic to write to the author's
remote_ids
when detected in the incoming json object, and search Infogami against those remote ids to detect if the author already exists. The incomingauthor
dict can store an optionalremote_ids
field to contain these (i.e. viaf stored inauthor["remote_ids"]["viaf"]
), except for OL ID, which is not a remote identifier, so it is expected atauthor["key"]
.Issues:
Testing
tested using the output from #9674
To test the import, I wasn't sure how to hit /api/import with user credentials, so I disabled the
if not can_write():
condition in openlibrary/plugins/importapi/code.py as well as theif not account_key:
condition in openlibrary/catalog/add_book/init.py, and copy-pasted the printed JSON records into a Postman request body.Example:
This responds successfully with:
Viewing this author key at
http://localhost:8080/authors/OL15A
shows that the strong identifiers were imported correctly:Editing the author verifies this as well:
I then created a test book record whose author uses the same VIAF but has a missing name:
The response shows that the author was successfully matched to an existing one by VIAF:
This also works for OL IDs, which uses a slightly different fetch query than the other strong identifiers do:
Importing optional cover images also works:
I added support for all identifiers found in identifiers.yml, except for Inventaire, which I couldn't find in Wikidata:
Screenshot
Stakeholders