Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script for linking lcgft/lcsh terms in saogf terms #1070

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

lrosenstrom
Copy link
Contributor

@lrosenstrom lrosenstrom commented Jan 12, 2022

Labels mapped by quering https://id.loc.gov/authorities/subjectheadings/label/... in a bash script.

TBD
Change closeMatch -> exactMatch?
There is a mix of https://id.loc.gov/authorities/genreForms, https://id.loc.gov/authorities/subjects, https://id.loc.gov/authorities/childrensSubjects etc in the label to URI mapping. Do we care?

Corrected manually (spelling mistakes etc):

Darkwawe (music) -> Darkwave (Music) => https://id.loc.gov/authorities/genreForms/gf2014026759.html (pm14bfp71ldshtn)
Taraab -> Taraab (Music) => https://id.loc.gov/authorities/genreForms/gf2014027123.html (wt7bjpmf1s02t5n)
Occational verse -> Occasional verse => https://id.loc.gov/authorities/genreForms/gf2014026460.html (khw087h30db2179)
Alternative histories (fiction) -> Alternative histories (Fiction) => https://id.loc.gov/authorities/genreForms/gf2014026220.html (0xbdhl4j3g28rlz)
Straight-edge -> Straight-edge (Music) => https://id.loc.gov/authorities/subjects/sh2003002058.html (42gkrt7n4r3cqv8)
Indians Music -> Indians--Music => https://id.loc.gov/authorities/subjects/sh85065058.html (86lpwfvs37m8dj2)
Minutes (records) -> Minutes (Records) => https://id.loc.gov/authorities/genreForms/gf2014026128.html (0xbfnmfj240rck1)
Tango -> Tangos (Music) => https://id.loc.gov/authorities/genreForms/gf2014027127.html (97mqx3jt059lds6)

@olovy
Copy link
Contributor

olovy commented Jan 12, 2022

There is a mix of https://id.loc.gov/authorities/genreForms, https://id.loc.gov/authorities/subjects, https://id.loc.gov/authorities/childrensSubjects etc in the label to URI mapping. Do we care?

Yes!
The subjects should be replaced with links to genre/form terms.
e.g.
"Historical fiction" https://id.loc.gov/authorities/genreForms/gf2014026370
"Graphic novels" https://id.loc.gov/authorities/genreForms/gf2014026362

I don't know whats the best/easiest way to find them mechanically though.
https://id.loc.gov/search/?q=Graphic+novels&q=cs%3Ahttp%3A%2F%2Fid.loc.gov%2Fauthorities%2FgenreForms

@lrosenstrom
Copy link
Contributor Author

lrosenstrom commented Jan 12, 2022

The ones that are available as genreForm can be obtained from the label with e.g. https://id.loc.gov/authorities/genreForms/label/Graphic%20novels. I don't think all of them are, however. I'll try to quantify that statement...

@olovy
Copy link
Contributor

olovy commented Jan 12, 2022

@lrosenstrom
Copy link
Contributor Author

lrosenstrom commented Jan 12, 2022

Cleaned up the mapping to use genre/form terms when available. Some music terms are only available as subjects, see the script. Whoever finds the correct URI for "Tango" (used in 97mqx3jt059lds6) gets a cookie!
Edit: Or should we remove the lcsh music terms from gf-musiktermer? And only link to lc in the sao subject heading version? I think there is a gf and sh version of most terms: https://libris.kb.se/katalogisering/search/libris?q=te%20deum%20laudamus&_limit=20&%40type=Concept&_sort=

"Sprechstimme" : "https://id.loc.gov/authorities/subjects/sh2008000832",
"Straight-edge" : "https://id.loc.gov/authorities/subjects/sh2003002058",
"String octets" : "https://id.loc.gov/authorities/subjects/sh85129020",
"Tango" : "?",
Copy link
Contributor

@olovy olovy Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

"Double bass" : "https://id.loc.gov/authorities/subjects/sh85039150",
"Guitar" : "https://id.loc.gov/authorities/subjects/sh85057803",
"Indians Music" : "https://id.loc.gov/authorities/subjects/sh85065058",
"Mass (Music)" : "https://id.loc.gov/authorities/subjects/sh85081852",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage here comes from https://libris.kb.se/katalogisering/pm14bd3742fkn0b where both "Mass (Music)" and "Masses" are used, so let's keep it that way and use both variants for now.

@olovy
Copy link
Contributor

olovy commented Jan 12, 2022

Change closeMatch -> exactMatch?

exactMatch for lcgft and closeMatch for lcsh?

(maybe we should closeMatch sao at some point as well...?)

@lrosenstrom
Copy link
Contributor Author

lrosenstrom commented Jan 13, 2022

Added more mappings for blank nodes where inScheme = lcgft.
These are a bit unclear:
Sports programs (used in nl03bf463042n22)
Surveys (used in vs69jmvd2pts31z)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants