-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleaning up of low information agents #4903
Comments
also please list any other agent cleaning steps that need to happen for #4554 to be successful |
You don't need to do anything - say GO! and I go and we're done.... I think people want to do things - lots of cleanup seems to have happened in the other thread (which is what lead to the idea of just getting rid of the clutter that lead to those situations), I'm happy to do whatever I can to facilitate that, just let me know what you need. Here are 34961 collector (table, not role) or less agents created more than a year ago. temp_agent_clean_fp(1).csv.zip There are an additional 6820 low-information agents created within the last year. FYI there are 95973 total agents at the moment. |
Thank you for clarifying that, I thought there had to be something done on our end for you to be able to move forward but now I understand So our big steps here are, we set up a deadline, once its past the deadline, these agents get moved into verbatim agents and their remarks get moved into a remarks field for the attribute? It's a long list and there is no way the agents committee can tackle them all, but my hope is that we can help some collections who would really like their agents to be "fixed" before the merge Would it be better to focus on the old or new ones? Or maybe it doesn't matter? |
But we will also communicate to these collections that just because the agent has been put into verbatim agent, nothing is lost and there are tools/workflows to help them clean them up and add them as an actual agent |
Sounds reasonable, or I can break things into chunks, move some out of the way while we're still working on others, WHATEVER.
Let me know if you need a different view of the data.
Yep, that's the intention.
Yep. Worst case we do absolutely nothing, which still puts cleanup in the context of more than bare strings and seems like a significant improvement to me. Best case, #4872 works as hoped, this all becomes a click (which might be automated). |
Here are a few for @wellerjes @droberts49 Aileen Alvarez (person: 21302067) [new window] These are all listed as "Student participant in the Chicago Academy of Sciences summer TEENS program." in remarks. If you want to keep them as agents - I suggest creating a project and adding them all to it. I'm happy to do this for you if you want! Exploring further, there a a whole BUNCH of students in this group that really would be a nice project instead or a series of related projects if this is some kind of annual thing. @dustymc I guess group membership would be something that keeps an agent an agent? I really don't like the groups, but perhaps they do serve some purpose as they apparently have here. |
Groups are just awful (no metadata) relationships, prioritizing #4555 would simplify things. |
I can help a ton with that - Groups are fairly easy to convert to projects - the problem then becomes all the activity of the group and how we manage that. So when the group agent is a collector - OOF |
is there a way for me to get a list of all agents that are connected to my collections? |
I don't think so, but if you'll elaborate on that I can probably pull them. |
any agents that are associated with ? |
From the data above (straightforward) or from anywhere (not straightforward, needs an issue)? |
but the data above doesn't say which collections the agents are attributed to, there are a lot of them that have NMMNH in the remarks though that I can work on, but I'm curious if there are many others that don't. I'll make a new issue |
See my instructions in your other issue - I've been working on UTEP:Herp agents and already found some cool stuff! Check our Ernest A. Liner (who I also added to Bionomia) and Eugene D. Fleharty. It's fun to figure people out! I also spent some time on random Joneses and was able to use remarks to add other data to some of the - still a long way yo go in that list though.... |
I'm all for cleaning up agents, but if 'low quality' agents (only initials plus last name) get moved to verbatim, what happens to the collector/preparator agent? I hope it's not getting changed to 'unknown' as someone with a last name but only first/middle initials is certainly known more then someone with only sets of initials. ??? |
'low quality' is "just names and acting only as collector," the format of the name is not involved.
It will be removed. There is no data loss, the 'verbatim agent' attribute can carry all of the information a names-only Agent can carry. There will be tools to "upgrade" verbatim if more information becomes available, and the intention of this Issue is to recover anything that was entered incorrectly - eg, https://arctos.database.museum/agent/21345767 (entered today) would have been on the chopping block because it had only remarks, @Jegelewicz created a relationship from those remarks, it's no longer "just strings" and therefore will not be involved in any cleanup. Lots more discussion in #4554, https://github.com/ArctosDB/newsletter/issues/166#issuecomment-1211368414 will become an article. |
I wasn't able to attend the issues meeting and just read through the notes. Can someone clarify what will happen to agent_remarks for an agent with remarks but no relationships/addresses/transactions? Will remarks get transferred to verbatim agent attribute remarks? (they won't disappear correct?) |
Yes - the remarks and also I think aka's will be placed in the remark for the verbatim agent. |
Are these remarks then visible on the catalog record page - some are loooong and/or pre-date the option to add "curatorial remarks" and probably don't need to be publicly displayed? |
@AJLinn I think @dustymc decided to go wild last week and merge a bunch of stuff, but I really want to wait on most merges until AFTER any string-only agents are converted to verbatim agents so that what ever people are using NOW is what they get THEN. Also, just in case you have missed out. Any agent that has only names and remarks that is ONLY involved in collector roles, will be converted to verbatim agent around the first of next year. Any agents that you want to keep that are in this group need to have either a status, relationship, or address added to keep them agent-worthy. Also note that you can no longer add an agent without one of these things. This is a plea to PLEASE allow the collections to do this work! Mass-mergeing stuff now is going to mean people losing verbatim information that they have currently recorded in agents. |
Thanks @Jegelewicz - I've had to miss out on a bunch of Arctos stuff because of a 8-week long seminar I've been involved with (now completed) and have not been tracking agent stuff as closely as I should have. I am fully invested in doing this work but agree, we MUST have time to prioritize the work and it's not going to be a quick fix! Forcing us to drop all of the other time-critical priorities to do these fixes before things get merged is not going to engender a positive working environment! It's going to lead to mistakes and pissed-off users, NOT the intended improvement of agent records. For example, for just UAM:EH dealing with 918 names on this google spreadsheet assuming it takes an average of 5 minutes per name (some will take much more effort while others will be faster) means 76.5 hours of work!!! I assume others have equally as many entries to review and none of us have two weeks of dedicated time to only devote to this task.
1000-times YES! |
Yep - I can get through about 20-25 in an hour. FWIW I have been spending some time each week just transferring information from remarks to the various status, relationship and address fields, so I am trying to help everyone get some of this done before the deadline. Also, @ArctosDB/agents-committee is meeting half an hour early each month to work on this too. |
That's what's happening, #4930 |
I'll work on adding more data to the UAM:Art agent profiles in the list. It can be challenging to research some of our more obscure artists, but I will do what I can. If an agent is a determiner of an attribute, does this disqualify them from being changed into a verbatim agent? Can you clarify, is the list shared in this issue all of the "low quality agents", or are there more? |
Right now, yes.
All as of the day it was made - but more agents get added every day... |
Hi all- I had a full on panic attack about this and thought we were going
to have to leave Arctos and find a different database for art collections.
As we’ve talked about before it is very normal to have very little
information about an artist (sometimes just an initial) but it still has
enormous value and is a lot more data than nothing. The idea of losing that
was so distressing- the creator field is probably our most
legally/ethically important field to track in the collection. I was
distraught that we would need to add a birth date because for living
artists (living humans) they have the right to privacy about their birth
date and are not required to provide it to us. Karinna has calmed be down
by explaining that we can add “associated with X collection” to keep them
from getting merged.
Can I get confirmation that our creator agents with low data will not be
erased/lost if they are formally associated with our collection? And that
these are all the agents? There won’t be more in the future that will be
eased if I’m not keeping tabs on Arctos discussions? This feels utterly
terrifying to me.
I haven’t read this paper and it’s recommendations in detail but it seems
relevant to this conversation and I’m wondering if the working group has a
policy/statement about about ethics/privacy for people who have personal
info in Arctos? Or maybe something to develop if we don’t?
https://mdsoar.org/handle/11603/14397
On Wed, Sep 14, 2022 at 1:31 PM Karinna Gomez ***@***.***> wrote:
I'll work on adding more data to the UAM:Art agent profiles in the list.
It can be challenging to research some of our more obscure artists, but I
will do what I can. If an agent is a determiner of an attribute, does this
disqualify them from being changed into a verbatim agent? Can you clarify,
is the list shared in this issue all of the "low quality agents", or are
there more?
—
Reply to this email directly, view it on GitHub
<#4903 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJKSRRZZVSQXSRIYP2ZT65LV6I73VANCNFSM55TVXMQQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Sent from Gmail Mobile
|
@marecaguthrie don't panic! We aren't asking anyone to add any information that isn't already in a biography - just to add it in some more appropriate places! I will make it a personal mission to look through your agents to ensure that you know if any of them are headed for the verbatim agent attribute. BUT even that would not lose anything! If an agent is "verbatimized" to a verbatim agent attribute any remark associated with the agent (your biography) will also go into the verbatim agent remark and they could be "upgraded" to an agent at any time there is something other than name or remark to identify them.
We really don't and we should but we do encumber certain agent information (all addresses except ORCiD, Wikidata and Library of Congress as those are already public). I will start an issue in the internal repo for this. |
There is still some very fundamental misunderstanding, or miscommunication, or misSOMETHING at play here. Nothing can be lost; the defining characteristic of a verbatim agent is that the information fits in that structure without loss. "Goes by single initial, prefers anonymity" is a great fit for verbatim agents; what we're doing does what you say you need to do much better than what we're coming from (where "A." would assuredly get credited with a bunch of unrelated low-information activity, and then probably changed to fit those misattributed data) possibly can. |
@marecaguthrie here is an example - Litho-Krome Company Before I did anything, this agent only had the following information Had I left it alone, instead of this on the catalog record: You would have seen
BUT I just took from remarks the "address" Columbus, Georgia and BOOM, now this is worthy of remaining an agent. With a few clicks, I was able to find their LinkedIN, and a Bloomberg page which both included a FULL address - and it appears this company is closed according to Google (plus their website is up for grabs). So, nothing that isn't already public was needed in order to "agentify" this strings-only agent and now nothing in your records will change nor will their public agent page except for the addition of the urls, which are already public. But also, the agent is more complete and others can tell if it is the same Litho-Krome Company they have in their data or if there is a new incarnation of this company name. Hope that helps! |
So to summarize (because I know this can be a bit confusing, and we've been working on this for a long time):
@dustymc @Jegelewicz @lin-fred @droberts49 do I have the summary right? |
My only objection is around the categorization of "down-graded." "Verbatimizing" is a lateral move, functionally equivalent to any other approach. Bigger-picture it should result in a much more information rich environment where things like duplicates (which prevent giving proper credit) are much less likely to exist, so while the path may not be direct I think the end result is inevitably an up-grading. |
@dustymc I changed the wording. What do you think? |
Nice, one more request - consider changing You can search verbatim agents to You can search verbatim agents; no functionality is lost |
I think we should be good to go for the summary. Here is a clean version of it so we can link to this comment when we are discussing the issue. I'll try and keep track with developments and add to the summary as things come up: So to summarize (because I know this can be a bit confusing, and we've been working on this for a long time):
|
I finished going through and adding more data for the UAM:Art agents on the list attached in this issue. |
Assigning an archival database student to help. |
Thanks for explaining! That is reassuring!
On Wed, Sep 14, 2022 at 2:05 PM Teresa Mayfield-Meyer < ***@***.***> wrote:
@marecaguthrie <https://github.com/marecaguthrie> don't panic! We aren't
asking anyone to add any information that isn't already in a biography -
just to add it in some more appropriate places! I will make it a personal
mission to look through your agents to ensure that you know if any of them
are headed for the verbatim agent attribute.
BUT even that would not lose anything! If an agent is "downgraded" to a
verbatim agent attribute any remark associated with the agent (your
biography) will also go into the verbatim agent remark and they could be
"upgraded" to an agent at any time there is something other than name or
remark to identify them.
if the working group has a policy/statement about about ethics/privacy for
people who have personal info in Arctos? Or maybe something to develop if
we don’t?
We really don't and we should but we do encumber certain agent information
(all addresses except ORCiD, Wikidata and Library of Congress as those are
already public). I will start an issue in the internal repo for this.
—
Reply to this email directly, view it on GitHub
<#4903 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJKSRR5ZNJVG3VZSGMMXB43V6JD2TANCNFSM55TVXMQQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Sent from Gmail Mobile
|
I think we're done here. |
@dustymc can you post a list of agents that only have remarks you'd like us to start looking through
The text was updated successfully, but these errors were encountered: