Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new command: annotate #292

Merged
merged 15 commits into from
Aug 1, 2022
Merged

Add new command: annotate #292

merged 15 commits into from
Aug 1, 2022

Conversation

hrshdhgd
Copy link
Contributor

Fixes #280

@hrshdhgd hrshdhgd requested a review from matentzn July 18, 2022 22:12
matentzn
matentzn previously approved these changes Jul 19, 2022
sssom/io.py Outdated Show resolved Hide resolved
sssom/io.py Show resolved Hide resolved
sssom/io.py Outdated Show resolved Hide resolved
sssom/io.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry meant to comment before, not Approve, just make sure you read the comments. Great job :)

@hrshdhgd hrshdhgd requested a review from matentzn July 19, 2022 20:23
if len(v) == 1:
msdf.metadata[k] = v[0]
else:
msdf.metadata[k] = list(v)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this is that if have have to creators in the metadata and want to add a third one.. it wont work. Would it make sense to make the method strictly additive for multi valued metadata? Maybe not. Do you have an opinion on this? Maybe it is correct the way it is..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! We will have to think about it. When is it additive and when is it replacement? Should there be a flag?

Copy link
Contributor Author

@hrshdhgd hrshdhgd Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what I have implemented now is if a slot to be annotated is multivalued , add new values to the list. But what if the entire slot value needs to be changed? How does the user indicate that?

Possible solutions:

  • create a new method remove_annotation which does exactly the opposite of annotate? It seems inconvenient in my opinion.
  • Have a keyword in params called remove which will be handled separately and would look like this:
params = {
            "mapping_set_id": ("http://w3id.org/my/mapping.sssom.tsv",),
            "mapping_set_version": ("2021-01-01",),
            "creator_id": ("orcid:2345"),
            "remove": { "creator_id": "orcid:1234" }
        }

Basically , remove will be a dict with key = slot to be edited and value will be the one to be removed.

  • Have a --remove CLI option that lists the metadata that needs to be removed. The only allowed values here would be MULTIVALUED_SLOTS or maybe all slots)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a boolean flag --replace-multivalued which defaults to false, and if it is true, it will remove any existing values from a multivalued value?

Copy link
Contributor Author

@hrshdhgd hrshdhgd Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if the users want to retain some of the contents of the list and add new stuff? They'll have to type the ones they want again. If that's the case, the first solution would have worked and was a simple one too.

Copy link
Contributor Author

@hrshdhgd hrshdhgd Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I've implemented replace_multivalued. I've added a TODO comment next to the click option. If we can think of a cleaner way to do this, we can always go back and improve the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. good one :)

matentzn
matentzn previously approved these changes Jul 20, 2022
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! THANK you! The only tiny thing that is slightly unsatisfactory, is the fact that the sanitation of the kwars happens during annotate_file, but nothing keeps an API user to pass a broken "meta" dict into augment_metadata method.

@hrshdhgd
Copy link
Contributor Author

Added a function are_params_slots that checks for slots in both annotate_file and augment_metadata

@hrshdhgd hrshdhgd requested a review from matentzn July 29, 2022 18:32
@hrshdhgd hrshdhgd merged commit 13e2b3a into master Aug 1, 2022
@hrshdhgd hrshdhgd deleted the issue-280 branch August 1, 2022 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New method sssom annotate
2 participants