-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup: resolve pylint warnings #149
Conversation
- Update regex for input str - Validate that expected reference sequence matches actual reference sequence
@ahwagner I think it'd be good to get your review on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. I made a few notes of possible changes but nothing important. Some of these ignore statements are probably fine to make QC pass but are things that would be good to refactor later (eg functions that are too long or have too many arguments).
I will also say that the amount of times we have to ignore the protected-access warning suggests that maybe we should just disable it project-wide.
In my original comment for the PR, I made a note about resolving the Sure, I'll go ahead and add |
per @ahwagner : |
@ahwagner made your suggested change of renaming the import name. Let me know if there's anything else that needs to be changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks technically sound to me. I am less certain if these changes are procedurally correct. Specifically, see my comments about the localize mapping to localize_named_feature.
If you are convinced this mapping is correct, then please let @reece and me know so we can do a deeper code review.
allele_sl = ga4gh.vrs.models.Variation(**allele.as_dict()) | ||
del allele_sl._id | ||
allele_sl.location = self.localize(allele.location) | ||
allele_sl.location = self.localize_named_feature(allele.location, assembly_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right to me by the names and descriptions of these methods. Why would updating a sequence location (parent method) have anything to do with a named feature (child method)? My understanding of named features are annotated regions on a sequence, e.g. gene symbols, protein domains, named exons.
I would need to dig deeper into the code to understand if this is actually correct; and if so we need to reconsider the names and descriptions of these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of this and the notebook, these methods converts a Chromsome Location or Gene Location to a Sequence Location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@korikuzma is correct: the "localization" process is intended to convert a named feature to coordinates. I proposed (and I thought that you and Larry agreed) that we needed to be able to project all features onto a common coordinate system. This operation is necessary in order to identify overlapping features, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I misunderstood; I thought we were "localizing" a sequence location here, but understand now the idea is to localize a feature or chromosome location. That seems appropriately named to me. Thanks @korikuzma and @reece.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I didn't actually run it. Please see comments inline.
allele_sl = ga4gh.vrs.models.Variation(**allele.as_dict()) | ||
del allele_sl._id | ||
allele_sl.location = self.localize(allele.location) | ||
allele_sl.location = self.localize_named_feature(allele.location, assembly_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@korikuzma is correct: the "localization" process is intended to convert a named feature to coordinates. I proposed (and I thought that you and Larry agreed) that we needed to be able to project all features onto a common coordinate system. This operation is necessary in order to identify overlapping features, for example.
Close #97
Notes:
Pylint static analysis
too-many
warnings, I just disabled. We can fix this in a later issue if we want to do more code clean up.disable=unused-argument
in translate_from methods will be fixed in Updatetranslate_from
signature + handling different from_translators #150fixme
for pylint. It seemed silly to comment it out with a disablenormalize
todo_normalize
inTranslator
due toRedefining name 'normalize' from outer scope (line 21) (redefined-outer-name)
. We can rename to something else if we don't likedo_normalize
. I am bad at naming.localize_allele
was calling a method that did not exist (localize
), so usedlocalize_named_feature
which seemed like what it should be calling._get_coords
usedchr_cb_map
but the docs said "return (start,end) of bandcb
in mapm
", so switched to usingm
.redefined-outer-name
. Open to better var names. (naming is hard x2)