-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broad api calls #20
Broad api calls #20
Conversation
…ut value; this is being cached for single spectra
…; create a function to dynamically assign top-level conversion methods to Converters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, only problem is that asyncstdlib is not on conda!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine. I've added simple test data input and output files.
The efficiency of the double caching with both execute_job_with_cache
and the @lru_cache
method decorator is unclear to me. Testing with a significant dataset could clarify this, but is not required at the moment in my opinion.
Beware that the default lru_cache maxsize is unlimited.
Oh, and we have a 500MB .msp file in the UMSA library for some serious testing: https://umsa.cerit-sc.cz/library/list#folders/F1c84aa7fc4490e6d/datasets/9a34fd777b6c8572 |
@@ -6,6 +9,7 @@ class Converter: | |||
def __init__(self, session): | |||
self.session = session | |||
|
|||
@lru_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure self
is hashable here? It contains the session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we have to make sure default hash based on id
is sufficient here. We will have a look into that in #1 (cache testing).
The current design requires specification of type
It seems like default is 128 (default arg value) |
Improvement of API calls efficiency - where possible, broader calls were created (instead of obtaining single result, e.g. InChIKey, we obtain multiple attributes at once). Results are stored in cache for the single spectra data, as the obtained result can be reused for another job. This was implemented in
Annotator.execute_job_with_cache
.Another
cache
for async requests was implemented using asynchronous version of lru_cache. Close #8.Additionally, implementation of individual services was simplified where possible (mostly by separating calls and parsing), but this caused issue #19.