-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 214 lru cache mem leak #240
Issue 214 lru cache mem leak #240
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #240 +/- ##
==========================================
- Coverage 75.54% 73.01% -2.53%
==========================================
Files 109 109
Lines 7586 7598 +12
==========================================
- Hits 5731 5548 -183
- Misses 1855 2050 +195
☔ View full report in Codecov by Sentry. |
The errors thrown above are the DCOR-issues you, @paulmueller , were talking about, right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor changes ❤️
dclab/rtdc_dataset/core.py
Outdated
return length | ||
# Try to get the length from the feature sizes | ||
keys = list(self._events.keys()) | ||
keys = list(self._events.keys()) + self.features_basin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a separate case for self.features_basin here?
I fear that accessing this property might take very long and might not be required.
i.e.
keys = list(self._events.keys())
for kk in keys:
...
keysb = self.features_basin
if keysb:
return len(self[keysb][0])
raise ValueError...
The iteration over the self._events
is just a precaution due to some internal features not implementing len properly. But for basin features this should not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a dataset is created without any own features, but only referencing features of its basin, then there are no key-value pairs in self._events, so this would return a length of 0
, despite the basin having multiple entries.
So only if self._events.keys()
is empty I need to look at the self.features_basin
-list.
So similar to what you suggest I could first check
keys = list(self._events.keys())
if not keys:
keys = self.features_basin
for kk in keys:
. . .
Would that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be a good solution 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a one-liner I could also do:
keys = list(self._events.keys()) or self.features_basin
Here the part after or
would only be executed if self._events.keys()
is empty. I think it looks a bit nicer and shorter.
What do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the or
looks nicer.
CHANGELOG
Outdated
@@ -1,4 +1,5 @@ | |||
0.54.3 | |||
- ref: replace lru_cache on instance methods with cache attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a "fix". Please reference the issue number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
Tests failing is not related to this PR. |
Looks good to me. Can I merge? |
Yes! |
In this MR we refactor all
functools.lru_cache
decorators on instance methods, since they introduce memory-leak.Currently the only class that still contains
lru_cache
-decorators for instance methods isDCORTraceItem
indclab.rtdc_dataset/fmt_dcor/events.py
. There are going to be future refactoring steps that will remove these.