-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#4501: Add RSS Memory Profiling to Modin Logging #4502
Conversation
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one fix.
Codecov Report
@@ Coverage Diff @@
## master #4502 +/- ##
==========================================
+ Coverage 86.64% 89.37% +2.73%
==========================================
Files 226 229 +3
Lines 18298 21229 +2931
==========================================
+ Hits 15854 18974 +3120
+ Misses 2444 2255 -189
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For logging memory specifically, I think we should probably use this: https://docs.python.org/3/library/logging.handlers.html#rotatingfilehandler
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
Signed-off-by: Naren Krishna <naren@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a quick question, but looks good to me otherwise! I think instead of writing over the file maybe it would be better to truncate the middle part? So that way users have start and end?
modin/logging/config.py
Outdated
logfile = RotatingFileHandler( | ||
filename=log_filename, | ||
mode="a", | ||
maxBytes=100000000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have maxBytes
set so high?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set it to 10MB for each file (since it is being stored locally I think this is reasonable), and logging a max of 10GB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the 10GB max?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/library/logging.handlers.html#rotatingfilehandler -- because backupCount
is set to 100. At maximum you get 100 files of 10MB each. If people are running logs at that level anyhow and need support, they would likely reach out to us somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like 100 MB (10^8)? Is that what you meant to put? I think something like int(1e8)
would be more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to make this configurable instead of hard coded.
I think setting a default max of 10GB of logs is way too high. This is 10GB per Modin job, not total. I prefer to cap it at 100MB per job by default. If more is needed, they can set a configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So 10MB per file, and a max of 10 files, for a total of 100MB per job? Was looking through Google and the max MB per file for logs people recommend for a local filesystem is somewhere in the 10-50MB range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
modin/logging/config.py
Outdated
logfile = RotatingFileHandler( | ||
filename=log_filename, | ||
mode="a", | ||
maxBytes=100000000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to make this configurable instead of hard coded.
I think setting a default max of 10GB of logs is way too high. This is 10GB per Modin job, not total. I prefer to cap it at 100MB per job by default. If more is needed, they can set a configuration.
Signed-off-by: Naren Krishna <naren@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, @naren-ponder thanks!
I want to separate the different kinds of logs in a separate PR. Will open an issue for that.
Signed-off-by: Naren Krishna naren@ponder.io
What do these changes do?
Add RSS Memory Profiling at
LogMemoryInterval
interval to Modin logging. This helps with out-of-core measurements.flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date