-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Basic structured logging #47210
base: master
Are you sure you want to change the base?
[data] Basic structured logging #47210
Conversation
Signed-off-by: Matthew Owen <mowen@anyscale.com>
Signed-off-by: Matthew Owen <mowen@anyscale.com>
Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
Co-authored-by: Hao Chen <chenh1024@gmail.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
18a921c
to
7c6313d
Compare
@@ -0,0 +1,34 @@ | |||
version: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this yaml file format? Is this somethign specific to ray data? is there documentation about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The YAML file format is specific to python (more docs here). We just directly load the yaml with something like
with open(config_path) as file:
config = yaml.safe_load(file)
logging.config.dictConfig(config)
Probably could do a better job explaining that this could be used in the brief logging docs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment in this file explaining the origin/format of this YAML file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blocker for me, but we should probably add some tests to ensure the behavior, and follow up with product and doc changes.
Also, maybe out of scope, but right now everything is configured through env var. Does it make sense to create a logging config for users to do other configurations such as log level or log file locations?
Users could create their own |
Yep, understand there are essentially unlimited degrees of freedom for users to define their own yaml, but then they would have to specify everything on their own right. Just wondering if here are something in between if user just need a small tweak. But totally understand if there are no such usecase for ray data, no need to over engineer this. |
Signed-off-by: Matthew Owen <mowen@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I think to keep things simple will leave as is. My understanding is that the vast majority of users will use the default settings (maybe not even JSON file logging) and those who do need something custom will likely have the expertise to set things up in whatever way they desire. Can revisit in the future if this becomes a pain point. |
python/ray/data/_internal/logging.py
Outdated
os.path.join(os.path.dirname(__file__), "logging_json.yaml") | ||
) | ||
|
||
# Environment variable to specify the encoding of the log messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also comment what options are available?
python/ray/data/_internal/logging.py
Outdated
environment variable. If the variable isn't set, this function loads the | ||
"logging.yaml" file that is adjacent to this module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring is slightly out-of-date with the code
@@ -0,0 +1,34 @@ | |||
version: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment in this file explaining the origin/format of this YAML file?
Signed-off-by: Matthew Owen <mowen@anyscale.com>
Signed-off-by: Matthew Owen <mowen@anyscale.com>
Why are these changes needed?
Adds structured logging to Ray Data. This will allow users to configure logging to use any of the following:
Examples:
Code snippet:
JSON logging (new)
Console output:
ray-data.log:
TEXT logging (unchanged)
Console output:
ray-data.log:
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.