Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid modifying log level globally #1944

Merged
merged 3 commits into from
Nov 13, 2023

Conversation

droctothorpe
Copy link
Contributor

What this PR does / why we need it:
This PR eliminates the code that was setting the log level to INFO globally, as documented in this issue.

The get_job_logs implementation needs to be addressed before merging, but I figured it would be easier to talk about with a proper PR @johnugeorge.

Do you want it to not print to stdout by default and just return a data structure that looks like this?

{
    pod1: logs,
    pod2: logs,
    pod3: logs
}

Which issue(s) this PR fixes
Fixes # #1942

@coveralls
Copy link

Pull Request Test Coverage Report for Build 6804550532

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 4 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.03%) to 42.859%

Files with Coverage Reduction New Missed Lines %
pkg/controller.v1/mpi/mpijob_controller.go 4 81.05%
Totals Coverage Status
Change from base Build 6723816999: -0.03%
Covered Lines: 3751
Relevant Lines: 8752

💛 - Coveralls

@johnugeorge
Copy link
Member

Sounds good to me.

@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 9, 2023

@johnugeorge I updated get_job_logs to return Dict[str, str].

There's one corner case, which is when follow is set to True. It seems like it's meant to function similar to kubectl logs <pod> -f, where emitting to stdout is the whole point, so I applied a print statement to that particular fork in the logic exclusively.

@coveralls
Copy link

coveralls commented Nov 9, 2023

Pull Request Test Coverage Report for Build 6851207814

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.01%) to 42.882%

Files with Coverage Reduction New Missed Lines %
pkg/controller.v1/mpi/mpijob_controller.go 2 81.24%
Totals Coverage Status
Change from base Build 6723816999: -0.01%
Covered Lines: 3753
Relevant Lines: 8752

💛 - Coveralls

@johnugeorge
Copy link
Member

@droctothorpe Tests failed

Copy link
Member

@kuizhiqing kuizhiqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

BTW, for the format part, we can do something like this

logger = logging.getLogger(__name__)
# logger.setLevel(logging.INFO)
formatter = logging.Formatter(
    fmt='%(name)s %(levelname)s %(asctime)s %(message)s')
ch = logging.StreamHandler()
ch.setFormatter(formatter)
logger.addHandler(ch)

@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 13, 2023

LGTM

BTW, for the format part, we can do something like this

logger = logging.getLogger(__name__)
# logger.setLevel(logging.INFO)
formatter = logging.Formatter(
    fmt='%(name)s %(levelname)s %(asctime)s %(message)s')
ch = logging.StreamHandler()
ch.setFormatter(formatter)
logger.addHandler(ch)

@kuizhiqing that's neat! Previously, the format was actually stripping all of the metadata:

logging.basicConfig(format="%(message)s")

I didn't implement the original code, but I suspect it was implemented to make logging function more like print (i.e. no structured log metadata). This PR eliminates that pattern, so modifying the logging format should no longer be necessary, but it's good to know that this is an option.

@droctothorpe
Copy link
Contributor Author

/retest

@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 13, 2023

@johnugeorge I don't have retest privileges apparently 🙃 .

@johnugeorge
Copy link
Member

Thanks @kuizhiqing for review.
Thanks @droctothorpe for adding the fix.

/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: droctothorpe, johnugeorge, kuizhiqing

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 230bfb4 into kubeflow:master Nov 13, 2023
32 checks passed
@@ -849,7 +853,7 @@ def get_job_logs(
if logline is None:
finished[index] = True
break
logging.info("[Pod %s]: %s", pods[index], logline)
print(f"[Pod {pods[index]}]: {logline}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the logging.
@droctothorpe Do we want to use print here instead of logging ?
Maybe we should keep our logging in SDK consistent ?
For example, KFP uses logging.info also: https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/client/client.py#L470
And they also setup logging here: https://github.com/kubeflow/pipelines/blob/a9279843946183429f6572516acee6523de36e53/sdk/python/kfp/cli/__main__.py#L23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @andreyvelich! I'm still wrapping my head around some of these nuances so take these answers with a grain of salt (Python logging is surprisingly complex).


Do we want to use print here instead of logging ?

print is more performant and recommended by the Python docs:

image

Source: https://docs.python.org/3/howto/logging.html#when-to-use-logging


And they also setup logging here:
https://github.com/kubeflow/pipelines/blob/a9279843946183429f6572516acee6523de36e53/sdk/python/kfp/cli/__main__.py#L23

Note that they only do this in the CLI. That's because the CLI is a user-facing API so to speak, i.e. it's not meant to be consumed by other libraries that have their own opinions on how logging should be configured. From the Python docs:

It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers. This is because the configuration of handlers is the prerogative of the application developer who uses your library. The application developer knows their target audience and what handlers are most appropriate for their application: if you add handlers ‘under the hood’, you might well interfere with their ability to carry out unit tests and deliver logs which suit their requirements.

Source: https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library

Copy link
Member

@andreyvelich andreyvelich Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, that makes sense, thanks for sharing these links!
I agree, that we should have appropriate logger for the SDK (e.g. logging.getLogger(__name__)) to not use root logger, but I still think that we should use logging to log some data for the user.

The problem with print is that user can't identify from which library the output was generated, but with logging we can configure it.

What do you think about this:

  • We can introduce a new parameter to TrainingClient() called verbose, and we can configure this parameter for various levels of logging: https://docs.python.org/3/library/logging.html#levels. We can start with (INFO, WARNING, and ERROR).
  • Depends on this parameter we can configure our logger in the constructor. For example, for verbose=10
self.logger = logging.getLogger(__name__)
if verbose == 20:
  self.logger.setLevel(logging.INFO)
  • And then we are going to use self.logger.info() or self.logger.warning() or self.logger.error() when it is required to print some data.

  • If user doesn't want to see any logs, they can always override it as follows:

logger = logging.getLogger("kubeflow.training.api.training_client")
logger.setLevel(logging.NOTSET)

Or provide verbose=0 as TrainingClient() parameter:

client = TrainingClient(verbose=0)

WDYT @droctothorpe @johnugeorge @kuizhiqing ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants