Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to inject a context manager around TaskProcess.run #2449

Merged
merged 2 commits into from
Jul 11, 2018

Conversation

ulzha
Copy link
Contributor

@ulzha ulzha commented Jun 28, 2018

Description

Adds a configurable context manager around TaskProcess.

Motivation and Context

The intent with this is to enable customizable collection of logs per task (stdout and stderr output from any Task's run() method, including any subprocesses spawned by it), as opposed to having to read through the worker's logs where outputs from all the tasks are interspersed.

Inspired by CaptureOutput from the opensource capturer package.

Such configurable context managers could be useful extension points for "plugins" made for other purposes too. (Related: #1897)

Have you tested this? If so, how?

Has unittests.

@ulzha ulzha force-pushed the capturer branch 2 times, most recently from fe52cf3 to 5c986c9 Compare July 2, 2018 19:12
@ulzha ulzha changed the title (WIP) Capture stdout and stderr by task Capture stdout and stderr by task Jul 2, 2018
@ulzha ulzha changed the title Capture stdout and stderr by task Allow to inject a context manager around TaskProcess.run Jul 2, 2018
The intent with this is to enable collection of logs (stdout and stderr output) per task, as opposed to having to read through the worker's logs where all the task outputs are interspersed.
Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So cool! Have you tested this in real code yet?

luigi/worker.py Outdated
@@ -258,6 +259,27 @@ def terminate(self):
return super(TaskProcess, self).terminate()


# TODO be composable with arbitrarily many custom context managers?
# Introduce a convention shared for extension points other than TaskProcess?
# Use https://docs.openstack.org/stevedore?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could just link to our issue with the full debate/context on plugins?

@@ -419,6 +441,12 @@ class worker(Config):
force_multiprocessing = BoolParameter(default=False,
description='If true, use multiprocessing also when '
'running with 1 worker')
task_process_context = Parameter(default=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider making this a "list" by making a comma-separate name of classnames?

But I guess users can make their single context-manager itself have multiple ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, that would be the next step I could think of; about to add in a separate PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw a warning because None is not allowed in string parameter:

UserWarning: Parameter "task_process_context" with value "None" is not of type string.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw a warning because None is not allowed in string parameter:

UserWarning: Parameter "task_process_context" with value "None" is not of type string.

I wonder why can't it be defaulted to "" instead, like in

task_process_context = Parameter(default="",

@ulzha any thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can leave the None default and make it an OptionalParameter

luigi/worker.py Outdated

def run(self):
if self.context:
logger.debug('Instantiating ' + self.context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something you can grep easier like "Importing module for "?

@Tarrasch
Copy link
Contributor

Feel free to merge this. Perhaps someone else from Spotify should review too?

@NatashaL
Copy link
Contributor

So cool! Have you tested this in real code yet?

Yes, tested 👍

@NatashaL NatashaL merged commit b791d57 into spotify:master Jul 11, 2018
@ulzha ulzha deleted the capturer branch July 11, 2018 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants