Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular output on stdout cut off after running parallel task #104

Closed
veewee opened this issue Mar 2, 2020 · 11 comments
Closed

Regular output on stdout cut off after running parallel task #104

veewee opened this issue Mar 2, 2020 · 11 comments

Comments

@veewee
Copy link

veewee commented Mar 2, 2020

Hello,

First things first: Thanks for your packages!

I'm trying to make a part of an existing tool async first.
However, when mixing parallel with regular symfony console output, the output gets cut off.
You can test it with this boilerplate code (assuming you have symfony/console in your dependencies)
I currently disabled the extensions, so parallel is using the internal php process implementation.

<?php
require 'vendor/autoload.php';

use function Amp\ParallelFunctions\parallel;
use function Amp\Promise\wait;

$input = new \Symfony\Component\Console\Input\ArgvInput();
$output = new \Symfony\Component\Console\Output\ConsoleOutput();
$style = new \Symfony\Component\Console\Style\SymfonyStyle($input, $output);

$somethingLong = array_map(
    static function ($i) { return $i.': lorem ipsum ....'; },
    range(0,999)
);

$style->writeln($somethingLong);
wait(
    parallel(function () {
        return 1;
    })()
);
$style->writeln($somethingLong);

As you can see: the first output list is complete.
However, the second output list behaves in an inconsistent strange way.

For example:

502: lorem ipsum ....
503: lorem i776: lorem ipsum ....
777: lorem ipsum ....
778: lorem ipsum ....

From time to time, it also cuts off the end:

942: lorem ipsum ....
943: lorem ipsum ....
944: lorem ipsum ....
945: lorem i%

My best guess is that this is related to the output streams being piped to an amphp/byte-stream:
https://github.com/amphp/parallel/blob/master/lib/Worker/Internal/WorkerProcess.php#L47-L48

If I replace the Symfony code with a low-level php implementation, it fails in a more consistent way:

$style = new class {
    /**
     * @var bool|resource
     */
    private $stream;

    public function __construct()
    {
        $this->stream = fopen('php://stdout', 'r+'); // same goes for :  STDOUT
    }

    public function writeln(array $messages)
    {
        fwrite($this->stream, implode(PHP_EOL, $messages));
    }
};
45: lorem ipsum ....
46: lorem ipsum ....
# end of command

Do you have any suggestion on how to solve this issue or how to unpipe the streams after running the async actions?

The reporting of the tool I am writing is done by symfony console instead of amphp at the moment. Would love to keep it like that

(More context about the actual implementation is available here : phpro/grumphp#741 - but the provided snippet should be sufficient to reproduce)

@kelunik
Copy link
Member

kelunik commented Mar 2, 2020

I guess stream_set_blocking(STDOUT, true) after the parallel part solves the issue?

@veewee
Copy link
Author

veewee commented Mar 2, 2020

Thanks for the fast answer!
Indeed @kelunik : that solves the issue.

Is this something that should be added to this package or rather something I should add to my implementation? It is probably a better DX if the package cleans up the stream changes once it's done?

@trowski
Copy link
Member

trowski commented Mar 3, 2020

Hmm… I'm not sure there's an easy way to know when we're done writing to STDOUT, as that doesn't happen in a single place and the assumption was made that you'd write to STDOUT using Amp's streams, accessible through Amp\ByteStream\getStdout().

If that doesn't work for you, I guess I would recommend setting blocking to true on STDOUT once you know you're done with Amp.

The Worker API attempted to make parallelizing tasks simple with a very minimal API and to make debugging easy and obvious. One of the choices we made was piping STDOUT/STDERR to the main process. You can have much more control over the behavior of child processes/threads if you use the Context API, which allows you you write a script that is executed in parallel.

@veewee
Copy link
Author

veewee commented Mar 4, 2020

Thanks for the detailed feedback! It are valid assumptions to make.
I will dive a bit deeper into the Context API and determine which of the 2 solutions fits our needs best.

@veewee veewee closed this as completed Mar 4, 2020
@kelunik
Copy link
Member

kelunik commented Mar 4, 2020

We could probably provide some option to disable the automatic piping?

@veewee
Copy link
Author

veewee commented Mar 4, 2020

That could be a nice solution to the problem as well...

@kelunik
Copy link
Member

kelunik commented Mar 5, 2020

@veewee Do you want to provide a PR?

@veewee
Copy link
Author

veewee commented Mar 6, 2020

Hello @kelunik,

Any thoughts on how to implement this?
I was investigating 2 paths:

  • Adding an env var (which results in a not so transparent api, but is probably the easiest solution)
  • Using the environment class for this : I am not sure why this is being used. It is never accessed ATM?

@kelunik
Copy link
Member

kelunik commented Mar 6, 2020

The environment class is meant to be used by tasks in the workers, so not suitable here, as we like to change the parent behavior.

My first thought was a global toggle like a function that's toggleable, but initialized by an environment variable or so. But I think it's better to introduce some kind of StreamHandler interface, that can be used to disable or customize the piping behavior.

The current piping happens in an internal class users shouldn't touch, but I guess we can pass such a stream handler down from DefaultPool.

ByteStream\pipe($stdout, ByteStream\getStdout());

Something like that maybe?

<?php

interface OutputSink {
    function getStdout(Worker $worker);

    function getStderr(Worker $worker);
}

@veewee
Copy link
Author

veewee commented Mar 6, 2020

Since I dont know the internals of this package well enough, I'm afraid adding the feature as you describe is a bit out of my league at the moment.

@kelunik
Copy link
Member

kelunik commented Mar 6, 2020

@veewee Could you open another issue then, so we don't forget about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants