Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

#223 - SapiStreamEmitter::emit() - high memory usage - memory exhaust #224

Conversation

fcabralpacheco
Copy link
Contributor

No description provided.

@Ocramius
Copy link
Member

Besides CS, this looks good to me!

Wondering if there's a way to measure memory peak usage, but I guess there isn't a proper approach...

Probably something to be tried with memory_get_peak_usage()

@Ocramius Ocramius added the bug label Jan 14, 2017
@Ocramius Ocramius self-requested a review January 14, 2017 15:38
@Ocramius Ocramius added this to the 1.3.9 milestone Jan 14, 2017
@fcabralpacheco
Copy link
Contributor Author

I was wondering the same thing about a better way to measure peak memory usage.
I tried to implement something using the "memory_get_peak_usage" and "memory_get_usage".
With first function I think it is strange the behavior that the test would have, because it would be necessary to allocate memory until the last known peak, make the "emit" call, and check the new peak. In this situation if some previous test consumed a lot of memory, the new test would reach a higher usage peak than previous.
Trying to find a better way I used the "memory_get_usage", which usualy is used for function profiling.
It was necessary to check at each tick the memory consumption, because PHP deallocates the memory at the end of the call, and the local memory usage must be subtracted from peak (prophecy records all returns, making memory usage twice).
In the end, the test can detect two incorrect behaviors of the "emit" method (in readable streams):

  • when the whole body of the response is stored in memory;
  • when the "emit" method ignores the value of $maxBufferLength;

I configured the test data to keep memory usage lower (~ 8M x 2) in this particular test.

In addition, I made small changes to previous tests to make it easier to find the reason for the test failures (not throwing unecessary exception).

@fcabralpacheco
Copy link
Contributor Author

The test does not work on hhvm.
Maybe I can check if the tests are running on an hhvm (HHVM_VERSION), but I think it's not correct.

@Ocramius
Copy link
Member

Looking good! Will need to check it out locally and verify it before merging.

@fcabralpacheco
Copy link
Contributor Author

I found another way to track memory usage, I was already using ob_start callback (with small chunck size) to lower memory usage, excluding new output buffer contents (returning null string). Now it also calls the closure that checks the memory consumption.

@fcabralpacheco
Copy link
Contributor Author

I've kept memory track closure apart of output_callback (closure call closure) because of two have different objectives, and the first one can be used in another way (register_tick_function, ob_start, or in read, toString, getContents).

@Ocramius Ocramius self-assigned this Jan 17, 2017
@Ocramius
Copy link
Member

Linking #223

@Ocramius Ocramius changed the title Fix issue #223 #223 - SapiStreamEmitter::emit() - high memory usage - memory exhaust Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
@Ocramius Ocramius closed this in ca95165 Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
Ocramius added a commit that referenced this pull request Jan 17, 2017
@Ocramius
Copy link
Member

@fcabralpacheco thanks! Merge, added some fixups (only CS), and preparing a release 👍

@fcabralpacheco
Copy link
Contributor Author

@Ocramius Thanks for CS fixups. I'll try to improve next time.

Just one detail, it's missing two returns, after line 66 and 96 of file SapiStreamEmitter.php.

The tests were successful because in readable stream they are heavy based on CallbackStream, which detach the callback after a successful call of "getContents" or "__toString", ending in a "eof" condition (maybe I can improve the first tests I wrote to fail if it detects "read" calls in non-readable streams).

Also a comment, in "emitBodyRange" may be that someone in the future believes that it is necessary to skip the bytes of the stream until reach first position (in non seekable and readable streams).
I think that in this case, it is necessary to consider that the need of "seek(0)" or "seek($first)" is due to the fact that the emitters can be stacked, using the stream more than once if necessary.
But when the stream is non seekable (and readable) you are usually reading from a source where the cost of a seek is too high or not acceptable (e.g.: network streaming), and for performance reasons it is more usual to retrieve and deliver the stream in the right position (first byte of range) to the emitter and for just one utilization.
In the end, in most cases skipping the first bytes would mean that for each connection the application would retrieve and discard the first bytes of the stream without any utility.

@Ocramius
Copy link
Member

Ocramius commented Jan 17, 2017

Argh, indeed, I need to adapt the tests. Will revert the patch and check why the tests aren't failing with the missing return statements.

@Ocramius Ocramius reopened this Jan 17, 2017
@Ocramius Ocramius modified the milestones: 1.3.10, 1.3.9 Jan 17, 2017
@Ocramius
Copy link
Member

I think a few never expectations need to be added to the suite :-\

@fcabralpacheco
Copy link
Contributor Author

I think adding more data sets to tests "testEmitBody" and "testEmitBodyRange" with "$expectedReads = 0" and returning false to the method "isReadable" in this situation may be sufficient to detect this kind of problem.

@fcabralpacheco
Copy link
Contributor Author

@Ocramius
My last thought don't work.
I just realized something.
If the stream is properly implemented, "getContents" or "__toString" always ends in "eof" condition.
That way the missing returns only causes 1 or 2 more checks (on if or while) before return.

@fcabralpacheco
Copy link
Contributor Author

fcabralpacheco commented Jan 18, 2017

@Ocramius

Please check the new PR, I improved the tests a little bit. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants