-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHP memory leak #1128
Comments
Relevant unit test:
A historical PR that added that test: |
I checked a few quick ideas to give the next person some entrypoints and potentially save a few hours of looking for these: I just quickly tested whether increasing the http://localhost:5400/website-server/?url=/crash.php diff --git a/packages/playground/remote/src/lib/worker-thread.ts b/packages/playground/remote/src/lib/worker-thread.ts
index cfc77fea2..06b78152d 100644
--- a/packages/playground/remote/src/lib/worker-thread.ts
+++ b/packages/playground/remote/src/lib/worker-thread.ts
@@ -241,6 +241,7 @@ const [setApiReady, setAPIError] = exposeAPI(
try {
php.initializeRuntime(await recreateRuntime());
+ php.setPhpIniEntry('memory_limit', '256M');
if (startupOptions.sapiName) {
await php.setSapiName(startupOptions.sapiName);
@@ -286,6 +287,20 @@ try {
'playground-includes/wp_http_dummy.php': transportDummy,
'playground-includes/wp_http_fetch.php': transportFetch,
});
+ php.writeFile('/wordpress/crash.php', `<?php
+ function useAllMemory() {
+ echo "Initial memory usage: " . (memory_get_usage()/(1024*1024)) . ' MB <br>';
+ // ini_set('memory_limit', '1024MB'); // The php limit seems to be 128MB but it doesn't affect the results.
+ $data = '';
+
+ while (true) {
+ $data .= str_repeat('a', 1024 * 1024); // Increase string size by 1MB in each iteration
+ echo "* " . (memory_get_usage()/(1024*1024)) . ' MB <br>';
+ }
+ }
+ useAllMemory();
+ die();`)
+
if (virtualOpfsDir) {
await bindOpfs({
Which is weird considering how:
and wordpress-playground/packages/php-wasm/compile/php/Dockerfile Lines 839 to 840 in 06f27eb
I was, however, able to use a bit more memory before crashing by not extracting WordPress files into MEMFS – I think HEAP is used to store their content: |
@adamziel Thanks for doing all this work to help narrow the possibilities. |
I am looking at whether we can just build PHP with debug info and step through it in the browser. This page suggests it is possible at least sometimes: So far, I've asked emscripten to build with debug info via the Now I'm looking at whether we can step through it in Chrome. The original source files may be required, and that might be a challenge since the build is done within Docker. We'll see. |
With Chrome Canary, after tweaking the Dockerfile to add the With some tweaks we should be able to fix this. Then we can hopefully step through the debugger to observe where and why PHP thinks it is out of memory. |
@brandonpayton check |
@adamziel thanks for the pointer. I only ran into source map things in the Dockerfile after beginning to add DWARF info. Then I saw something that suggested the DWARF route would offer richer info while debugging. That promise hasn't panned out yet, but I'm working on it. Thankfully, to address the missing source files, the debugging extension has a setting to map paths, designed for cases like ours (They even mentioned Docker-based builds : ). Here's what I'm seeing so far with the crash. We get the call stack which may not be super useful. But it does help us identify where the "Allowed memory size" error is occurring (there are 3-4 places in PHP 8.0 where that error is reported). Unfortunately, access to variable values is infrequent and rare, so I'm working on that. It's impressive that the reflected call stack includes the JS functions that invoked the Wasm export. |
I tried using the latest emscripten version to see whether using a newer version might lead to more complete debug info, but the variable values are still largely unavailable. One interesting thing is that, after building PHP 8.0 with latest emcc, the error changed to simply "Out of memory", which comes a bit later than the error seen above. It seems to indicate that the limit checking said everything was OK but the actual allocation failed. This might just be due to changing emscripten options/behavior leading to different preprocessor behavior during build (i.e., maybe the memory limit checking didn't run at all). Either way, we don't have to have var values in the debugger right now. As a next step, I plan to patch the PHP 8.0 to augment the error message, including whatever info we want from within |
I know it probably doesn't matter here, but I was curious and found this page suggesting that variable names in the inspector are possible: https://developer.chrome.com/docs/devtools/wasm Also: https://www.perplexity.ai/search/Debug-wasm-variable-0L7fbRR3T0iM2LqSMnt39g |
I switched back to emscripten 3.1.43 and am thankfully still seeing var values in the debugger. It works best when I let the page load Playground and open the dev tools after. If dev tools are opened before loading Playground for the first time, no php-src source files are listed. Setting memory limit to 1024MB via
Planning to dig into this more on Monday. |
Even without repeated The following version of the test script:
<?php
ini_set('memory_limit', '64M');
function useAllMemory() {
echo "Initial memory limit: " . ini_get( 'memory_limit' ) . '<br>';
echo "Initial memory usage: " . (memory_get_usage()/(1024*1024)) . ' MB <br>';
echo "<br>";
$data = '';
$counter = 0;
$total_strlen = 0;
$tail = str_repeat('a', 64 * 1024); // Increase string size by 64KB in each iteration
for ($counter = 0; $counter < 1000; $counter++) {
$data .= $tail;
$usage = memory_get_usage();
$strlen = strlen($data);
$total_strlen += $strlen;
echo "* iteration: $counter <br>";
echo "* strlen(): " . ($strlen/(1024*1024)) . " MB <br>";
echo "* memory_get_usage(): " . ($usage/(1024*1024)) . " MB <br>";
echo "* aggregate strlen()'s: " . ($total_strlen/(1024*1024)) . " MB <br>";
echo "<br>";
}
}
useAllMemory();
echo "Completed without OOM<br>";
die(); This script printed the following results:
PHP reports just 14.5MB of memory used, yet somehow it is unable (or believes it is unable) to allocate more memory. For next steps, I'm planning to:
Also, planning to read this article for ideas: |
This appears to be the function PHP calls to make room for the concatenation. I've highlighted the line requesting allocation. It is part of the call stack where the OOM occurs, and it is the furthest up the stack I could get before getting into PHP opcode processing. The dev tools aren't showing which opcode, or I would share that here. |
And this is where PHP asks the OS for memory and detects a failure: Now to see what Emscripten is doing with the mmap() call... |
NOTE: When OOM occurs, the debugger shows that the requested memory size is about 14MB. |
Memory allocated with mmap() is freed with munmap(). PHP mostly ignores failed munmap() calls, except that it logs to STDERR when ZEND_MM_ERROR is defined: I went to define ZEND_MM_ERROR so we could see munmap() failures in the debugger and found this, which implies there were such failures before we silenced them:
Next: Confirm that these failures are occurring for this OOM and, if so, try to find out why. |
Yep. This is what I see in the console after setting ZEND_MM_ERROR=1. Some progress on getting visibility into Emscripten library behavior: I haven't read or understood the implementation yet, but at least we can take a look. Planning to continue with this tomorrow. Other notes:
PHP does appear to use anonymous regions, but I'm not yet sure whether PHP attempts partial unmapping. Something to check. For tomorrow:
|
This is top-notch work @brandonpayton! |
❤️ thanks, @adamziel!
I instrumented PHP 8.3 to print the address and size of each mmap() and munmap() attempt, and PHP does appear to attempt partial unmapping. From one of the test runs (full log here), we can see the first munmap() failure occurs when attempting to unmap with a smaller size than previously mapped memory at the same address.
Specifically, this is the first munmap() failure():
And the second munmap() failure is an attempt to partially unmap within the same memory space allocated earlier at address 0x6320000 and continuing until address 0x6720000.
Note that PHP is requesting a partial unmapping, something the emscripten libs apparently don't support. And when it fails, it does nothing (but does log to stderr if ZEND_MM_ERROR is truthy). I don't yet know what we can do about this. Here are some questions we can pursue. Do any others come to mind?
Tomorrow, I plan to start by walking through more of php-src to see if I missed other memory allocation options and to see whether providing custom memory storage is a possible solution. |
I’ve had a similar experience with a bunch of socket reading options and they were relatively easy to add with a patch. If we’re lucky, this would also be the case here. |
That's good to hear! OK, I'll take a look at the Emscripten libs first. It would be ideal not to have to customize PHP behavior but rather have the underlying platform behave as expected. |
Emscripten mmap() and mmap() rely upon a generic API supported by multiple allocators: // Direct access to the system allocator. Use these to access that underlying
// allocator when intercepting/wrapping the allocator API. Works with with both
// dlmalloc and emmalloc.
void *emscripten_builtin_memalign(size_t alignment, size_t size);
void *emscripten_builtin_malloc(size_t size);
void emscripten_builtin_free(void *ptr); And emscripten simulates anonymous mmap() and munmap() using But it seems like unmapping support could be added to an allocator. An allocator tracks allocated and free regions, and AFAICT, all a partial unmapping should do is take an allocated region and break it into a free region and one or more allocated regions. Conceptually partial unmapping can do one of four things:
At a high-level, this seems doable. Doing it sounds like a lot of fun, but a simpler solution would be better. I've spent a fair amount of time thinking about this issue, re-reading php-src code, and testing various ideas (e.g., playing with declared alignments in an attempt to avoid reasons PHP munmaps). So far, adding partial unmapping support is my only idea. If we go ahead with this... Our current allocator is dlmalloc, a generic malloc implementation from elsewhere that was adopted by Emscripten, and it tracks chunks using a combination of circular doubly-linked lists and a form of trees (specifically, tries). In our current version of Emscripten, its source file is ~6400 lines. But Emscripten also has it's own minimalistic allocator called "emmalloc". It's billed as a "Simple minimalistic but efficient sbrk()-based malloc/free that works in singlethreaded and multithreaded builds.". It "subdivides regions of free space into distinct circular doubly linked lists, where each linked list If we're going to add support for partial unmapping of anonymous regions, it is probably best to try first with the simpler allocator. I'm not sure which is the better allocator for PHP/WP performance, but for a PoC, it seems better to pick the simpler starting point. In very brief testing, Emmalloc seems fine when rendering a WP home page, and if the performance is comparable for the average case, it's also the smaller implementation. |
There are some updates on our continued investigation here and here. The state of this issue is:
|
@adamziel I haven't finished testing yet, but updating wasm_memory_storage to zero allocated memory looks like it might solve the issue. Apparently many mmap() implementations zero memory for anonymous mmap. And PHP may depend on this behavior. I'm currently rebuilding all php-wasm versions for a real PR, but you can check out c304fec under #1220 for the potential fix. The unit tests are passing there, but the e2e tests are not. I'm not yet sure if the failures are related to the memory leak fix. |
This PR attempts to fix the memory leak reported in #1128 and is an iteration on PR #1189 which had problems with "memory out of bounds" errors. ## What problem is it solving? It stops PHP from leaking memory throughout the runtime of a script and hopefully stops memory out of bounds errors by zeroing all memory given to PHP. ## How is the problem addressed? - By avoiding mmap()/munmap() which have incomplete implementations in Emscripten - By using posix_memalign() to allocate memory instead and manually zeroing the memory before handing it to PHP ## Testing Instructions - Observe CI test results - Use `npm run dev` and exercise Playground locally in the browser
We merged a potential fix in #1229. Let's see how it goes and close this if all is indeed well. |
@brandonpayton with #1229 merged, the original reproduction scenario still triggers the out of memory error in a browser. I can see the node.js test is passing, that's weird! |
@adamziel the original reproduction scenario always triggers an out of memory error because it uses an infinite loop that concatenates strings. Is this what you are testing with? while (true) {
$data .= str_repeat('a', 1024 * 1024); // Increase string size by 1MB in each iteration
echo "* " . (memory_get_usage()/(1024*1024)) . ' MB <br>';
} I wish we could see the output prior to the fatal, but now, the error is just printed to a console with no output shown. I will see if we can change that back so we actually see output prior to error. |
It looks like that behavior probably has to do with this PR: The main PHP script output used to print in the browser, but now, when the script exits with an error, the error is printed to the console with no partial response shown in the content pane. For at least the web interface, it seems like showing a partial response to the user is more helpful than showing nothing. Planning to file a bug for this. cc @bgrgicak |
@adamziel, it turns out I am able to reproduce this failure as well with the original repo and the web version. It fails after making a 50MB string, even though there should be much more memory available. Digging into this... |
@adamziel, I think what we are now seeing is expected. When I test in the browser, the PHP memory limit is 128M, and the "Allowed memory size exhausted" happens when the attempted allocation would surpass the allowed memory size. From the end of the test:
The limit is 128M, and the new allocation would add ~55MB to about 75MB making about 130MB, which exceeds the configured limit and triggers the error. Full log: https://gist.github.com/brandonpayton/d4239a0da6828647ed73770da95db043 Related to the memory limit, do we expect it to be 128M, and if so, would it make sense to choose a higher default? |
What, whaaaat, you're right! I increased the memory_limit and I was able to get that loop to allocate 200M – yay! Exemplary work on this one @brandonpayton! 🎉 Let's close and discuss the larger memory limit separately – we might want to also consider lowering the overall HEAP size, I think it's ~2GB now? |
## What is this PR doing? Increases PHP memory limit to 256M in web browsers to unblock more memory-hungry use-cases. Related: * #1128 ## Testing instructions Go to http://127.0.0.1:5400/website-server/?url=/phpinfo.php and confirm the memory limit says `256 M`
## What does this PR do? This PRrestores displaying the PHP output when a PHP error is encountered. It does that by attaching `response` to the error thrown by BasePHP and augmenting the Comlink transfer handler to pass that response between workers through postMessage. ## What problem does it solve? Prior to be0e783, script output was still shown for Fatal Errors. For example, when reproducing the memory-related errors under #1128, we used to see something like the following where page output was shown along with the Fatal Error that ended execution. But now that we are no longer returning a PHPResponse when there is a non-zero exit code, the content is not updated when running the same script. The error message is printed to the console, but the partial content is not visible to the user. Instead the previous content is left in place as if the script had not run at all. Closes #1231 ## Testing instructions Ensure the E2E tests pass
## What does this PR do? This PR restores displaying the PHP output when a PHP error is encountered. It does that by attaching `response` to the error thrown by BasePHP and augmenting the Comlink transfer handler to pass that response between workers through postMessage. ## What problem does it solve? Prior to be0e783, script output was still shown for Fatal Errors. For example, when reproducing the memory-related errors under #1128, we used to see something like the following where page output was shown along with the Fatal Error that ended execution. But now that we are no longer returning a PHPResponse when there is a non-zero exit code, the content is not updated when running the same script. The error message is printed to the console, but the partial content is not visible to the user. Instead the previous content is left in place as if the script had not run at all. Closes #1231 ## Testing instructions Ensure the E2E tests pass
@brandonpayton You are awesome!! I think this was one of the hardest problems to solve. ❤️ I tested it on NodeJS and I can see each request allocates always the maximum memory.
Thank you!! |
Thank you, @sejas! |
The issue below was reported by @sejas. It negatively impacts user experience of Playground and
wp-now
. Let's make sure:php.ini
defaults to, say, 1GB memory.php_wasm.c
doesn't contribute to the memory leak problem. PHP itself suffers from one, see pm.max_requests. Is php_wasm.c making it worse?A few memory leaks were already patched in this repo, find old PRs for more context.
What @sejas reported:
I've created a PHP function to make it much easier to benchmark the memory usage:
I'm currently adding it to the index.php, but you can also try the plugin in Studio or wp-now.
Here are my results:
Observe that the site screenshot displays almost 60MB of maximum memory.
The next (2nd) page load displays 29MB
The third page load 4.4MB
and then 2.4 MB
and 1.3 MB
Screenshot:
cc @brandonpayton
The text was updated successfully, but these errors were encountered: