-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context lost after running in loop in an interval. GPU memory leak? #572
Comments
I believe you are running out of memory, and the context crashes. You need to cleanup your previous result by using |
Is this new? In gpu.js 2.0.4, I didn't needed to call |
This is new functionality, here is a version of the jsfiddle that cleans up textures: https://jsfiddle.net/robertleeplummerjr/Lhqrod8f/1/
Currently, no. In your exact use case there is no need to use pipeline, turning it off will revert to the previous behavior. Why was it changed? Textures that were outputs from a single kernel would all be overwritten, and passing a texture output to a kernel that it came from would need to be cloned. In short, it was for GPU.js to mimic javascript better (you now can just pass the same texture into the kernel from which it came), and to be more performant (textures are automatically recycled internally). |
@robertleeplummerjr What do you mean with "be more performant (textures are automatically recycled internally)"? When I do not use pipeline, calling a kernel is very slow, because the texture data is read from the gpu into js. |
Yes, you can use them, it is just when you no longer need a texture, go ahead and delete it.
Here is a use the use case where the input and output textures are recycled: const gpu = new GPU();
const onesKernel = gpu.createKernel(function() { return 1; }, { output: [100, 100], pipeline: true });
const addOneKernel = gpu.createKernel(function(input) {
return input[this.thread.y][this.thread.x] + 1;
}, { output: [100, 100], pipeline: true });
const ones = onesKernel();
const twos = addOneKernel(ones);
// NOTE: here is where things get tricky!
const threes = addOneKernel(twos); Why is this "tricky"? Because twos.delete(); This doesn't actually delete the internal GL texture, it removes a reference to it, and allows GPU.js to recycle textures. It won't delete the internal textures until there are no references to them. |
It looks like, that every time a kernel is called, a new texture is created, instead that textures are recycled. Is that right? |
@robertleeplummerjr Normally, I think, a new texture should only be created if a kernel is called the first time, or gets it output as input, otherwise you have a memory leak, if you not manually delete every texture. |
It will only copy a texture if it is sent in as an argument. However, if the texture has not been deleted, a new texture is created for the output. Thus the memory leak you potentially are experiencing.
We're open to ideas, proposals that could help the community are welcomed. |
@robertleeplummerjr I would be happy, if GPU.js would get a configuration setting, to configure the old behaviour, because it looks like the new behaviour, to create always a new texture and delete textures, costs more performance than the old one. |
Lets just come up with a setting name for this behavior, and I'll get it in. Previously it was called 'immutable', perhaps just resurrecting this property would be ideal. |
I'm not good with names, perhaps:
I think "jsLikeOutput" is not a good name, but I don't have a better name now. But I think it is best, to have both settings. What do you think @robertleeplummerjr ? |
Performance should only be affected if cloning is being overly done. The other thing you can do is use |
I'm experiencing the same performance drawback with the newer versions of gpu.js. Following example is blistering fast with version 2.0:
Using this code with version 2.7 will lead to unnecessary texture cloning. Adding t2.delete() and t1.delete() to the loop will avoid the cloning - but there's still a performance penalty of around 10% compared to version 2.0. Any idea why that is? In case the old flag 'immutable' would bring back the old behaviour I would like to vote for resurrecting it. :) |
Ok, I think we have enough of a need here to go ahead and bring back immutable. |
FYI, already have it working locally, just have to fix some broken tests. #NobodyLeftBehind! |
@CipSoft-Components, I've forked and updated your jsfiddle: https://jsfiddle.net/robertleeplummerjr/0pqcsvy8/ Ty for including a detailed example. |
Profiled as well in chrome, and I couldn't even see the GPU memory allocations. As well the memory initially used by the kernel shrank and stabilized over about a 5 minute run with intervals set to 60 milliseconds. I believe we are set here. Please let me know if this becomes an issue again. |
Wow, that was quick! Thank you for the awesome support!! I just tested the new version 2.8. It's working right away with my old code. So v2.8 with default options behaves exactly like v2.0. Great! However for some reason v2.0 is still around 10% faster than v2.8 for me. You can test it with my code example above (works with v2.0 and v.2.8 without modifications). Any idea why that is? |
Ok, I've done some more performance testing with the different versions. Version 2.3.1 is the last version without any performance penalty. Version 2.4 which introduced the new memory pipeline management shows the 10% performance drawback (this version also needs texture.delete() to be compatible). |
Thank you! Now it works like before, with the benefits of all the changes from the versions in between. |
I'll see if I can run a profile to identify the areas where this penalty is introduced today or tomorrow, but if you wanted to help as a contributor, we welcome it. My guess is here:
We simply need to add the check if |
@Steve73 since I knew it to do a but of work that is completely irrelevant to mutable kernels, I went ahead and added the boolean to the related if statements and released it as a quick fix/patch: https://github.com/gpujs/gpu.js/releases/tag/2.8.1 Can you test performance and see what it is like now for us now? TY again for your hard work! |
Just tested version 2.8.1. Unfortunately, the problem ist still there. Are there any other relevant changes in 2.4 that could be the reason? |
I'm profiling with chrome's performance monitor. I looked a bit more into it and found that in v2.3.1 _setupOutputTexture() doesn't get called. In v2.8.1 it is called and quite expensive. In v2.8.1 the code looks like this:
In v2.3.1 the same code looked like this:
What I see is, that in v2.8.1 the "if (this.immutable) {" is missing several times. |
Fantastic! I will implement by tomorrow morning if you don't test/beat me to it. |
Ok, though the lines you mention are not 1 to 1, I did find that I was still cloning textures and that has been resolved in https://www.npmjs.com/package/gpu.js/v/2.8.2 Added unit tests: gpu.js/test/internal/texture-recycling.js Line 419 in 359e471
|
Can you retry and let me know your results? |
That helps considerably. Worked on this last night and plan to today as well. |
I have found the culprit, and one I had not seen prior, I believe the net result will be faster than v2.3.1. I should have a fix out later today, or tomorrow. |
feat: introduce WebGL._replaceOutputTexture and WebGL._replaceSubOutputTextures to cut down on resource usage feat: All supportable Math.methods added fix: Safari not able to render texture arguments feat: CPU gets a pipeline that acts like GPU with/without immutable
Released v2.9.0: https://github.com/gpujs/gpu.js/releases/tag/2.9.0 |
Script used for measuring: const size = 4096;
const gpu = new GPU({ mode: 'gpu' });
const kernel = gpu.createKernel(function(v) {
return v[this.thread.x] + 1;
}, {
output: [size],
pipeline: true,
immutable: true,
});
console.time('run');
let lastResult = null;
let result = kernel(new Float32Array(size));
for (let i = 0; i < 10000; i++) {
result = kernel(lastResult = result);
if (lastResult.delete) {
lastResult.delete();
}
}
console.log(result.toArray ? result.toArray() : result);
console.timeEnd('run'); You have helped me sincerely make this project faster. I cannot BELIEVE the difference in performance. |
Nearly 18 times faster! |
Thank you & great work! v2.9.0 is now even faster than v2.3.1 in my test case. Congrats! Unfortunately my actual application with v2.9.0 is still somewhat slower than with 2.3.1. Also I discovered some unexpected behavior with the newer versions. I'll look into it... |
I've pinned down the unexpected behaviour to following bug.
v2.9.0 shows following output:
This is incorrect. All entries should be 1. The bug seems to occur when using a kernel with small output size and pipeline disabled, followed by a kernel with bigger output size and pipeline enabled. |
Reproduced, working to resolve. |
I've created #586 for tracking. |
Fyi fixed. |
Awesome! Now everything works in v2.9.1. Thank you for your perfect support! |
feat: introduce WebGL._replaceOutputTexture and WebGL._replaceSubOutputTextures to cut down on resource usage feat: All supportable Math.methods added fix: Safari not able to render texture arguments feat: CPU gets a pipeline that acts like GPU with/without immutable
What is wrong?
WebGL context lost. When running a kernel in an interval in a loop many times, after some time the WebGL context get lost, an nothing works any more.
If you open the task manager from windows, in the tab Performance in the GPU section, you can see, that the gpu memory rises, then stays very high for a while, and the drops when the context is lost.
Where does it happen?
In GPU.js in the browser on windows, when running in an interval and loop.
How do we replicate the issue?
Here is a jsfiddle, to replicate the issue: https://jsfiddle.net/zbo8q3wv/
I used a special argument type in this example, but the issue appears also without any arguments.
How important is this (1-5)?
5
Expected behavior (i.e. solution)
Running smoothly without gpu memory leak.
Other Comments
I tried to update my gpu.js from 2.0.4 to 2.6.5, because of a bug when loading an image into a kernel, and afterwards use other kernels with arguments of the type "Array1D(2)". In 2.0.4 If the arguments have more than 2 entries, the entries are strange ordered and so on. I tested this issue against gpu.js 2.6.5, and realised, that it is fixed, but found this issue, when I was really updating my project.
Here also a jsfiddle, with the same code as above, but with gpu.js 2.0.4: https://jsfiddle.net/w0Ljs735/
If you look here at the gpu memory, you will see, that it not rises at all.
The text was updated successfully, but these errors were encountered: