Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to NAN 2 (NodeJS 4) #182

Merged
merged 1 commit into from
Oct 4, 2015
Merged

Upgrade to NAN 2 (NodeJS 4) #182

merged 1 commit into from
Oct 4, 2015

Conversation

kant2002
Copy link
Contributor

All trivial migration to Node.JS finished, but there some memory leaks which cause tests to fail.

Anybody interested could continue based on this work.
This is related to #169 and #179

Windows not fully working (still WIP)
Linux & MacOS fine.

@kant2002 kant2002 closed this Sep 16, 2015
@kant2002 kant2002 reopened this Sep 16, 2015
@kant2002 kant2002 changed the title WIP for Node 4.0 port WIP for Node 4.0 port (all tests passed) Sep 21, 2015
@cadecairos
Copy link

Great work on this @kant2002 👍

@EyalAr hopefully you can take a look at this pretty soon :)

@kant2002
Copy link
Contributor Author

Not only me, @weareu help a lot

@antoinepairet
Copy link

@unbornchikken You mentioned in #169 you could help porting lwip to NAN 2
Could you take a look at the work performed by @kant2002 and @weareu
Thanks for the hard work!

@unbornchikken
Copy link

I did my port #180 but unfortunately that not builds because of some png lib shit, that I doesn't have the time to fix. Master was a little mess I believe when I forked.

@antoinepairet
Copy link

@unbornchikken Thanks for your quick reply.
@kant2002 What can I do to help making progress on this?

@EyalAr
Copy link
Owner

EyalAr commented Sep 27, 2015

Thanks everybody for pitching in!
@unbornchikken Recently I was in the process of upgrading libpng to a later version, but I haven't finished fixing all the issues there.
It might be worth reverting those changes in order to isolate the work needed to this PR.

@unbornchikken
Copy link

@EyalAr Np, I really didn't have the time to fix that, I'm pretty busy with my Workflow 4 Node module that we're starting to use it in production, that's why I asked for help. Since @kant2002 and @weareu did his own port with passed tests, there is nothing to get done. Or, there is? @antoinepairet I really don't get what's the issue there. Could anyone enlighten me on this please?

@kant2002
Copy link
Contributor Author

Test suite on Windows is falling midway. So there additional memory leak probably, or this is libpng porting issue. I don't sure. I believe PR that this could be merged, this solve pain of 2/3 users, and then Windows testing continued.

@weareu
Copy link
Contributor

weareu commented Sep 27, 2015

I did successfully upgrade libpng to latest with passed tests. But reverted as I didn't think it relevant. Give me a day or so to put back the upgrade and find a windows machine to test on.

@weareu
Copy link
Contributor

weareu commented Sep 27, 2015

Getting confused. Libpng is latest. ibpng-1.6.18
CImg is not. This I'll upgrade and still find windows to test on.

@antoinepairet
Copy link

@weareu I have a virtual machine with Windows 7 64 bits. If you point me to the repo/commit you would like to test, I can build and run mocha tests.
Thanks to everyone involved.

@EyalAr
Copy link
Owner

EyalAr commented Sep 28, 2015

There are AppVeyor tests for Windows machines, which seem to pass now. They fail on master (in which I started the libpng upgrade attempt).
It'd be good if this PR also updates Travis and AppVeyor configuration to test on node4 (and 0.12 while we're at it).
Once we have node4 passing on the CIs and I do a code review I think we'll be ready to merge and release a new version (which I know @joannajw has been waiting for. Sorry for the delay).

Speaking of releasing a new version, I really appreciate the job you guys have done here, please let me know if you'd be interested in further helping to maintain this project.

@kant2002
Copy link
Contributor Author

More information about Windows build.

  • I'm on Windows 10, Node 4.1.1 x64, Node-gyp 3.0.3
  • Tests are flacky now. Sometimes they are pass, sometimes they are not. And looks like this is depends on the compiler

When build with VS2015, following tests periodically fails:

  1. lwip.open arguments validation -> pixelbuffer-> with correct width and height for 1 channel
  2. lwip.open arguments validation -> pixelbuffer-> with correct width and height for 2 channel
  3. lwip.open arguments validation -> pixelbuffer-> with correct width and height for 3 channel
  4. lwip.open -> raw pixel buffer -> grayscale image
  5. lwip.open -> raw pixel buffer -> grayscale image with alpha
  6. lwip.open -> raw pixel buffer -> rgb image

When rebuild with VS2013 it also half of the time fails.
Seems to be that VS2013 version fails more often then VS2015. At least on VS2015 build I usually stuck on lwip.open -> raw pixel buffer tests, but on VS2013 I happy when I manage to pass lwip.open arguments validation -> pixelbuffer, and VS2013 rarely finish, but VS2015 finishes half of the time for me.

@kant2002
Copy link
Contributor Author

@EyalAr I also want to add support for VS2015, for Node > 4.0 and test both VS2013 and VS2015 in Travis/AppVeyor. Do you think this is reasonable?

Also could someone give me idea why Travis is failing now?

@EyalAr
Copy link
Owner

EyalAr commented Sep 28, 2015

@kant2002
Sure, let's add VS2015 to the test suite, as many Windows users do use VS2015.

Windows support has always been problematic with native node modules.
Eventually I want to avoid local compilation completely by compiling once on the CI and having end users download the compiled binaries automatically while doing npm install lwip. See issue #83.

Regarding Travis, it seems to be related to travis-ci/travis-ci#4771 and g++ version.

@weareu
Copy link
Contributor

weareu commented Sep 28, 2015

Ok tested on windows, seems like heap is being corrupted by both Cimg and Giflib, having a hard time tracking down the actual culprit as I think the triggers thus far is symptomatic. Upgraded CImg, but that didn't make a difference. Seems like the buffer is being re-used in some instances where it was wrapped. Strapped on some windows memory checking/debugging tools (deleaker, Dr Memory, App Verifier, etc), but they are giving me so many issues with either CImg or Giflib. Pnglib seems fine. But again, hard to tell.

What I haven't checked is if the current version is passing tests on windows on Node 0.10/11/12, can anyone confirm?

@unbornchikken
Copy link

@weareu where does those buffers originated from? They wraps some native memory, or they are created with the Buffer constructor, and used its memory at native side? Because if the latter, then the issue is that new Buffer implementation is backed be v8's ArrayBuffer, and its memory is backed by the GC, hence it's moveable.

@antoinepairet
Copy link

I did some testing on Windows 7 64bits, node 4.1.1 32 bits. I run the tests 20 times (all output in a gist):

  • 6 successful full runs
  • 14 crashes (no apparent pattern, different tests)

If it is an issue with the new implementation of Buffer in node v4, we should observe the same random behaviour on OSX, no? But I could not make the tests crash on OSX. It thus seems to be o windows-only issue.

Let me know how I can help.

regards

@kant2002
Copy link
Contributor Author

Amount of crashes could be reduced by commenting out line @ca3a44acda503ff9ec3fe87ef8b18a406363deb1, that resolve lwip.open -> raw pixel buffer issues. Actually I believe that this is just replace crashes with memory leaks.

@unbornchikken The buffers originates from the NodeJS side and their content used by the native code.

@kant2002
Copy link
Contributor Author

I was try to copy memory to the array and uncomment delete _cimg which cause the crashes, but no luck. I prefer having special issue for that - #185 which should be fixed. Since that code was already present, than we could made new release with support for Node4 and then concentrate on fixing the issue. Or alternatively if this is Windows only issue then we could put this under #define, so on Mac and Linux would be no memory leaks.

@antoinepairet
Copy link

@kant2002 thanks for digging in further. Unfortunately, removing delete _cimg is not fixing random crashes on Windows. I still observe random crashes when running npm test. Something new that I did not observe before is: npm test to complete (no crash) but test failures (stress tests).
I created a Gist with the details of another batch of 20 test runs.

rgds

@e00dan
Copy link

e00dan commented Sep 29, 2015

Fingers crossed for this PR and I'm trying to already switch to your branch.

@kant2002
Copy link
Contributor Author

@antoinepairet Thanks for sharing your crashing tests, I also find that tests are failing after the commit. Under pressure with other projects, so sorry for noise. My next step would be add two tests which are falling to the stress tests, and try execute only them, hopefully this allow me speedup testing.

Since when I copy content of the buffer to the heap and then pass copied data, tests fails at the same rate, so I suspect that error could be in the another place, and not in the LwipImage::New method.

@weareu
Copy link
Contributor

weareu commented Sep 29, 2015

I've found the issue. Major change though. Will commit and update my PR. Then we can discuss.

@kant2002
Copy link
Contributor Author

Cool @weareu. I just find out that when testing on raw pixel buffer 1200x1200 instead of 120x120 issue appear instantly, not sure if this is helps much now.

@EyalAr EyalAr changed the title Upgrade to NAN 2 Upgrade to NAN 2 (NodeJS 4) Oct 2, 2015
@weareu
Copy link
Contributor

weareu commented Oct 2, 2015

@EyalAr your repo, not a problem for me. I would've supported implementing correct channels for smaller buffers and much faster processing in those instances as well but maybe in future as I can imagine pulling channels through to decoders/encoders is some more effort. Also, larger than 4 channels will still fail if kept this way, maybe provide an error when > 4? Some of the changes from @kant2002 wrt the PNG_SIMPLIFIED_WRITE_SUPPORTED and included files for decoder/encoder I cannot answer to, I thought it was merged from master. So hopefully he can help with why on those.

@EyalAr
Copy link
Owner

EyalAr commented Oct 2, 2015

@weareu I agree. Converting all buffers to 4 channels may not be the most elegant and efficient solution, but it works for now.
It'd be great if we can think of other alternatives.
But in order to get this PR merged soon, we should probably do it separately.

@weareu
Copy link
Contributor

weareu commented Oct 2, 2015

@EyalAr ah I see the error on larger than 3 channel buffers in makeRgbaBuffer 4 will fall through so 5+ will give an error then. thats fine.

@weareu
Copy link
Contributor

weareu commented Oct 2, 2015

Cool in that case I'll revert and update my pr to Kant. He can check the minor issues on his side. If we are all merged up I'll add my Channel changes back and pull it through to all encoders/decoders so that channels is what the image says and create another PR. Will help to clean things up as well on the commit side.

@kant2002
Copy link
Contributor Author

kant2002 commented Oct 2, 2015

I rebase over master and @EyalAr and @weareu what do you thin. Should I try run tests without b89363b (Fix CImg memcpy overflow caused by N_CHANNELS 4) ?

@weareu
Copy link
Contributor

weareu commented Oct 2, 2015

@kant2002 yep try it. Remember to make sure delete stays in the destructor.

@EyalAr
Copy link
Owner

EyalAr commented Oct 2, 2015

Notice that AppVeyor doesn't run the tests. It only builds the library.
Once the N_CHANNELS changes are reverted, tests on Travis should pass.

This should be the last one.
@EyalAr
Copy link
Owner

EyalAr commented Oct 3, 2015

Great, we have Travis passing now.
I also tested on a Windows7 VM with VS2013 and Node 4.0.0 and 4.1.0, and everything passed.

@antoinepairet Would you please test on your Windows machine as well?

@unbornchikken Would you please do a quick review of the NAN2 related code changes?

@kant2002 There are two lines commented out in pnglibconf.h (one in decoder and one in encoder), would you please explain why? (I commented on those lines above)

Thanks everybody.

@@ -16,11 +16,12 @@ void RotateWorker::Execute () {
const float nangle = cimg::mod(_degs, 360.0f);
if (cimg::mod(nangle, 90.0f) != 0) {
CImg<unsigned char> * res;
int channels = _cimg->spectrum();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weareu revert here as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the unclean commit here, really apologise, not sure how that happened. I can revert, commit, push, merge, pull, then kant can merge and update or, this code is probably better, has no impact on hardcoding or not and definitely needs to come back later. What do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries.
I like the code to be consistent, so if we leave it like this, we have to make similar changes in other places in the code.
I agree that this code gives us more flexibility regarding the number of channels, but we need to design those changes properly, as they affect other parts of the library.
Let's revert for now, and later decide how we properly bring this back across the library.
Thanks :)

@unbornchikken
Copy link

Hi, thank you for trusting me on this. I'm gonna take a look later today.

@kant2002
Copy link
Contributor Author

kant2002 commented Oct 3, 2015

@EyalAr I have issues building encoder and decoder with these defines without pngwrite.c and pngread.c

It produce errors like that:

*lwip_decoder*
...
png.obj : error LNK2001: unresolved external symbol png_destroy_read_struct

*lwip_encoder*
...
png.obj : error LNK2001: unresolved external symbol png_destroy_write_struct

@EyalAr
Copy link
Owner

EyalAr commented Oct 3, 2015

@kant2002 You're right. Same issue for me only on Windows. OK, let's leave it like that.

@unbornchikken
Copy link

Looks good.

@antoinepairet
Copy link

@EyalAr
Here my test results for kant2002@f37cc0e , Virtual Machine running under Parallels (Version 10.3.0 (29227)). Win 7 64 bits. Visual Studio 2015 Community ed.

I have made 12 runs. 3 failed (2nd, 6th and 12th)

Results of failing runs:

  444 passing (2m)
  1 failing

  1) image.batch toBuffer png interlaced fast compression should succeed:
     Error: timeout of 10000ms exceeded. Ensure the done() callback is being called in this test.
      at null.<anonymous> (C:\Users\apairet\sources_windows\lwip\node_modules\mocha\lib\runnable.js:189:19)
  444 passing (2m)
  1 failing

  1) stress tests 25 random manipulations on one image (1) should succeed:
     Error: Unable to resize image
      at Error (native)
 444 passing (2m)
 1 failing

 1) stress tests 7 random manipulations for 50 images (in parallel) should succeed:
    Error: Unable to resize image
     at Error (native)

I am currently downloading Visual Studio 2013. Then I will rerun a batch of tests after having installed with the following option:

npm install --msvs_version=2013

I should have the results for VS2013 by tomorrow.

Thanks to all for the hard work. Regards.

@antoinepairet
Copy link

I just made 30 tests run after having installed with --msvs_version=2013, 1 failed (10th):

  444 passing (3m)
  1 failing

  1) stress tests 7 random manipulations for 50 images (in parallel) should succeed:
     Error: Unable to resize image
      at Error (native)

regards

@EyalAr
Copy link
Owner

EyalAr commented Oct 4, 2015

Thanks @unbornchikken.

Thanks @antoinepairet. I'm not concerned about tests failing because of timeout (I increased the timeout in 8f3df6a to avoid this in the future). The native error (Unable to resize image) is a bit more concerning. I assume it's caused by insufficient memory on the VM, but I can't reproduce it. Anyway, should look out for it in case it comes back.

@kant2002, @weareu I merged all your commits into the version/0.0.8 branch, and will merge everything into master and release a new version after all tests pass.

Thanks everybody!

@EyalAr EyalAr merged commit f37cc0e into EyalAr:master Oct 4, 2015
@bexoss
Copy link

bexoss commented Apr 1, 2016

So.... what is conclusion? I have same problem in node 4.2 , Ubuntu 14. I got an same issue in latest version. Is it solved or in progress?

@Undefitied
Copy link

bexoss +

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants