Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroying Napi::ThreadSafeFunction instances causing segfault, but Linux only? #592

Closed
csheely opened this issue Nov 6, 2019 · 9 comments

Comments

@csheely
Copy link

csheely commented Nov 6, 2019

I have Node.js code that appears to work fine on Windows but that often crashes on Linux. If I'm doing something incorrectly, I can't see it, so I'm hoping someone can assist me.

  • Windows 10, Visual Studio 2017
  • Ubuntu 18.04, gcc 7.4.0
  • NAPI_VERSION = 4
  • Node version 10.17 and 12.13 (no difference in behavior)

I've distilled my module down and posted it here for reference: https://github.com/csheely/napi-threadsafe-function. See the bottom of the README for the commands to build the module and run the test app (pretty simple).

The module exposes a Foo class that is instantiated from JavaScript code. The Foo object's registerBar() method creates and returns a BarInterface object, and a Napi::Function argument is stored as a Napi::ThreadSafeFunction in a BarActivityReceiver helper class (will be called from other threads in the "real' app when activity occurs).

When I destroy the BarActivityReceiver object, I call Napi::ThreadSafeFunction::Release() on the thread safe function member, as the documentation states.

On Linux, if I create and then destroy one or two of these objects at a time, the code runs but eventually crashes after a few minutes following the destruction of the objects.

If I create and destroy 10 objects at a time (which is what my test app does), it appears to always crash the first time, after the 10th object is destroyed. But only on Linux.

The root folder of my repo contains valgrind output from a debug build that seems to show things go awry due to the Napi::ThreadSafeFunction destructor.

Thanks in advance for any assistance you can provide!

-Chris

@KevinEady
Copy link
Contributor

Hi @csheely ,

So we recently refactored the ThreadSafeFunction to remove the std::unique_ptr holding the underlying napi_threadsafe_function in #546 which has not been published yet (see #570 for the issue tracking the next release). Looking at your valgrind:

==3324==    at 0x4C3123B: operator delete(void*) (vg_replace_malloc.c:576)
==3324==    by 0x97BBB6C: std::default_delete<napi_threadsafe_function__*>::operator()(napi_threadsafe_function__**) const (unique_ptr.h:78)
==3324==    by 0x97BB7D6: std::unique_ptr<napi_threadsafe_function__*, std::default_delete<napi_threadsafe_function__*> >::~unique_ptr() (unique_ptr.h:268)
==3324==    by 0x97BB763: Napi::ThreadSafeFunction::~ThreadSafeFunction() (napi.h:1834)

This means you're probably using the published version 1.7.1.

If possible, can you try running your test with the current version of master? You should just be able to update your package.json to point to the git repo link. Does this solve the problem?

@gperinazzo
Copy link

I had the same problem (invalid write of 8 bytes at FinalizeFinalizeWrapperWithDataAndContext due to it being already deleted by the unique_ptr).

It works with the code currently on master.

@csheely
Copy link
Author

csheely commented Nov 6, 2019

Hi @KevinEady,

You are correct, I've been using version 1.7.1.

I was able to pull down the current version of master from the repo, and it does seem to solve the problem. At least it's running for a lot longer than it used to on Linux.

Thanks very much for the pointer, and we'll anxiously await the release of 2.0.0.

@KevinEady
Copy link
Contributor

Great news @csheely , I suggest we keep this issue open until release of 2.0.0 and you can verify that it works with the new published release as well.

@csheely
Copy link
Author

csheely commented Nov 7, 2019

Sounds good @KevinEady. Thanks!

@KevinEady
Copy link
Contributor

Hi @csheely ,

Looks like 2.0.0 was recently released. Can you try updating your package.json to this new major version and see if this resolves your problem now?

@csheely
Copy link
Author

csheely commented Dec 4, 2019

Hi @KevinEady, I'm away this week but will try when I get back Monday. Thanks for the heads-up!

@csheely
Copy link
Author

csheely commented Dec 11, 2019

@KevinEady sorry for the delay. I've tested 2.0.0 against my simplified example code referenced above, and it looks like the problem was addressed by the new release.

When testing with my actual code it looks like I'm still having some issues, but they don't look exactly the same as this one and may be the result of my own code.

I suggest we mark this issue as resolved and close it. I'll continue troubleshooting my other problems and write up new issues if I can trace them back to the new release.

@KevinEady
Copy link
Contributor

Thanks for the update! Closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants