Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some examples fail on macOS #8

Closed
beaugunderson opened this issue Jan 6, 2017 · 27 comments
Closed

Some examples fail on macOS #8

beaugunderson opened this issue Jan 6, 2017 · 27 comments
Assignees
Labels
c++ C++ engineering related python Python engineering related welcome contribution

Comments

@beaugunderson
Copy link
Contributor

They all seem to fail with the same assertion:

$ python bubbles.py
asset_manager.h@(Ln 20): Assertion Failed. [Asset has been expired]

My taichi is compiled against Python 2.7.13, boost 1.6.13, embree 2.13.0, and tbb 4.4-20161128.

The examples that do work are:

  • geometry
  • scoping
  • trees
@yuanming-hu
Copy link
Member

Thank you for pointing this out. It's an interesting issue since bubbles.py runs well on my Windows, OS X, and Ubuntu environment. Could you please provide more clues about which line in the python script goes wrong? I guess it's around line 20 of bubbles.py, where the Perlin noise texture is used as an asset and registered to AssetManager.

@yuanming-hu yuanming-hu self-assigned this Jan 6, 2017
@beaugunderson
Copy link
Contributor Author

beaugunderson commented Jan 6, 2017

Yes, you are right, it's the perlin texture. :)

I tried not defining the texture and not setting color_map=texture.id on the plane but got this error:

$ python bubbles.py
Assertion failed: (color_sampler != nullptr), function initialize, file /Users/beau/p/taichi/taichi/src/surface_material/surface_material.cpp, line 52.
Abort trap: 6

I then tried to set a color on the plane but got the same as above.

Removing the plane entirely from the scene does allow the bubbles to render, though.

@yuanming-hu
Copy link
Member

How about now? I changed texture.id to texture in line 21. When python GCs one object, it will then be deleted in the C++ part as well thus result in expired assets. It's strange that different versions of python have different GC behaviour, though.

@beaugunderson
Copy link
Contributor Author

beaugunderson commented Jan 6, 2017

Hmm, same issue when passing texture instead of texture.id (I tried using f0b5d67)

@beaugunderson
Copy link
Contributor Author

Also noting that I was able to get material_balls.pyto work by removing all textures, masks, and maps (except for the envmap, which I'm guessing isn't handled by the asset manager?)

@yuanming-hu
Copy link
Member

It's strange that textures do not work. I'll probably deal with it after work today, thanks.

@yuanming-hu yuanming-hu added python Python engineering related c++ C++ engineering related labels Jan 6, 2017
@yuanming-hu
Copy link
Member

Hi beaugunderson, I still cannot reproduce the error on my side. However, I added some code so that the C++ exception can be passed to python and we can get more information about which python script goes wrong.

If you don't mind it please run the new code again and paste the python call stack when crash here.

Thank you very much.

@beaugunderson
Copy link
Contributor Author

Here's the result:

$ python bubbles.py
asset_manager.h@(Ln 20): Assertion Failed. [Asset has been expired]
Traceback (most recent call last):
  File "bubbles.py", line 20, in create_scene
    texture = (Texture('perlin') + 1).fract()
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 38, in __add__
    return Texture("linear_op", alpha=1, tex1=self, beta=1, tex2=other)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 11, in __init__
    self.c.initialize(P(**kwargs))
RuntimeError: Assertion failed.
Traceback (most recent call last):
  File "bubbles.py", line 49, in <module>
    renderer.initialize(preset='pt', scene=create_scene())
  File "bubbles.py", line 41, in create_scene
    scene.add_mesh(mesh)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/scene.py", line 31, in __exit__
    raise exc_val
RuntimeError: Assertion failed.

@beaugunderson
Copy link
Contributor Author

Happy to do more debugging here, just let me know what you need :)

@beaugunderson
Copy link
Contributor Author

I added a couple of print statements near texture.py line 11 to print name and kwargs before and after the call to asset_ptr_to_id:

$ python bubbles.py
perlin {}
perlin {}
const {'value': (1, 1, 1, 1)}
const {'value': (1, 1, 1, 1)}
linear_op {'alpha': 1, 'beta': 1, 'tex1': <taichi.visual.texture.Texture instance at 0x111c5cab8>, 'tex2': <taichi.visual.texture.Texture instance at 0x111c5cb00>}
linear_op {'alpha': 1, 'beta': 1, 'tex1': 0, 'tex2': 1}
asset_manager.h@(Ln 20): Assertion Failed. [Asset has been expired]

@beaugunderson
Copy link
Contributor Author

It looks like part of this may be due to boost's handling of std::shared_ptr vs. boost::shared_ptr?

@beaugunderson
Copy link
Contributor Author

std::shared_ptr support was added to boost 1.63 but only when compiled in C++11 mode, which is not the default for brew install boost; I will try compiling in C++11 mode now...

@beaugunderson
Copy link
Contributor Author

beaugunderson commented Jan 6, 2017

Sadly brew reinstall boost --c++11 did not fix the issue.

@beaugunderson
Copy link
Contributor Author

I also tried changing all instances of std::shared_ptr, std::weak_ptr, std::make_shared, and std::static_pointer_cast to their boost:: equivalents but that had no effect either.

@beaugunderson
Copy link
Contributor Author

Because we know that the weak_ptr prematurely expiring is the issue I was able to make some progress in running the examples by storing a shared_ptr in AssetManager instead... Wish I could be more helpful but I'm new to C++11 and boost.python :)

@yuanming-hu
Copy link
Member

Thank you so much for trying these possibilities. Based on your observations, asset_ptr_to_id seems suspicious.

Could you please try replacing the __init__ function of Texture (starting from Ln. 7 of texture.py) to the following?

    def __init__(self, name, **kwargs):
        _kwargs_backup = kwargs.copy()
        if isinstance(name, str):
            self.c = tc_core.create_texture(name)
            kwargs = asset_manager.asset_ptr_to_id(kwargs)
            self.c.initialize(P(**kwargs))
        else:
            self.c = name
        self.id = tc_core.register_texture(self.c)
        print 'name:', name
        print 'original:', kwargs
        print 'backup:', _kwargs_backup
        print

I guess maybe asset_ptr_to_id just replaced the original Texture with its id. The Texture object was thereby disposed (Texture.c as well, which is exactly std::shared_ptr) so we have expired std::weak_ptr. What I did here is holding an extra copy of pointer to these objects (Texture in python) in _kwargs_backup.

Hope it helps.

@beaugunderson
Copy link
Contributor Author

I tried a variation of that as well to no effect; here is the output from your method:

$ python bubbles.py
kwargs: {}
name: perlin
original: {}
backup: {}

kwargs: {'value': (1, 1, 1, 1)}
name: const
original: {'value': (1, 1, 1, 1)}
backup: {'value': (1, 1, 1, 1)}

kwargs: {'alpha': 1, 'beta': 1, 'tex1': <taichi.visual.texture.Texture instance at 0x14c8aecf8>, 'tex2': <taichi.visual.texture.Texture instance at 0x14c8aed40>}
asset_manager.h@(Ln 20): Assertion Failed. [Asset has been expired]
Traceback (most recent call last):
  File "bubbles.py", line 20, in create_scene
    texture = (Texture('perlin') + 1).fract()
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 48, in __add__
    return Texture("linear_op", alpha=1, tex1=self, beta=1, tex2=other)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 15, in __init__
    self.c.initialize(P(**kwargs))
RuntimeError: Assertion failed.
Traceback (most recent call last):
  File "bubbles.py", line 49, in <module>
    renderer.initialize(preset='pt', scene=create_scene())
  File "bubbles.py", line 41, in create_scene
    scene.add_mesh(mesh)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/scene.py", line 32, in __exit__
    raise exc_val
RuntimeError: Assertion failed.

(I also print 'kwargs:', kwargs at the beginning of the method)

@yuanming-hu
Copy link
Member

It's indeed a strange problem...Maybe a good way to debug, is to add some debug output in the destructor of taichi::Texture in C++, so that we can know when it's destroyed (and the weak_ptr expires). I'll deal with it later. Sorry to get you so involved in this issue.

@iamyoukou
Copy link

@beaugunderson Do you still fail in those examples? They work well on my MacOSX 10.11.6.

@zq317157782
Copy link

I got the same issue

@beaugunderson
Copy link
Contributor Author

@iamyoukou yes, I still have the issue (I just updated to latest git and recompiled to test)

@iamyoukou
Copy link

@beaugunderson It's weird. bubbles.py works fine for me, but tube_in_cube.py and vcm.py get Assertion Failed issue.

aibm8:rendering YJ-work$ python2.7 tube_in_cube.py 
Assertion failed: (color_sampler != nullptr), function initialize, file /Users/YJ-work/Desktop/taichi_root/taichi/src/surface_material/surface_material.cpp, line 213.
Abort trap: 6
aibm8:rendering YJ-work$ python2.7 vcm.py 
Assertion failed: (color_sampler != nullptr), function initialize, file /Users/YJ-work/Desktop/taichi_root/taichi/src/surface_material/surface_material.cpp, line 213.
Abort trap: 6

@yuanming-hu
Copy link
Member

Hi guys, sorry for the delayed response. I added some code to the ctor and dtor of taichi::Texture to explicit print out this process, like

DEBUG: Texture 0x7fe49a6ce468 creating.
Configures: 
 * value = (0.280035,0.28,0.92)

, so that we can have a better knowledge of when these textures are created and disposed. On my Mac, when I run bubbles.py, I get all these creating info but no disposed info. I guess you may get some disposed so that the weak_ptr expires. Please run the latest bubbles.py and paste some output here. Thanks!

@iamyoukou I fixed tube_in_cube.py and vcm.py which crash for some other reasons.

@beaugunderson
Copy link
Contributor Author

@IteratorAdvance hmm the results are very odd!

it seems like the pointer is expired without the texture being disposed?

Creating texture: perlin
DEBUG: Texture 0x7fb4a7c274e8 creating.
Configures:
Creating texture: const
DEBUG: Texture 0x7fb4a7c26e38 creating.
Configures:
 * value = (1, 1, 1, 1)
Creating texture: linear_op
DEBUG: Texture 0x7fb4a7c273c8 creating.
Configures:
 * alpha = 1
 * beta = 1
 * tex1 = 0
 * tex2 = 1
asset_manager.h@(Ln 20): Assertion Failed. [Asset has been expired]
Traceback (most recent call last):
  File "bubbles.py", line 20, in create_scene
    texture = (Texture('perlin') + 1).fract()
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 39, in __add__
    return Texture("linear_op", alpha=1, tex1=self, beta=1, tex2=other)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/texture.py", line 12, in __init__
    self.c.initialize(P(**kwargs))
RuntimeError: Assertion failed.
Traceback (most recent call last):
  File "bubbles.py", line 49, in <module>
    renderer.initialize(preset='pt', scene=create_scene())
  File "bubbles.py", line 41, in create_scene
    scene.add_mesh(mesh)
  File "/Users/beau/p/taichi/taichi/python/taichi/visual/scene.py", line 31, in __exit__
    raise exc_val
RuntimeError: Assertion failed.
DEBUG: Texture 0x7fb4a7c273c8 disposed.
DEBUG: Texture 0x7fb4a7c274e8 disposed.
DEBUG: Texture 0x7fb4a7c26e38 disposed.

@iamyoukou
Copy link

bubbles.py

YedeMacBook-Pro:rendering YJ-work$ python bubbles.py 
Creating texture: perlin
DEBUG: Texture 0x7fa2515733a8 creating.
Configures: 
Creating texture: const
DEBUG: Texture 0x7fa25157c4b8 creating.
Configures: 
 * value = (1, 1, 1, 1)
Creating texture: linear_op
DEBUG: Texture 0x7fa25157dfd8 creating.
Configures: 
 * alpha = 1
 * beta = 1
 * tex1 = 0
 * tex2 = 1
Creating texture: fract
DEBUG: Texture 0x7fa25157d238 creating.
Configures: 
 * tex = 2
DEBUG: Texture 0x7fa25157dbd8 creating.
Configures: 
 * value = (0.1,0.08,0.08)
DEBUG: Texture 0x7fa25157e4b8 creating.
Configures: 
 * value = (1,1,1)
DEBUG: Texture 0x7fa25157e708 creating.
Configures: 
 * value = (0.28,0.92,0.758026)
DEBUG: Texture 0x7fa25157f948 creating.
Configures: 
 * value = (0.28,0.92,0.758026)
DEBUG: Texture 0x7fa251613288 creating.
Configures: 
 * value = (0.92,0.28,0.590734)
DEBUG: Texture 0x7fa2516456c8 creating.
Configures: 
 * value = (0.92,0.28,0.590734)
DEBUG: Texture 0x7fa251740c98 creating.
Configures: 
 * value = (0.92,0.28,0.66191)
DEBUG: Texture 0x7fa2517496f8 creating.
Configures: 
 * value = (0.92,0.28,0.66191)
DEBUG: Texture 0x7fa251631518 creating.
Configures: 
 * value = (0.28,0.643029,0.92)
DEBUG: Texture 0x7fa251606be8 creating.
Configures: 
 * value = (0.28,0.643029,0.92)
DEBUG: Texture 0x7fa251486748 creating.
Configures: 
 * value = (0.28,0.92,0.846726)
DEBUG: Texture 0x7fa2514940b8 creating.
Configures: 
 * value = (0.28,0.92,0.846726)
DEBUG: Texture 0x7fa2514879b8 creating.
Configures: 
 * value = (0.689488,0.92,0.28)
DEBUG: Texture 0x7fa251486678 creating.
Configures: 
 * value = (0.689488,0.92,0.28)
DEBUG: Texture 0x7fa25148afd8 creating.
Configures: 
 * value = (0.77292,0.28,0.92)
DEBUG: Texture 0x7fa251489b38 creating.
Configures: 
 * value = (0.77292,0.28,0.92)
DEBUG: Texture 0x7fa25157fb98 creating.
Configures: 
 * value = (0.28,0.744459,0.92)
DEBUG: Texture 0x7fa25157fa18 creating.
Configures: 
 * value = (0.28,0.744459,0.92)
DEBUG: Texture 0x7fa25157e948 creating.
Configures: 
 * value = (0.313126,0.28,0.92)
DEBUG: Texture 0x7fa25157eef8 creating.
Configures: 
 * value = (0.313126,0.28,0.92)
DEBUG: Texture 0x7fa25157fb18 creating.
Configures: 
 * value = (0.92,0.28,0.328118)
DEBUG: Texture 0x7fa25157f058 creating.
Configures: 
 * value = (0.92,0.28,0.328118)
DEBUG: Texture 0x7fa251644f08 creating.
Configures: 
 * value = (0.475327,0.92,0.28)
DEBUG: Texture 0x7fa251645008 creating.
Configures: 
 * value = (0.475327,0.92,0.28)
DEBUG: Texture 0x7fa25157e7f8 creating.
Configures: 
 * value = (0.28,0.895481,0.92)
DEBUG: Texture 0x7fa25157e828 creating.
Configures: 
 * value = (0.28,0.895481,0.92)
DEBUG: Texture 0x7fa25174a718 creating.
Configures: 
 * value = (0.28,0.334592,0.92)
DEBUG: Texture 0x7fa25174a7e8 creating.
Configures: 
 * value = (0.28,0.334592,0.92)
DEBUG: Texture 0x7fa251633418 creating.
Configures: 
 * value = (0.28,0.92,0.684025)
DEBUG: Texture 0x7fa251633518 creating.
Configures: 
 * value = (0.28,0.92,0.684025)
DEBUG: Texture 0x7fa251498ed8 creating.
Configures: 
 * value = (0.28,0.867459,0.92)
DEBUG: Texture 0x7fa251492698 creating.
Configures: 
 * value = (0.28,0.867459,0.92)
DEBUG: Texture 0x7fa25174a848 creating.
Configures: 
 * value = (0.73369,0.92,0.28)
DEBUG: Texture 0x7fa251749a28 creating.
Configures: 
 * value = (0.73369,0.92,0.28)
DEBUG: Texture 0x7fa251637a68 creating.
Configures: 
 * value = (0.92,0.546126,0.28)
DEBUG: Texture 0x7fa251637b68 creating.
Configures: 
 * value = (0.92,0.546126,0.28)
DEBUG: Texture 0x7fa251740d78 creating.
Configures: 
 * value = (0.92,0.67768,0.28)
DEBUG: Texture 0x7fa251749a58 creating.
Configures: 
 * value = (0.92,0.67768,0.28)
DEBUG: Texture 0x7fa25164b118 creating.
Configures: 
 * value = (0.307248,0.28,0.92)
DEBUG: Texture 0x7fa25164b1e8 creating.
Configures: 
 * value = (0.307248,0.28,0.92)
DEBUG: Texture 0x7fa25157f498 creating.
Configures: 
 * value = (0.92,0.28,0.753858)
DEBUG: Texture 0x7fa25157f4c8 creating.
Configures: 
 * value = (0.92,0.28,0.753858)
DEBUG: Texture 0x7fa2514253d8 creating.
Configures: 
 * value = (0.92,0.28,0.65878)
DEBUG: Texture 0x7fa2514866a8 creating.
Configures: 
 * value = (0.92,0.28,0.65878)
DEBUG: Texture 0x7fa25164b248 creating.
Configures: 
 * value = (0.28,0.861503,0.92)
DEBUG: Texture 0x7fa2516389a8 creating.
Configures: 
 * value = (0.28,0.861503,0.92)
DEBUG: Texture 0x7fa251498f08 creating.
Configures: 
 * value = (0.92,0.28,0.558986)
DEBUG: Texture 0x7fa251498f38 creating.
Configures: 
 * value = (0.92,0.28,0.558986)
DEBUG: Texture 0x7fa251645a08 creating.
Configures: 
 * value = (0.28,0.495654,0.92)
DEBUG: Texture 0x7fa251637568 creating.
Configures: 
 * value = (0.28,0.495654,0.92)
DEBUG: Texture 0x7fa25157f598 creating.
Configures: 
 * value = (0.30667,0.28,0.92)
DEBUG: Texture 0x7fa25157e858 creating.
Configures: 
 * value = (0.30667,0.28,0.92)
DEBUG: Texture 0x7fa25157fc48 creating.
Configures: 
 * value = (0.656047,0.92,0.28)
DEBUG: Texture 0x7fa25157fc78 creating.
Configures: 
 * value = (0.656047,0.92,0.28)
DEBUG: Texture 0x7fa25157ff58 creating.
Configures: 
 * value = (0.28,0.92,0.32669)
DEBUG: Texture 0x7fa25157ff88 creating.
Configures: 
 * value = (0.28,0.92,0.32669)
DEBUG: Texture 0x7fa251499508 creating.
Configures: 
 * value = (0.68898,0.28,0.92)
DEBUG: Texture 0x7fa251499538 creating.
Configures: 
 * value = (0.68898,0.28,0.92)
DEBUG: Texture 0x7fa2516375c8 creating.
Configures: 
 * value = (0.28,0.92,0.355142)
DEBUG: Texture 0x7fa2516375f8 creating.
Configures: 
 * value = (0.28,0.92,0.355142)
DEBUG: Texture 0x7fa2515803d8 creating.
Configures: 
 * value = (0.28,0.515833,0.92)
DEBUG: Texture 0x7fa251580098 creating.
Configures: 
 * value = (0.28,0.515833,0.92)
Scene loaded. Triangle count: 600006
stage 1
time: 1.02565097809

works fine.

@IteratorAdvance Thanks. vcm.py and tube_in_cube.py now work fine.

@b-z
Copy link

b-z commented Jan 23, 2017

My errors were same with @beaugunderson 's (above ⤴️)

And geometry/trees/scoping work fine

@yuanming-hu
Copy link
Member

It's indeed a wired bug and I managed to reproduce it on my system, by using the same version of python and boost as @beaugunderson does.

I'm sorry that I was buzy doing something else recently, but the good news is I fixed this bug.

In Ln. 38-41 in include/taichi/common/asset_manager.h, I changed

    template <typename T>
    static int insert_asset(const std::shared_ptr<T> &ptr) {
        return get_instance().insert_asset_<T>(ptr);
    }

to

    template <typename T>
    static int insert_asset(std::shared_ptr<T> &ptr) {
        return get_instance().insert_asset_<T>(ptr);
    }

(I removed the const modifier before the parameter.)

and it works.

It's probably because boost or python changed something that affects the ref_count mechanism of std::shared_ptr.

If you guys know what's going on here, please let me know.

It works on my Mac, and please let me know if it works for you @beaugunderson @zq317157782 @b-z . If not, please feel free to reopen this issue.

Sorry for the delay for fixing this, and thank you all for your feedback and your effort for debugging.

feisuzhu added a commit that referenced this issue Jan 12, 2023
Related Issue: #7140 

### Brief Summary

On macOS, when test worker hard fails (abort, EXC_BAD_ACCESS, etc.),
backward_cpp's signal handler will re-raise the signal and catch it
afterwards, make it an infinite loop, at the moment the offending
process can't be terminated easily (except a SIGKILL), eat CPU cycles
and blocks test runner.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a04f0e28 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #2: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #4: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #5: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #6: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #7: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #8: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #9: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #10: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #11: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #12: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #13: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #14: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #15: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #16: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #17: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #18: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #19: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #20: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #21: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #22: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #23: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #24: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #25: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #26: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #27: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #28: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #29: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #30: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #31: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #32: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #33: 0x00000001a046b454 libsystem_c.dylib`abort + 124
    frame #34: 0x0000000100194fc0 python`os_abort + 12
    frame #35: 0x00000001000758a8 python`cfunction_vectorcall_NOARGS + 324
    frame #36: 0x00000001001140f0 python`call_function + 460
    frame #37: 0x000000010011086c python`_PyEval_EvalFrameDefault + 27176
    frame #38: 0x00000001000287e4 python`function_code_fastcall + 128
    frame #39: 0x0000000100028008 python`PyVectorcall_Call + 120
    frame #40: 0x0000000100110b20 python`_PyEval_EvalFrameDefault + 27868
    frame #41: 0x000000010010982c python`_PyEval_EvalCodeWithName + 3008
    frame #42: 0x0000000100028948 python`_PyFunction_Vectorcall + 208
    frame #43: 0x0000000100028008 python`PyVectorcall_Call + 120
```
lin-hitonami pushed a commit that referenced this issue Jan 12, 2023
Related Issue: #7140 

### Brief Summary

On macOS, when test worker hard fails (abort, EXC_BAD_ACCESS, etc.),
backward_cpp's signal handler will re-raise the signal and catch it
afterwards, make it an infinite loop, at the moment the offending
process can't be terminated easily (except a SIGKILL), eat CPU cycles
and blocks test runner.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a04f0e28 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #2: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #4: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #5: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #6: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #7: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #8: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #9: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #10: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #11: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #12: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #13: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #14: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #15: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #16: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #17: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #18: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #19: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #20: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #21: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #22: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #23: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #24: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #25: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #26: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #27: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #28: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #29: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #30: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #31: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #32: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #33: 0x00000001a046b454 libsystem_c.dylib`abort + 124
    frame #34: 0x0000000100194fc0 python`os_abort + 12
    frame #35: 0x00000001000758a8 python`cfunction_vectorcall_NOARGS + 324
    frame #36: 0x00000001001140f0 python`call_function + 460
    frame #37: 0x000000010011086c python`_PyEval_EvalFrameDefault + 27176
    frame #38: 0x00000001000287e4 python`function_code_fastcall + 128
    frame #39: 0x0000000100028008 python`PyVectorcall_Call + 120
    frame #40: 0x0000000100110b20 python`_PyEval_EvalFrameDefault + 27868
    frame #41: 0x000000010010982c python`_PyEval_EvalCodeWithName + 3008
    frame #42: 0x0000000100028948 python`_PyFunction_Vectorcall + 208
    frame #43: 0x0000000100028008 python`PyVectorcall_Call + 120
```
lin-hitonami pushed a commit that referenced this issue Jan 12, 2023
Related Issue: #7140 

### Brief Summary

On macOS, when test worker hard fails (abort, EXC_BAD_ACCESS, etc.),
backward_cpp's signal handler will re-raise the signal and catch it
afterwards, make it an infinite loop, at the moment the offending
process can't be terminated easily (except a SIGKILL), eat CPU cycles
and blocks test runner.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a04f0e28 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #2: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #4: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #5: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #6: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #7: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #8: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #9: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #10: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #11: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #12: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #13: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #14: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #15: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #16: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #17: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #18: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #19: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #20: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #21: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #22: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #23: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #24: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #25: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #26: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #27: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #28: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #29: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #30: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #31: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #32: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #33: 0x00000001a046b454 libsystem_c.dylib`abort + 124
    frame #34: 0x0000000100194fc0 python`os_abort + 12
    frame #35: 0x00000001000758a8 python`cfunction_vectorcall_NOARGS + 324
    frame #36: 0x00000001001140f0 python`call_function + 460
    frame #37: 0x000000010011086c python`_PyEval_EvalFrameDefault + 27176
    frame #38: 0x00000001000287e4 python`function_code_fastcall + 128
    frame #39: 0x0000000100028008 python`PyVectorcall_Call + 120
    frame #40: 0x0000000100110b20 python`_PyEval_EvalFrameDefault + 27868
    frame #41: 0x000000010010982c python`_PyEval_EvalCodeWithName + 3008
    frame #42: 0x0000000100028948 python`_PyFunction_Vectorcall + 208
    frame #43: 0x0000000100028008 python`PyVectorcall_Call + 120
```
lin-hitonami pushed a commit that referenced this issue Jan 12, 2023
Related Issue: #7140 

### Brief Summary

On macOS, when test worker hard fails (abort, EXC_BAD_ACCESS, etc.),
backward_cpp's signal handler will re-raise the signal and catch it
afterwards, make it an infinite loop, at the moment the offending
process can't be terminated easily (except a SIGKILL), eat CPU cycles
and blocks test runner.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a04f0e28 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #2: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #4: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #5: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #6: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #7: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #8: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #9: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #10: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #11: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #12: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #13: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #14: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #15: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #16: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #17: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #18: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #19: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #20: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #21: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #22: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #23: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #24: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #25: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #26: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #27: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame #28: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #29: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #30: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame #31: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame #32: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame #33: 0x00000001a046b454 libsystem_c.dylib`abort + 124
    frame #34: 0x0000000100194fc0 python`os_abort + 12
    frame #35: 0x00000001000758a8 python`cfunction_vectorcall_NOARGS + 324
    frame #36: 0x00000001001140f0 python`call_function + 460
    frame #37: 0x000000010011086c python`_PyEval_EvalFrameDefault + 27176
    frame #38: 0x00000001000287e4 python`function_code_fastcall + 128
    frame #39: 0x0000000100028008 python`PyVectorcall_Call + 120
    frame #40: 0x0000000100110b20 python`_PyEval_EvalFrameDefault + 27868
    frame #41: 0x000000010010982c python`_PyEval_EvalCodeWithName + 3008
    frame #42: 0x0000000100028948 python`_PyFunction_Vectorcall + 208
    frame #43: 0x0000000100028008 python`PyVectorcall_Call + 120
```
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Related Issue: taichi-dev#7140 

### Brief Summary

On macOS, when test worker hard fails (abort, EXC_BAD_ACCESS, etc.),
backward_cpp's signal handler will re-raise the signal and catch it
afterwards, make it an infinite loop, at the moment the offending
process can't be terminated easily (except a SIGKILL), eat CPU cycles
and blocks test runner.

```
(lldb) bt
* thread taichi-dev#1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a04f0e28 libsystem_kernel.dylib`__pthread_kill + 8
    frame taichi-dev#1: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#2: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#3: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#4: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#5: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#6: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#7: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#8: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#9: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#10: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#11: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#12: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#13: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#14: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#15: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#16: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#17: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#18: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#19: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#20: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#21: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#22: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#23: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#24: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#25: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#26: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#27: 0x00000001283a0848 taichi_python.cpython-38-darwin.so`backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 28
    frame taichi-dev#28: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#29: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#30: 0x00000001a0402e10 libsystem_c.dylib`raise + 32
    frame taichi-dev#31: 0x00000001a056ec44 libsystem_platform.dylib`_sigtramp + 56
    frame taichi-dev#32: 0x00000001a052343c libsystem_pthread.dylib`pthread_kill + 292
    frame taichi-dev#33: 0x00000001a046b454 libsystem_c.dylib`abort + 124
    frame taichi-dev#34: 0x0000000100194fc0 python`os_abort + 12
    frame taichi-dev#35: 0x00000001000758a8 python`cfunction_vectorcall_NOARGS + 324
    frame taichi-dev#36: 0x00000001001140f0 python`call_function + 460
    frame taichi-dev#37: 0x000000010011086c python`_PyEval_EvalFrameDefault + 27176
    frame taichi-dev#38: 0x00000001000287e4 python`function_code_fastcall + 128
    frame taichi-dev#39: 0x0000000100028008 python`PyVectorcall_Call + 120
    frame taichi-dev#40: 0x0000000100110b20 python`_PyEval_EvalFrameDefault + 27868
    frame taichi-dev#41: 0x000000010010982c python`_PyEval_EvalCodeWithName + 3008
    frame taichi-dev#42: 0x0000000100028948 python`_PyFunction_Vectorcall + 208
    frame taichi-dev#43: 0x0000000100028008 python`PyVectorcall_Call + 120
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ C++ engineering related python Python engineering related welcome contribution
Projects
None yet
Development

No branches or pull requests

5 participants