Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard to trace errors if raise multiple errors on MT #8299

Closed
firejox opened this issue Oct 9, 2019 · 5 comments · Fixed by #14220
Closed

Hard to trace errors if raise multiple errors on MT #8299

firejox opened this issue Oct 9, 2019 · 5 comments · Fixed by #14220

Comments

@firejox
Copy link
Contributor

firejox commented Oct 9, 2019

The code

4.times do                                                                                                                                                                    
  spawn do                                                                                                                                                                    
    raise "Error here"                                                                                                                                                        
  end                                                                                                                                                                         
end                                                                                                                                                                           
                                                                                                                                                                              
sleep 3 

The output

Unhandled exception in spawn: Error hereUnhandled exception in spawn: Error here (Exception)                                                                                  
 (Exception)                                                                                                                                                                  
Unhandled exception in spawn: Error here (Exception)                                                                                                                          
Unhandled exception in spawn: Error here (Exception)                                                                                                                          
  from /home/developer/workspace/src/fiber.cr:255:3 in 'run'                                                                                                                  
  from /home/developer/workspace/src/fiber.cr:255:3 in 'run'                                                                                                                  
  from /home/developer/workspace/src/fiber.cr:48:34 in '->'                                                                                                                   
  from ???                                                                                                                                                                    
  from /home/developer/workspace/src/fiber.cr:48:34 in '->'                                                                                                                   
  from ???                                                                                                                                                                    
  from /home/developer/workspace/src/fiber.cr:255:3 in 'run'                                                                                                                  
  from /home/developer/workspace/src/fiber.cr:48:34 in '->'                                                                                                                   
  from ???                                                                                                                                                                    
  from /home/developer/workspace/src/fiber.cr:255:3 in 'run'                                                                                                                  
  from /home/developer/workspace/src/fiber.cr:48:34 in '->'                                                                                                                   
  from ???

Crystal version

Crystal 0.32.0-dev [0505d1c2e] (2019-10-09)                                                                                                                                   
                                                                                                                                                                              
LLVM: 8.0.1                                                                                                                                                                   
Default target: x86_64-pc-linux-gnu
@firejox firejox changed the title Hard to trace error messages if raise multiple errors on MT Hard to trace errors if raise multiple errors on MT Oct 9, 2019
@jhass
Copy link
Member

jhass commented Oct 9, 2019

I guess we should just have a global lock around the exception printer to stdout. It's a rare thing to happen anyways so shouldn't incur any performance penalty, I hope

@ysbaddaden
Copy link
Contributor

Unwinding the stack, extracting the file:line info from DWARF tables are already a high cost. Adding a mutex to print errors shouldn't add that much. Especially since in production apps it should be redirected to a logger anyway.

@firejox
Copy link
Contributor Author

firejox commented Oct 9, 2019

I think my code is not good enough to indicate the problem. The mess of error log is bad, not the worst. The worst is that errors can generate error. I have no idea to design a minimal straightforward code for the worst case, but we can take the issue #8288 as an example. Sometimes I got the error when I ran this code.

[0x55863b6e1bb6] *CallStack::print_backtrace:Int32 +118                                                                                                                       
[0x55863b6d3019] __crystal_sigfault_handler +217                                                                                                                              
[0x55863b73aec0] sigfault_handler +40                                                                                                                                         
[0x7fc08a7444d0] ???                                                                                                                                                          
[0x7fc08a771587] GC_malloc_kind_global +87                                                                                                                                    
[0x55863b6e9f36] *GC::malloc<UInt64>:Pointer(Void) +6                                                                                                                         
[0x55863b6c4dde] __crystal_malloc64 +14                                                                                                                                       
[0x55863b719c88] *Pointer(Tuple(String, Int32, Int32, Int32)) +72                                                                                                             
[0x55863b7197e7] *Array(Tuple(String, Int32, Int32, Int32)) +183                                                                                                              
[0x55863b719712] *Array(Tuple(String, Int32, Int32, Int32)) +82                                                                                                               
[0x55863b7191e0] *Debug::DWARF::LineNumbers::Sequence#initialize:Array(UInt8) +208                                                                                            
[0x55863b718f80] *Debug::DWARF::LineNumbers::Sequence::new:Debug::DWARF::LineNumbers::Sequence +560                                                                           
[0x55863b713cbc] *Debug::DWARF::LineNumbers#decode_sequences<(UInt32 | UInt64)>:Nil +380                                                                                      
[0x55863b713b2c] *Debug::DWARF::LineNumbers#initialize<IO::FileDescriptor+, (UInt32 | UInt64)>:Nil +108                                                                       
[0x55863b713a84] *Debug::DWARF::LineNumbers::new<IO::FileDescriptor+, (UInt32 | UInt64)>:Debug::DWARF::LineNumbers +132                                                       
[0x55863b6daeb8] *CallStack::read_dwarf_sections:(Array(Tuple(UInt64, UInt64, String)) | Nil) +9928                                                                           
[0x55863b6d85ed] *CallStack::decode_line_number<UInt64>:Tuple(String, Int32, Int32) +45                                                                                       
[0x55863b6d7d78] *CallStack#decode_backtrace:Array(String) +296                                                                                                               
[0x55863b6d7c32] *CallStack#printable_backtrace:Array(String) +50                                                                                                             
[0x55863b72d8fd] *Exception+ +77                                                                                                                                              
[0x55863b72d748] *Exception+ +120                                                                                                                                             
[0x55863b6f5647] *Fiber#run:(IO::FileDescriptor | Nil) +583                                                                                                                   
[0x55863b6d2b06] ~proc2Proc(Fiber, (IO::FileDescriptor | Nil)) +6                                                                                                             
[0x0] ??? 

This error is generated by other errors. The bugs which we want to resolve are hided by new error, and it is extremely hard to debug.

@ysbaddaden The cost of mutex depends on how frequently the mutex is competed by different threads/fibers. There is no reason to frequently unwind stacks in production app.

@HertzDevil
Copy link
Contributor

On my M2 I can't reproduce the interleaved messages anymore, but this is what I get:

Unhandled exception in spawn: Error here (Exception)
  from src/crystal/compiler_rt/divmod128.cr:63 in '->'
  from src/fiber.cr:146:11 in 'run'
  from src/fiber.cr:98:34 in '->'
Unhandled exception in spawn: Error here (Exception)
  from src/crystal/compiler_rt/divmod128.cr:63 in '->'
  from src/fiber.cr:146:11 in 'run'
  from src/fiber.cr:98:34 in '->'
Unhandled exception in spawn: Error here (Exception)
  from src/crystal/compiler_rt/divmod128.cr:63 in '->'
  from src/fiber.cr:146:11 in 'run'
  from src/fiber.cr:98:34 in '->'
Unhandled exception in spawn: Error here (Exception)
  from src/crystal/compiler_rt/divmod128.cr:63 in '->'
  from src/fiber.cr:146:11 in 'run'
  from src/fiber.cr:98:34 in '->'

src/crystal/compiler_rt/divmod128.cr seems to be the last file required before src/crystal/main.cr. Other than that I have no idea why it would show up here.

@ysbaddaden
Copy link
Contributor

I do reproduce on Linux x86_64 with MT enabled. The problem is that different threads compete to write their message with many write to STDERR. We might work around the issue by writing the whole message into a buffer, then writing once to STDERR?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants