-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault while connecting from multiple processes with Postgres 12 #311
Comments
Confirming the same issue. Ran into it while testing on my MacOS an app, after upgrading to Postgres 12 from Postgres 11 ( via brew) . Works for a Single connection, but crashes for multiple, i.e. crashes a Rails app, when trying to load models.
The only thing that changed - upgraded Postrgres to 12.1 from 11 |
@jessebs I expect |
Yes, that did avoid the issue. |
I've reported this upstream: https://postgr.es/m/93f7379b-2e2f-db0c-980e-07ebd5de92ff%40crunchydata.com |
It looks like in Rails 5, this can be fixed by adding Unfortunately Rails 4 doesn't pass |
Thanks a lot @trevorcreech, ENV variable PGGSSENCMODE="disable" worked for me. |
I'm a bit confused - based on my understanding of the problem, I would expect that closing the connection before the fork would resolve it but it does not. Any understanding of why that's the case? |
It's not about shared handles or anything. It seems that macOS libraries are generally not safe to use after fork() (due to libdispatch.) It's a known issue over at Homebrew. |
Another workaround that I found is by not setting any value to the host, or just omit the # 1
Process.fork do
PG.connect(:host => '', :user => 'username', :dbname => 'test_db')
end
# 2
Process.fork do
PG.connect(:user => 'username', :dbname => 'test_db')
end If you're in Rails, you can just do the same thing with the # 1: This will crash
development:
adapter: postgresql
database: my_database
host: localhost
# 2: This won't crash and will still connect to localhost just fine
development:
adapter: postgresql
database: my_database
host: So far I noticed that I only need this workaround if the host is |
@alienxp03 Yes, this also skips GSS encryption by connecting over Unix socket (which is only available when client and server are on the same machine.) Depending on how the server is configured, Unix socket may involve different authentication than TCP socket. |
Just letting you guys know that this also happens on Ruby 2.3.1, using Adding |
If you installed PostgreSQL through Homebrew, reinstalling should fix it, thanks to this PR: Homebrew/homebrew-core#47494 brew uninstall postgresql
# then kill all postgresql processes or restart machine
brew install postgresql |
segfault.txt postgresql 14.8 (Homebrew) My workaround was to add |
Encountered this issue today on Venture@13.2.1. Workaround was also |
same here, came up just today after an auto-upgrade from brew. maybe a regression? @ankane's solution did not work for me :( so i had to go the route with |
Same issue here. Although not ideal, in a Rails app, setting nothing for the development:
host:
adapter: postgresql
encoding: unicode
database: <%= ENV['DATABASE_NAME'] %>
pool: 5 This is a temporary solution and by no means ideal. |
@konung it seems people are experiencing this again. Can we reopen this issue? |
This is likely happening because My colleague shared this relevant backtrace: "vmRegionInfo" : "0x103e00abc is not in any region. Bytes after previous region: 2749 Bytes before following region: 62788\n REGION TYPE START - END [ VSIZE] PRT\/MAX SHRMOD REGION DETAIL\n MALLOC_TINY 103d00000-103e00000 [ 1024K] rw-\/rwx SM=PRV \n---> GAP OF 0x10000 BYTES\n VM_ALLOCATE 103e10000-103e20000 [ 64K] rw-\/rwx SM=COW ",
"exception" : {"codes":"0x0000000000000001, 0x0000000103e00abc","rawCodes":[1,4359981756],"type":"EXC_BAD_ACCESS","signal":"SIGABRT","subtype":"KERN_INVALID_ADDRESS at 0x0000000103e00abc"},
"vmregioninfo" : "0x103e00abc is not in any region. Bytes after previous region: 2749 Bytes before following region: 62788\n REGION TYPE START - END [ VSIZE] PRT\/MAX SHRMOD REGION DETAIL\n MALLOC_TINY 103d00000-103e00000 [ 1024K] rw-\/rwx SM=PRV \n---> GAP OF 0x10000 BYTES\n VM_ALLOCATE 103e10000-103e20000 [ 64K] rw-\/rwx SM=COW ",
"asi" : {"CoreFoundation":["*** multi-threaded process forked ***"],"libsystem_c.dylib":["crashed on child side of fork pre-exec"]},
"extMods" : {"caller":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"system":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"targeted":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"warnings":0},
"faultingThread" : 0,
"threads" : [{"triggered":true,"id":139232,"threadState":{"x":[{"value":0},{"value":0},{"value":0},{"value":0},{"value":20680267530240},{"value":4410931412992},{"value":144},{"value":0},{"value":10969316120912989786},{"value":10969316121450844250},{"value":2},{"value":4294967293},{"value":1099511627776},{"value":0},{"value":0},{"value":0},{"value":328},{"value":8131268448},{"value":0},{"value":6},{"value":8052481536,"symbolLocation":0,"symbol":"_main_thread"},{"value":771},{"value":8052481760,"symbolLocation":224,"symbol":"_main_thread"},{"value":4303728177,"symbolLocation":33915,"symbol":"hex_table"},{"value":95},{"value":0},{"value":0},{"value":0},{"value":0}],"flavor":"ARM_THREAD_STATE64","lr":{"value":6525926440},"cpsr":{"value":1073745920},"fp":{"value":6171233216},"sp":{"value":6171233184},"esr":{"value":1442840704,"description":" Address size fault"},"pc":{"value":6525699876,"matchesCrashFrame":1},"far":{"value":4701421568}},"queue":"com.apple.main-thread","frames":[{"imageOffset":38692,"symbol":"__pthread_kill","symbolLocation":8,"imageIndex":63},{"imageOffset":27688,"symbol":"pthread_kill","symbolLocation":288,"imageIndex":64},{"imageOffset":486120,"symbol":"abort","symbolLocation":180,"imageIndex":65},{"imageOffset":556468,"symbol":"die","symbolLocation":12,"imageIndex":1},{"imageOffset":556916,"symbol":"rb_bug_for_fatal_signal","symbolLocation":448,"imageIndex":1},{"imageOffset":1792864,"symbol":"sigsegv","symbolLocation":96,"imageIndex":1},{"imageOffset":14884,"symbol":"_sigtramp","symbolLocation":56,"imageIndex":66},{"imageOffset":18052,"symbol":"_os_log_preferences_refresh","symbolLocation":36,"imageIndex":67},{"imageOffset":20748,"symbol":"os_log_type_enabled","symbolLocation":712,"imageIndex":67},{"imageOffset":44020,"symbol":"_xpc_connection_activate_if_needed","symbolLocation":152,"imageIndex":68},{"imageOffset":54464,"symbol":"xpc_connection_resume","symbolLocation":92,"imageIndex":68},{"imageOffset":51696,"symbol":"get_primary_name","symbolLocation":152,"imageIndex":38},{"imageOffset":50396,"symbol":"api_macos_ptcursor_next","symbolLocation":240,"imageIndex":38},{"imageOffset":38792,"symbol":"krb5_cccol_cursor_next","symbolLocation":76,"imageIndex":38},{"imageOffset":39536,"symbol":"krb5_cccol_have_content","symbolLocation":92,"imageIndex":38},{"imageOffset":88212,"symbol":"acquire_cred_context","symbolLocation":1664,"imageIndex":37},{"imageOffset":86428,"symbol":"acquire_cred_from","symbolLocation":688,"imageIndex":37},{"imageOffset":29056,"symbol":"gss_add_cred_from","symbolLocation":624,"imageIndex":37},{"imageOffset":28104,"symbol":"gss_acquire_cred_from","symbolLocation":400,"imageIndex":37},{"imageOffset":27692,"symbol":"gss_acquire_cred","symbolLocation":36,"imageIndex":37},{"imageOffset":105780,"symbol":"pg_GSS_have_cred_cache","symbolLocation":60,"imageIndex":34},{"imageOffset":34308,"symbol":"PQconnectPoll","symbolLocation":4416,"imageIndex":34},{"imageOffset":10032,"symbol":"gvl_PQconnectPoll_skeleton","symbolLocation":24,"imageIndex":33},{"imageOffset":2034236,"symbol":"rb_nogvl","symbolLocation":268,"imageIndex":1},{"imageOffset":9992,"symbol":"gvl_PQconnectPoll","symbolLocation":44,"imageIndex":33},{"imageOffset":33088,"symbol":"pgconn_connect_poll","symbolLocation":48,"imageIndex":33},{"imageOffset":2361060,"symbol":"vm_call_cfunc_with_frame","symbolLocation":232,"imageIndex":1},{"imageOffset":2363212,"symbol":"vm_call_symbol","symbolLocation":572,"imageIndex":1},{"imageOffset":2252892,"symbol":"vm_exec_core","symbolLocation":8132,"imageIndex":1},{"imageOffset":2325636,"symbol":"rb_vm_exec","symbolLocation":2092,"imageIndex":1}, As explained in https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/, this happens because the macOS system calls are not thread-safe, so when they get called in a fork a seg fault may occur. I suspect there's not much this library can do about it, but there are workarounds:
Without GSSAPI (installed with
|
|
@Meekohi if you are using rails, try adding |
I did give that a shot as well with no luck. After more experiments I found that this was only effecting me on the new Amazon Linux AMI2023, and does not effect the older Amazon Linux AMI2, so I've switched back to that for now. I assume there is some newer library (maybe libpq?) the new AMI is using that somehow exercises this bug. |
We're crashing again (in macOS) since Homebrew/homebrew-core#132976 This has occured before, from https://bugs.ruby-lang.org/issues/16239 > This is a fatal interaction between the PostgreSQL 12 client libraries > and the GSS implementation provided by macOS. This is being tracked in > the pg gem at ged/ruby-pg#311 More in ged/ruby-pg#311 (comment) Docs on PGGSSENCMODE https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNECT-GSSENCMODE OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES has been a workaround in the past (puma/puma#1421) but it shouldn't be used without care: Homebrew/homebrew-core#137909 (comment)
This fixed my issue. Thank you so much! |
This just does not work for me. I've even tried the following before_fork do
require 'pg'
# Force a connection and query to PG
ActiveRecord::Base.connection.migration_context.current_version
# Work around macOS 10.13 and later being very picky about `fork` usage and
# interactions with Objective-C code see:https://github.com/puma/puma/issues/1421
if /darwin/ =~ RUBY_PLATFORM
require 'fiddle'
# Dynamically load Foundation.framework, ~implicitly~ initialising the
# Objective-C runtime before any forking happens in Puma
Fiddle.dlopen('/System/Library/Frameworks/Foundation.framework/Foundation')
end
ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord)
end I can set Ruby: |
The actual command is You probably knew that, but I wanted to clarify for others. |
@stevenharman Is it possible there's a different seg fault happening (such as #555)? I should note that if I modify the reproduction script in #311 (comment) to incorporate your change, this seems to fix the GSSAPI seg fault for me: #!/usr/bin/env ruby
require 'pg'
PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')
if /darwin/ =~ RUBY_PLATFORM
require 'fiddle'
# Dynamically load Foundation.framework, ~implicitly~ initialising the
# Objective-C runtime before any forking happens in Puma
Fiddle.dlopen('/System/Library/Frameworks/Foundation.framework/Foundation')
end
Process.fork do
PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')
end Maybe if you post the stack trace and crash dump we can confirm. |
this worked for me! |
Resolving Segmentation Faults with PostgreSQL 12, Ruby 2.6.5, and Rails ^6.0.0 on Ubuntu 22.04If you are using PostgreSQL 12, Ruby 2.6.5, and Rails ^6.0.0 on Ubuntu 22.04 and encounter a segmentation fault, this guide is for you. IssueBy default, Ubuntu 22.04 uses OpenSSL 3. However, your Ruby version requires OpenSSL 1.0.1 or 1.1.1, which Ubuntu does not natively support. SolutionThere is no need to modify OpenSSL. Instead, follow these steps:
Additional HelpIf your environment doesn't allow you to install Ruby or you encounter other issues related to versions, feel free to reach out to me for personalized assistance. For any further problems or personalized help, please contact me directly. |
I'm running Postgres 12.1, Ruby 2.7.4 and pg 1.4.6 on MX-linux and i'm facing the segfault and i tried gssencmode: disable but still no luck. |
try this work for me |
I'm running Postgres 12.1, Ruby 2.6.5 and pg 1.1.4 on MacOS Catalina. This also happens on Ruby 2.5.1 and 2.5.7 as well as MacOS Mojave. It does not happen with Postgres 10 or 11.
Running the following small script will cause a segfault.
See the attached segfault.txt and crash_report.txt
The text was updated successfully, but these errors were encountered: