Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault while connecting from multiple processes with Postgres 12 #311

Closed
jessebs opened this issue Dec 3, 2019 · 29 comments
Closed

Segfault while connecting from multiple processes with Postgres 12 #311

jessebs opened this issue Dec 3, 2019 · 29 comments

Comments

@jessebs
Copy link

jessebs commented Dec 3, 2019

I'm running Postgres 12.1, Ruby 2.6.5 and pg 1.1.4 on MacOS Catalina. This also happens on Ruby 2.5.1 and 2.5.7 as well as MacOS Mojave. It does not happen with Postgres 10 or 11.

Running the following small script will cause a segfault.

#!/usr/bin/env ruby
require 'pg'

PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')

Process.fork do
  PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')
end

See the attached segfault.txt and crash_report.txt

@konung
Copy link

konung commented Dec 3, 2019

Confirming the same issue. Ran into it while testing on my MacOS an app, after upgrading to Postgres 12 from Postgres 11 ( via brew) . Works for a Single connection, but crashes for multiple, i.e. crashes a Rails app, when trying to load models.

gem 'pg' => PG::VERSION = "1.1.4"
rails 6.0.1
ruby 2.6.5 (also tried 2.5.3 with the same result)
psql (PostgreSQL) 12.1
MacOS Catalina

The only thing that changed - upgraded Postrgres to 12.1 from 11

@cbandy
Copy link
Contributor

cbandy commented Dec 3, 2019

@jessebs I expect gssencmode: "disable" will avoid the crash. Can you please verify?

@jessebs
Copy link
Author

jessebs commented Dec 3, 2019

Yes, that did avoid the issue.

@cbandy cbandy mentioned this issue Dec 3, 2019
@cbandy
Copy link
Contributor

cbandy commented Dec 3, 2019

I've reported this upstream: https://postgr.es/m/93f7379b-2e2f-db0c-980e-07ebd5de92ff%40crunchydata.com

@trevorcreech
Copy link

It looks like in Rails 5, this can be fixed by adding gssencmode: disable as a connection parameter in database.yml

Unfortunately Rails 4 doesn't pass gssencmode along to pg (because VALID_CONN_PARAMS is hardcoded until Rails 5). We were able to fix this by setting a PGGSSENCMODE="disable" environment variable.

@miguelpeniche
Copy link

Thanks a lot @trevorcreech, ENV variable PGGSSENCMODE="disable" worked for me.

@jessebs
Copy link
Author

jessebs commented Dec 6, 2019

I'm a bit confused - based on my understanding of the problem, I would expect that closing the connection before the fork would resolve it but it does not.

Any understanding of why that's the case?

@cbandy
Copy link
Contributor

cbandy commented Dec 6, 2019

It's not about shared handles or anything. It seems that macOS libraries are generally not safe to use after fork() (due to libdispatch.) It's a known issue over at Homebrew.

@alienxp03
Copy link

alienxp03 commented Dec 11, 2019

Another workaround that I found is by not setting any value to the host, or just omit the host params.

# 1
Process.fork do
  PG.connect(:host => '', :user => 'username', :dbname => 'test_db')
end

# 2
Process.fork do
  PG.connect(:user => 'username', :dbname => 'test_db')
end

If you're in Rails, you can just do the same thing with the host params.

# 1: This will crash
development:
  adapter: postgresql
  database: my_database
  host: localhost

# 2: This won't crash and will still connect to localhost just fine
development:
  adapter: postgresql
  database: my_database
  host:

So far I noticed that I only need this workaround if the host is localhost or 127.0.0.1.

@cbandy
Copy link
Contributor

cbandy commented Dec 11, 2019

@alienxp03 Yes, this also skips GSS encryption by connecting over Unix socket (which is only available when client and server are on the same machine.) Depending on how the server is configured, Unix socket may involve different authentication than TCP socket.

@TiuTalk
Copy link

TiuTalk commented Dec 11, 2019

Just letting you guys know that this also happens on Ruby 2.3.1, using pg 0.19.0, with postgresql stable 12.1 from Homebrew.

Adding gssencmode: disable to the database.yaml fixed the segfaults, thanks @trevorcreech

@ankane
Copy link

ankane commented Jan 21, 2020

If you installed PostgreSQL through Homebrew, reinstalling should fix it, thanks to this PR: Homebrew/homebrew-core#47494

brew uninstall postgresql
# then kill all postgresql processes or restart machine
brew install postgresql

@abepark01
Copy link

abepark01 commented Jun 8, 2023

segfault.txt
I encountered this issue today on Mac OS Ventura 13.8 when running a rails server locally on port 3000.

postgresql 14.8 (Homebrew)
ruby 3.2.2 (installed via rbenv)
pg 1.5.3
rails 7.0.5

My workaround was to add export PGGSSENCMODE="disable" to my .zshrc file

@kefimochi
Copy link

Encountered this issue today on Venture@13.2.1. Workaround was also export PGGSSENCMODE="disable" to my .zshrc file. Thanks everyone!

@jk779
Copy link

jk779 commented Jun 14, 2023

same here, came up just today after an auto-upgrade from brew. maybe a regression?

@ankane's solution did not work for me :(

so i had to go the route with PGGSSENCMODE="disable" too, which i don't really like :/

@DaniG2k
Copy link

DaniG2k commented Jun 15, 2023

Same issue here. Although not ideal, in a Rails app, setting nothing for the host: value like the following in database.yml seems to do the trick:

development:
  host:
  adapter: postgresql
  encoding: unicode
  database: <%= ENV['DATABASE_NAME'] %>
  pool: 5

This is a temporary solution and by no means ideal.

@daande
Copy link

daande commented Jun 16, 2023

@konung it seems people are experiencing this again. Can we reopen this issue?

@stanhu
Copy link

stanhu commented Jun 27, 2023

This is likely happening because krb5 v1.21 shipped with this change: krb5/krb5#1221

My colleague shared this relevant backtrace:

  "vmRegionInfo" : "0x103e00abc is not in any region.  Bytes after previous region: 2749  Bytes before following region: 62788\n      REGION TYPE                    START - END         [ VSIZE] PRT\/MAX SHRMOD  REGION DETAIL\n      MALLOC_TINY                 103d00000-103e00000    [ 1024K] rw-\/rwx SM=PRV  \n--->  GAP OF 0x10000 BYTES\n      VM_ALLOCATE                 103e10000-103e20000    [   64K] rw-\/rwx SM=COW  ",
  "exception" : {"codes":"0x0000000000000001, 0x0000000103e00abc","rawCodes":[1,4359981756],"type":"EXC_BAD_ACCESS","signal":"SIGABRT","subtype":"KERN_INVALID_ADDRESS at 0x0000000103e00abc"},
  "vmregioninfo" : "0x103e00abc is not in any region.  Bytes after previous region: 2749  Bytes before following region: 62788\n      REGION TYPE                    START - END         [ VSIZE] PRT\/MAX SHRMOD  REGION DETAIL\n      MALLOC_TINY                 103d00000-103e00000    [ 1024K] rw-\/rwx SM=PRV  \n--->  GAP OF 0x10000 BYTES\n      VM_ALLOCATE                 103e10000-103e20000    [   64K] rw-\/rwx SM=COW  ",
  "asi" : {"CoreFoundation":["*** multi-threaded process forked ***"],"libsystem_c.dylib":["crashed on child side of fork pre-exec"]},
  "extMods" : {"caller":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"system":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"targeted":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"warnings":0},
  "faultingThread" : 0,
  "threads" : [{"triggered":true,"id":139232,"threadState":{"x":[{"value":0},{"value":0},{"value":0},{"value":0},{"value":20680267530240},{"value":4410931412992},{"value":144},{"value":0},{"value":10969316120912989786},{"value":10969316121450844250},{"value":2},{"value":4294967293},{"value":1099511627776},{"value":0},{"value":0},{"value":0},{"value":328},{"value":8131268448},{"value":0},{"value":6},{"value":8052481536,"symbolLocation":0,"symbol":"_main_thread"},{"value":771},{"value":8052481760,"symbolLocation":224,"symbol":"_main_thread"},{"value":4303728177,"symbolLocation":33915,"symbol":"hex_table"},{"value":95},{"value":0},{"value":0},{"value":0},{"value":0}],"flavor":"ARM_THREAD_STATE64","lr":{"value":6525926440},"cpsr":{"value":1073745920},"fp":{"value":6171233216},"sp":{"value":6171233184},"esr":{"value":1442840704,"description":" Address size fault"},"pc":{"value":6525699876,"matchesCrashFrame":1},"far":{"value":4701421568}},"queue":"com.apple.main-thread","frames":[{"imageOffset":38692,"symbol":"__pthread_kill","symbolLocation":8,"imageIndex":63},{"imageOffset":27688,"symbol":"pthread_kill","symbolLocation":288,"imageIndex":64},{"imageOffset":486120,"symbol":"abort","symbolLocation":180,"imageIndex":65},{"imageOffset":556468,"symbol":"die","symbolLocation":12,"imageIndex":1},{"imageOffset":556916,"symbol":"rb_bug_for_fatal_signal","symbolLocation":448,"imageIndex":1},{"imageOffset":1792864,"symbol":"sigsegv","symbolLocation":96,"imageIndex":1},{"imageOffset":14884,"symbol":"_sigtramp","symbolLocation":56,"imageIndex":66},{"imageOffset":18052,"symbol":"_os_log_preferences_refresh","symbolLocation":36,"imageIndex":67},{"imageOffset":20748,"symbol":"os_log_type_enabled","symbolLocation":712,"imageIndex":67},{"imageOffset":44020,"symbol":"_xpc_connection_activate_if_needed","symbolLocation":152,"imageIndex":68},{"imageOffset":54464,"symbol":"xpc_connection_resume","symbolLocation":92,"imageIndex":68},{"imageOffset":51696,"symbol":"get_primary_name","symbolLocation":152,"imageIndex":38},{"imageOffset":50396,"symbol":"api_macos_ptcursor_next","symbolLocation":240,"imageIndex":38},{"imageOffset":38792,"symbol":"krb5_cccol_cursor_next","symbolLocation":76,"imageIndex":38},{"imageOffset":39536,"symbol":"krb5_cccol_have_content","symbolLocation":92,"imageIndex":38},{"imageOffset":88212,"symbol":"acquire_cred_context","symbolLocation":1664,"imageIndex":37},{"imageOffset":86428,"symbol":"acquire_cred_from","symbolLocation":688,"imageIndex":37},{"imageOffset":29056,"symbol":"gss_add_cred_from","symbolLocation":624,"imageIndex":37},{"imageOffset":28104,"symbol":"gss_acquire_cred_from","symbolLocation":400,"imageIndex":37},{"imageOffset":27692,"symbol":"gss_acquire_cred","symbolLocation":36,"imageIndex":37},{"imageOffset":105780,"symbol":"pg_GSS_have_cred_cache","symbolLocation":60,"imageIndex":34},{"imageOffset":34308,"symbol":"PQconnectPoll","symbolLocation":4416,"imageIndex":34},{"imageOffset":10032,"symbol":"gvl_PQconnectPoll_skeleton","symbolLocation":24,"imageIndex":33},{"imageOffset":2034236,"symbol":"rb_nogvl","symbolLocation":268,"imageIndex":1},{"imageOffset":9992,"symbol":"gvl_PQconnectPoll","symbolLocation":44,"imageIndex":33},{"imageOffset":33088,"symbol":"pgconn_connect_poll","symbolLocation":48,"imageIndex":33},{"imageOffset":2361060,"symbol":"vm_call_cfunc_with_frame","symbolLocation":232,"imageIndex":1},{"imageOffset":2363212,"symbol":"vm_call_symbol","symbolLocation":572,"imageIndex":1},{"imageOffset":2252892,"symbol":"vm_exec_core","symbolLocation":8132,"imageIndex":1},{"imageOffset":2325636,"symbol":"rb_vm_exec","symbolLocation":2092,"imageIndex":1},

As explained in https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/, this happens because the macOS system calls are not thread-safe, so when they get called in a fork a seg fault may occur.

I suspect there's not much this library can do about it, but there are workarounds:

  1. Disable GSSAPI via the PGGSSENCMODE=disable environment variable or pass gssencmode=disable in the connection string.
  2. Initiate the database connection by preloading the app before forking so the macOS system calls are invoked in the parent process. In Puma, the preload_app! config option does this.
  3. Use a PostgreSQL server that does not have --with-gssapi enabled. By default I believe the Homebrew version has this, but asdf does not install PostgreSQL with this. You can check by using otool -L:

Without GSSAPI (installed with asdf)

% otool -L ~/.asdf/installs/postgres/13.9/bin/postgres
/Users/stanhu/.asdf/installs/postgres/13.9/bin/postgres:
	/opt/homebrew/opt/openssl@3/lib/libssl.3.dylib (compatibility version 3.0.0, current version 3.0.0)
	/opt/homebrew/opt/openssl@3/lib/libcrypto.3.dylib (compatibility version 3.0.0, current version 3.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)

With GSSAPI (installed with Homebrew)

% otool -L /opt/homebrew/opt/postgresql@12/bin/postgres
/opt/homebrew/opt/postgresql@12/bin/postgres:
	/usr/lib/libxml2.2.dylib (compatibility version 10.0.0, current version 10.9.0)
	/usr/lib/libpam.2.dylib (compatibility version 3.0.0, current version 3.0.0)
	/opt/homebrew/opt/openssl@3/lib/libssl.3.dylib (compatibility version 3.0.0, current version 3.0.0)
	/opt/homebrew/opt/openssl@3/lib/libcrypto.3.dylib (compatibility version 3.0.0, current version 3.0.0)
	/opt/homebrew/opt/krb5/lib/libgssapi_krb5.2.2.dylib (compatibility version 2.0.0, current version 2.2.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
	/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
	/opt/homebrew/opt/icu4c/lib/libicui18n.73.dylib (compatibility version 73.0.0, current version 73.2.0)
	/opt/homebrew/opt/icu4c/lib/libicuuc.73.dylib (compatibility version 73.0.0, current version 73.2.0)

@Meekohi
Copy link

Meekohi commented Jun 30, 2023

PGGSSENCMODE="disable" is not working for me, is there an earlier version of pg that is not affected?

@jdeff
Copy link

jdeff commented Jun 30, 2023

@Meekohi if you are using rails, try adding gssencmode: disable to your db config in config/database.yml.

@Meekohi
Copy link

Meekohi commented Jun 30, 2023

I did give that a shot as well with no luck. After more experiments I found that this was only effecting me on the new Amazon Linux AMI2023, and does not effect the older Amazon Linux AMI2, so I've switched back to that for now. I assume there is some newer library (maybe libpq?) the new AMI is using that somehow exercises this bug.

dentarg added a commit to Starkast/wikimum that referenced this issue Aug 17, 2023
We're crashing again (in macOS) since
Homebrew/homebrew-core#132976

This has occured before, from https://bugs.ruby-lang.org/issues/16239

> This is a fatal interaction between the PostgreSQL 12 client libraries
> and the GSS implementation provided by macOS. This is being tracked in
> the pg gem at ged/ruby-pg#311

More in ged/ruby-pg#311 (comment)

Docs on PGGSSENCMODE
https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNECT-GSSENCMODE

OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES has been a workaround in the
past (puma/puma#1421) but it shouldn't be used
without care: Homebrew/homebrew-core#137909 (comment)
@Rhoxio
Copy link

Rhoxio commented Sep 14, 2023

It looks like in Rails 5, this can be fixed by adding gssencmode: disable as a connection parameter in database.yml

Unfortunately Rails 4 doesn't pass gssencmode along to pg (because VALID_CONN_PARAMS is hardcoded until Rails 5). We were able to fix this by setting a PGGSSENCMODE="disable" environment variable.

This fixed my issue. Thank you so much!

@stevenharman
Copy link

  1. Initiate the database connection by preloading the app before forking so the macOS system calls are invoked in the parent process. In Puma, the preload_app! config option does this.

This just does not work for me. I've even tried the following before_fork hook in Puma:

before_fork do
  require 'pg'
  # Force a connection and query to PG
  ActiveRecord::Base.connection.migration_context.current_version

  # Work around macOS 10.13 and later being very picky about `fork` usage and
  # interactions with Objective-C code see:https://github.com/puma/puma/issues/1421
  if /darwin/ =~ RUBY_PLATFORM
    require 'fiddle'
    # Dynamically load Foundation.framework, ~implicitly~ initialising the
    # Objective-C runtime before any forking happens in Puma
    Fiddle.dlopen('/System/Library/Frameworks/Foundation.framework/Foundation')
  end

  ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord)
end

I can set gssencmode: disable in database.yml, but that seems so heavy-handed, and preloading the app sure sounds like it should fix it.

Ruby: 3.0.6
pg: 1.5.4
Rails: 6.1.7.4

@docwhat
Copy link

docwhat commented Dec 13, 2023

PGGSSENCMODE="disable" is not working for me, is there an earlier version of pg that is not affected?

The actual command is export PGGSSENCMODE="disable". Without export, the variable is only visible to the shell, not to any commands (like rake, ruby, or rails).

You probably knew that, but I wanted to clarify for others.

@stanhu
Copy link

stanhu commented Jan 8, 2024

This just does not work for me. I've even tried the following before_fork hook in Puma:

@stevenharman Is it possible there's a different seg fault happening (such as #555)? I should note that if I modify the reproduction script in #311 (comment) to incorporate your change, this seems to fix the GSSAPI seg fault for me:

#!/usr/bin/env ruby
require 'pg'

PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')

if /darwin/ =~ RUBY_PLATFORM
  require 'fiddle'
  # Dynamically load Foundation.framework, ~implicitly~ initialising the
  # Objective-C runtime before any forking happens in Puma
  Fiddle.dlopen('/System/Library/Frameworks/Foundation.framework/Foundation')
end

Process.fork do
  PG.connect(:host => 'localhost', :user => 'username', :dbname => 'test_db')
end

Maybe if you post the stack trace and crash dump we can confirm.

@napster235
Copy link

@Meekohi if you are using rails, try adding gssencmode: disable to your db config in config/database.yml.

this worked for me!

@PrathameshSurve
Copy link

PrathameshSurve commented Jun 27, 2024

Resolving Segmentation Faults with PostgreSQL 12, Ruby 2.6.5, and Rails ^6.0.0 on Ubuntu 22.04

If you are using PostgreSQL 12, Ruby 2.6.5, and Rails ^6.0.0 on Ubuntu 22.04 and encounter a segmentation fault, this guide is for you.

Issue

By default, Ubuntu 22.04 uses OpenSSL 3. However, your Ruby version requires OpenSSL 1.0.1 or 1.1.1, which Ubuntu does not natively support.

Solution

There is no need to modify OpenSSL. Instead, follow these steps:

  1. Check your database.yml file:

    development:
      adapter: postgresql
      encoding: unicode
      host: 
      port: 5432
      pool: 5
      database: database_name
      username: username
      password: password
    
    test:
      adapter: postgresql
      encoding: unicode
      host: 
      port: 5432
      pool: 5
      database: database_name
      username: username
      password: password
  2. Modify the host field:

    Replace host: localhost with host:, leaving it blank. This adjustment can help resolve the segmentation fault.

Additional Help

If your environment doesn't allow you to install Ruby or you encounter other issues related to versions, feel free to reach out to me for personalized assistance.


For any further problems or personalized help, please contact me directly.

@BrijeshSajeev
Copy link

I'm running Postgres 12.1, Ruby 2.7.4 and pg 1.4.6 on MX-linux and i'm facing the segfault and i tried gssencmode: disable but still no luck.

@bichou
Copy link

bichou commented Jul 15, 2024

I'm running Postgres 12.1, Ruby 2.7.4 and pg 1.4.6 on MX-linux and i'm facing the segfault and i tried gssencmode: disable but still no luck.

try this work for me
systemctl stop postgresql.service
msfconsole
work fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests