Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0.8 crashes in update_galera_set_read_only() #2393

Closed
alexjfisher opened this issue Nov 13, 2019 · 6 comments
Closed

2.0.8 crashes in update_galera_set_read_only() #2393

alexjfisher opened this issue Nov 13, 2019 · 6 comments

Comments

@alexjfisher
Copy link

Upgrading from 2.0.7 to 2.0.8, proxysql crashes shortly after startup.

This is on RHEL 7.7 using the official package

I think all relevant configuration is probably already in the log file, but in addition, my setup looks a bit like...

  • 2 galera hostgroups, 2 servers in each (clusters also have a garbd node for quorum).
  • Asynchronous replication is configured between the clusters and one cluster has all its nodes set to READ_ONLY (this is how I'm handling site failover)
  • I have a query rule that directs users to the 'correct' writer hostgroup (ie the one in the cluster which doesn't have all its nodes set as read-only). The destination_hostgroup is updated by a scheduler script. (To failover between sites, I set the current live site nodes to read-only=1 and shortly afterwards, set the new site nodes to read-only=0. The scheduler script notices the changes in the runtime_mysql_servers table and updates the query rules appropriately.).

I install and configure ProxySQL using Puppet with puppet/proxysql that I help maintain.

The full log, ends with...

2019-11-13 13:24:18 MySQL_HostGroups_Manager.cpp:4447:update_galera_set_read_only(): [WARNING] Galera: setting host 192.168.6.74:3306 (part of cluster with writer_hostgroup=1) in read_only because: read_only=YES
2019-11-13 13:24:18 sqlite3db.cpp:61:execute(): [ERROR] SQLITE error: no such function: hostgroup_id --- UPDATE OR IGNORE mysql_servers_incoming SET hostgroup_id=3 WHERE hostname='192.168.6.74' AND port=3306 AND hostgroup_id (1, 2)
Error: signal 11:
/usr/bin/proxysql(_Z13crash_handleri+0x1a)[0x4992da]
/lib64/libc.so.6(+0x363f0)[0x7f3bf183b3f0]
/lib64/libc.so.6(_IO_vfprintf+0x4a79)[0x7f3bf1852069]
/lib64/libc.so.6(vsprintf+0x6b)[0x7f3bf187642b]
/lib64/libc.so.6(sprintf+0x87)[0x7f3bf18585c7]
/usr/bin/proxysql(_ZN24MySQL_HostGroups_Manager27update_galera_set_read_onlyEPciiS0_+0x2bf)[0x4c244f]
/usr/bin/proxysql(_Z21monitor_galera_threadPv+0x13b3)[0x559803]
/usr/bin/proxysql(_ZN14ConsumerThread3runEv+0xf7)[0x55b7d7]
/lib64/libpthread.so.0(+0x7ea5)[0x7f3bf2a1bea5]
/lib64/libc.so.6(clone+0x6d)[0x7f3bf19038cd]
2019-11-13 13:24:18 main.cpp:1396:ProxySQL_daemonize_phase3(): [ERROR] ProxySQL crashed. Restarting!

i can attach a core dump encrypted with gpg or can mail it privately if needed. I did have a quick look with gdb and maybe the following is enough on its own.

Core was generated by `/usr/bin/proxysql --reload -c /etc/proxysql.cnf'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f3bf1852069 in _IO_vfprintf_internal (s=s@entry=0x7f3bed7e0d80, format=<optimized out>, 
    format@entry=0x8e1498 "DELETE FROM mysql_servers_incoming WHERE hostname='%s' AND port=%d AND hostgroup_id in (%d, %d) FROM mysql_galera_hostgroups WHERE writer_hostgroup=%d)", 
    ap=ap@entry=0x7f3bed7e0ea8) at vfprintf.c:1635
1635		  process_string_arg (((struct printf_spec *) NULL));
(gdb) bt
#0  0x00007f3bf1852069 in _IO_vfprintf_internal (s=s@entry=0x7f3bed7e0d80, format=<optimized out>, 
    format@entry=0x8e1498 "DELETE FROM mysql_servers_incoming WHERE hostname='%s' AND port=%d AND hostgroup_id in (%d, %d) FROM mysql_galera_hostgroups WHERE writer_hostgroup=%d)", 
    ap=ap@entry=0x7f3bed7e0ea8) at vfprintf.c:1635
#1  0x00007f3bf187642b in __IO_vsprintf (string=0x7f3bedad9d00 "DELETE FROM mysql_servers_incoming WHERE hostname='up_id=3 WHERE hostname='192.168.6.74' AND port=3306 AND hostgroup_id (1, 2)", 
    format=0x8e1498 "DELETE FROM mysql_servers_incoming WHERE hostname='%s' AND port=%d AND hostgroup_id in (%d, %d) FROM mysql_galera_hostgroups WHERE writer_hostgroup=%d)", 
    args=args@entry=0x7f3bed7e0ea8) at iovsprintf.c:42
#2  0x00007f3bf18585c7 in __sprintf (s=<optimized out>, format=<optimized out>) at sprintf.c:32
#3  0x00000000004c244f in MySQL_HostGroups_Manager::update_galera_set_read_only (this=0x7f3bf143a500, _hostname=0x7f3bee821100 "192.168.6.74", _port=3306, _writer_hostgroup=1, 
    _error=_error@entry=0x9101a5 "read_only=YES") at MySQL_HostGroups_Manager.cpp:4458
#4  0x0000000000559803 in monitor_galera_thread (arg=<optimized out>) at MySQL_Monitor.cpp:1827
#5  0x000000000055b7d7 in ConsumerThread::run (this=0x7f3bee732a70) at MySQL_Monitor.cpp:83
#6  0x00007f3bf2a1bea5 in start_thread (arg=0x7f3bed7e3700) at pthread_create.c:307
#7  0x00007f3bf19038cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

#2290 seems to be the most likely change between 2.0.7 and 2.0.8 that might be causing this issue. @acampoh - is there anything obvious you can spot?

Many thanks,
Alex

@acampoh
Copy link

acampoh commented Nov 13, 2019

Yep, it looks like those are the guilty lines.

I'm so sorry about that. i'll make a patch asap.

thanks for the report

@acampoh
Copy link

acampoh commented Nov 13, 2019

here is the patch.

#2394

sorry for the bug

@themightystephen
Copy link

Encountered the same issue here. Thanks for patching.

For now, I've disabled proxysql on my cluster until the fix is available in a new release (proxysql is more of a nice-to-have in my case), as it looked like it was leading to occasional "Connection refused" errors, which I assume occurred when connections were attempted between proxysql crashing and restarting.

@renecannao
Copy link
Contributor

Fixed in 2.0.9 . Closing

@themightystephen
Copy link

Excellent, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants