-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fatal error: concurrent map writes #1962
Comments
Can you share more details about the OPA configuration and logs leading up to the panic? I only see the stack trace log. |
This happens during everyday operation, but I think it relates to Status API with multiple bundles. |
Previously it would pass a reference to the status map on the plugin, which is potentially dangerous as the map can be changed (happens explicitly on plugin reconfigure). This now gives each bulk listener their own copy of the map. Fixes: open-policy-agent#1962 Signed-off-by: Patrick East <east.patrick@gmail.com>
#1991 makes some speculative changes (at least fixing some issue). @yuvalamar any chance you can try that change out? I haven't been able to repo the issue on my environments. |
Previously it would pass a reference to the status map on the plugin, which is potentially dangerous as the map can be changed (happens explicitly on plugin reconfigure). This now gives each bulk listener their own copy of the map. Fixes: #1962 Signed-off-by: Patrick East <east.patrick@gmail.com>
Re-opening until we get some verification that #1991 fixes it |
Was able to reproduce this by using ~1000 bundles (each with their own unique service) and updating the discovery config to remove bundles (dropped it to ~980). The trick appeared to be having enough bundles and updating them at a high enough frequency that there was constantly a status update message being sent, this pretty much guaranteed that the reconfiguration that happened when discovery updates would modify the status map while the status plugin was trying to read it. With the build including the changes from #1991 it was no longer reproducing. |
Expected Behavior
Opa should not panic and exit
Actual Behavior
panic
Steps to Reproduce the Problem
I think the bug is found in :
func (m *Manager) Client(name string) rest.Client { return m.services[name] }
I added the full log with panic stack
we had 19 OPA restarts in production in the last 24 hours
panic.log
Additional Info
opa version 15.0
The text was updated successfully, but these errors were encountered: