-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix relcache callback handling causing crashes #4193
Conversation
15e8b72
to
343726c
Compare
343726c
to
7fb64f1
Compare
Codecov Report
@@ Coverage Diff @@
## main #4193 +/- ##
==========================================
- Coverage 90.77% 90.75% -0.02%
==========================================
Files 215 215
Lines 39384 39464 +80
==========================================
+ Hits 35750 35815 +65
- Misses 3634 3649 +15
Continue to review full report at Codecov.
|
7fb64f1
to
ecf804f
Compare
I tested manually these two things:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, but I need to continue checking.
ecf804f
to
1b3f478
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this since the last comment is not critical.
Will look into this PR more seriously as well. |
We should probably have a disclaimer in comments above every function that is potentially called in syscache callback context. Something in the spirit of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the above LGTM.
6cddc41
to
e8c8857
Compare
8366543
to
9e32a92
Compare
Add a TAP test that checks that the extensions state is updated across concurrent sessions/backends when the extension is "dropped" or "created".
9e32a92
to
43bedf3
Compare
@@ -425,8 +425,7 @@ fdw_relinfo_create(PlannerInfo *root, RelOptInfo *rel, Oid server_oid, Oid local | |||
*/ | |||
fpinfo->fdw_startup_cost = DEFAULT_FDW_STARTUP_COST; | |||
fpinfo->fdw_tuple_cost = DEFAULT_FDW_TUPLE_COST; | |||
Assert(ts_extension_oid != InvalidOid); | |||
fpinfo->shippable_extensions = list_make1_oid(ts_extension_oid); | |||
fpinfo->shippable_extensions = list_make1_oid(ts_extension_get_oid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add this here:
Assert(OidIsValid(fpinfo->shippable_extensions));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the assertion since the function ts_extension_get_oid()
cannot return an invalid OID.
endforeach(P_FILE) | ||
|
||
set(MODULE_PATHNAME "$libdir/timescaledb-${PROJECT_VERSION_MOD}") | ||
configure_file(functions.sql.in functions.sql) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All .sql
files we have that use @MODULE_PATHNAME@
don't have an .in
extension. We're not very consistent. Also, I think this is the first test script that actually uses @MODULE_PATHNAME@
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some test sql files use :MODULE_PATHNAME
, because it is set at runtime via psql variables. The solution I use here is different and sets the @MODULE_PATHNAME@
to replace it at configuration (cmake) time. This is also why I added the .in extension. I think both are valid approaches, but the runtime-option isn't easily available for TAP tests.
Also, this sql file isn't strictly a test, but rather a library file. But not sure that matters.
If we are inconsistent elsewhere, we can fix that separately.
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#3974 Fix remote EXPLAIN with parameterized queries * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #3974 Fix remote EXPLAIN with parameterized queries * #4122 Fix segfault on INSERT into distributed hypertable * #4142 Ignore invalid relid when deleting hypertable * #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * #4161 Fix memory handling during scans * #4186 Fix owner change for distributed hypertable * #4192 Abort sessions after extension reload * #4193 Fix relcache callback handling causing crashes **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * #4122 Fix segfault on INSERT into distributed hypertable * #4142 Ignore invalid relid when deleting hypertable * #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * #4161 Fix memory handling during scans * #4176 Fix remote EXPLAIN with parameterized queries * #4181 Fix spelling errors and omissions * #4186 Fix owner change for distributed hypertable * #4192 Abort sessions after extension reload * #4193 Fix relcache callback handling causing crashes * #4199 Remove signal-unsafe calls from signal handlers * #4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain
Fix a crash that could corrupt indexes when running
VACUUM FULL pg_class
.The crash happens when caches are queried/updated within a cache
invalidation function, which can lead to corruption and recursive
cache invalidations.
To solve the issue, make sure the relcache invalidation callback is
simple and never invokes the relcache or syscache directly or
indirectly.
Some background: The extension is preloaded and thus have planner
hooks installed irrespective of whether the extension is actually
installed or not in the current database. However, the hooks need to
be disabled as long as the extension is not installed. To avoid always
having to dynamically check for the presence of the extension, the
state is cached in the session.
However, the cached state needs to be updated if the extension changes
(altered/dropped/created). Therefore, the relcache invalidation
callback mechanism is (ab)used in TimescaleDB to signal updates to the
extension state across all active backends.
The signaling is implemented by installing a dummy table as part of
the extension and any invalidation on the relid for that table signals
a change in the extension state. However, as of this change, the
actual state is no longer determined in the callback itself, since it
requires use of the relcache and causes the bad behavior. Therefore,
the only thing that remains in the callback after this change is to
reset the extension state.
The actual state is instead resolved on-demand, but can still be
cached when the extension is in the installed state and the dummy
table is present with a known relid. However, if the extension is not
installed, the extension state can no longer be cached as there is no
way to signal other backends that the state should be reset when they
don't know the dummy table's relid, and cannot resolve it from within
the callback itself.
Fixes #3924
Disable-Check: Commit-Count