-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix recursion in cache processing #2259
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2259 +/- ##
==========================================
+ Coverage 90.10% 90.22% +0.12%
==========================================
Files 211 212 +1
Lines 34290 34297 +7
==========================================
+ Hits 30897 30946 +49
+ Misses 3393 3351 -42
Continue to review full report at Codecov.
|
@cevian would be good to have a test case that reproduces the bug and shows that this is issue is fixed. |
@erimatnor In theory I agree, but I haven't been able to come up with a reproducible case. This works most of the time as-is. It's some internal Postgres state that causes this behavior to be triggered. |
@erimatnor I spent a lot more time trying to reproduce and could not. It seems like the loop is within I think this only happens when there are a lot of msgs or something similar. I put my attempts at testing this in the fix_recursion_tests branch in my fork in case people want to get back to it at some point, but it feels like there is a benefit to merging this in the meantime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but agree with @erimatnor that it would be great to have a test case for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something is wrong with the cache code if we have to rely on these hacks to solve issues and I wonder how many more issues we have with the cache code. However, this solves the current problem and I do not see that this approach should cause any issues, so approving.
This seems to be a rare occurrence, so not sure how we can test this.
Leaving some more notes: I suspect this happens in the UNKNOWN state and in background workers. The backend causing the invalidation needs to commit or abort to cause invalidations to be sent to other backends, including bg workers. The bg workers has to also already be in a transaction when the messages are received, otherwise it will just happily stay in the unknown state. Another note: scanning for the namespace in the heap instead of through cache will not help. The backtrace shows that the recursion happens through LockRelationOid, which would be called for a pure-heap scan too. Notice that the backtrace is for a cache miss. |
Previously, cache invalidation could cause recursion in cache processing. PR timescale#1493 fixed this for the cache_invalidate_callback() -> ts_extension_invalidate() call path. But the call path cache_invalidate_callback() -> ts_extension_is_loaded() could still go into recursion. So, this PR moves the recursion-prevention logic into extension_update_state(), which is common to both call paths. Fixes timescale#2200.
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high priority for upgrading. In particular the fixes contained in this maintenance release address bugs in compression, drop_chunks and the background worker scheduler. **Bugfixes** * timescale#2059 Improve infering start and stop arguments from gapfill query * timescale#2067 Support moving compressed chunks * timescale#2068 Apply SET TABLESPACE for compressed chunks * timescale#2090 Fix index creation with IF NOT EXISTS for existing indexes * timescale#2092 Fix delete on tables involving hypertables with compression * timescale#2164 Fix telemetry installed_time format * timescale#2184 Fix background worker scheduler memory consumption * timescale#2222 Fix `negative bitmapset member not allowed` in decompression * timescale#2255 Propagate privileges from hypertables to chunks * timescale#2256 Fix segfault in chunk_append with space partitioning * timescale#2259 Fix recursion in cache processing * timescale#2261 Lock dimension slice tuple when scanning **Thanks** * @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning * @dewetburger430 for reporting an issue with setting tablespace for compressed chunks * @fvannee for reporting an issue with cache invalidation * @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables * @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS * @sezaru for reporting an issue with background worker scheduler memory consumption
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high priority for upgrading. In particular the fixes contained in this maintenance release address bugs in compression, drop_chunks and the background worker scheduler. **Bugfixes** * timescale#2059 Improve infering start and stop arguments from gapfill query * timescale#2067 Support moving compressed chunks * timescale#2068 Apply SET TABLESPACE for compressed chunks * timescale#2090 Fix index creation with IF NOT EXISTS for existing indexes * timescale#2092 Fix delete on tables involving hypertables with compression * timescale#2164 Fix telemetry installed_time format * timescale#2184 Fix background worker scheduler memory consumption * timescale#2222 Fix `negative bitmapset member not allowed` in decompression * timescale#2255 Propagate privileges from hypertables to chunks * timescale#2256 Fix segfault in chunk_append with space partitioning * timescale#2259 Fix recursion in cache processing * timescale#2261 Lock dimension slice tuple when scanning **Thanks** * @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning * @dewetburger430 for reporting an issue with setting tablespace for compressed chunks * @fvannee for reporting an issue with cache invalidation * @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables * @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS * @sezaru for reporting an issue with background worker scheduler memory consumption
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high priority for upgrading. In particular the fixes contained in this maintenance release address issues in compression, drop_chunks and the background worker scheduler. **Bugfixes** * timescale#2059 Improve infering start and stop arguments from gapfill query * timescale#2067 Support moving compressed chunks * timescale#2068 Apply SET TABLESPACE for compressed chunks * timescale#2090 Fix index creation with IF NOT EXISTS for existing indexes * timescale#2092 Fix delete on tables involving hypertables with compression * timescale#2164 Fix telemetry installed_time format * timescale#2184 Fix background worker scheduler memory consumption * timescale#2222 Fix `negative bitmapset member not allowed` in decompression * timescale#2255 Propagate privileges from hypertables to chunks * timescale#2256 Fix segfault in chunk_append with space partitioning * timescale#2259 Fix recursion in cache processing * timescale#2261 Lock dimension slice tuple when scanning **Thanks** * @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning * @dewetburger430 for reporting an issue with setting tablespace for compressed chunks * @fvannee for reporting an issue with cache invalidation * @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables * @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS * @prathamesh-sonpatki for contributing a typo fix * @sezaru for reporting an issue with background worker scheduler memory consumption
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high priority for upgrading. In particular the fixes contained in this maintenance release address issues in compression, drop_chunks and the background worker scheduler. **Bugfixes** * #2059 Improve infering start and stop arguments from gapfill query * #2067 Support moving compressed chunks * #2068 Apply SET TABLESPACE for compressed chunks * #2090 Fix index creation with IF NOT EXISTS for existing indexes * #2092 Fix delete on tables involving hypertables with compression * #2164 Fix telemetry installed_time format * #2184 Fix background worker scheduler memory consumption * #2222 Fix `negative bitmapset member not allowed` in decompression * #2255 Propagate privileges from hypertables to chunks * #2256 Fix segfault in chunk_append with space partitioning * #2259 Fix recursion in cache processing * #2261 Lock dimension slice tuple when scanning **Thanks** * @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning * @dewetburger430 for reporting an issue with setting tablespace for compressed chunks * @fvannee for reporting an issue with cache invalidation * @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables * @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS * @prathamesh-sonpatki for contributing a typo fix * @sezaru for reporting an issue with background worker scheduler memory consumption
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high priority for upgrading. In particular the fixes contained in this maintenance release address issues in compression, drop_chunks and the background worker scheduler. **Bugfixes** * #2059 Improve infering start and stop arguments from gapfill query * #2067 Support moving compressed chunks * #2068 Apply SET TABLESPACE for compressed chunks * #2090 Fix index creation with IF NOT EXISTS for existing indexes * #2092 Fix delete on tables involving hypertables with compression * #2164 Fix telemetry installed_time format * #2184 Fix background worker scheduler memory consumption * #2222 Fix `negative bitmapset member not allowed` in decompression * #2255 Propagate privileges from hypertables to chunks * #2256 Fix segfault in chunk_append with space partitioning * #2259 Fix recursion in cache processing * #2261 Lock dimension slice tuple when scanning **Thanks** * @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning * @dewetburger430 for reporting an issue with setting tablespace for compressed chunks * @fvannee for reporting an issue with cache invalidation * @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables * @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS * @prathamesh-sonpatki for contributing a typo fix * @sezaru for reporting an issue with background worker scheduler memory consumption
Previously, cache invalidation could cause recursion in cache processing.
PR #1493 fixed this for the
call path. But the call path
could still go into recursion.
So, this PR moves the recursion-prevention logic into
extension_update_state(), which is common to both call paths.
Fixes #2200.