Release 2.6.1 #4213

RafiaSabih · 2022-04-05T12:26:39Z

2.6.1 (2022-04-11)
This release is patch release. We recommend that you upgrade at the next available opportunity.

Bugfixes

#3974 Fix remote EXPLAIN with parameterized queries
#4122 Fix segfault on INSERT into distributed hypertable
#4142 Ignore invalid relid when deleting hypertable
#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
#4161 Fix memory handling during scans
#4186 Fix owner change for distributed hypertable
#4192 Abort sessions after extension reload
#4193 Fix relcache callback handling causing crashes
Thanks

@abrownsword for reporting a crash in the telemetry reporter
@daydayup863 for reporting issue with remote explain

On non-debug builds we might end up with an empty list of tests when generating the schedule. On older cmake versions < 3.14 trying to sort an empty list will produce an error, so we check for empty list here.

Since we now lock down search_path during update/downgrade there are some additional requirements for writing sql files.

We cache the Chunk structs in RelOptInfo private data. They are later used to estimate the chunk sizes, check which data nodes they belong to, et cetera. Looking up the chunks is expensive, so this change speeds up the planning.

When inserting in a distributed hypertable with a query on a distributed hypertable a segfault would occur when all the chunks on the query would get pruned.

This PR fixes ON_END logic for distributed DDL execution by removing old leftover check, which marked those commands as unsupported. Fix: timescale#4106

As part of adding a scan iterator interface on top of the Scanner module (commit 8baaa98), the internal scanner state that was previously private, was made public. Now that it is public, it makes more sense to make it part of the standard user-facing `ScannerCtx` struct, which also simplifies the code elsewhere.

Make the Scanner module more flexible by allowing optional control over when the scanned relation is opened and closed. Relations can then remain open over multiple scans, which can improve performance and efficiency. Closes timescale#2173

Chunk scan performance during querying is improved by avoiding repeated open and close of relations and indexes when joining chunk information from different metadata tables. When executing a query on a hypertable, it is expanded to include all its children chunks. However, during the expansion, the chunks that don't match the query constraints should also be excluded. The following changes are made to make the scanning and exclusion more efficient: * Ensure metadata relations and indexes are only opened once even though metadata for multiple chunks are scanned. This avoids doing repeated open and close of tables and indexes for each chunk scanned. * Avoid interleaving scans of different relations, ensuring better data locality, and having, e.g., indexes warm in cache. * Avoid unnecessary scans that repeat work already done. * Ensure chunks are locked in a consistent order (based on Oid). To enable the above changes, some refactoring was necessary. The chunk scans that happen during constraint exclusion are moved into separate source files (`chunk_scan.c`) for better structure and readability. Some test outputs are affected due to the new ordering of chunks in append relations.

Scan functions cannot be called on a per-tuple memory context as they might allocate data that need to live until the end of the scan. Fix this in a couple of places to ensure correct memory handling. Fixes timescale#4148, timescale#4145

PostgreSQL scan functions might allocate memory that needs to live for the duration of the scan. This applies also to functions that are called during the scan, such as getting the next tuple. To avoid situations when such functions are accidentally called on, e.g., a short-lived per-tuple context, add a explicit scan memory context to the Scanner interface that wraps the PostgreSQL scan API.

When running `performDeletion` is is necessary to have a valid relation id, but when doing a lookup using `ts_hypertable_get_by_id` this might actually return a hypertable entry pointing to a table that does not exist because it has been deleted previously. In this case, only the catalog entry should be removed, but it is not necessary to delete the actual table. This scenario can occur if both the hypertable and a compressed table are deleted as part of running a `sql_drop` event, for example, if a compressed hypertable is defined inside an extension. In this case, the compressed hypertable (indeed all tables) will be deleted first, and the lookup of the compressed hypertable will find it in the metadata but a lookup of the actual table will fail since the table does not exist. Fixes timescale#4140

@daydayup863

In certain multi-node queries, we end up using a parameterized query on the datanodes. If "timescaledb.enable_remote_explain" is enabled we run an EXPLAIN on the datanode with the remote query. EXPLAIN doesn't work with parameterized queries. So, we check for that case and avoid invoking a remote EXPLAIN if so. Fixes timescale#3974 Reported and test case provided by @daydayup863

Functions `elog` and `ereport` are unsafe to use in signal handlers since they call `malloc`. This commit removes them from signal handlers. Fixes timescale#4200

If a session is started and loads (and caches, by OID) functions in the extension to use them in, for example, a `SELECT` query on a continuous aggregate, the extension will be marked as loaded internally. If an `ALTER EXTENSION` is then executed in a separate session, it will update `pg_extension` to hold the new version, and any other sessions will see this as the new version, including the session that already loaded the previous version of the shared library. Since the pre-update session has loaded some functions from the old version already, running the same queries with the old named functions will trigger a reload of the new version of the shared library to get the new functions (same name, but different OID), but since this has already been loaded in a different version, it will trigger an error that GUC variables are re-defined. Further queries after that will then corrupt the database causing a crash. This commit fixes this by recording the version loaded rather than if it has been loaded and check that the version did not change after a query has been analyzed (in the `post_analyze_hook`). If the version changed, it will generate a fatal error to force an abort of the session. Fixes timescale#4191

Fix a crash that could corrupt indexes when running VACUUM FULL pg_class. The crash happens when caches are queried/updated within a cache invalidation function, which can lead to corruption and recursive cache invalidations. To solve the issue, make sure the relcache invalidation callback is simple and never invokes the relcache or syscache directly or indirectly. Some background: The extension is preloaded and thus have planner hooks installed irrespective of whether the extension is actually installed or not in the current database. However, the hooks need to be disabled as long as the extension is not installed. To avoid always having to dynamically check for the presence of the extension, the state is cached in the session. However, the cached state needs to be updated if the extension changes (altered/dropped/created). Therefore, the relcache invalidation callback mechanism is (ab)used in TimescaleDB to signal updates to the extension state across all active backends. The signaling is implemented by installing a dummy table as part of the extension and any invalidation on the relid for that table signals a change in the extension state. However, as of this change, the actual state is no longer determined in the callback itself, since it requires use of the relcache and causes the bad behavior. Therefore, the only thing that remains in the callback after this change is to reset the extension state. The actual state is instead resolved on-demand, but can still be cached when the extension is in the installed state and the dummy table is present with a known relid. However, if the extension is not installed, the extension state can no longer be cached as there is no way to signal other backends that the state should be reset when they don't know the dummy table's relid, and cannot resolve it from within the callback itself. Fixes timescale#3924

Add a TAP test that checks that the extensions state is updated across concurrent sessions/backends when the extension is "dropped" or "created".

Stop throwing exception with message "column of relation already exists" when running the command ALTER TABLE ... ADD COLUMN IF NOT EXISTS ... on compressed hypertables. Fix timescale#4087

Allow ALTER TABLE OWNER TO command to be used with distributed hypertable. Fix timescale#4180

The function `tsl_finalize_agg_ffunc` modified the aggregation state by setting `trans_value` to the final result when computing the final value. Since the state can be re-used several times, there could be several calls to the finalization function, and the finalization function would be confused when passed a final value instead of a aggregation state transition value. This commit fixes this by not modifying the `trans_value` when computing the final value and instead just returns it (or the original `trans_value` if there is no finalization function). Fixes timescale#3248

codecov · 2022-04-08T07:31:32Z

Codecov Report

Merging #4213 (fc6d8a4) into 2.6.x (23962c8) will increase coverage by 0.01%.
The diff coverage is 97.74%.

❗ Current head fc6d8a4 differs from pull request most recent head 943da93. Consider uploading reports for the commit 943da93 to get more accurate results

@@            Coverage Diff             @@
##            2.6.x    #4213      +/-   ##
==========================================
+ Coverage   90.67%   90.68%   +0.01%     
==========================================
  Files         215      216       +1     
  Lines       39060    39312     +252     
==========================================
+ Hits        35418    35652     +234     
- Misses       3642     3660      +18

Impacted Files	Coverage Δ
src/chunk.h	`100.00% <ø> (ø)`
src/planner.c	`94.95% <ø> (-0.02%)`	⬇️
src/planner.h	`100.00% <ø> (ø)`
src/ts_catalog/catalog.h	`100.00% <ø> (ø)`
tsl/src/continuous_aggs/invalidation.c	`96.10% <ø> (ø)`
tsl/src/continuous_aggs/materialize.c	`71.42% <0.00%> (ø)`
tsl/src/fdw/data_node_scan_plan.c	`98.21% <ø> (ø)`
src/extension.c	`86.27% <90.00%> (-4.02%)`	⬇️
src/loader/loader.c	`92.94% <94.11%> (+0.05%)`	⬆️
src/plan_expand_hypertable.c	`93.94% <95.65%> (-0.11%)`	⬇️
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23962c8...943da93. Read the comment docs.

Add concurrent_query_and_drop_chunks to ignore-list and fix C compiler warning.

Scan functions cannot be called on a per-tuple memory context as they might allocate data that need to live until the end of the scan. Fix this in a couple of places to ensure correct memory handling. Fixes timescale#4148, timescale#4145

PostgreSQL scan functions might allocate memory that needs to live for the duration of the scan. This applies also to functions that are called during the scan, such as getting the next tuple. To avoid situations when such functions are accidentally called on, e.g., a short-lived per-tuple context, add a explicit scan memory context to the Scanner interface that wraps the PostgreSQL scan API.

@abrownsword

This release is a patch release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable * timescale#4122 Fix segfault on INSERT into distributed hypertable * timescale#4142 Ignore invalid relid when deleting hypertable * timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable * timescale#4161 Fix memory handling during scans * timescale#4176 Fix remote EXPLAIN with parameterized queries * timescale#4181 Fix spelling errors and omissions * timescale#4186 Fix owner change for distributed hypertable * timescale#4192 Abort sessions after extension reload * timescale#4193 Fix relcache callback handling causing crashes * timescale#4199 Remove signal-unsafe calls from signal handlers * timescale#4219 Do not modify aggregation state in finalize **Thanks** * @abrownsword for reporting a crash in the telemetry reporter * @daydayup863 for reporting issue with remote explain

RafiaSabih requested a review from a team as a code owner April 5, 2022 12:26

RafiaSabih requested review from josesahad and akuzm and removed request for a team April 5, 2022 12:26

RafiaSabih force-pushed the release-2.6.1 branch 7 times, most recently from aeca377 to a081921 Compare April 7, 2022 13:32

mkindahl changed the base branch from main to 2.6.x April 7, 2022 16:00

svenklemm and others added 18 commits April 8, 2022 08:56

Fix non-debug build with older cmake

37775fe

On non-debug builds we might end up with an empty list of tests when generating the schedule. On older cmake versions < 3.14 trying to sort an empty list will produce an error, so we check for empty list here.

Post release 2.6.0

2e0dafa

Document requirements for statements in sql files

786e008

Since we now lock down search_path during update/downgrade there are some additional requirements for writing sql files.

Cache chunk data when performing chunk exclusion

8cd565b

We cache the Chunk structs in RelOptInfo private data. They are later used to estimate the chunk sizes, check which data nodes they belong to, et cetera. Looking up the chunks is expensive, so this change speeds up the planning.

Fix segfault on INSERT in distributed hypertables

b2d1e67

When inserting in a distributed hypertable with a query on a distributed hypertable a segfault would occur when all the chunks on the query would get pruned.

Fix RENAME TO/SET SCHEMA on distributed hypertable

d710e52

This PR fixes ON_END logic for distributed DDL execution by removing old leftover check, which marked those commands as unsupported. Fix: timescale#4106

Fix memory handling during scans

e52cd11

Scan functions cannot be called on a per-tuple memory context as they might allocate data that need to live until the end of the scan. Fix this in a couple of places to ensure correct memory handling. Fixes timescale#4148, timescale#4145

Remove signal-unsafe calls from signal handlers

29d0e65

Functions `elog` and `ereport` are unsafe to use in signal handlers since they call `malloc`. This commit removes them from signal handlers. Fixes timescale#4200

Update comments to Postgresql standard style

a40080a

Fix spelling errors and omissions

d4f68ca

erimatnor and others added 4 commits April 8, 2022 09:20

Add TAP tests for extension state

e8e6dd0

Add a TAP test that checks that the extensions state is updated across concurrent sessions/backends when the extension is "dropped" or "created".

Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable

85502a2

Stop throwing exception with message "column of relation already exists" when running the command ALTER TABLE ... ADD COLUMN IF NOT EXISTS ... on compressed hypertables. Fix timescale#4087

Fix owner change for distributed hypertable

e63fac4

Allow ALTER TABLE OWNER TO command to be used with distributed hypertable. Fix timescale#4180

RafiaSabih force-pushed the release-2.6.1 branch from a081921 to 7dead4f Compare April 8, 2022 07:22

mkindahl force-pushed the release-2.6.1 branch from 7dead4f to fc6d8a4 Compare April 8, 2022 11:52

horzsolt removed the request for review from josesahad April 8, 2022 11:54

horzsolt assigned mkindahl and RafiaSabih and unassigned mkindahl Apr 8, 2022

horzsolt requested a review from mkindahl April 8, 2022 11:55

RafiaSabih force-pushed the release-2.6.1 branch 3 times, most recently from c448377 to 6446d91 Compare April 11, 2022 09:51

Markos Fountoulakis and others added 4 commits April 11, 2022 17:16

Fix regressions found in nightly CI

97df916

Add concurrent_query_and_drop_chunks to ignore-list and fix C compiler warning.

Fix memory handling during scans

e15438a

Scan functions cannot be called on a per-tuple memory context as they might allocate data that need to live until the end of the scan. Fix this in a couple of places to ensure correct memory handling. Fixes timescale#4148, timescale#4145

svenklemm force-pushed the release-2.6.1 branch from 6446d91 to 943da93 Compare April 11, 2022 15:29

svenklemm approved these changes Apr 11, 2022

View reviewed changes

svenklemm merged commit 9ae47c6 into timescale:2.6.x Apr 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.6.1 #4213

Release 2.6.1 #4213

RafiaSabih commented Apr 5, 2022

codecov bot commented Apr 8, 2022 •

edited

Loading

Release 2.6.1 #4213

Release 2.6.1 #4213

Conversation

RafiaSabih commented Apr 5, 2022

codecov bot commented Apr 8, 2022 • edited Loading

Codecov Report

codecov bot commented Apr 8, 2022 •

edited

Loading