Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.6.1 #4213

Merged
merged 26 commits into from
Apr 11, 2022
Merged

Release 2.6.1 #4213

merged 26 commits into from
Apr 11, 2022

Conversation

RafiaSabih
Copy link
Contributor

2.6.1 (2022-04-11)
This release is patch release. We recommend that you upgrade at the next available opportunity.

Bugfixes

#3974 Fix remote EXPLAIN with parameterized queries
#4122 Fix segfault on INSERT into distributed hypertable
#4142 Ignore invalid relid when deleting hypertable
#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
#4161 Fix memory handling during scans
#4186 Fix owner change for distributed hypertable
#4192 Abort sessions after extension reload
#4193 Fix relcache callback handling causing crashes
Thanks

@abrownsword for reporting a crash in the telemetry reporter
@daydayup863 for reporting issue with remote explain

@RafiaSabih RafiaSabih requested a review from a team as a code owner April 5, 2022 12:26
@RafiaSabih RafiaSabih requested review from josesahad and akuzm and removed request for a team April 5, 2022 12:26
@RafiaSabih RafiaSabih force-pushed the release-2.6.1 branch 7 times, most recently from aeca377 to a081921 Compare April 7, 2022 13:32
@mkindahl mkindahl changed the base branch from main to 2.6.x April 7, 2022 16:00
svenklemm and others added 18 commits April 8, 2022 08:56
On non-debug builds we might end up with an empty list of tests
when generating the schedule. On older cmake versions < 3.14
trying to sort an empty list will produce an error, so
we check for empty list here.
Since we now lock down search_path during update/downgrade there
are some additional requirements for writing sql files.
We cache the Chunk structs in RelOptInfo private data. They are later
used to estimate the chunk sizes, check which data nodes they belong
to, et cetera. Looking up the chunks is expensive, so this change
speeds up the planning.
When inserting in a distributed hypertable with a query on a
distributed hypertable a segfault would occur when all the chunks
on the query would get pruned.
This PR fixes ON_END logic for distributed DDL execution
by removing old leftover check, which marked those commands
as unsupported.

Fix: timescale#4106
As part of adding a scan iterator interface on top of the Scanner
module (commit 8baaa98), the internal scanner state that was
previously private, was made public. Now that it is public, it makes
more sense to make it part of the standard user-facing `ScannerCtx`
struct, which also simplifies the code elsewhere.
Make the Scanner module more flexible by allowing optional control
over when the scanned relation is opened and closed. Relations can
then remain open over multiple scans, which can improve performance
and efficiency.

Closes timescale#2173
Chunk scan performance during querying is improved by avoiding
repeated open and close of relations and indexes when joining chunk
information from different metadata tables.

When executing a query on a hypertable, it is expanded to include all
its children chunks. However, during the expansion, the chunks that
don't match the query constraints should also be excluded. The
following changes are made to make the scanning and exclusion more
efficient:

* Ensure metadata relations and indexes are only opened once even
  though metadata for multiple chunks are scanned. This avoids doing
  repeated open and close of tables and indexes for each chunk
  scanned.
* Avoid interleaving scans of different relations, ensuring better
  data locality, and having, e.g., indexes warm in cache.
* Avoid unnecessary scans that repeat work already done.
* Ensure chunks are locked in a consistent order (based on Oid).

To enable the above changes, some refactoring was necessary. The chunk
scans that happen during constraint exclusion are moved into separate
source files (`chunk_scan.c`) for better structure and readability.

Some test outputs are affected due to the new ordering of chunks in
append relations.
Scan functions cannot be called on a per-tuple memory context as they
might allocate data that need to live until the end of the scan. Fix
this in a couple of places to ensure correct memory handling.

Fixes timescale#4148, timescale#4145
PostgreSQL scan functions might allocate memory that needs to live for
the duration of the scan. This applies also to functions that are
called during the scan, such as getting the next tuple. To avoid
situations when such functions are accidentally called on, e.g., a
short-lived per-tuple context, add a explicit scan memory context to
the Scanner interface that wraps the PostgreSQL scan API.
When running `performDeletion` is is necessary to have a valid relation
id, but when doing a lookup using `ts_hypertable_get_by_id` this might
actually return a hypertable entry pointing to a table that does not
exist because it has been deleted previously. In this case, only the
catalog entry should be removed, but it is not necessary to delete the
actual table.

This scenario can occur if both the hypertable and a compressed table
are deleted as part of running a `sql_drop` event, for example, if a
compressed hypertable is defined inside an extension. In this case, the
compressed hypertable (indeed all tables) will be deleted first, and
the lookup of the compressed hypertable will find it in the metadata
but a lookup of the actual table will fail since the table does not
exist.

Fixes timescale#4140
In certain multi-node queries, we end up using a parameterized query
on the datanodes. If "timescaledb.enable_remote_explain" is enabled we
run an EXPLAIN on the datanode with the remote query. EXPLAIN doesn't
work with parameterized queries. So, we check for that case and avoid
invoking a remote EXPLAIN if so.

Fixes timescale#3974

Reported and test case provided by @daydayup863
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes timescale#4200
If a session is started and loads (and caches, by OID) functions in the
extension to use them in, for example, a `SELECT` query on a continuous
aggregate, the extension will be marked as loaded internally.

If an `ALTER EXTENSION` is then executed in a separate session, it will
update `pg_extension` to hold the new version, and any other sessions
will see this as the new version, including the session that already
loaded the previous version of the shared library.

Since the pre-update session has loaded some functions from the old
version already, running the same queries with the old named functions
will trigger a reload of the new version of the shared library to get
the new functions (same name, but different OID), but since this has
already been loaded in a different version, it will trigger an error
that GUC variables are re-defined.

Further queries after that will then corrupt the database causing a
crash.

This commit fixes this by recording the version loaded rather than if
it has been loaded and check that the version did not change after a
query has been analyzed (in the `post_analyze_hook`). If the version
changed, it will generate a fatal error to force an abort of the
session.

Fixes timescale#4191
Fix a crash that could corrupt indexes when running VACUUM FULL
pg_class.

The crash happens when caches are queried/updated within a cache
invalidation function, which can lead to corruption and recursive
cache invalidations.

To solve the issue, make sure the relcache invalidation callback is
simple and never invokes the relcache or syscache directly or
indirectly.

Some background: The extension is preloaded and thus have planner
hooks installed irrespective of whether the extension is actually
installed or not in the current database. However, the hooks need to
be disabled as long as the extension is not installed. To avoid always
having to dynamically check for the presence of the extension, the
state is cached in the session.

However, the cached state needs to be updated if the extension changes
(altered/dropped/created). Therefore, the relcache invalidation
callback mechanism is (ab)used in TimescaleDB to signal updates to the
extension state across all active backends.

The signaling is implemented by installing a dummy table as part of
the extension and any invalidation on the relid for that table signals
a change in the extension state. However, as of this change, the
actual state is no longer determined in the callback itself, since it
requires use of the relcache and causes the bad behavior. Therefore,
the only thing that remains in the callback after this change is to
reset the extension state.

The actual state is instead resolved on-demand, but can still be
cached when the extension is in the installed state and the dummy
table is present with a known relid. However, if the extension is not
installed, the extension state can no longer be cached as there is no
way to signal other backends that the state should be reset when they
don't know the dummy table's relid, and cannot resolve it from within
the callback itself.

Fixes timescale#3924
erimatnor and others added 4 commits April 8, 2022 09:20
Add a TAP test that checks that the extensions state is updated across
concurrent sessions/backends when the extension is "dropped" or
"created".
Stop throwing exception with message "column of relation already exists"
when running the command ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...
on compressed hypertables.

Fix timescale#4087
Allow ALTER TABLE OWNER TO command to be used with distributed
hypertable.

Fix timescale#4180
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
@codecov
Copy link

codecov bot commented Apr 8, 2022

Codecov Report

Merging #4213 (fc6d8a4) into 2.6.x (23962c8) will increase coverage by 0.01%.
The diff coverage is 97.74%.

❗ Current head fc6d8a4 differs from pull request most recent head 943da93. Consider uploading reports for the commit 943da93 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##            2.6.x    #4213      +/-   ##
==========================================
+ Coverage   90.67%   90.68%   +0.01%     
==========================================
  Files         215      216       +1     
  Lines       39060    39312     +252     
==========================================
+ Hits        35418    35652     +234     
- Misses       3642     3660      +18     
Impacted Files Coverage Δ
src/chunk.h 100.00% <ø> (ø)
src/planner.c 94.95% <ø> (-0.02%) ⬇️
src/planner.h 100.00% <ø> (ø)
src/ts_catalog/catalog.h 100.00% <ø> (ø)
tsl/src/continuous_aggs/invalidation.c 96.10% <ø> (ø)
tsl/src/continuous_aggs/materialize.c 71.42% <0.00%> (ø)
tsl/src/fdw/data_node_scan_plan.c 98.21% <ø> (ø)
src/extension.c 86.27% <90.00%> (-4.02%) ⬇️
src/loader/loader.c 92.94% <94.11%> (+0.05%) ⬆️
src/plan_expand_hypertable.c 93.94% <95.65%> (-0.11%) ⬇️
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23962c8...943da93. Read the comment docs.

@horzsolt horzsolt removed the request for review from josesahad April 8, 2022 11:54
@horzsolt horzsolt assigned mkindahl and RafiaSabih and unassigned mkindahl Apr 8, 2022
@horzsolt horzsolt requested a review from mkindahl April 8, 2022 11:55
@RafiaSabih RafiaSabih force-pushed the release-2.6.1 branch 3 times, most recently from c448377 to 6446d91 Compare April 11, 2022 09:51
Markos Fountoulakis and others added 4 commits April 11, 2022 17:16
Add concurrent_query_and_drop_chunks to ignore-list and fix C compiler
warning.
Scan functions cannot be called on a per-tuple memory context as they
might allocate data that need to live until the end of the scan. Fix
this in a couple of places to ensure correct memory handling.

Fixes timescale#4148, timescale#4145
PostgreSQL scan functions might allocate memory that needs to live for
the duration of the scan. This applies also to functions that are
called during the scan, such as getting the next tuple. To avoid
situations when such functions are accidentally called on, e.g., a
short-lived per-tuple context, add a explicit scan memory context to
the Scanner interface that wraps the PostgreSQL scan API.
This release is a patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
@svenklemm svenklemm merged commit 9ae47c6 into timescale:2.6.x Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants