Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory handling during scans #4161

Merged
merged 2 commits into from
Mar 18, 2022

Conversation

erimatnor
Copy link
Contributor

@erimatnor erimatnor commented Mar 12, 2022

Scan functions cannot be called on a per-tuple memory context as they
might allocate data that need to live until the end of the scan. Fix
this in a couple of places to ensure correct memory handling.

Fixes #4148, #4145
Disable-Check: Commit-Count

@erimatnor erimatnor linked an issue Mar 12, 2022 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Mar 12, 2022

Codecov Report

Merging #4161 (533abcf) into main (ce3e04a) will increase coverage by 0.03%.
The diff coverage is 97.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4161      +/-   ##
==========================================
+ Coverage   90.66%   90.69%   +0.03%     
==========================================
  Files         215      215              
  Lines       39362    39382      +20     
==========================================
+ Hits        35686    35718      +32     
+ Misses       3676     3664      -12     
Impacted Files Coverage Δ
src/scan_iterator.h 100.00% <ø> (ø)
src/scanner.c 94.70% <96.42%> (+0.19%) ⬆️
src/chunk_scan.c 98.66% <100.00%> (ø)
.../constraint_aware_append/constraint_aware_append.c 92.53% <100.00%> (+0.03%) ⬆️
src/scan_iterator.c 94.11% <100.00%> (ø)
src/telemetry/stats.c 97.67% <100.00%> (-0.03%) ⬇️
tsl/src/nodes/data_node_dispatch.c 97.10% <0.00%> (-0.25%) ⬇️
tsl/src/bgw_policy/job.c 88.01% <0.00%> (-0.05%) ⬇️
src/bgw/job.c 93.22% <0.00%> (+0.28%) ⬆️
tsl/src/reorder.c 85.17% <0.00%> (+0.29%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a759b2b...533abcf. Read the comment docs.

@@ -252,9 +252,9 @@ ts_chunk_scan_by_constraints(const Hyperspace *hs, const List *dimension_vecs,
prev_chunk_oid = chunk->table_id;
}

MemoryContextSwitchTo(work_mcxt);
/* Only one chunk should match */
Assert(ts_scan_iterator_next(&chunk_it) == NULL);
Copy link
Contributor

@pmwkaa pmwkaa Mar 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is not related to this PR, but I think placing working code inside the Assert statement might break it for the Release build or any other one, which is built without debug enabled, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It verifies that only one row is returned, which is the assumption we make since we scan on the primary key. In debug, this is verified to be true, but the check isn't really needed if the assumption is true, which is why it is elided in release builds.

@erimatnor erimatnor force-pushed the fix-scanning-crashes branch from 671efe5 to 34dead2 Compare March 14, 2022 08:32
@erimatnor erimatnor marked this pull request as ready for review March 14, 2022 08:48
@erimatnor erimatnor requested a review from a team as a code owner March 14, 2022 08:48
@erimatnor erimatnor requested review from berkley, afiskon, duncan-tsdb, akuzm, mkindahl, nikkhils, svenklemm and a team and removed request for a team, berkley, afiskon and duncan-tsdb March 14, 2022 08:48
@erimatnor
Copy link
Contributor Author

This should fix the recent Sqlsmith crashes, but we should double-check that the CI job works as expected.

@erimatnor
Copy link
Contributor Author

Would like to merge #4164 first and then run the Sqlsmith tests on this PR before merging.

@erimatnor erimatnor added the bug label Mar 15, 2022
@erimatnor erimatnor self-assigned this Mar 15, 2022
src/telemetry/stats.c Outdated Show resolved Hide resolved
@erimatnor erimatnor force-pushed the fix-scanning-crashes branch from 34dead2 to ab6bbf5 Compare March 16, 2022 10:15
Copy link
Member

@svenklemm svenklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldnt reproduce the issue from #4148 with this applied.

Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits, but nothing critical.

src/scanner.c Show resolved Hide resolved
src/scanner.c Show resolved Hide resolved
Comment on lines +415 to +417
if (ts_scanner_limit_reached(ctx))
is_valid = false;
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
if (ts_scanner_limit_reached(ctx))
is_valid = false;
else
if (ts_scanner_limit_reached(ctx))
{
is_valid = false;
}
else

src/telemetry/stats.c Outdated Show resolved Hide resolved
@erimatnor erimatnor requested a review from pmwkaa March 17, 2022 09:05
@erimatnor
Copy link
Contributor Author

Couldnt reproduce the issue from #4148 with this applied.

FYI, the CI checks also include the SQLsmith run on this commit below.

@erimatnor erimatnor force-pushed the fix-scanning-crashes branch from ab6bbf5 to 1965994 Compare March 17, 2022 09:15
Scan functions cannot be called on a per-tuple memory context as they
might allocate data that need to live until the end of the scan. Fix
this in a couple of places to ensure correct memory handling.

Fixes timescale#4148, timescale#4145
PostgreSQL scan functions might allocate memory that needs to live for
the duration of the scan. This applies also to functions that are
called during the scan, such as getting the next tuple. To avoid
situations when such functions are accidentally called on, e.g., a
short-lived per-tuple context, add a explicit scan memory context to
the Scanner interface that wraps the PostgreSQL scan API.
@erimatnor erimatnor force-pushed the fix-scanning-crashes branch from 1965994 to 533abcf Compare March 17, 2022 19:58
@erimatnor erimatnor merged commit 846878c into timescale:main Mar 18, 2022
@erimatnor erimatnor deleted the fix-scanning-crashes branch March 18, 2022 07:06
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 5, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
This was referenced Apr 5, 2022
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 6, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#3974 Fix remote EXPLAIN with parameterized queries
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #3974 Fix remote EXPLAIN with parameterized queries
* #4122 Fix segfault on INSERT into distributed hypertable
* #4142 Ignore invalid relid when deleting hypertable
* #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* #4161 Fix memory handling during scans
* #4186 Fix owner change for distributed hypertable
* #4192 Abort sessions after extension reload
* #4193 Fix relcache callback handling causing crashes

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 8, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 11, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 11, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
RafiaSabih added a commit to RafiaSabih/timescaledb that referenced this pull request Apr 11, 2022
This release is patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
svenklemm pushed a commit to RafiaSabih/timescaledb that referenced this pull request Apr 11, 2022
This release is a patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* timescale#4122 Fix segfault on INSERT into distributed hypertable
* timescale#4142 Ignore invalid relid when deleting hypertable
* timescale#4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* timescale#4161 Fix memory handling during scans
* timescale#4176 Fix remote EXPLAIN with parameterized queries
* timescale#4181 Fix spelling errors and omissions
* timescale#4186 Fix owner change for distributed hypertable
* timescale#4192 Abort sessions after extension reload
* timescale#4193 Fix relcache callback handling causing crashes
* timescale#4199 Remove signal-unsafe calls from signal handlers
* timescale#4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
svenklemm pushed a commit that referenced this pull request Apr 11, 2022
This release is a patch release. We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #4121 Fix RENAME TO/SET SCHEMA on distributed hypertable
* #4122 Fix segfault on INSERT into distributed hypertable
* #4142 Ignore invalid relid when deleting hypertable
* #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* #4161 Fix memory handling during scans
* #4176 Fix remote EXPLAIN with parameterized queries
* #4181 Fix spelling errors and omissions
* #4186 Fix owner change for distributed hypertable
* #4192 Abort sessions after extension reload
* #4193 Fix relcache callback handling causing crashes
* #4199 Remove signal-unsafe calls from signal handlers
* #4219 Do not modify aggregation state in finalize

**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Crash in planner for JOIN queries. [Bug]: Telemetry Reporter terminated segmentation fault
4 participants