-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Timescale segfaults when backfilling data #6540
Comments
hi @iliastsa, thank you for reaching out. Is it possible to share the schema and hypertable definition for the hypertable that you are getting the segfault with? Is that table compressed? |
Sure, here is the DDL + TimescaleDB / compression settings: create table logs (
column1 bigint not null,
column2 int not null,
column3 int not null,
column4 int not null,
column5 int,
column6 bool not null,
column7 bytea not null,
column8 bytea,
column9 bytea,
column10 bytea,
column11 bytea,
column12 bytea null,
primary key (column1, column2, column3)
);
select create_hypertable('logs', 'column1', chunk_time_interval => 300000, create_default_indexes => false);
alter table logs set (
timescaledb.compress,
timescaledb.compress_segmentby = 'column7',
timescaledb.compress_orderby = 'column1 desc, column2 desc, column4 desc'
); |
hello @iliastsa, we are trying to reproduce the error by inserting/deleting and COPYing into compressed chunks but unfortunately, we do not have a reproduction case so far. |
Yeah I've also tried to reproduce it locally with inserts/deletes/COPYs but can't get it to crash. I don't have a coredump, I'll try and get one when we encounter the crash again. |
We are suddenly having what seems to be the same issue. This system has been running for 2 months with no issue and suddenly started getting this:
The duplicate key error is semi-expected - the issue is it should not crash postgres!
Backtrace from core dump:
Postgres version: 14.9 |
Ok it turns out in our case we were actually still on timescale 2.11.0. Running |
...except that the issue has reappeared after upgrading to 2.14.2. |
Is the stack trace on 2.14.2 different? There was a similar problem that was fixed in 2.12, probably this is what you hit initially: #6117 But if it still fails on 2.14.2, maybe you're hitting something different now. If you can make some snippet of data on which it reproduces always, that would be perfect, because I was not able to reproduce it or figure out a possible cause after some experiments. |
Unfortunately this was all on a live system and I didn't manage to obtain a stack trace for 2.14.2. |
This release contains bug fixes since the 2.15.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#6540 Segmentation fault when backfilling data with COPY into a compressed chunk * timescale#6858 Before update trigger not working correctly * timescale#6908 Fix gapfill with timezone behaviour around dst switches * timescale#6911 Fix dropped chunk metadata removal in update script * timescale#6940 Fix `pg_upgrade` failure by removing `regprocedure` from catalog table * timescale#6957 Fix segfault in UNION queries with ordering on compressed chunks **Thanks** * @DiAifU, @kiddhombre and @intermittentnrg for reporting issues with gapfill and daylight saving time * @edgarzamora for reporting issue with update triggers * @hongquan for reporting an issue with the update script * @iliastsa and @SystemParadox for reporting an issue with COPY into a compressed chunk
the 2.14.2 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#6540 Segmentation fault when backfilling data with COPY into a compressed chunk * timescale#6858 Before update trigger not working correctly * timescale#6908 Fix gapfill with timezone behaviour around dst switches * timescale#6911 Fix dropped chunk metadata removal in update script * timescale#6940 Fix `pg_upgrade` failure by removing `regprocedure` from catalog table * timescale#6957 Fix segfault in UNION queries with ordering on compressed chunks **Thanks** * @DiAifU, @kiddhombre and @intermittentnrg for reporting issues with gapfill and daylight saving time * @edgarzamora for reporting issue with update triggers * @hongquan for reporting an issue with the update script * @iliastsa and @SystemParadox for reporting an issue with COPY into a compressed chunk
This release contains performance improvements and bug fixes since the 2.15.0 release. Best practice is to upgrade at the next available opportunity. **Migrating from self-hosted TimescaleDB v2.14.x and earlier** After you run `ALTER EXTENSION`, you must run [this SQL script](https://github.com/timescale/timescaledb-extras/blob/master/utils/2.15.X-fix_hypertable_foreign_keys.sql). For more details, see the following pull request [#6797](#6797). If you are migrating from TimescaleDB v2.15.0, no changes are required. **Bugfixes** * #6540: Segmentation fault when you backfill data using COPY into a compressed chunk. * #6858: `BEFORE UPDATE` trigger not working correctly. * #6908: Fix `time_bucket_gapfill()` with timezone behaviour around daylight savings time (DST) switches. * #6911: Fix dropped chunk metadata removal in the update script. * #6940: Fix `pg_upgrade` failure by removing `regprocedure` from the catalog table. * #6957: Fix the `segfault` in UNION queries that contain ordering on compressed chunks. **Thanks** * @DiAifU, @kiddhombre and @intermittentnrg for reporting the issues with gapfill and daylight saving time. * @edgarzamora for reporting the issue with update triggers. * @hongquan for reporting the issue with the update script. * @iliastsa and @SystemParadox for reporting the issue with COPY into a compressed chunk.
This release contains bug fixes since the 2.15.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#6540 Segmentation fault when backfilling data with COPY into a compressed chunk * timescale#6858 Before update trigger not working correctly * timescale#6908 Fix gapfill with timezone behaviour around dst switches * timescale#6911 Fix dropped chunk metadata removal in update script * timescale#6940 Fix `pg_upgrade` failure by removing `regprocedure` from catalog table * timescale#6957 Fix segfault in UNION queries with ordering on compressed chunks **Thanks** * @DiAifU, @kiddhombre and @intermittentnrg for reporting issues with gapfill and daylight saving time * @edgarzamora for reporting issue with update triggers * @hongquan for reporting an issue with the update script * @iliastsa and @SystemParadox for reporting an issue with COPY into a compressed chunk
This release contains bug fixes since the 2.15.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #6540 Segmentation fault when backfilling data with COPY into a compressed chunk * #6858 Before update trigger not working correctly * #6908 Fix gapfill with timezone behaviour around dst switches * #6911 Fix dropped chunk metadata removal in update script * #6940 Fix `pg_upgrade` failure by removing `regprocedure` from catalog table * #6957 Fix segfault in UNION queries with ordering on compressed chunks **Thanks** * @DiAifU, @kiddhombre and @intermittentnrg for reporting issues with gapfill and daylight saving time * @edgarzamora for reporting issue with update triggers * @hongquan for reporting an issue with the update script * @iliastsa and @SystemParadox for reporting an issue with COPY into a compressed chunk
What type of bug is this?
Crash
What subsystems and features are affected?
Data ingestion
What happened?
When backfilling data into a hypertable, we get a segfault and the server goes into recovery mode. We've encountered this multiple times, on all TimescaleDB versions since we started using it (which was at v2.11.1 if I'm not mistaken).
It should be noted that we've noticed that dropping old chunks before backfilling helped the backfill to progress further sometimes.
TimescaleDB version affected
2.13.1
PostgreSQL version used
15.5
What operating system did you use?
Ubuntu 22.04 LTS x64
What installation method did you use?
Deb/Apt
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
How can we reproduce the bug?
The text was updated successfully, but these errors were encountered: