Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable real time aggregation for CAggs with joins #5243

Merged
merged 1 commit into from
Feb 10, 2023

Conversation

RafiaSabih
Copy link
Contributor

No description provided.

@RafiaSabih RafiaSabih self-assigned this Jan 30, 2023
@codecov
Copy link

codecov bot commented Jan 30, 2023

Codecov Report

Merging #5243 (fb3ad7d) into main (206056c) will increase coverage by 0.00%.
The diff coverage is n/a.

❗ Current head fb3ad7d differs from pull request most recent head 350af01. Consider uploading reports for the commit 350af01 to get more accurate results

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5243   +/-   ##
=======================================
  Coverage   89.35%   89.35%           
=======================================
  Files         225      225           
  Lines       51925    51925           
=======================================
+ Hits        46396    46399    +3     
+ Misses       5529     5526    -3     
Impacted Files Coverage Δ
tsl/src/nodes/data_node_dispatch.c 93.24% <0.00%> (+0.21%) ⬆️
tsl/src/reorder.c 85.71% <0.00%> (+0.21%) ⬆️
src/bgw/job_stat.c 92.18% <0.00%> (+0.31%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8132908...350af01. Read the comment docs.

Copy link
Contributor

@fabriziomello fabriziomello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to show in tests all the CAggs definition (i.e. \d+ conditions_summary_daily_realtime)

Comment on lines 191 to 251
DETAIL: drop cascades to table _timescaledb_internal._hyper_6_35_chunk
drop cascades to table _timescaledb_internal._hyper_6_36_chunk
DETAIL: drop cascades to table _timescaledb_internal._hyper_7_37_chunk
drop cascades to table _timescaledb_internal._hyper_7_38_chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to turn off the detailed output for this part of the test. It will change more times, and potentially generate false positives. It is sufficient to know the number of objects dropped.

Comment on lines 334 to 360
NOTICE: drop cascades to 9 other objects
DETAIL: drop cascades to view _timescaledb_internal._partial_view_3
NOTICE: drop cascades to 12 other objects
DETAIL: drop cascades to view conditions_summary_daily_realtime
drop cascades to view _timescaledb_internal._partial_view_3
drop cascades to view _timescaledb_internal._direct_view_3
drop cascades to view _timescaledb_internal._partial_view_4
drop cascades to view _timescaledb_internal._direct_view_4
drop cascades to view conditions_summary_daily_2
drop cascades to view _timescaledb_internal._partial_view_5
drop cascades to view _timescaledb_internal._direct_view_5
drop cascades to view _timescaledb_internal._partial_view_7
drop cascades to view _timescaledb_internal._direct_view_7
drop cascades to view conditions_summary_daily_2
drop cascades to view _timescaledb_internal._partial_view_6
drop cascades to view _timescaledb_internal._direct_view_6
drop cascades to view _timescaledb_internal._partial_view_8
drop cascades to view _timescaledb_internal._direct_view_8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this do not add anything to the test, so you can just as well remove this output by not printing the detailed error output.

Comment on lines 45 to 56
--Create a cagg with join between a hypertable and a normal table
-- with equality condition on inner join type
CREATE MATERIALIZED VIEW conditions_summary_daily_realtime
WITH (timescaledb.continuous) AS
SELECT time_bucket(INTERVAL '1 day', day) AS bucket,
AVG(temperature),
MAX(temperature),
MIN(temperature),
name
FROM conditions, devices
WHERE conditions.device_id = devices.device_id
GROUP BY name, bucket;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @fabriziomello here. Show the definition of the cagg so that the test demonstrates that one with real-time aggregate was actually created.

@RafiaSabih RafiaSabih force-pushed the real_time_joins branch 4 times, most recently from 350af01 to 1f4efa0 Compare February 7, 2023 11:31
Copy link
Contributor

@jnidzwetzki jnidzwetzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good to me. Please see the comment on the test improvement and add a proper commit/PR description and (if needed) also a changelog entry.

devices
WHERE conditions.device_id = devices.device_id AND devices.device_id >= COALESCE(_timescaledb_internal.to_date(_timescaledb_internal.cagg_watermark(3)), '-infinity'::date)
GROUP BY devices.name, (time_bucket('@ 1 day'::interval, conditions.day));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you insert data into the real-time part of the CAGG and do a SELECT on the CAGG to show that real-time aggregation works?

@RafiaSabih RafiaSabih force-pushed the real_time_joins branch 3 times, most recently from fc57aa3 to 328e7d8 Compare February 9, 2023 09:34
@horzsolt horzsolt added this to the TimescaleDB 2.10 milestone Feb 9, 2023
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Comment on lines 3383 to 3384
* If there is join in CAgg definition then adjust varno
* to get time column from hypertable only.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which hypertable? Both the materialized table and the "underlying" table are hypertables. Or are you thinking about the hypertable in the join? Would be good to clarify which one it is.

Comment on lines 325 to 326
DETAIL: drop cascades to table _timescaledb_internal._hyper_8_41_chunk
drop cascades to table _timescaledb_internal._hyper_8_42_chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest to not print the detailed message here, so set verbosity to terse.

@RafiaSabih RafiaSabih enabled auto-merge (rebase) February 10, 2023 11:05
@RafiaSabih RafiaSabih merged commit ece15d6 into timescale:main Feb 10, 2023
@RafiaSabih RafiaSabih deleted the real_time_joins branch February 10, 2023 20:56
@mahipv mahipv mentioned this pull request Feb 14, 2023
mahipv added a commit that referenced this pull request Feb 20, 2023
This release contains new features and bug fixes since the 2.9.3
release.

This release is high priority for upgrade. We strongly recommend that
you upgrade as soon as possible.

**Features**
* #5241 Allow RETURNING clause when inserting into compressed chunks
* #5245 Manage life-cycle of connections via memory contexts
* #5246 Make connection establishment interruptible
* #5253 Make data node command execution interruptible
* #5243 Enable real-time aggregation for continuous aggregates with
joins
* #5262 Extend enabling compression on a continuous aggregrate with
'compress_segmentby' and 'compress_orderby' parameters

**Bugfixes**
* #4926 Fix corruption when inserting into compressed chunks
* #5218 Add role-level security to job error log
* #5214 Fix use of prepared statement in async module
* #5290 Compression can't be enabled on continuous aggregates when
segmentby/orderby columns need quotation
* #5239 Fix next_start calculation for fixed schedules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants