Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(aws_cloudwatch_logs sink): add configurable log retention #18865

Merged

Conversation

AndrewChubatiuk
Copy link
Contributor

Fixes #12038

@AndrewChubatiuk AndrewChubatiuk requested a review from a team October 17, 2023 15:34
@netlify
Copy link

netlify bot commented Oct 17, 2023

👷 Deploy request for vector-project pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 97d882b

@bits-bot
Copy link

bits-bot commented Oct 17, 2023

CLA assistant check
All committers have signed the CLA.

@netlify
Copy link

netlify bot commented Oct 17, 2023

Deploy Preview for vrl-playground canceled.

Name Link
🔨 Latest commit 97d882b
🔍 Latest deploy log https://app.netlify.com/sites/vrl-playground/deploys/6540f9863a80780008c35ff8

@github-actions github-actions bot added the domain: sinks Anything related to the Vector's sinks label Oct 17, 2023
@AndrewChubatiuk AndrewChubatiuk force-pushed the cloudwatch-logs-retention branch from 21d9a58 to f8cdef9 Compare October 17, 2023 15:53
@neuronull neuronull changed the title enable cloudwatch logs sink retention feat(aws_cloudwatch_logs sink): add configurable log retention Oct 17, 2023
@neuronull neuronull added the sink: aws_cloudwatch_logs Anything `aws_cloudwatch_logs` sink related label Oct 17, 2023
Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I have a couple of queries around CloudwatchFuture. I'm not a massive expert in Cloudwatch, so it's possible they may just arise from my misunderstanding.

As an aside, that future really needs to be rewritten using async/await, but that's probably for another day.

src/sinks/aws_cloudwatch_logs/request.rs Show resolved Hide resolved
Comment on lines 140 to 141
self.state =
State::DeleteRetentionPolicy(self.client.delete_retention_policy());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to not delete the retention policy if one has already been specified. This is the current behaviour and people may be surprised if retention policies start being removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

src/sinks/aws_cloudwatch_logs/request.rs Outdated Show resolved Hide resolved
Comment on lines 353 to 352
.limit(1)
.log_group_name_prefix(group_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sufficient?

  • cloudwatch has log group x1, vector looking for x, api call returns x1
  • cloudwatch has log groups x AND x1, vector looking for x, we HOPE it returns x, but feels like we're depending on some implementation detail for this.

Copy link
Contributor Author

@AndrewChubatiuk AndrewChubatiuk Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added filtering for describe_streams and removed describe_groups since it's not needed anymore

@AndrewChubatiuk AndrewChubatiuk force-pushed the cloudwatch-logs-retention branch from 33da68f to f637e89 Compare October 20, 2023 16:10
@AndrewChubatiuk AndrewChubatiuk force-pushed the cloudwatch-logs-retention branch from f637e89 to f649cf9 Compare October 20, 2023 16:26
@AndrewChubatiuk
Copy link
Contributor Author

@StephenWakely @srstrickland
already made changes according to your comments. is it good to be merged?

Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think there's just one more change that needs to be made and then it's good to go.

Thanks for your patience in dealing with this.

src/sinks/aws_cloudwatch_logs/request.rs Outdated Show resolved Hide resolved
Co-authored-by: Stephen Wakely <stephen@lisp.space>

info!(message = "Retention policy updated for stream.", name = %self.client.stream_name);

self.state = State::CreateStream(self.client.create_log_stream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs formatting.. Nearly there..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StephenWakely any other changes needed?

@StephenWakely
Copy link
Contributor

Ah, we face some clippy errors now.

error: consider choosing a more descriptive name
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:239:28
    |
239 |                         Ok(__) => {}
    |                            ^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#just_underscores_and_digits
note: the lint level is defined here
   --> src/lib.rs:7:9
    |
7   | #![deny(warnings)]
    |         ^^^^^^^^
    = note: `#[deny(clippy::just_underscores_and_digits)]` implied by `#[deny(warnings)]`

error: called `filter(..).next()` on an `Iterator`. This is more succinctly expressed by calling `.find(..)` instead
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:135:43
    |
135 |                       if let Some(stream) = response
    |  ___________________________________________^
136 | |                         .log_streams
137 | |                         .ok_or(CloudwatchError::NoStreamsFound)?
138 | |                         .into_iter()
...   |
141 | |                         })
142 | |                         .next()
    | |_______________________________^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#filter_next
    = note: `#[deny(clippy::filter_next)]` implied by `#[deny(warnings)]`

error: using `clone` on type `i32` which implements the `Copy` trait
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:343:30
    |
343 |         let retention_days = self.retention_days.clone();
    |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try removing the `clone` call: `self.retention_days`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
    = note: `#[deny(clippy::clone_on_copy)]` implied by `#[deny(warnings)]`

Can you take a look? You can run cargo vdev check rust to run clippy against the project to check the errors are fixed.

@AndrewChubatiuk
Copy link
Contributor Author

Ah, we face some clippy errors now.

error: consider choosing a more descriptive name
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:239:28
    |
239 |                         Ok(__) => {}
    |                            ^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#just_underscores_and_digits
note: the lint level is defined here
   --> src/lib.rs:7:9
    |
7   | #![deny(warnings)]
    |         ^^^^^^^^
    = note: `#[deny(clippy::just_underscores_and_digits)]` implied by `#[deny(warnings)]`

error: called `filter(..).next()` on an `Iterator`. This is more succinctly expressed by calling `.find(..)` instead
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:135:43
    |
135 |                       if let Some(stream) = response
    |  ___________________________________________^
136 | |                         .log_streams
137 | |                         .ok_or(CloudwatchError::NoStreamsFound)?
138 | |                         .into_iter()
...   |
141 | |                         })
142 | |                         .next()
    | |_______________________________^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#filter_next
    = note: `#[deny(clippy::filter_next)]` implied by `#[deny(warnings)]`

error: using `clone` on type `i32` which implements the `Copy` trait
Error:    --> src/sinks/aws_cloudwatch_logs/request.rs:343:30
    |
343 |         let retention_days = self.retention_days.clone();
    |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try removing the `clone` call: `self.retention_days`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
    = note: `#[deny(clippy::clone_on_copy)]` implied by `#[deny(warnings)]`

Can you take a look? You can run cargo vdev check rust to run clippy against the project to check the errors are fixed.

done

@StephenWakely
Copy link
Contributor

Ok... Hopefully final check. The docs are out of date since we have added a new configuration option. You need to run

make generate-component-docs

and check in the resulting changes.

@AndrewChubatiuk AndrewChubatiuk requested review from a team as code owners November 8, 2023 21:17
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Nov 8, 2023
@AndrewChubatiuk
Copy link
Contributor Author

@StephenWakely
Please check again

Copy link
Contributor

@buraizu buraizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, approving with one minor update requested

@@ -639,6 +639,20 @@ base: components: sinks: aws_cloudwatch_logs: configuration: {
}
}
}
retention: {
description: "Retention policy configuration for AWS Cloudwatch Log Group"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: "Retention policy configuration for AWS Cloudwatch Log Group"
description: "Retention policy configuration for AWS CloudWatch Log Group"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buraizu done

Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. We are getting some document errors now. These errors should have been highlighted when you generated the component docs, but for some reason they weren't.

Thanks for your patience. We really need to improve the process with these checks.

#[derive(Clone, Debug, Default)]
/// Retention policy configuration for AWS CloudWatch Log Group
pub struct Retention {
#[configurable(derived)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[configurable(derived)]
/// Whether or not to set a retention policy when creating a new Log Group.

#[serde(default)]
pub enabled: bool,

#[configurable(derived)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[configurable(derived)]
/// If retention is enabled, the number of days to retain logs for.

deserialize_with = "retention_days",
skip_serializing_if = "crate::serde::skip_serializing_if_default"
)]
pub days: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub days: i32,
pub days: u32,

Unfortunately, the docs aren't able to work with i32. u32 should be fine here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@StephenWakely
Copy link
Contributor

Can you also run make generate-component-docs to update the docs with these changes.

@AndrewChubatiuk
Copy link
Contributor Author

Can you also run make generate-component-docs to update the docs with these changes.

done

Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you very much for seeing this through.

@StephenWakely StephenWakely added this pull request to the merge queue Nov 10, 2023
Copy link

Regression Detector Results

Run ID: f4c49727-8e08-4c97-885b-b1bddc57f5af
Baseline: 473b720
Comparison: 104984f
Total vector CPUs: 7

Explanation

A regression test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine quickly if vector performance is changed and to what degree by a pull request.

Because a target's optimization goal performance in each experiment will vary somewhat each time it is run, we can only estimate mean differences in optimization goal relative to the baseline target. We express these differences as a percentage change relative to the baseline target, denoted "Δ mean %". These estimates are made to a precision that balances accuracy and cost control. We represent this precision as a 90.00% confidence interval denoted "Δ mean % CI": there is a 90.00% chance that the true value of "Δ mean %" is in that interval.

We decide whether a change in performance is a "regression" -- a change worth investigating further -- if both of the following two criteria are true:

  1. The estimated |Δ mean %| ≥ 5.00%. This criterion intends to answer the question "Does the estimated change in mean optimization goal performance have a meaningful impact on your customers?". We assume that when |Δ mean %| < 5.00%, the impact on your customers is not meaningful. We also assume that a performance change in optimization goal is worth investigating whether it is an increase or decrease, so long as the magnitude of the change is sufficiently large.

  2. Zero is not in the 90.00% confidence interval "Δ mean % CI" about "Δ mean %". This statement is equivalent to saying that there is at least a 90.00% chance that the mean difference in optimization goal is not zero. This criterion intends to answer the question, "Is there a statistically significant difference in mean optimization goal performance?". It also means there is no more than a 10.00% chance this criterion reports a statistically significant difference when the true difference in mean optimization goal is zero -- a "false positive". We assume you are willing to accept a 10.00% chance of inaccurately detecting a change in performance when no true difference exists.

The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values of "Δ mean %" mean that baseline is faster, whereas positive values of "Δ mean %" mean that comparison is faster. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed.

No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%.

Fine details of change detection per experiment.
experiment goal Δ mean % Δ mean % CI confidence
otlp_http_to_blackhole ingress throughput +4.53 [+4.38, +4.68] 100.00%
socket_to_socket_blackhole ingress throughput +3.90 [+3.83, +3.98] 100.00%
fluent_elasticsearch ingress throughput +2.63 [+2.18, +3.09] 100.00%
syslog_log2metric_splunk_hec_metrics ingress throughput +1.01 [+0.88, +1.14] 100.00%
http_to_http_acks ingress throughput +0.74 [-0.57, +2.05] 64.77%
file_to_blackhole egress throughput +0.59 [-1.90, +3.07] 30.32%
datadog_agent_remap_datadog_logs ingress throughput +0.56 [+0.47, +0.66] 100.00%
syslog_loki ingress throughput +0.52 [+0.50, +0.55] 100.00%
http_text_to_http_json ingress throughput +0.23 [+0.12, +0.33] 99.93%
syslog_humio_logs ingress throughput +0.21 [+0.13, +0.29] 100.00%
syslog_regex_logs2metric_ddmetrics ingress throughput +0.12 [+0.03, +0.22] 97.45%
http_to_http_noack ingress throughput +0.09 [-0.01, +0.20] 84.56%
http_to_http_json ingress throughput +0.04 [-0.04, +0.11] 59.71%
splunk_hec_indexer_ack_blackhole ingress throughput +0.00 [-0.14, +0.14] 1.08%
splunk_hec_to_splunk_hec_logs_acks ingress throughput -0.00 [-0.14, +0.14] 2.11%
splunk_hec_to_splunk_hec_logs_noack ingress throughput -0.02 [-0.13, +0.10] 20.31%
datadog_agent_remap_datadog_logs_acks ingress throughput -0.06 [-0.13, +0.02] 78.94%
syslog_splunk_hec_logs ingress throughput -0.07 [-0.12, -0.02] 97.14%
http_to_s3 ingress throughput -0.07 [-0.34, +0.20] 34.21%
enterprise_http_to_http ingress throughput -0.09 [-0.18, -0.00] 90.82%
otlp_grpc_to_blackhole ingress throughput -0.14 [-0.24, -0.04] 97.90%
splunk_hec_route_s3 ingress throughput -0.20 [-0.71, +0.31] 48.49%
datadog_agent_remap_blackhole ingress throughput -0.26 [-0.34, -0.17] 100.00%
datadog_agent_remap_blackhole_acks ingress throughput -0.98 [-1.07, -0.90] 100.00%
syslog_log2metric_humio_metrics ingress throughput -1.09 [-1.17, -1.01] 100.00%

Merged via the queue into vectordotdev:master with commit 104984f Nov 10, 2023
38 checks passed
@AndrewChubatiuk AndrewChubatiuk deleted the cloudwatch-logs-retention branch November 10, 2023 15:44
pront pushed a commit to dygfloyd/vector that referenced this pull request Nov 15, 2023
…rdotdev#18865)

* enable cloudwatch logs sink retention

* MR comments

Co-authored-by: Stephen Wakely <stephen@lisp.space>

* make fmt

* fixed warnings

* generated docs

* pr comments

* pr comments

* generated docs

---------

Co-authored-by: Stephen Wakely <stephen@lisp.space>
@noamcohen97
Copy link

When is this supposed to be released?

@StephenWakely
Copy link
Contributor

When is this supposed to be released?

We are aiming to release v0.35 later today (8th Jan) which will include this change.

AndrooTheChen pushed a commit to discord/vector that referenced this pull request Sep 23, 2024
…rdotdev#18865)

* enable cloudwatch logs sink retention

* MR comments

Co-authored-by: Stephen Wakely <stephen@lisp.space>

* make fmt

* fixed warnings

* generated docs

* pr comments

* pr comments

* generated docs

---------

Co-authored-by: Stephen Wakely <stephen@lisp.space>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks sink: aws_cloudwatch_logs Anything `aws_cloudwatch_logs` sink related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[aws_cloudwatch_logs sink] Log retention is not configurable
7 participants