-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement(pulsar sink): Refactor to use StreamSink #14345
Conversation
This commit heavily refactors the Pulsar Sink to use the StreamSink interface and is modeled after the Kafka Sink. It also adds additional features that bring it in line with Kafka Sink feature set. This includes: * Refactoring to use StreamSink instead of Sink interace. See vectordotdev#9261 * Supports dynamic topics using a topic template * Refactor configurations in advance of adding Pulsar source * Rework message parsing to support logs and metrics, with support for dynamic keys and properties This work is heavily modeled after Kafka sink. This means there has been some duplication of some utility code. However, it has not been refactored to remove the duplication as there wasn't a clear pattern of where such shared code should be put. Additionally, this refactor seems to be much simpler by using StreamSink but does require some workarounds limitations in the Pulsar client library by wrapping certain resources in Arc<Mutex> that *may* have performance implications. I am not famaliar enough to know if there might be some efficiencies by structuring this differently. Remaining work: * Add a few more tests
✅ Deploy Preview for vector-project canceled.
|
Soak Test ResultsBaseline: 3c8f8de ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial feedback, mostly on the configurable. Will need to spend a chunk of time reviewing the bulk of the changes.
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Co-authored-by: Spencer Gilbert <Spencer.Gilbert@gmail.com>
Soak Test ResultsBaseline: 28113af ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: fd9e733 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey thanks for the contribution!
It looks like this PR is getting close.
This doesn't happen very often but we actually have two PRs that are touching the same component, and in this case one PR would be overwritten by your changes.
The good news is I believe it should be fairly straightforward to satisfy the other PR's logic within this PR of yours, if you are open to it.
The other conflicting PR is #14779
, where there is a need to expose retry logic settings.
Luckily we have provisions to do this in the new style sink framework.
I've left a couple comments below showing where that would be added.
If you could include that in your PR it would be awesome! 🙏
Hey @neuronull, sorry I lost the thread on this one! Yes I would like to get this in, working on it right now |
✅ Deploy Preview for vrl-playground ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Co-authored-by: neuronull <kyle.criddle@datadoghq.com>
This changes the Pulsar sink to use the new helpers like: * Request Builder * Tower Request / Retry Settings * Improved size_of However, it is not currently complining and I need some help
Hey @neuronull, I have worked on refactoring this to use the tower request and also use some of the newer mechanisms for request builder, retry logic, and the byte counting... However, I have a snag... and some of my rust inexperience is showing. To keep this PR clear, I haven't pushed the in-progress changes here, but instead to https://github.com/addisonj/vector/tree/pulsar_sink_improve_help I think I am pretty close to have it compiling... but hitting a snag that I am struggling to make sense of:
Comparing to other sinks, I don't see any others that need a clone on passing the service. I also don't understand what I am missing in implementing the traits, as they seems like they should all be internal to the tower service. I am going to keep poking at this but if you have any ideas, I would appreciate it! |
Thanks for tackling the changes! I'll take a look at this and get back to you. |
To solve the immediate problem outlined above, I believe you just need to derive #[derive(Clone)]
pub struct PulsarService<Exe: Executor> {
// ...
} |
Regression Detector ResultsRun ID: 16996c3d-20d8-42f5-9eb4-567abb6b14cf ExplanationA regression test is an integrated performance test for The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed. No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%. Fine details of change detection per experiment.
|
Regression Detector ResultsRun ID: e8089e9d-6368-40b6-b658-c4e6d2aa0088 ExplanationA regression test is an integrated performance test for The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed. No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%. Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable, I don't have any issues! Excited to see this make it in :)
✅ Deploy Preview for vrl-playground canceled.
|
Regression Detector ResultsRun ID: 0e981b33-098b-460b-a9fe-6d1234089bf8 ExplanationA regression test is an integrated performance test for The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed. No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%. Fine details of change detection per experiment.
|
Regression Detector ResultsRun ID: 4a63dcbd-a9e2-4169-b89d-c5d275781b2e ExplanationA regression test is an integrated performance test for The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed. No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%. Fine details of change detection per experiment.
|
At last, merging this one in. |
Woohoo! thanks for getting this over the line. Will try and do some testing of this as we have some pretty high scale use cases |
# [1.13.0](answerbook/vector@v1.12.1...v1.13.0) (2023-09-13) ### Bug Fixes * **appsignal sink**: Add TLS config option (vectordotdev#17122) [198068c](answerbook/vector@198068c) - GitHub * **buffers**: correctly handle partial writes in reader seek during initialization (vectordotdev#17099) [a791595](answerbook/vector@a791595) - GitHub * **config**: recurse through schema refs when determining eligibility for unevaluated properties (vectordotdev#17150) [71d1bf6](answerbook/vector@71d1bf6) - GitHub * **docker_logs source**: Support tcp schema [e1c0c02](answerbook/vector@e1c0c02) - GitHub * **elasticsearch sink**: Elasticsearch sink with api_version set to "auto" does not recognize the API version of ES6 as V6 (vectordotdev#17226) (vectordotdev#17227) [9b6ef24](answerbook/vector@9b6ef24) - GitHub * **gcp_stackdriver_metrics sink**: Call function to regenerate auth token (vectordotdev#17297) [bf7904b](answerbook/vector@bf7904b) - GitHub * **influxdb_logs**: encode influx line when no tags present (vectordotdev#17029) [c3aa14f](answerbook/vector@c3aa14f) - GitHub * **reduce transform**: Revert flushing on interval change to `expire_metrics_ms` (vectordotdev#17084) [e86b155](answerbook/vector@e86b155) - GitHub * **releasing**: Fix globbing of release artifacts for GitHub (vectordotdev#17114) [7fe089c](answerbook/vector@7fe089c) - GitHub * **schemas**: Dont panic with non object field kinds (vectordotdev#17140) [1e43208](answerbook/vector@1e43208) - GitHub ### Chores * (syslog source): add source_ip to some syslog tests (vectordotdev#17235) [29c34c0](answerbook/vector@29c34c0) - GitHub * add note to DEVELOPING.md re panics (vectordotdev#17277) [03e905e](answerbook/vector@03e905e) - GitHub * Add UX note about encoding of log_schema keys (vectordotdev#17266) [dc6e54c](answerbook/vector@dc6e54c) - GitHub * **administration**: add `appsignal` to codeowners (vectordotdev#17127) [7b15d19](answerbook/vector@7b15d19) - GitHub * **buffer**: tidy up some of the module level docs for `disk_v2` (vectordotdev#17093) [edaa612](answerbook/vector@edaa612) - GitHub * **ci**: bump docker/metadata-action from 4.3.0 to 4.4.0 (vectordotdev#17170) [854d71e](answerbook/vector@854d71e) - GitHub * **ci**: Disable `appsignal` integration test until CA issues are resolved (vectordotdev#17109) [f3b5d42](answerbook/vector@f3b5d42) - GitHub * **ci**: Disable scheduled runs of Baseline Timings workflow (vectordotdev#17281) [4335b0a](answerbook/vector@4335b0a) - GitHub * **ci**: Fix event assertions for `aws_ec2_metadata` transform (vectordotdev#17413) [da36fb6](answerbook/vector@da36fb6) - GitHub * **ci**: Increase timeout for integration tests (vectordotdev#17326) [e1f125a](answerbook/vector@e1f125a) - GitHub * **ci**: Increase timeout for integration tests to 30m (vectordotdev#17350) [5d3f619](answerbook/vector@5d3f619) - GitHub * **ci**: re-enable `appsignal` integration test (vectordotdev#17111) [48fc574](answerbook/vector@48fc574) - GitHub * **ci**: Remove ci-sweep tasks (vectordotdev#17415) [5c33f99](answerbook/vector@5c33f99) - GitHub * **ci**: remove unnecessary dep install (vectordotdev#17128) [f56d1ef](answerbook/vector@f56d1ef) - GitHub * **ci**: Try to fix apt retries (vectordotdev#17393) [6b3db04](answerbook/vector@6b3db04) - GitHub * **ci**: update unsupported ubuntu version runners (vectordotdev#17113) [e7c4815](answerbook/vector@e7c4815) - GitHub * **ci**: use python v3.8 in ubuntu 20.04 runner (vectordotdev#17116) [7a40c81](answerbook/vector@7a40c81) - GitHub * **config**: begin laying out primitives for programmatically querying schema (vectordotdev#17130) [aad8115](answerbook/vector@aad8115) - GitHub * **config**: emit human-friendly version of enum variant/property names in schema (vectordotdev#17171) [3b38ba8](answerbook/vector@3b38ba8) - GitHub * **config**: improve config schema output with more precise `unevaluatedProperties` + schema ref flattening (vectordotdev#17026) [2d72f82](answerbook/vector@2d72f82) - GitHub * **deps**: Add 3rd party license file and CI checks (vectordotdev#17344) [7350e1a](answerbook/vector@7350e1a) - GitHub * **deps**: bump anyhow from 1.0.70 to 1.0.71 (vectordotdev#17300) [6a5af3b](answerbook/vector@6a5af3b) - GitHub * **deps**: bump assert_cmd from 2.0.10 to 2.0.11 (vectordotdev#17290) [c4784fd](answerbook/vector@c4784fd) - GitHub * **deps**: bump async-compression from 0.3.15 to 0.4.0 (vectordotdev#17365) [b9aac47](answerbook/vector@b9aac47) - GitHub * **deps**: bump async-graphql from 5.0.7 to 5.0.8 (vectordotdev#17357) [05a4f17](answerbook/vector@05a4f17) - GitHub * **deps**: bump async-graphql-warp from 5.0.7 to 5.0.8 (vectordotdev#17367) [693584e](answerbook/vector@693584e) - GitHub * **deps**: bump async-stream from 0.3.4 to 0.3.5 (vectordotdev#17076) [c29c817](answerbook/vector@c29c817) - GitHub * **deps**: bump aws-sigv4 from 0.55.0 to 0.55.1 (vectordotdev#17138) [dbb3f25](answerbook/vector@dbb3f25) - GitHub * **deps**: bump axum from 0.6.12 to 0.6.18 (vectordotdev#17257) [41ac76e](answerbook/vector@41ac76e) - GitHub * **deps**: bump cached from 0.42.0 to 0.43.0 (vectordotdev#17118) [f90b3b3](answerbook/vector@f90b3b3) - GitHub * **deps**: bump chrono-tz from 0.8.1 to 0.8.2 (vectordotdev#17088) [623b838](answerbook/vector@623b838) - GitHub * **deps**: bump clap_complete from 4.2.0 to 4.2.1 (vectordotdev#17229) [d286d16](answerbook/vector@d286d16) - GitHub * **deps**: bump clap_complete from 4.2.1 to 4.2.2 (vectordotdev#17359) [565668e](answerbook/vector@565668e) - GitHub * **deps**: bump clap_complete from 4.2.2 to 4.2.3 (vectordotdev#17383) [111cd07](answerbook/vector@111cd07) - GitHub * **deps**: bump console-subscriber from 0.1.8 to 0.1.9 (vectordotdev#17358) [97b862c](answerbook/vector@97b862c) - GitHub * **deps**: bump directories from 5.0.0 to 5.0.1 (vectordotdev#17271) [be69f5f](answerbook/vector@be69f5f) - GitHub * **deps**: bump dunce from 1.0.3 to 1.0.4 (vectordotdev#17244) [cfc387d](answerbook/vector@cfc387d) - GitHub * **deps**: bump enumflags2 from 0.7.5 to 0.7.6 (vectordotdev#17079) [cbc17be](answerbook/vector@cbc17be) - GitHub * **deps**: bump enumflags2 from 0.7.6 to 0.7.7 (vectordotdev#17206) [c80c5eb](answerbook/vector@c80c5eb) - GitHub * **deps**: bump flate2 from 1.0.25 to 1.0.26 (vectordotdev#17320) [ef13370](answerbook/vector@ef13370) - GitHub * **deps**: bump getrandom from 0.2.8 to 0.2.9 (vectordotdev#17101) [d53240b](answerbook/vector@d53240b) - GitHub * **deps**: bump h2 from 0.3.18 to 0.3.19 (vectordotdev#17388) [6088abd](answerbook/vector@6088abd) - GitHub * **deps**: bump hashlink from 0.8.1 to 0.8.2 (vectordotdev#17419) [01b3cd7](answerbook/vector@01b3cd7) - GitHub * **deps**: bump hyper from 0.14.25 to 0.14.26 (vectordotdev#17347) [c43dcfd](answerbook/vector@c43dcfd) - GitHub * **deps**: bump inventory from 0.3.5 to 0.3.6 (vectordotdev#17401) [5b5ad16](answerbook/vector@5b5ad16) - GitHub * **deps**: bump libc from 0.2.140 to 0.2.141 (vectordotdev#17104) [dd9608a](answerbook/vector@dd9608a) - GitHub * **deps**: bump libc from 0.2.141 to 0.2.142 (vectordotdev#17273) [bc618a2](answerbook/vector@bc618a2) - GitHub * **deps**: bump libc from 0.2.142 to 0.2.143 (vectordotdev#17338) [6afe206](answerbook/vector@6afe206) - GitHub * **deps**: bump libc from 0.2.143 to 0.2.144 (vectordotdev#17346) [99b8dc1](answerbook/vector@99b8dc1) - GitHub * **deps**: bump memmap2 from 0.5.10 to 0.6.0 (vectordotdev#17355) [dae0c6a](answerbook/vector@dae0c6a) - GitHub * **deps**: bump memmap2 from 0.6.0 to 0.6.1 (vectordotdev#17364) [58ba741](answerbook/vector@58ba741) - GitHub * **deps**: bump metrics, metrics-tracing-context, metrics-util (vectordotdev#17336) [9a723e3](answerbook/vector@9a723e3) - GitHub * **deps**: bump mlua from 0.8.8 to 0.8.9 (vectordotdev#17423) [57f8bd4](answerbook/vector@57f8bd4) - GitHub * **deps**: bump mock_instant from 0.2.1 to 0.3.0 (vectordotdev#17210) [40c9afc](answerbook/vector@40c9afc) - GitHub * **deps**: bump mongodb from 2.4.0 to 2.5.0 (vectordotdev#17337) [64f4f69](answerbook/vector@64f4f69) - GitHub * **deps**: bump nkeys from 0.2.0 to 0.3.0 (vectordotdev#17421) [3320eda](answerbook/vector@3320eda) - GitHub * **deps**: bump notify from 5.1.0 to 6.0.0 (vectordotdev#17422) [58603b9](answerbook/vector@58603b9) - GitHub * **deps**: bump num_enum from 0.5.11 to 0.6.0 (vectordotdev#17106) [42f298b](answerbook/vector@42f298b) - GitHub * **deps**: bump num_enum from 0.6.0 to 0.6.1 (vectordotdev#17272) [f696e7b](answerbook/vector@f696e7b) - GitHub * **deps**: bump opendal from 0.30.5 to 0.31.0 (vectordotdev#17119) [8762563](answerbook/vector@8762563) - GitHub * **deps**: bump opendal from 0.31.0 to 0.33.2 (vectordotdev#17286) [3d41931](answerbook/vector@3d41931) - GitHub * **deps**: bump opendal from 0.33.2 to 0.34.0 (vectordotdev#17354) [ae602da](answerbook/vector@ae602da) - GitHub * **deps**: bump openssl from 0.10.48 to 0.10.50 (vectordotdev#17087) [9a56ed8](answerbook/vector@9a56ed8) - GitHub * **deps**: bump openssl from 0.10.50 to 0.10.52 (vectordotdev#17299) [0ecceb3](answerbook/vector@0ecceb3) - GitHub * **deps**: bump pin-project from 1.0.12 to 1.1.0 (vectordotdev#17385) [e8d3002](answerbook/vector@e8d3002) - GitHub * **deps**: bump prettydiff from 0.6.2 to 0.6.4 (vectordotdev#17089) [e090610](answerbook/vector@e090610) - GitHub * **deps**: bump prettydiff from 0.6.2 to 0.6.4 (vectordotdev#17315) [a1ec68d](answerbook/vector@a1ec68d) - GitHub * **deps**: bump proc-macro2 from 1.0.55 to 1.0.56 (vectordotdev#17103) [6f74523](answerbook/vector@6f74523) - GitHub * **deps**: bump proc-macro2 from 1.0.56 to 1.0.57 (vectordotdev#17400) [a6e1ae7](answerbook/vector@a6e1ae7) - GitHub * **deps**: bump prost-build from 0.11.8 to 0.11.9 (vectordotdev#17260) [a88aba4](answerbook/vector@a88aba4) - GitHub * **deps**: bump quote from 1.0.26 to 1.0.27 (vectordotdev#17348) [f81ff18](answerbook/vector@f81ff18) - GitHub * **deps**: bump rdkafka from 0.29.0 to 0.30.0 (vectordotdev#17387) [9703188](answerbook/vector@9703188) - GitHub * **deps**: bump regex from 1.7.3 to 1.8.1 (vectordotdev#17222) [410aa3c](answerbook/vector@410aa3c) - GitHub * **deps**: bump reqwest from 0.11.16 to 0.11.17 (vectordotdev#17316) [09176ec](answerbook/vector@09176ec) - GitHub * **deps**: bump security-framework from 2.8.2 to 2.9.0 (vectordotdev#17386) [1287168](answerbook/vector@1287168) - GitHub * **deps**: bump serde from 1.0.159 to 1.0.160 (vectordotdev#17270) [036ad4a](answerbook/vector@036ad4a) - GitHub * **deps**: bump serde from 1.0.160 to 1.0.162 (vectordotdev#17317) [79e97a2](answerbook/vector@79e97a2) - GitHub * **deps**: bump serde from 1.0.162 to 1.0.163 (vectordotdev#17366) [9852c17](answerbook/vector@9852c17) - GitHub * **deps**: bump serde_json from 1.0.95 to 1.0.96 (vectordotdev#17258) [7570bb3](answerbook/vector@7570bb3) - GitHub * **deps**: bump serde_with from 2.3.1 to 2.3.2 (vectordotdev#17090) [adbf4d5](answerbook/vector@adbf4d5) - GitHub * **deps**: bump serde_yaml from 0.9.19 to 0.9.21 (vectordotdev#17120) [d6f2625](answerbook/vector@d6f2625) - GitHub * **deps**: bump socket2 from 0.4.7 to 0.5.2 (vectordotdev#17121) [db39d83](answerbook/vector@db39d83) - GitHub * **deps**: bump socket2 from 0.5.2 to 0.5.3 (vectordotdev#17384) [ac51b8a](answerbook/vector@ac51b8a) - GitHub * **deps**: bump syslog from 6.0.1 to 6.1.0 (vectordotdev#17301) [61e6154](answerbook/vector@61e6154) - GitHub * **deps**: bump tokio from 1.27.0 to 1.28.0 (vectordotdev#17231) [8067f84](answerbook/vector@8067f84) - GitHub * **deps**: bump tokio from 1.28.0 to 1.28.1 (vectordotdev#17368) [ae6a51b](answerbook/vector@ae6a51b) - GitHub * **deps**: bump tokio-stream from 0.1.12 to 0.1.14 (vectordotdev#17339) [80c8247](answerbook/vector@80c8247) - GitHub * **deps**: bump tokio-tungstenite from 0.18.0 to 0.19.0 (vectordotdev#17404) [ae1dd6e](answerbook/vector@ae1dd6e) - GitHub * **deps**: bump tonic from 0.8.3 to 0.9.1 (vectordotdev#17077) [eafba69](answerbook/vector@eafba69) - GitHub * **deps**: bump tonic from 0.9.1 to 0.9.2 (vectordotdev#17221) [aa9cbd0](answerbook/vector@aa9cbd0) - GitHub * **deps**: bump tonic-build from 0.8.4 to 0.9.2 (vectordotdev#17274) [e0a07c6](answerbook/vector@e0a07c6) - GitHub * **deps**: bump tracing-subscriber from 0.3.16 to 0.3.17 (vectordotdev#17268) [1406c08](answerbook/vector@1406c08) - GitHub * **deps**: bump typetag from 0.2.7 to 0.2.8 (vectordotdev#17302) [c8e0e5f](answerbook/vector@c8e0e5f) - GitHub * **deps**: bump uuid from 1.3.0 to 1.3.1 (vectordotdev#17091) [9cc2f1d](answerbook/vector@9cc2f1d) - GitHub * **deps**: bump uuid from 1.3.0 to 1.3.2 (vectordotdev#17256) [bc6f7fd](answerbook/vector@bc6f7fd) - GitHub * **deps**: bump uuid from 1.3.2 to 1.3.3 (vectordotdev#17403) [3a3fe63](answerbook/vector@3a3fe63) - GitHub * **deps**: bump warp from 0.3.4 to 0.3.5 (vectordotdev#17288) [d8c1f12](answerbook/vector@d8c1f12) - GitHub * **deps**: bump wasm-bindgen from 0.2.84 to 0.2.85 (vectordotdev#17356) [ea24b4d](answerbook/vector@ea24b4d) - GitHub * **deps**: bump wasm-bindgen from 0.2.85 to 0.2.86 (vectordotdev#17402) [0518176](answerbook/vector@0518176) - GitHub * **deps**: bump wiremock from 0.5.17 to 0.5.18 (vectordotdev#17092) [51312aa](answerbook/vector@51312aa) - GitHub * **deps**: Fix up missing license (vectordotdev#17379) [a2b8903](answerbook/vector@a2b8903) - GitHub * **deps**: Reset dependencies bumped by a61dea1 (vectordotdev#17100) [887d6d7](answerbook/vector@887d6d7) - GitHub * **deps**: true up cargo.lock (vectordotdev#17149) [10fce65](answerbook/vector@10fce65) - GitHub * **deps**: Update h2 (vectordotdev#17189) [a2882f3](answerbook/vector@a2882f3) - GitHub * **deps**: Upgrade cue to 0.5.0 (vectordotdev#17204) [d396320](answerbook/vector@d396320) - GitHub * **deps**: Upgrade Debian to bullseye for distroless image (vectordotdev#17160) [c304a8c](answerbook/vector@c304a8c) - GitHub * **deps**: Upgrade rust to 1.69.0 (vectordotdev#17194) [ef15696](answerbook/vector@ef15696) - GitHub * **dev**: Add note about generating licenses to CONTRIBUTING.md (vectordotdev#17410) [539f379](answerbook/vector@539f379) - GitHub * **dev**: ignore `.helix` dir (vectordotdev#17203) [32a935b](answerbook/vector@32a935b) - GitHub * **dev**: Install the correct `mold` based on CPU architecture (vectordotdev#17248) [4b80c71](answerbook/vector@4b80c71) - GitHub * **dev**: remove editors from gitignore (vectordotdev#17267) [61c0d76](answerbook/vector@61c0d76) - GitHub * **docs**: Add Enterprise link and update Support link (vectordotdev#17408) [5184d50](answerbook/vector@5184d50) - GitHub * **docs**: Add missing 0.28.2 version [38607cd](answerbook/vector@38607cd) - Jesse Szwedko * **docs**: Clarify `key_field` for `sample` and `throttle` transforms (vectordotdev#17372) [d1e5588](answerbook/vector@d1e5588) - GitHub * **docs**: Document event type conditions (vectordotdev#17311) [a9c8dc8](answerbook/vector@a9c8dc8) - GitHub * **docs**: make doc style edits (vectordotdev#17155) [65a8856](answerbook/vector@65a8856) - GitHub * **docs**: Remove trailing, unmatched quote (vectordotdev#17163) [3c92556](answerbook/vector@3c92556) - GitHub * **docs**: Remove unneeded `yaml` dependency from website (vectordotdev#17215) [752d424](answerbook/vector@752d424) - GitHub * **docs**: Update component statuses 2023Q2 (vectordotdev#17362) [22cda94](answerbook/vector@22cda94) - GitHub * **docs**: update the `v0.28.0` upgrade guide with note about `datadog_logs` sink `hostname` key (vectordotdev#17156) [c169131](answerbook/vector@c169131) - GitHub * **external docs**: correctly mark some sinks as stateful (vectordotdev#17085) [64d560d](answerbook/vector@64d560d) - GitHub * **loki sink**: warn on label expansions and collisions (vectordotdev#17052) [f06692b](answerbook/vector@f06692b) - GitHub * **pulsar**: pulsar-rs bump to v5.1.1 (vectordotdev#17159) [68b54a9](answerbook/vector@68b54a9) - GitHub * Re-add transform definitions (vectordotdev#17152) [9031d0f](answerbook/vector@9031d0f) - GitHub * Regen docs for sample and throttle (vectordotdev#17390) [6c57ca0](answerbook/vector@6c57ca0) - GitHub * **releasing**: Add known issues fixed by 0.29.1 (vectordotdev#17218) [40d543a](answerbook/vector@40d543a) - GitHub * **releasing**: Bump Vector version to 0.30.0 (vectordotdev#17134) [3834612](answerbook/vector@3834612) - GitHub * **releasing**: Fix homebrew release script (vectordotdev#17131) [cfbf233](answerbook/vector@cfbf233) - Jesse Szwedko * **releasing**: Fix release channels (vectordotdev#17133) [58b44e8](answerbook/vector@58b44e8) - Jesse Szwedko * **releasing**: Prepare v0.28.2 release [a61dea1](answerbook/vector@a61dea1) - Jesse Szwedko * **releasing**: Prepare v0.29.0 release [4bf6805](answerbook/vector@4bf6805) - Jesse Szwedko * **releasing**: Prepare v0.30.0 release [38c3f0b](answerbook/vector@38c3f0b) - Jesse Szwedko * **releasing**: Regenerate Kubernetes manifests for 0.21.2 (vectordotdev#17108) [fd13d64](answerbook/vector@fd13d64) - GitHub * **releasing**: Regenerate manifests for 0.21.1 chart (vectordotdev#17187) [1f0de6b](answerbook/vector@1f0de6b) - GitHub * **releasing**: Regenerate manifests for 0.22.0 chart (vectordotdev#17135) [e7ea0a8](answerbook/vector@e7ea0a8) - GitHub * **releasing**: update patch release template with extra step details [27c3526](answerbook/vector@27c3526) - GitHub * Remove skaffold from project (vectordotdev#17145) [d245927](answerbook/vector@d245927) - GitHub * remove transform type coercion (vectordotdev#17411) [b6c7e0a](answerbook/vector@b6c7e0a) - GitHub * Revert transform definitions (vectordotdev#17146) [05a3f44](answerbook/vector@05a3f44) - GitHub * **socket source**: Remove deprecated `max_length` setting from `tcp` and `unix` modes. (vectordotdev#17162) [9ecfc8c](answerbook/vector@9ecfc8c) - GitHub * **syslog source**: remove the remove of source_ip (vectordotdev#17184) [5dff0ed](answerbook/vector@5dff0ed) - GitHub * **topology**: Transform outputs hash table of OutputId -> Definition (vectordotdev#17059) [1bdb24d](answerbook/vector@1bdb24d) - GitHub * Upgrade `VRL` to `0.3.0` (vectordotdev#17325) [4911d36](answerbook/vector@4911d36) - GitHub ### Features * adds 'metric_name' field to internal logs for the tag_cardinality_limit component (vectordotdev#17295) [4317340](answerbook/vector@4317340) - GitHub * **codecs**: Add full codec support to AWS S3 source/sink (vectordotdev#17098) [d648c86](answerbook/vector@d648c86) - GitHub * **kubernetes_logs**: use kube-apiserver cache for list requests (vectordotdev#17095) [e46fae7](answerbook/vector@e46fae7) - GitHub * Merge with upstream v0.30.0 [6b93177](answerbook/vector@6b93177) - GitHub [LOG-17997](https://logdna.atlassian.net/browse/LOG-17997) * **new sink**: Initial `datadog_events` sink (vectordotdev#7678) [fef3810](answerbook/vector@fef3810) - Jesse Szwedko * Update VRL library [6ace1e6](answerbook/vector@6ace1e6) - Jorge Bay ### Miscellaneous * Merge branch 'master' [d4b35bb](answerbook/vector@d4b35bb) - Jorge Bay * Merge tag 'v0.30.0' into update-upstream [ee2f300](answerbook/vector@ee2f300) - Jorge Bay * Merge commit '9031d0faa2811187874364e1b5a3305c9ed0c0da' into update-upstream [c12faae](answerbook/vector@c12faae) - Jorge Bay * Improve tokio::select behavior for kafka source and finalizers (vectordotdev#17380) [d4df21c](answerbook/vector@d4df21c) - GitHub * Prepare v0.29.1 release [21beed7](answerbook/vector@21beed7) - Kyle Criddle * Add a quickfix to handle special capitalization cases (vectordotdev#17141) [ba63e21](answerbook/vector@ba63e21) - GitHub * Adjust doc comment locations (vectordotdev#17154) [730c938](answerbook/vector@730c938) - GitHub * **amqp sink**: Support AMQ Properties (content-type) in AMQP sink (vectordotdev#17174) [c10d30b](answerbook/vector@c10d30b) - GitHub * **aws provider**: Let `region` be configured for default authentication (vectordotdev#17414) [c81ad30](answerbook/vector@c81ad30) - GitHub * **core**: add unit test (ignored) for support for encoding special chars in `ProxyConfig` (vectordotdev#17148) [5247972](answerbook/vector@5247972) - GitHub * **loki sink**: Fix formatting in labels example (vectordotdev#17396) [f3734e8](answerbook/vector@f3734e8) - GitHub * **observability**: Log underlying error for unhandled HTTP errors (vectordotdev#17327) [bf8376c](answerbook/vector@bf8376c) - GitHub * **observability**: Update internal log rate limiting messages (vectordotdev#17394) [1951535](answerbook/vector@1951535) - GitHub * **pulsar sink**: Refactor to use StreamSink (vectordotdev#14345) [1e97a2f](answerbook/vector@1e97a2f) - GitHub * **topology**: Add source id to metadata (vectordotdev#17369) [c683999](answerbook/vector@c683999) - GitHub * **vdev**: Load compose files and inject network block (vectordotdev#17025) [5d88655](answerbook/vector@5d88655) - GitHub
This commit heavily refactors the Pulsar Sink to use the StreamSink interface and is modeled after the Kafka Sink.
It also adds additional features that bring it in line with Kafka Sink feature set.
This includes:
This work is heavily modeled after Kafka sink. This means there has been some duplication of some utility code. However, it has not been refactored to remove the duplication as there wasn't a clear pattern of where such shared code should be put.
Additionally, this refactor seems to be much simpler by using StreamSink but does require some workarounds limitations in the Pulsar client library by wrapping certain resources in Arc that may have performance implications. I am not famaliar enough to know if there might be some efficiencies by structuring this differently.
Remaining work: