Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: filter events support #3882

Merged
merged 5 commits into from
Oct 3, 2023
Merged

feat: filter events support #3882

merged 5 commits into from
Oct 3, 2023

Conversation

chandumlg
Copy link
Member

@chandumlg chandumlg commented Sep 18, 2023

Description

This pr adds the following features:

  1. If transformer returns a statusCode: 298 after transforming an event, that event will be treated as a filtered event. Job (corresponding to that event) will be marked as filtered in jobsdb and the same state is forwarded to reporting service.

  2. If transformer returns a statusCode: 299 after transforming an event (esp: during router transform), that event will be treated as a successful one and router doesn't need to perform any more actions. Job will be marked as succeeded in jobsdb.

  3. Event filter module in processor, instead of silently filtering the events, returns the filtered events with statusCode: 298. Processor will handle them accordingly.

More details: https://www.notion.so/rudderstacks/Introduce-new-code-b-w-transformer-server-to-filter-out-events-8e8faea134474ac08bd85d63d00eceab

Linear Ticket

https://linear.app/rudderstack/issue/INT-515/implement-server-side-changes-to-accept-new-status-code-from

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@codecov
Copy link

codecov bot commented Sep 18, 2023

Codecov Report

Attention: 32 lines in your changes are missing coverage. Please review.

Comparison is base (1fcabdd) 70.20% compared to head (ebbae96) 70.22%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3882      +/-   ##
==========================================
+ Coverage   70.20%   70.22%   +0.01%     
==========================================
  Files         357      357              
  Lines       53160    53244      +84     
==========================================
+ Hits        37321    37388      +67     
- Misses      13585    13596      +11     
- Partials     2254     2260       +6     
Files Coverage Δ
jobsdb/jobsdb.go 73.23% <ø> (ø)
router/handle.go 71.89% <100.00%> (+0.87%) ⬆️
services/rsources/stats_collector.go 90.90% <100.00%> (ø)
processor/eventfilter/eventfilter.go 92.07% <95.00%> (+0.30%) ⬆️
processor/processor.go 88.41% <97.53%> (+0.87%) ⬆️
processor/transformer/transformer.go 89.61% <70.00%> (-0.78%) ⬇️
router/transformer/transformer.go 69.47% <66.66%> (-0.32%) ⬇️
processor/trackingplan.go 56.81% <50.00%> (-0.22%) ⬇️
router/worker.go 81.91% <60.00%> (-0.50%) ⬇️
router/internal/eventorder/eventorder.go 87.71% <15.38%> (-4.22%) ⬇️

... and 10 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chandumlg chandumlg force-pushed the feat.filter-events branch 2 times, most recently from 50a9ec1 to 04bcd43 Compare September 24, 2023 23:38
@chandumlg chandumlg marked this pull request as ready for review September 24, 2023 23:39
@chandumlg chandumlg changed the title feat: filter events feat: filter events support Sep 24, 2023
jobsdb/jobsdb.go Show resolved Hide resolved
utils/types/types.go Outdated Show resolved Hide resolved

var filtered, failed []transformer.TransformerResponse
for _, event := range response.FailedEvents {
if event.StatusCode == types.FilterEventCode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the filtered event code be returned by the user transformer as well in case the event is dropped?
IMO to be confident we need integration tests, with an actual transformer, reporting enabled (but not sending reports to a remote server) to verify that filtered events are being recorded in reporting in various stages of the processor's pipeline:

  • 1 scenario for message filter
  • 1 scenario for destination transformation
  • 1 scenario for user transformation
  • 1 scenario for tracking plan(?)

cc @pChondros

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can use the same filtered event code for dropped events during user transformation as well. When the changes for that are done by the respective team, rudder-server will report them as filtered. (This will be communicated to them. CC @sivashanmukh.) Till then, things will continue to work as is.

router/worker.go Outdated Show resolved Hide resolved
processor/processor.go Outdated Show resolved Hide resolved
processor/processor.go Outdated Show resolved Hide resolved
@@ -1187,26 +1230,26 @@ func (proc *Handle) getFailedEventJobs(response transformer.Response, commonMeta
)
payload, err := jsonfast.Marshal(messages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why we are storing a single job containing an array of failed payloads to proc_errors? @cisse21?
And we are using this single job to calculate dropped events as well. Wouldn't that cause skewed rsources metrics @Sidddddarth?

@@ -806,7 +821,7 @@ func (w *worker) sendRouterResponseCountStat(status *jobsdb.JobStatusT, destinat
destinationTag := misc.GetTagName(destination.ID, destination.Name)
var alert bool
alert = w.allowRouterAbortedAlert(errorAt)
if status.JobState == jobsdb.Succeeded.State {
if status.JobState == jobsdb.Succeeded.State || status.JobState == jobsdb.Filtered.State {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any adaptations needed in sendEventDeliveryStat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, No.

router response counts -> responses router received for an event. respStatusCode label will reveal about the status of the delivery.

event delivery stats -> stats for successfully delivered events (count & time spent by the event in the pipeline since it is received.)

@chandumlg chandumlg merged commit 7ead8a9 into master Oct 3, 2023
@chandumlg chandumlg deleted the feat.filter-events branch October 3, 2023 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants