-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kbn/journeys] fix hanging on telemetry call & improve logging #175194
[kbn/journeys] fix hanging on telemetry call & improve logging #175194
Conversation
const id = ++this.telemetryTrackerCount; | ||
this.log.info(`Waiting for telemetry request #${id} to complete`); | ||
return Rx.of(requestSuccess$).pipe( | ||
Rx.timeout(60_000), | ||
Rx.catchError((error) => { | ||
if (error instanceof Error && error.name === 'TimeoutError') { | ||
this.log.error(`Timeout error occurred: ${error.message}`); | ||
} | ||
// Rethrow the error if it's not a TimeoutError | ||
return Rx.throwError(() => new Error(error)); | ||
}), | ||
Rx.tap({ | ||
complete: () => this.log.debug(`Telemetry request #${id} complete`), | ||
complete: () => this.log.info(`Telemetry request #${id} complete`), | ||
error: (err) => this.log.error(`Telemetry request was not processed: ${err.message}`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few changes to make it work correctly:
telemetryTrackerCount
was always returning1
since it was misplaced before observables.- delete
requestFailure$
since we don't have to do anything in case of Telemetry request failure event. We fully rely on kibana service to make things the right way. - adding timeout of 60 sec to issue error notification in case of no new emitted events;
- log messages on both complete & error
└-> Go to Transform Wizard
└-> "before each" hook: global before each for "Go to Transform Wizard"
│ debg APM DEBUG start span {"span":"fe0fad39231fa33e","parent":"9cf0f1c9d8df9a6a","trace":"ada4a00b5cac22401cd1ed060e28e4f2","name":"step: Go to Transform Wizard","type":"step","subtype":null,"action":null}
│ debg APM DEBUG supersedeWithSpanRunContext(<Span fe0fad39231fa33e>)
│ info Waiting for telemetry request #2 to complete
│ info Telemetry request #2 complete
│ info Waiting for telemetry request #3 to complete
│ info Telemetry request #3 complete
│ info Waiting for telemetry request #4 to complete
│ info Telemetry request #4 complete
│ debg APM DEBUG ended span {"span":"fe0fad39231fa33e","parent":"9cf0f1c9d8df9a6a","trace":"ada4a00b5cac22401cd1ed060e28e4f2","name":"step: Go to Transform Wizard","type":"step","subtype":null,"action":null}
│ debg APM DEBUG addEndedSpan(step: Go to Transform Wizard)
│ debg APM DEBUG encoding span {"span":"fe0fad39231fa33e","parent":"9cf0f1c9d8df9a6a","trace":"ada4a00b5cac22401cd1ed060e28e4f2","name":"step: Go to Transform Wizard","type":"step"}
└- ✓ pass (1.0m)
async waitForListViewTable() { | ||
await this.page.waitForSelector(subj('table-is-ready'), { state: 'visible' }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to page object and updating tests for consistency.
await page.waitForSelector(subj('userMenuButton'), { | ||
state: 'attached', | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve waiting for home page after login
/ci |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @dmlemeshko |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ic#175194) ## Summary This PR fixes the issue causing (mostly) [login journey](https://buildkite.com/elastic/kibana-single-user-performance/builds/12398#018d1149-cc2e-4591-a61c-176768081e2c) stuck for 14 min waiting for Telemetry call response. <img width="964" alt="Screenshot 2024-01-22 at 11 12 24" src="https://github.com/elastic/kibana/assets/10977896/8cadc2ec-ee84-42f6-8a0c-ad949367429c"> I believe the issue was in how we handle the Observables for request events. I added extra comment in the particular code change. I no longer can reproduce it, all the events are reported correctly: <img width="964" alt="image" src="https://github.com/elastic/kibana/assets/10977896/fa2c4b27-dcf2-480b-a07f-aeb23045149a"> Logs cleaning is to log in console only performance metrics event but not all EBT elements. Also not to report some browser errors that not Kibana specific. Testing: run the following script 3-4 times ``` PERFORMANCE_ENABLE_TELEMETRY=1 node scripts/run_performance.js --journey-path x-pack/performance/journeys/login.ts ``` - script is completed without delays (e.g. doesn't hang on after hook in TEST phase) - telemetry requests are logged with correct counter and all finished, e.g. `Waiting for telemetry request elastic#2 to complete` is followed by `Telemetry request elastic#2 complete` - only events started with `Report event "performance_metric"` are in console output (cherry picked from commit 8be528e)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…ging (#175194) (#175247) # Backport This will backport the following commits from `main` to `8.12`: - [[kbn/journeys] fix hanging on telemetry call & improve logging (#175194)](#175194) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dzmitry Lemechko","email":"dzmitry.lemechko@elastic.co"},"sourceCommit":{"committedDate":"2024-01-22T17:18:10Z","message":"[kbn/journeys] fix hanging on telemetry call & improve logging (#175194)\n\n## Summary\r\n\r\nThis PR fixes the issue causing (mostly) [login\r\njourney](https://buildkite.com/elastic/kibana-single-user-performance/builds/12398#018d1149-cc2e-4591-a61c-176768081e2c)\r\nstuck for 14 min waiting for Telemetry call response.\r\n\r\n\r\n<img width=\"964\" alt=\"Screenshot 2024-01-22 at 11 12 24\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/8cadc2ec-ee84-42f6-8a0c-ad949367429c\">\r\n\r\nI believe the issue was in how we handle the Observables for request\r\nevents. I added extra comment in the particular code change.\r\n\r\nI no longer can reproduce it, all the events are reported correctly:\r\n<img width=\"964\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/fa2c4b27-dcf2-480b-a07f-aeb23045149a\">\r\n\r\nLogs cleaning is to log in console only performance metrics event but\r\nnot all EBT elements. Also not to report some browser errors that not\r\nKibana specific.\r\n\r\n\r\nTesting:\r\n\r\nrun the following script 3-4 times\r\n```\r\nPERFORMANCE_ENABLE_TELEMETRY=1 node scripts/run_performance.js --journey-path x-pack/performance/journeys/login.ts\r\n```\r\n\r\n- script is completed without delays (e.g. doesn't hang on after hook in\r\nTEST phase)\r\n- telemetry requests are logged with correct counter and all finished,\r\ne.g. `Waiting for telemetry request #2 to complete` is followed by\r\n`Telemetry request #2 complete`\r\n- only events started with `Report event \"performance_metric\"` are in\r\nconsole output","sha":"8be528efb3169b630f44b33912adaf63bdea2a25","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v8.12.1","v8.13.0"],"title":"[kbn/journeys] fix hanging on telemetry call & improve logging","number":175194,"url":"https://github.com/elastic/kibana/pull/175194","mergeCommit":{"message":"[kbn/journeys] fix hanging on telemetry call & improve logging (#175194)\n\n## Summary\r\n\r\nThis PR fixes the issue causing (mostly) [login\r\njourney](https://buildkite.com/elastic/kibana-single-user-performance/builds/12398#018d1149-cc2e-4591-a61c-176768081e2c)\r\nstuck for 14 min waiting for Telemetry call response.\r\n\r\n\r\n<img width=\"964\" alt=\"Screenshot 2024-01-22 at 11 12 24\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/8cadc2ec-ee84-42f6-8a0c-ad949367429c\">\r\n\r\nI believe the issue was in how we handle the Observables for request\r\nevents. I added extra comment in the particular code change.\r\n\r\nI no longer can reproduce it, all the events are reported correctly:\r\n<img width=\"964\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/fa2c4b27-dcf2-480b-a07f-aeb23045149a\">\r\n\r\nLogs cleaning is to log in console only performance metrics event but\r\nnot all EBT elements. Also not to report some browser errors that not\r\nKibana specific.\r\n\r\n\r\nTesting:\r\n\r\nrun the following script 3-4 times\r\n```\r\nPERFORMANCE_ENABLE_TELEMETRY=1 node scripts/run_performance.js --journey-path x-pack/performance/journeys/login.ts\r\n```\r\n\r\n- script is completed without delays (e.g. doesn't hang on after hook in\r\nTEST phase)\r\n- telemetry requests are logged with correct counter and all finished,\r\ne.g. `Waiting for telemetry request #2 to complete` is followed by\r\n`Telemetry request #2 complete`\r\n- only events started with `Report event \"performance_metric\"` are in\r\nconsole output","sha":"8be528efb3169b630f44b33912adaf63bdea2a25"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/175194","number":175194,"mergeCommit":{"message":"[kbn/journeys] fix hanging on telemetry call & improve logging (#175194)\n\n## Summary\r\n\r\nThis PR fixes the issue causing (mostly) [login\r\njourney](https://buildkite.com/elastic/kibana-single-user-performance/builds/12398#018d1149-cc2e-4591-a61c-176768081e2c)\r\nstuck for 14 min waiting for Telemetry call response.\r\n\r\n\r\n<img width=\"964\" alt=\"Screenshot 2024-01-22 at 11 12 24\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/8cadc2ec-ee84-42f6-8a0c-ad949367429c\">\r\n\r\nI believe the issue was in how we handle the Observables for request\r\nevents. I added extra comment in the particular code change.\r\n\r\nI no longer can reproduce it, all the events are reported correctly:\r\n<img width=\"964\" alt=\"image\"\r\nsrc=\"https://github.com/elastic/kibana/assets/10977896/fa2c4b27-dcf2-480b-a07f-aeb23045149a\">\r\n\r\nLogs cleaning is to log in console only performance metrics event but\r\nnot all EBT elements. Also not to report some browser errors that not\r\nKibana specific.\r\n\r\n\r\nTesting:\r\n\r\nrun the following script 3-4 times\r\n```\r\nPERFORMANCE_ENABLE_TELEMETRY=1 node scripts/run_performance.js --journey-path x-pack/performance/journeys/login.ts\r\n```\r\n\r\n- script is completed without delays (e.g. doesn't hang on after hook in\r\nTEST phase)\r\n- telemetry requests are logged with correct counter and all finished,\r\ne.g. `Waiting for telemetry request #2 to complete` is followed by\r\n`Telemetry request #2 complete`\r\n- only events started with `Report event \"performance_metric\"` are in\r\nconsole output","sha":"8be528efb3169b630f44b33912adaf63bdea2a25"}}]}] BACKPORT--> Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
…ic#175194) ## Summary This PR fixes the issue causing (mostly) [login journey](https://buildkite.com/elastic/kibana-single-user-performance/builds/12398#018d1149-cc2e-4591-a61c-176768081e2c) stuck for 14 min waiting for Telemetry call response. <img width="964" alt="Screenshot 2024-01-22 at 11 12 24" src="https://github.com/elastic/kibana/assets/10977896/8cadc2ec-ee84-42f6-8a0c-ad949367429c"> I believe the issue was in how we handle the Observables for request events. I added extra comment in the particular code change. I no longer can reproduce it, all the events are reported correctly: <img width="964" alt="image" src="https://github.com/elastic/kibana/assets/10977896/fa2c4b27-dcf2-480b-a07f-aeb23045149a"> Logs cleaning is to log in console only performance metrics event but not all EBT elements. Also not to report some browser errors that not Kibana specific. Testing: run the following script 3-4 times ``` PERFORMANCE_ENABLE_TELEMETRY=1 node scripts/run_performance.js --journey-path x-pack/performance/journeys/login.ts ``` - script is completed without delays (e.g. doesn't hang on after hook in TEST phase) - telemetry requests are logged with correct counter and all finished, e.g. `Waiting for telemetry request elastic#2 to complete` is followed by `Telemetry request elastic#2 complete` - only events started with `Report event "performance_metric"` are in console output
Summary
This PR fixes the issue causing (mostly) login journey stuck for 14 min waiting for Telemetry call response.
I believe the issue was in how we handle the Observables for request events. I added extra comment in the particular code change.
I no longer can reproduce it, all the events are reported correctly:
Logs cleaning is to log in console only performance metrics event but not all EBT elements. Also not to report some browser errors that not Kibana specific.
Testing:
run the following script 3-4 times
Waiting for telemetry request #2 to complete
is followed byTelemetry request #2 complete
Report event "performance_metric"
are in console output