-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kbn-ftr-common-functional-services] extend retry service #178660
[kbn-ftr-common-functional-services] extend retry service #178660
Conversation
/ci |
…ana into ftr-services/extend-retry
/ci |
packages/kbn-ftr-common-functional-services/services/retry/retry_for_success.ts
Outdated
Show resolved
Hide resolved
packages/kbn-ftr-common-functional-services/services/retry/retry.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Also tried a few scenarios locally and everything worked fine ✔️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for extending retryService functionality 🚀
onFailure( | ||
lastError, | ||
// optionally extend error message with description | ||
`reached the limit of attempts${addText(description)}: ${ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
log, | ||
}); | ||
{ | ||
retryCount: MAX_RETRIES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much nicer by removing the log service and using the retryService function directly 💯
What would be the default values for these configs in case we don't provide it directly? Are they configurable at the test config level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the default values for these configs in case we don't provide it directly? Are they configurable at the test config level?
I believe hardcoding retryCount
default value (e.g. to 2
) will lead to misusage: similar to retry.waitForWithTimeout
takes timeout
value, tryWithRetries
require options.retryCount
for each call.
Few points:
- I strongly believed that if
retryCount
is not defined explicitly,retry.try
is the correct function to use. - Wrapping too many API calls with
retry.tryWithRetries
may hide instability of the product. By explicitly settingretryCount
folks take more responsibility to evaluate individually case to case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmlemeshko thank you for extending retryForSuccess<T>()
utility function 👍
The changes look good to me besides a couple of things I left comment about. In general I'm curious if you considered using p-retry
for retrying functionality?
|
||
while (true) { | ||
if (retryCount) { | ||
// Use retryCount as an optional condition | ||
if (++attemptCounter > retryCount) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It's possible to get rid of nested if
s by merging the condition
if (retryCount && ++attemptCounter > retryCount) {
// ...
}
/** | ||
* Use to retry block {options.retryCount} times within {options.timeout} period and return block result | ||
* @param description block description | ||
* @param block retrying operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If I got properly block
here refers to a code block. I'd say it's quite a general term. Maybe rename it to something like action
or retriableFn
if (Date.now() - start > timeout) { | ||
await onFailure(lastError); | ||
onFailure(lastError, `reached timeout ${timeout} ms${addText(description)}`); | ||
throw new Error('expected onFailure() option to throw an error'); | ||
} else if (lastError && criticalWebDriverErrors.includes(lastError.name)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It could be refactored to omit else
. Since previous if
code block throws an error else
is optional here.
if (retryCount) { | ||
// Use retryCount as an optional condition | ||
if (++attemptCounter > retryCount) { | ||
onFailure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if onFailure
doesn't break the while loop? The default implementation throws an error but what if a custom onFailure
handler doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! I will make a fix. Still you can't pass custom onFailure logic outside Retry service.
retryDelay?: number; | ||
retryCount?: number; | ||
} | ||
|
||
export async function retryForSuccess<T>(log: ToolingLog, options: Options<T>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you investigated if p-retry
is applicable here? Additionally exponential back off timeout may work better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a chat with @maryam-saeidi about p-retry and the issues she faced while using it. This PR is actually about to help migrate away from p-retry
.
I think p-retry implementation is more complex vs FTR retry service, but since we never had stability issues with FTR retry (folks question logging & interface, understood :) ) I would keep logic as simple as possible.
As for logging and capabilities, I still believe we can collaborate with PRs like this one to achieve improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmlemeshko Could you share found issues with p-retry
?
My observation is p-retry
feels like a quite popular library with 14M weekly downloads. A lot of bugs should've been fixed already. On top of that p-retry
provides flexible configuration like exponential back off retry timeout while it still can be configured linearly. It allows to abort retries as well. Checking Kibana's codebase it's not hard to see it's used in multiple packages and plugins.
Having a comment for retryForSuccess
with some explanation and improvements over p-retry
will be really helpful for future maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maximpn Here is the ticket that I explained what challenge we faced, and in our case, we didn't have any log about the retry attempts which made it hard for us to understand if it was one request that timeout or there were multiple requests and due to exponential back off, the overall attempt failed. (The way that we tested it locally was by throwing an error in the test function and setting the retry to 10 times)
In general, I think it would be better to only rely on one library/package for retry purposes, for easier maintenance and a better understanding of the tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maryam-saeidi thanks for the details 👍
we didn't have any log about the retry attempts
Yes, it sounds like a bug. Or exponential timeout was too high so the test was interrupted due to a timeout. Btw, you could configurep-retry
to have a constant retry interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dmlemeshko for this work! 🚀
Glad I could contribute to improve the FTR testing experience
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
History
To update your PR or re-run it, just comment with: |
…8660) ## Summary I took the retry wrapper created by [jpdjere](https://github.com/jpdjere) in elastic#173998 and extended `retryForSuccess<T>` with required capabilities to provide the same functionality. This PR: 1) extends retry service with new function `tryWithRetries<T> => : Promise<T>` to retry block `options.retryCount` times within `options.timeout` period and return block result ```ts const response = await retry.tryWithRetries<SearchResponse>( 'search request', async () => { const response = await supertest .post(`/internal/search/es`) .set(ELASTIC_HTTP_VERSION_HEADER, '1') .send({ params: { body: { query: { match_all: {}, }, }, }, }) .expect(200); return response.body as SearchResponse; }, { retryCount: 4, retryDelay: 100, // optional timeout: 30000, // optional } ``` 2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry service 3) Adds descriptions to Retry service functions explaining the default use case How the failures look like: - when reached timeout before retry count limit ``` Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request' Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_with_retries.ts:29:15) at retryWithRetries (retry_with_retries.ts:97:21) at RetryService.tryForTime (retry.ts:38:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ``` - when reached retry count limit before timeout ``` Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2 Error: expected '# Click the Variables button, above, to create your own variables.\nGET ${exampleVariable1} // _search\n{\n "query": {\n "${exampleVariable2}": {} // match_all\n }\n}' to sort of equal '5' at Assertion.assert (expect.js:100:11) at Assertion.eql (expect.js:244:8) at _console.ts:47:41 at processTicksAndRejections (node:internal/process/task_queues:95:5) at runAttempt (retry_for_success.ts:29:15) at retryForSuccess (retry_for_success.ts:97:21) at RetryService.tryWithRetries (retry.ts:115:12) at Context.<anonymous> (_console.ts:44:7) at Object.apply (wrap_function.js:73:16) ``` (cherry picked from commit 277b3fb)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…8660) (#178978) # Backport This will backport the following commits from `main` to `8.13`: - [[kbn-ftr-common-functional-services] extend retry service (#178660)](#178660) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dzmitry Lemechko","email":"dzmitry.lemechko@elastic.co"},"sourceCommit":{"committedDate":"2024-03-19T15:32:59Z","message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","FTR","v8.14.0","v8.13.1"],"title":"[kbn-ftr-common-functional-services] extend retry service","number":178660,"url":"https://github.com/elastic/kibana/pull/178660","mergeCommit":{"message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/178660","number":178660,"mergeCommit":{"message":"[kbn-ftr-common-functional-services] extend retry service (#178660)\n\n## Summary\r\n\r\nI took the retry wrapper created by\r\n[jpdjere](https://github.com/jpdjere) in #173998 and extended\r\n`retryForSuccess<T>` with required capabilities to provide the same\r\nfunctionality.\r\n\r\nThis PR:\r\n1) extends retry service with new function `tryWithRetries<T> => :\r\nPromise<T>` to retry block `options.retryCount` times within\r\n`options.timeout` period and return block result\r\n\r\n```ts\r\n const response = await retry.tryWithRetries<SearchResponse>(\r\n 'search request',\r\n async () => {\r\n const response = await supertest\r\n .post(`/internal/search/es`)\r\n .set(ELASTIC_HTTP_VERSION_HEADER, '1')\r\n .send({\r\n params: {\r\n body: {\r\n query: {\r\n match_all: {},\r\n },\r\n },\r\n },\r\n })\r\n .expect(200);\r\n return response.body as SearchResponse;\r\n },\r\n {\r\n retryCount: 4,\r\n retryDelay: 100, // optional\r\n timeout: 30000, // optional\r\n }\r\n```\r\n\r\n2) removes `utils/retry.ts` wrapper and migrate tests to FTR Retry\r\nservice\r\n3) Adds descriptions to Retry service functions explaining the default\r\nuse case\r\n\r\nHow the failures look like:\r\n\r\n- when reached timeout before retry count limit \r\n```\r\n Error: retry.tryWithRetries reached timeout 11000 ms waiting for 'run console request'\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_with_retries.ts:29:15)\r\n at retryWithRetries (retry_with_retries.ts:97:21)\r\n at RetryService.tryForTime (retry.ts:38:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```\r\n- when reached retry count limit before timeout\r\n```\r\n Error: retry.tryWithRetries reached the limit of attempts waiting for 'run console request': 2 out of 2\r\n Error: expected '# Click the Variables button, above, to create your own variables.\\nGET ${exampleVariable1} // _search\\n{\\n \"query\": {\\n \"${exampleVariable2}\": {} // match_all\\n }\\n}' to sort of equal '5'\r\n at Assertion.assert (expect.js:100:11)\r\n at Assertion.eql (expect.js:244:8)\r\n at _console.ts:47:41\r\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\r\n at runAttempt (retry_for_success.ts:29:15)\r\n at retryForSuccess (retry_for_success.ts:97:21)\r\n at RetryService.tryWithRetries (retry.ts:115:12)\r\n at Context.<anonymous> (_console.ts:44:7)\r\n at Object.apply (wrap_function.js:73:16)\r\n```","sha":"277b3fbc24889bedd512f23674e768d2f4c43294"}},{"branch":"8.13","label":"v8.13.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
## Summary Resolves: #192201 - Expose `TryWithRetriesOptions` - Tune timeouts to pass ci - Add attempt count debug info to `retry/retry_for_success.ts` - Helps with tuning timeout params - Move exposure of `AlertingApiProvider` from `x-pack/test_serverless/api_integration/services/index.ts` -> `x-pack/test_serverless/shared/services/deployment_agnostic_services.ts` - This exposes the alerting api under Deployment Agnostic Services (DA), and DA is exposed within `x-pack/test_serverless/functional/services/index.ts` (Shared Services [Serverless]) - Collapse helper script functions into just another object literal stanza within `AlertingApiProvider` - Update all references - Refactor alerting api to use `retry` service, instead of p-retry (following [this pr](#178660)) ### Additional debug logging Run in debug mode (add `-v`): ``` node scripts/functional_tests \ --config x-pack/test_serverless/api_integration/test_suites/search/common_configs/config.group1.ts \ --grep "Summary actions" -v ``` #### After ``` │ sill retry.tryWithRetries('Alerting API - waitForDocumentInIndex, retryOptions: {"retryCount":5,"retryDelay":200}', [object AsyncFunction], [object Object]) │ debg --- retry.tryWithRetries error: index_not_found_exception │ Root causes: │ index_not_found_exception: no such index [alert-action-es-query] - Attempt #: 1 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 2 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 3 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 4 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 5 ... // Msg after all attempts fail: │ Error: retry.tryWithRetries reached the limit of attempts waiting for 'Alerting API - waitForDocumentInIndex, retryOptions: {"retryCount":5,"retryDelay":200}': 5 out of 5 │ ResponseError: index_not_found_exception │ Root causes: │ index_not_found_exception: no such index [alert-action-es-query] │ at SniffingTransport._request (node_modules/@elastic/transport/src/Transport.ts:601:17) │ at processTicksAndRejections (node:internal/process/task_queues:95:5) │ at /Users/trezworkbox/dev/main.worktrees/cleanup-alerting-api/node_modules/@elastic/transport/src/Transport.ts:704:22 │ at SniffingTransport.request (node_modules/@elastic/transport/src/Transport.ts:701:14) │ at Proxy.SearchApi (node_modules/@elastic/elasticsearch/src/api/api/search.ts:96:10) │ at alerting_api.ts:123:28 │ at runAttempt (retry_for_success.ts:30:15) │ at retryForSuccess (retry_for_success.ts:99:21) │ at Proxy.tryWithRetries (retry.ts:113:12) │ at Object.waitForDocumentInIndex (alerting_api.ts:120:14) │ at Context.<anonymous> (summary_actions.ts:146:20) │ at Object.apply (wrap_function.js:74:16) │ at Object.apply (wrap_function.js:74:16) │ at onFailure (retry_for_success.ts:18:9) │ at retryForSuccess (retry_for_success.ts:75:7) │ at Proxy.tryWithRetries (retry.ts:113:12) │ at Object.waitForDocumentInIndex (alerting_api.ts:120:14) │ at Context.<anonymous> (summary_actions.ts:146:20) │ at Object.apply (wrap_function.js:74:16) │ at Object.apply (wrap_function.js:74:16) ``` ### Notes Was put back in draft to additional scope detailed in issue linked above. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
## Summary Resolves: elastic#192201 - Expose `TryWithRetriesOptions` - Tune timeouts to pass ci - Add attempt count debug info to `retry/retry_for_success.ts` - Helps with tuning timeout params - Move exposure of `AlertingApiProvider` from `x-pack/test_serverless/api_integration/services/index.ts` -> `x-pack/test_serverless/shared/services/deployment_agnostic_services.ts` - This exposes the alerting api under Deployment Agnostic Services (DA), and DA is exposed within `x-pack/test_serverless/functional/services/index.ts` (Shared Services [Serverless]) - Collapse helper script functions into just another object literal stanza within `AlertingApiProvider` - Update all references - Refactor alerting api to use `retry` service, instead of p-retry (following [this pr](elastic#178660)) ### Additional debug logging Run in debug mode (add `-v`): ``` node scripts/functional_tests \ --config x-pack/test_serverless/api_integration/test_suites/search/common_configs/config.group1.ts \ --grep "Summary actions" -v ``` #### After ``` │ sill retry.tryWithRetries('Alerting API - waitForDocumentInIndex, retryOptions: {"retryCount":5,"retryDelay":200}', [object AsyncFunction], [object Object]) │ debg --- retry.tryWithRetries error: index_not_found_exception │ Root causes: │ index_not_found_exception: no such index [alert-action-es-query] - Attempt #: 1 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 2 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 3 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 4 │ sill es.search([object Object]) │ debg --- retry.tryWithRetries failed again with the same message... - Attempt #: 5 ... // Msg after all attempts fail: │ Error: retry.tryWithRetries reached the limit of attempts waiting for 'Alerting API - waitForDocumentInIndex, retryOptions: {"retryCount":5,"retryDelay":200}': 5 out of 5 │ ResponseError: index_not_found_exception │ Root causes: │ index_not_found_exception: no such index [alert-action-es-query] │ at SniffingTransport._request (node_modules/@elastic/transport/src/Transport.ts:601:17) │ at processTicksAndRejections (node:internal/process/task_queues:95:5) │ at /Users/trezworkbox/dev/main.worktrees/cleanup-alerting-api/node_modules/@elastic/transport/src/Transport.ts:704:22 │ at SniffingTransport.request (node_modules/@elastic/transport/src/Transport.ts:701:14) │ at Proxy.SearchApi (node_modules/@elastic/elasticsearch/src/api/api/search.ts:96:10) │ at alerting_api.ts:123:28 │ at runAttempt (retry_for_success.ts:30:15) │ at retryForSuccess (retry_for_success.ts:99:21) │ at Proxy.tryWithRetries (retry.ts:113:12) │ at Object.waitForDocumentInIndex (alerting_api.ts:120:14) │ at Context.<anonymous> (summary_actions.ts:146:20) │ at Object.apply (wrap_function.js:74:16) │ at Object.apply (wrap_function.js:74:16) │ at onFailure (retry_for_success.ts:18:9) │ at retryForSuccess (retry_for_success.ts:75:7) │ at Proxy.tryWithRetries (retry.ts:113:12) │ at Object.waitForDocumentInIndex (alerting_api.ts:120:14) │ at Context.<anonymous> (summary_actions.ts:146:20) │ at Object.apply (wrap_function.js:74:16) │ at Object.apply (wrap_function.js:74:16) ``` ### Notes Was put back in draft to additional scope detailed in issue linked above. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 9d22e8c)
Summary
I took the retry wrapper created by jpdjere in #173998 and extended
retryForSuccess<T>
with required capabilities to provide the same functionality.This PR:
tryWithRetries<T> => : Promise<T>
to retry blockoptions.retryCount
times withinoptions.timeout
period and return block resultutils/retry.ts
wrapper and migrate tests to FTR Retry serviceHow the failures look like: