diff --git a/.github/workflows/continous-integration.yml b/.github/workflows/continous-integration.yml index b50753b1dffb..587ad2ad26e4 100644 --- a/.github/workflows/continous-integration.yml +++ b/.github/workflows/continous-integration.yml @@ -290,6 +290,7 @@ jobs: - name: Prevent race condition in poetry build # More context about race condition during poetry build can be found here: # https://github.com/python-poetry/poetry/issues/7611#issuecomment-1747836233 + if: needs.changes.outputs.backend == 'true' run: | poetry config installer.max-workers 1 diff --git a/docs/docs/monitoring/load-testing-guidelines.mdx b/docs/docs/monitoring/load-testing-guidelines.mdx index ff40486853b5..a794d73639da 100644 --- a/docs/docs/monitoring/load-testing-guidelines.mdx +++ b/docs/docs/monitoring/load-testing-guidelines.mdx @@ -12,12 +12,26 @@ In order to gather metrics on our system's ability to handle increased loads and In each test case we spawned the following number of concurrent users at peak concurrency using a [spawn rate](https://docs.locust.io/en/1.5.0/configuration.html#all-available-configuration-options) of 1000 users per second. In our tests we used the Rasa [HTTP-API](https://rasa.com/docs/rasa/pages/http-api) and the [Locust](https://locust.io/) open source load testing tool. + | Users | CPU | Memory | |--------------------------|----------------------------------------------|---------------| | Up to 50,000 | 6vCPU | 16 GB | | Up to 80,000 | 6vCPU, with almost 90% CPU usage | 16 GB | +### Some recommendations to improve latency +- Sanic Workers must be mapped 1:1 to CPU for both Rasa Pro and Rasa Action Server +- Create `async` actions to avoid any blocking I/O +- `enable_selective_domain: true` : Domain is only sent for actions that needs it. This massively trims the payload between the two pods. +- Consider using compute efficient machines on cloud which are optimized for high performance computing such as the C5 instances on AWS. + However, as they are low on memory, models need to be trained lightweight. + + +| Machine | RasaPro | Rasa Action Server | +|--------------------------------|------------------------------------------------|--------------------------------------------------| +| AWS C5 or Azure F or Gcloud C2 | 3-7vCPU, 10-16Gb Memory, 3-7 Sanic Threads | 3-7vCPU, 2-12Gb Memory, 3-7 Sanic Threads | + + ### Debugging bot related issues while scaling up To test the Rasa [HTTP-API](https://rasa.com/docs/rasa/pages/http-api) ability to handle a large number of concurrent user activity we used the Rasa Pro [tracing](./tracing.mdx) capability