Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPMRPP-91158 || Updates according to the new async reporting approach #757

Merged
merged 7 commits into from
Jun 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 37 additions & 46 deletions docs/developers-guides/AsynchronousReporting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,40 +2,42 @@

### Overview

Asynchronous reporting implemented using [AMQP 0-9-1](https://www.rabbitmq.com/tutorials/amqp-concepts.html) protocol based on
[RabbitMq](https://www.rabbitmq.com) message broker.
The main idea of the async reporting is to give a response back immediately after a server that is receiving a request from a client.
So, using this approach, a client is not blocked and doesn't wait until a server processes the request.
Asynchronous reporting is set up using the [AMQP 0-9-1](https://www.rabbitmq.com/tutorials/amqp-concepts.html) protocol with
[RabbitMq](https://www.rabbitmq.com) as the message broker.
The main idea is to respond to the client immediately after the server receives a request. This way, the client isn't blocked and doesn't have to wait for the server to process the request.
Additionally, it acts as a requests load balancer, storing it in queues until the backend is free to process them.

### Simple scheme of interactions between RabbitMq and API
### Scheme of interactions between RabbitMq and API

***Difference between ID and UUID***

`ID` is a physical identificator of an entity generated automatically by a database at the moment of saving.
`UUID` is a virtual identificator of the entity. Can be specified in a request or, if not present, in a request generated automatically at the moment the
`api` accepts the request.
Each entity has both `ID` and `UUID`. `ID` is used to perform the CRUD operations with an entity that is ***already saved in db***.
`UUID` is used to build the child-parent relationships between entities at the client side during reporting.
In case of synchronous reporting, any response from `api` is returned ***after*** handling of the request and saving the entity in a database.
In case of asynchronous reporting, any response from `api` is returned ***before*** handling of the request and saving the entity in a database.
`ID` is a numerical identifier for an entity, automatically generated by the database at the moment of saving.
`UUID` is a string virtual identifier for an entity. `UUID` can be generated on the client side and provided with a request. If it is not provided,
it is generated automatically when the `API` accepts the request.
Each entity has both `ID` and `UUID`. `ID` is used to perform the CRUD operations on an entity that is ***already saved in db***.
`UUID` is used to build the child-parent relationships between entities on the client side during reporting.
In case of synchronous reporting, any response from `API` is returned ***after*** the request is handled and the entity is saved in the database.
In case of asynchronous reporting, any response from `API` is returned ***before*** the request is handled and the entity is saved in the database and ***after*** the request is published to the queue.
The responses in both modes look the same:
```json
{
"id": "cd64d5eb-fea1-4e7e-8a5a-69998ac5620f"
}
```
`id` property of the response is actually an `UUID`. This is due to backward compatibility.
So when you have this uuid and want to update, delete etc. entity you should get a physical `id` from the db first.
The `id` property in the response is actually an `UUID`. This is for backward compatibility.
Therefore, when you have this `UUID` and want to update or delete the entity, you need to first retrieve the physical `ID`.
It can be done via `API`:
* [Get specified launch by UUID](https://reportportal.io/docs/api/service-api/get-launch-by-uuid-using-get)

***Simple asynchronous reporting scheme***
***Asynchronous reporting scheme***


* **Step 1**
`API` receives HTTP request from `client`. `Controller` checks permissions and throws the request to `producer`.
`API` receives HTTP request from `client` to the reporting controller. The `Controller` verifies permissions and call the `producer` logic.
* **Step 2**
`Producer` validates business rules if necessary, generates UUID (virtual id) for an entity and returns it back to the `controller`,
composes a message for `RabbitMq` and sends it to the specified queue.
After that, a `controller` returns HTTP response to the `client` that contains UUID. **At the moment, the physical entity may not be created yet!**
`Producer` validates business rules if necessary, generates UUID (virtual id) if it is not provided in request,
builds a message for `RabbitMq` and sends it to the exchange with x-consistent-hash type.
After message is sent, the `controller` returns HTTP response to the `client` with UUID. **At the moment, the physical entity in database may not be created yet!**
* **Step 3**
`Consumer` starts processing the message as soon as it is received from `RabbitMq`.
After a successful processing, the entity will be stored in a database and obtain a physical id.
Expand Down Expand Up @@ -90,37 +92,28 @@ Requests and responses have no differences with sync ones but there are some spe
| rp.amqp.user | RP_AMQP_USER | rabbitmq |
| rp.amqp.pass | RP_AMQP_PASS | rabbitmq |
| rp.amqp.addresses | RP_AMQP_ADDRESSES | amqp://rabbitmq:rabbitmq@rabbitmq:5672 |
| rp.amqp.queues | RP_AMQP_QUEUES | 10 |
| rp.amqp.queuesPerPod | | 10 |
| reporting.queues.count | REPORTING_QUEUES_COUNT | 10 |


`rp.amqp.host` - Hostname of RabbitMq service.
`rp.amqp.port` - Port of RabbitMq service.
`rp.amqp.user` - Username to connect to RabbitMq service.
`rp.amqp.pass` - User password to connect to RabbitMq service.
`rp.amqp.addresses` - Full address to connect to RabbitMq service.
`rp.amqp.queues` - Number of queues to be processed by this service-api.
`rp.amqp.queuesPerPod` - Cluster configuration parameter. Number of queues to be processed by this service-api pod
(default effectively infinite).
:::note
should correlate with number QUEUE_AMOUNT & number of service-api pods being started in cluster.
:::
`reporting.queues.count` - Number of queues to be processed by this service-api.

#### Exchanges and queues for reporting

`API` produces two reporting exchanges - `reporting` and `reporting.retry`. Exchange `reporting` contains queues for storing messages
from the requests. Exchange `reporting.retry` contains queues for storing messages that were consumed exceptionally from the queues in `reporting`
exchange. The amount of the queues in the exchanges depends on `rp.amqp.queues` parameter. Exchange `reporting` has `N` queues with names
`reporting.0 ... reporting.N`. Exchange `reporting.retry` has `N+1` queues with the names `reporting.retry.0 ... reporting.retry.N` and `reporting.dlq`.
In case message from `reporting.retry` was consumed with exception more than 10 times, the message will be stored in reporting.dlq which is
[dead letter queue](https://www.rabbitmq.com/dlx.html).
The `API` creates two exchanges: `e.reporting` and `e.reporting.retry`. The `e.reporting` exchange is linked to queues that handle messages from requests, while the `e.reporting.retry` exchange is linked to retry queues and manages messages rejected from the main reporting queues.
The number of queues in these exchanges depends on the `REPORTING_QUEUES_COUNT` env variable. The `e.reporting` exchange has `N` queues named `q.reporting.id.0` to `q.reporting.id.N`. The `e.reporting.retry` exchange has 2 queues named `q.retry.reporting` and `q.retry.reporting.ttl`.
If a message from `e.reporting.retry` is consumed and throws an exception more than 10 times, it will be moved to a separate queue named `q.parkingLot.reporting`, where it will be stored for 7 days for manual error analysis.

<MediaViewer src={require('./img/async/ExchangesQueues.png')} alt="Exchanges Queues" />

#### Scheme

All requests (items, logs) related to the same launch will be stored in the same RabbitMQ queue.
It is achieved using the following algorithm that maps launch uuid to a queue key:

<MediaViewer src={require('./img/async/UuidQueusMapping.png')} alt="Uuid Queus Mapping" />
All requests (items, logs) related to the same launch will be stored in the same RabbitMQ queue.
This is achieved by using an exchange that maps messages to queues using the `Consistent Hashing` algorithm.

Messages in the queue don't have a strict order but they are stored mostly in the same order as they arrive from `client`.
This ensures a minimal amount of exceptions (causing the sending of such messages to the retry queue) caused by cases when a child is handled before its own parent.
Expand All @@ -129,16 +122,12 @@ Consuming scheme:

<MediaViewer src={require('./img/async/Consuming.png')} alt="Consuming" />

`(!)` Possible exceptions that may be thrown and lead to sending the message to the retry queue:
`(!)` All not managed exceptions will be moved to the `q.parkingLot.reporting` for manual analysis.
Possible exceptions that may be thrown and lead to moving the message to the retry queue:
* On start launch/test item:
* User not found.
* Entity not found. Parent entity not found.
* Bad request. Start time of the child item is earlier than the parent start time, trying to report a child child under a retry item, trying
to report a non-nested step under a nested step parent, trying to rerun a launch that does not exist.
* On finish launch/test item:
* Entity not found. Entity that has to be finished not found in database or parent entity not found (for test items).
* Bad request. User tries to finish already finished entity. Finish time is earlier than start time.
* Access denied. User tries to finish not own entity of under not own project
* On log creation:
* Entity not found. Trying to create log for not existing launch/test item

Expand All @@ -149,8 +138,10 @@ If the order is not broken, launch finish request will be handled when there are
<MediaViewer src={require('./img/async/FinishLaunch.png')} alt="Finish Launch" />

`(!)` It is a main difference in reporting mechanism between ReportPortal version 4 and 5.
In case the launch finish request is not last in the queue it will be finished anyway.
But all the next requests under the launch will be handled as soon as they get to the consumer and the launch statistics will be updated.
So it is possible to report items under already finished launch.
If the launch finish request is not the last in the queue, the launch will be finished anyway.
However, all subsequent requests related to that launch will be handled as they reach the consumer, and the launch statistics will be updated accordingly.
This means it is possible to report items under an already finished launch.
Events associated with the launch finish will be processed as soon as the launch finish is handled.
Items processed after the launch finish will not be included in post-launch handling processes such as 'Auto Analysis' and 'Quality Gates.'


Binary file modified docs/developers-guides/img/async/Consuming.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/developers-guides/img/async/ExchangesQueues.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.