Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎨 Reduces response time of catalog/services listing entrypoint #5273

Merged
merged 7 commits into from
Jan 29, 2024

Conversation

pcrespov
Copy link
Member

@pcrespov pcrespov commented Jan 25, 2024

What do these changes do?

This PR analyses and provides a temporary solution for the incident #5267 that reports very long delays in GET /catalog/services entrypoint. In addition the front-end can at time spawn up to 17 parallel calls to this entrypoint point (probably due to some retry mechanism)

image

We are well aware that the original design does not scale with the amount of services. A proper resolution of this issue therefore requires a redesign that should incorporate at least pagination and lighter item objects with bounded fields.

For the moment we decided to go with a strategic TTL cache that will reduce the overhead of postprocessing every service item in the webserver. Note that the webserver does not just forward the service object provided by the catalog service but also computes and adds some extra information (e.g. units) to it. Using a profiler reveals that replace_service_inputs incurs a considerable computation overhead. Therefore, with the last increase in the number of services this has caused the large delays reported in #5267.

This is the benchark results and profiler before the cache was added
image
and this is after
image
Finally here we can see several calls to the entrypoint. After the first, subsequent calls take much less time

image

Regarding the empty lists, I also noticed that this comes directly in the response of the catalog service (and not from the webserver). Probably is due to an issue of the caches on the multiple replicas there. Nonetheless this point has not been addressed here

image

Details

  • ⬆️ New cachetools to cache sync functions (we have aiocache to cache async functions`
  • ⬆️ Adds pytest-benchmark to benchmark the replace_service_inputs
  • ⬆️ Adds msgpack mainly to remove log warning message of some libraries (e.g. aiocache) that uses this as a default if it is available.

Related issue/s

How to test

Driving test services/web/server/tests/unit/isolated/test_catalog_models.py

Dev Checklist

DevOps

@pcrespov pcrespov self-assigned this Jan 25, 2024
@pcrespov pcrespov added this to the This is Sparta! milestone Jan 25, 2024
@pcrespov pcrespov added the a:webserver issue related to the webserver service label Jan 25, 2024
@pcrespov pcrespov force-pushed the is5267/catalog-service-list branch from 19a522f to b4b39c9 Compare January 25, 2024 19:04
Copy link

codecov bot commented Jan 25, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (4fa4917) 87.2% compared to head (508a986) 88.7%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #5273     +/-   ##
========================================
+ Coverage    87.2%   88.7%   +1.4%     
========================================
  Files        1308    1213     -95     
  Lines       53550   49548   -4002     
  Branches     1170    1024    -146     
========================================
- Hits        46749   43960   -2789     
+ Misses       6552    5367   -1185     
+ Partials      249     221     -28     
Flag Coverage Δ
integrationtests 65.2% <61.5%> (-0.1%) ⬇️
unittests 87.1% <92.3%> (+2.0%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...rver/src/simcore_service_webserver/catalog/_api.py 42.3% <ø> (ø)
...r/src/simcore_service_webserver/catalog/_models.py 64.9% <100.0%> (-26.4%) ⬇️
...er/src/simcore_service_webserver/catalog/client.py 51.1% <0.0%> (ø)

... and 113 files with indirect coverage changes

@pcrespov pcrespov force-pushed the is5267/catalog-service-list branch from 8d46419 to 8c11867 Compare January 26, 2024 13:54
@pcrespov pcrespov changed the title WIP 🎨 Is5267/catalog service list 🎨 Reduces response time of catalog/services listing entrypoint Jan 26, 2024
@pcrespov pcrespov marked this pull request as ready for review January 26, 2024 14:25
@pcrespov pcrespov enabled auto-merge (squash) January 26, 2024 15:37
@pcrespov pcrespov force-pushed the is5267/catalog-service-list branch from bc876b9 to 508a986 Compare January 28, 2024 17:23
Copy link

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Contributor

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pcrespov pcrespov merged commit f7365ce into ITISFoundation:master Jan 29, 2024
55 checks passed
@pcrespov pcrespov deleted the is5267/catalog-service-list branch January 29, 2024 08:14
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Feb 14, 2024
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:webserver issue related to the webserver service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants