Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performances on create/edit tenant field with high number of tenants #14512

Closed
lchabert opened this issue Dec 13, 2023 · 1 comment · Fixed by #14526
Closed

Poor performances on create/edit tenant field with high number of tenants #14512

lchabert opened this issue Dec 13, 2023 · 1 comment · Fixed by #14526
Assignees
Labels
severity: low Does not significantly disrupt application functionality, or a workaround is available status: accepted This issue has been accepted for implementation type: bug A confirmed report of unexpected behavior in the application

Comments

@lchabert
Copy link

lchabert commented Dec 13, 2023

Deployment Type

Self-hosted

NetBox Version

v3.6.6

Python Version

3.11

Steps to Reproduce

  • Create around 85k tenant in a single tenant group
  • Create around 150k prefixes linked with differents tenants
  • Create some sites, linked with differents tenants
  • Then edit one of the prefix, and click on tenant selector.

Expected Behavior

Maybe one second or less should be a good request timer
Or maybe the searchable select should be async to allow user to enter characters while the first request is running in background.

Observed Behavior

With a postgres 16 database, and 16GB of ram and 4 cpus, It will take 3.5 seconds to respond and give the user the opportunity to enter some search characters.

As far i can go through the debug process, i can see theses kinds of SQL request with long running timer ( >3s):

SELECT 
  DISTINCT "tenancy_tenant"."id", 
  "tenancy_tenant"."created", 
  "tenancy_tenant"."last_updated", 
  "tenancy_tenant"."custom_field_data", 
  "tenancy_tenant"."description", 
  "tenancy_tenant"."comments", 
  "tenancy_tenant"."name", 
  "tenancy_tenant"."slug", 
  "tenancy_tenant"."group_id", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "circuits_circuit" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "circuit_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_device" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "device_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_ipaddress" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "ipaddress_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_prefix" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "prefix_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_rack" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "rack_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_site" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "site_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "virtualization_virtualmachine" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "virtualmachine_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_vlan" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "vlan_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_vrf" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "vrf_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "virtualization_cluster" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "cluster_count" 
FROM 
  "tenancy_tenant" 
WHERE 
  "tenancy_tenant"."group_id" IN (
    SELECT 
      U0."id" 
    FROM 
      "tenancy_tenantgroup" U0 
    WHERE 
      U0."id" = 4
  ) 
ORDER BY 
  "tenancy_tenant"."name" ASC 
LIMIT 
  50

Theses request, for each kind of related objects making a count of them. So with 85k tenant, each count is executed 85k times.
From my point of view, to display a dynamic selector with customer display field, it's not necessary to count every objects.

To be more specific on the pgsql server, the request is executed on RAM buffered datas. No hit on system SSD. The cache usage is around 30% and hit objects.

We already have a discussion about it on slack: https://netdev-community.slack.com/archives/C01P0FRSXRV/p1702476390699559

@lchabert lchabert added the type: bug A confirmed report of unexpected behavior in the application label Dec 13, 2023
@DanSheps
Copy link
Member

DanSheps commented Dec 13, 2023

We had a discussion on Slack regarding this, this morning:

TLDR; is that this increase on the select fields is being caused by api calls which have annotations for counts of related items. Since we are performing brief API calls, this shouldn't be required.

Preliminary investigation suggests we could clear the annotations before resolving the query by calling self.queryset.query.annotations.clear()

We could also be more selective about it, depending on the outcome of #14302 is (only keeping relevant annotations instead)

Removing the annotations in testing resulted in a 50% increase in performance.

@DanSheps DanSheps added status: under review Further discussion is needed to determine this issue's scope and/or implementation severity: low Does not significantly disrupt application functionality, or a workaround is available labels Dec 13, 2023
@jeremystretch jeremystretch self-assigned this Dec 14, 2023
@jeremystretch jeremystretch added status: accepted This issue has been accepted for implementation and removed status: under review Further discussion is needed to determine this issue's scope and/or implementation labels Dec 14, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
severity: low Does not significantly disrupt application functionality, or a workaround is available status: accepted This issue has been accepted for implementation type: bug A confirmed report of unexpected behavior in the application
Projects
None yet
3 participants