Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(applications): add scalers for cpu and memory #1295

Merged
merged 5 commits into from
Oct 16, 2024

Conversation

arealmaas
Copy link
Collaborator

@arealmaas arealmaas commented Oct 14, 2024

Description

  • Enable automatic scaling based on CPU and memory for graphql and web-api (enduser only)
  • Setting 70 as the limit, this can be tweaked as we see how the services behave under load
  • Set a max number of 10 revisions for now. We might need to increase this in yt01

Related Issue(s)

Verification

  • Your code builds clean without any errors or warnings
  • Manual testing done (required)
  • Relevant automated test added (if you find this hard, leave it and we'll help out)

Documentation

  • Documentation is updated (either in docs-directory, Altinnpedia or a separate linked PR in altinn-studio-docs., if applicable)

Summary by CodeRabbit

  • New Features
    • Introduced dynamic scaling configuration for container apps, allowing for minimum and maximum replicas and custom scaling based on CPU and memory utilization.
  • Enhancements
    • Improved deployment flexibility with a centralized scaling parameter for container apps.

@arealmaas arealmaas marked this pull request as ready for review October 15, 2024 21:11
@arealmaas arealmaas requested review from a team as code owners October 15, 2024 21:11
Copy link
Contributor

coderabbitai bot commented Oct 15, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

This pull request introduces a new variable scale across multiple Bicep files to define scaling configurations for container applications. The scale variable specifies minimum and maximum replicas, along with custom scaling rules based on CPU and memory utilization thresholds. Additionally, the scale parameter is integrated into the containerApp module, allowing for a centralized and flexible scaling configuration. No other changes to existing parameters, resources, or modules are made.

Changes

File Path Change Summary
.azure/applications/graphql/main.bicep Added variable var scale for scaling configuration with minReplicas: 2, maxReplicas: 10, and scaling rules.
.azure/applications/web-api-eu/main.bicep Added variable var scale for scaling configuration with minReplicas: 2, maxReplicas: 10, and scaling rules.
.azure/modules/containerApp/main.bicep Added parameter param scale object = { minReplicas: 1, maxReplicas: 1 } and updated resource reference to use scale.

Possibly related PRs

  • feat(service): deploy application in container apps #1303: This PR introduces a new parameter for user-assigned identity in the container app deployment, which is closely related to the scaling configuration changes made in the main PR, as both involve modifications to the container app's Bicep files.

Suggested reviewers

  • oskogstad

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1a93bda and 34afe72.

📒 Files selected for processing (1)
  • .azure/modules/containerApp/main.bicep (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .azure/modules/containerApp/main.bicep

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@arealmaas arealmaas changed the title feat(applications): add scalers for cpu feat(applications): add scalers for cpu and memory Oct 15, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
.azure/modules/containerApp/main.bicep (1)

34-38: LGTM! Consider clarifying the comment about disabling scaling.

The new scale parameter is well-defined and aligns with the PR objectives. However, the comment "temp disable scaling by default for outbox scheduling" might benefit from additional context.

Could you provide more information about why scaling is temporarily disabled by default for outbox scheduling? This context could be valuable for future maintainers.

.azure/applications/graphql/main.bicep (1)

109-132: LGTM! Consider parameterizing scale values for flexibility.

The scale variable declaration looks good and aligns with the PR objectives. It sets the CPU usage limit to 70% as specified, and includes a similar rule for memory usage. The minimum of 2 replicas ensures high availability, and the maximum of 10 replicas matches the PR requirements.

To improve flexibility, consider parameterizing the scale values:

param minReplicas int = 2
param maxReplicas int = 10
param cpuThreshold int = 70
param memoryThreshold int = 70

var scale = {
  minReplicas: minReplicas
  maxReplicas: maxReplicas
  rules: [
    {
      custom: {
        type: 'cpu'
        metricType: 'Utilization'
        metadata: {
          value: string(cpuThreshold)
        }
      }
    }
    {
      custom: {
        type: 'memory'
        metricType: 'Utilization'
        metadata: {
          value: string(memoryThreshold)
        }
      }
    }
  ]
}

This approach would allow easier adjustments to the scaling configuration without modifying the template itself.

.azure/applications/web-api-eu/main.bicep (1)

80-103: LGTM! Consider parameterizing threshold values for flexibility.

The scale variable declaration aligns well with the PR objectives. It sets the CPU usage limit to 70% as specified, includes a similar threshold for memory, and sets the maximum replicas to 10. The minimum of 2 replicas ensures high availability.

Consider parameterizing the threshold values (currently set to 70%) to allow for easier adjustments in different environments:

 var scale = {
   minReplicas: 2
   maxReplicas: 10
   rules: [
     {
       custom: {
         type: 'cpu'
         metricType: 'Utilization'
         metadata: {
-          value: '70'
+          value: '${cpuThreshold}'
         }
       }
     }
     {
       custom: {
         type: 'memory'
         metricType: 'Utilization'
         metadata: {
-          value: '70'
+          value: '${memoryThreshold}'
         }
       }
     }
   ]
 }

Then, add these parameters at the top of the file:

@description('CPU utilization threshold for scaling (percentage)')
@minValue(1)
@maxValue(100)
param cpuThreshold int = 70

@description('Memory utilization threshold for scaling (percentage)')
@minValue(1)
@maxValue(100)
param memoryThreshold int = 70

This change would make it easier to adjust thresholds for different environments or future tuning.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1ae4f41 and 1a93bda.

📒 Files selected for processing (3)
  • .azure/applications/graphql/main.bicep (2 hunks)
  • .azure/applications/web-api-eu/main.bicep (2 hunks)
  • .azure/modules/containerApp/main.bicep (2 hunks)
🧰 Additional context used
🔇 Additional comments (6)
.azure/modules/containerApp/main.bicep (2)

Line range hint 1-89: Overall, the changes look good and align with the PR objectives.

The implementation of the scale parameter and its usage in the containerApp resource provide the desired flexibility for configuring automatic scaling based on CPU and memory usage. This aligns well with the PR objectives for the GraphQL and web API services.

A few points to consider:

  1. The comment about temporarily disabling scaling by default could use more context.
  2. Ensure that the scaling configuration (including the 70% CPU usage limit mentioned in the PR objectives) is properly set when this module is used.
  3. Verify that the maximum of 10 revisions for scaling configurations (mentioned in the PR objectives) is implemented, possibly in another file or when this module is used.

72-72: LGTM! Scaling configuration implemented as intended.

The scale parameter is correctly used in the containerApp resource, allowing for flexible scaling configuration. This implementation aligns with the PR objectives of adding automatic scaling capabilities for the GraphQL and web API services.

To ensure the scaling configuration is correctly implemented across all relevant files, let's run the following verification script:

This script will help us confirm that the scale parameter is consistently used across the project and that no hardcoded scaling configurations remain.

.azure/applications/graphql/main.bicep (2)

154-154: LGTM! Correct implementation of the scale parameter.

The scale variable is correctly passed as a parameter to the containerApp module. This implementation allows the container app to utilize the defined scaling configuration during deployment, which aligns with the PR objectives.


Line range hint 109-154: Summary: Changes align well with PR objectives.

The modifications in this file are focused and effectively implement the automatic scaling capabilities for the GraphQL service as outlined in the PR objectives. The scale variable introduces CPU and memory-based scaling rules with a 70% threshold, and sets appropriate minimum (2) and maximum (10) replicas. These changes are correctly integrated into the containerApp module.

A few points to consider:

  1. The PR mentions that the CPU usage limit of 70% is adjustable. The suggested parameterization in the earlier comment would facilitate this adjustment.
  2. The PR notes that the maximum of 10 revisions might need to be increased in the yt01 environment. Ensure that this is addressed if necessary, possibly through environment-specific parameters.
  3. Remember to update the relevant documentation as mentioned in the PR objectives, either in the docs directory or through a separate PR in the Altinn Studio documentation repository.

To ensure consistency across the codebase, let's check if similar changes have been applied to other services:

✅ Verification successful

Verification Complete: Scaling Configuration is Isolated and Correct

The scaling settings have been exclusively applied to the GraphQL service, aligning with the PR objectives. No similar scaling configurations were found in other Bicep files, ensuring that the changes do not inadvertently impact other services.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for similar scale configurations in other Bicep files

# Test: Search for scale configurations in other Bicep files
rg --type bicep -g 'var scale = \{' .azure

Length of output: 74


Script:

#!/bin/bash
# Description: Check for similar scale configurations in other Bicep files

# Find all .bicep files in the .azure directory and search for 'var scale = {'
fd --extension bicep .azure | xargs grep -H 'var scale = {'

Length of output: 61

.azure/applications/web-api-eu/main.bicep (2)

155-155: LGTM! Proper integration of scaling configuration.

The scale variable is correctly passed to the containerApp module, enabling dynamic scaling based on the defined criteria. This implementation aligns with the PR objectives of adding automatic scaling capabilities for the web API service.


Line range hint 1-174: Overall, the changes successfully implement the scaling configuration.

The implementation aligns well with the PR objectives, adding automatic scaling capabilities based on CPU and memory usage for the web API service. The code is clean, focused, and follows best practices.

To ensure these changes work as expected, please verify the following:

  1. Test the deployment in a non-production environment to confirm that the scaling rules are applied correctly.
  2. Monitor the CPU and memory usage of the deployed container app under various load conditions to ensure it scales as intended.
  3. Verify that the maximum number of replicas (10) is sufficient for your expected load, especially in the yt01 environment as mentioned in the PR objectives.

Run the following script to check for any other files that might need similar scaling configurations:

If other files are found, consider applying similar scaling configurations to maintain consistency across your infrastructure.

Copy link
Collaborator

@MagnusSandgren MagnusSandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denne må ventes litt med hvis ikke scale.minReplicas = scale.maxReplicas = 1 til service er oppe å kjøre.

@arealmaas
Copy link
Collaborator Author

Mulig jeg har misforstått, men er det ikke bare serviceowner-web-api som trenger scale: 1? Det er kun den som har RUN_OUTBOX_SCHEDULER=true

Eller er det noe annet som bestemmer det?

@oskogstad
Copy link
Collaborator

Mulig jeg har misforstått, men er det ikke bare serviceowner-web-api som trenger scale: 1? Det er kun den som har RUN_OUTBOX_SCHEDULER=true

Eller er det noe annet som bestemmer det?

Nope, stemmer det. Funker fint denne PR'en, SO får fortsatt default 1:1

oskogstad
oskogstad previously approved these changes Oct 16, 2024
Copy link

@arealmaas arealmaas merged commit eb0f19b into main Oct 16, 2024
25 checks passed
@arealmaas arealmaas deleted the feat/applications-scale-on-cpu branch October 16, 2024 15:45
arealmaas pushed a commit that referenced this pull request Oct 17, 2024
🤖 I have created a release *beep* *boop*
---


##
[1.25.0](v1.24.0...v1.25.0)
(2024-10-17)


### Features

* **applications:** add scalers for cpu and memory
([#1295](#1295))
([eb0f19b](eb0f19b))
* **infrastructure:** create new yt01 app environment
([#1291](#1291))
([1a1ccc0](1a1ccc0))
* **service:** add permissions for service-bus
([#1305](#1305))
([7bf4177](7bf4177))
* **service:** deploy application in container apps
([#1303](#1303))
([a309044](a309044))


### Bug Fixes

* **applications:** add missing property for scale configuration
([3ffb724](3ffb724))
* **applications:** use correct scale configuration
([#1311](#1311))
([b8fb3cc](b8fb3cc))
* Fix ID-porten acr claim parsing
([#1299](#1299))
([8b8862f](8b8862f))
* **service:** ensure default credentials work
([#1306](#1306))
([b1e6a14](b1e6a14))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants