Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(volumes): Miscellaneous fixes #2402

Merged
merged 22 commits into from
Jan 2, 2025
Merged

fix(volumes): Miscellaneous fixes #2402

merged 22 commits into from
Jan 2, 2025

Conversation

adityahase
Copy link
Member

Show disk size for single and multi-volume machines

  • For single-volume machines. Use the only disk for billing/usage calculations.
  • Multi-volume machines have a relatively small (~10GB) root disk. This is excluded from billing/usage calculations.
  • Auto-extend data filesystems to fill available space.

Metrics for all volumes

  • Server metrics show Disk IOPS and space usage for all mounted volumes ("real" filesystems).

Mountpoint-specific disk usage reactions

  • Handle disk usage alerts based on the mount point.
  • Find (and extend) the correct device based on the mount point.
  • Handle partitioned and unpartitioned volumes.

Explicitly handle machines with data volumes

Virtual Machine, Virtual Machine Image, Server, and Database Server DocTypes now have a has_data_volume field.

Use Press Job arguments for setting labels

```json
{
  "labels": {
    "alertname": "Disk Almost Full",
    "cluster": "Mumbai",
    "device": "/dev/nvm0n1p1",
    "fstype": "ext4",
    "instance": "n1.local.frappe.cloud",
    "job": "node",
    "mountpoint": "/",
    "severity": "critical"
  }
}
```

We need to use this to find the mountpoint of the disk that's almost full.
The default behavior remains unchanged.
i.e. Pick the first volume from the list.
Handles partitioned and un-partitioned devices.
Handles any device (unlike the hardcoded /dev/nvme0n1)
TODO: Free space and prediction calculations still use data from the / mount point
Filter for all three possible mountpoints. Root, MariaDB and Benches.
Show multiline charts for iops and disk space.
On single volume machines this defaults to /
On multi-volume machines defaults to /opt/volumes/mariadb and /opt/volumes/benches

After this the required-space calculations will use query metrics for the correct mount point.
Before
VM.disk_size
- Set from Plan.size
- Used to set root volume size on boot
- Always fetched from VM.volumes[0].size

Now
We can have one or two disks (root, data). So we use two fields
VM.root_disk_size
- Set default to 10G
- Used to set root volume size on boot
- Fetched from get_root_volume()

VM.disk_size (Stand-in for VM.data_disk_size)
- Set based on Plan.size
- Used to set the data disk size on boot
- Fetched from get_root_volume()
Size and root_size they serve the same purpose as VM.disk_size and VM.root_disk_size
Root Disk Size now stores the size of the root disk
Data volume in the Virtual Machine Image is very small (~10GB)

All plans use larger volumes (~25+), so we resize the filesystem

Root partition (and filesystem) seems to auto-extend on first boot
Before this we'd rely on number of volumes to determine if we have a data volume

It's possible to have multiple volumes that aren't data volumes
(e.g. volumes for temporary data copying)

This field should only be edited in Virtual Machine DocType

Virtual Machine dictates if Server / Database Server has data volume
Virtual Machine Image copies this value at the time of image creation
Virtual Machines created from an image inherit the value
@adityahase adityahase merged commit 613e2a6 into master Jan 2, 2025
5 checks passed
@adityahase adityahase deleted the fix-volumes branch January 2, 2025 13:16
Copy link

codecov bot commented Jan 2, 2025

Codecov Report

Attention: Patch coverage is 33.76623% with 102 lines in your changes missing coverage. Please review.

Project coverage is 38.45%. Comparing base (65b8ad7) to head (62d1457).
Report is 26 commits behind head on master.

Files with missing lines Patch % Lines
...s/press/doctype/virtual_machine/virtual_machine.py 30.18% 37 Missing ⚠️
press/press/doctype/server/server.py 37.03% 34 Missing ⚠️
...ype/virtual_machine_image/virtual_machine_image.py 16.66% 10 Missing ⚠️
...ertmanager_webhook_log/alertmanager_webhook_log.py 25.00% 9 Missing ⚠️
...s/press/doctype/database_server/database_server.py 0.00% 5 Missing ⚠️
...ype/prometheus_alert_rule/prometheus_alert_rule.py 33.33% 4 Missing ⚠️
...ual_machine_migration/virtual_machine_migration.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2402      +/-   ##
==========================================
- Coverage   38.52%   38.45%   -0.07%     
==========================================
  Files         377      377              
  Lines       29286    29380      +94     
==========================================
+ Hits        11282    11299      +17     
- Misses      18004    18081      +77     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant