Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preflight OS, CPU, RAM, Swap, and Filesystem checks #326

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

Kushal-deb
Copy link

@Kushal-deb Kushal-deb commented Feb 3, 2025

  • Implemented OS, NIC and Other preflight checks to validate system requirements before Ceph cluster creation.

    • Checks include:
      • OS version (RHEL 9+ required)
      • SELinux enforcing mode
      • Firewalld installation and status
      • Required package availability (rpcbind, podman, firewalld)
      • Podman version check (>= 3.3)
      • RHEL software profile validation
      • Tuned profile check
      • CPU, RAM, Swap, and Filesystem (part of other checks)
      • Check whether jumbo frames are enabled
      • Is it configured with DHCP or static IP
      • Is the bandwidth sufficient
      • Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
      • Check and report network latency (ping) with all hosts provided in the inventory file
      • Separate NICs for front-end and back-end networks

Enhancements:

❯ ansible-playbook -i ~/ansible-inventory/inventory.ini cephadm-preflight.yml                                                                                                                                                              ─╯

PLAY [insecure_registries] *******************************************************************************************************************************************************************************************************************

TASK [fail if insecure_registry is undefined] ************************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [preflight] *****************************************************************************************************************************************************************************************************************************

TASK [fail when ceph_origin is custom with no repository defined] ****************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [fail if baseurl is not defined for ceph_custom_repositories] ***************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [all] ***********************************************************************************************************************************************************************************************************************************

TASK [Initialize preflight results list] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if OS is RHEL 9+] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store OS check result] *****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure SELinux is set to Enforcing mode] ***********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Retrieve SELinux status from ansible_facts] ********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Check Result] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Failure Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store SELinux check result] ************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure required packages are installed] ************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Check Result] *******************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Failure Reason] *****************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Package Installation Result] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure firewalld is enabled and running] ***********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Firewalld Check Status] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Firewalld check result] **********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Collect installed package facts] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if Podman is installed] **********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract Podman version] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Podman Check Variables] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Podman Installation Check] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure Podman is installed if missing (Fixable)] ***************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [Validate RHEL software profile] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Result] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store RHEL Profile check] **************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get current tuned profile] *************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Result] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Reason] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Tuned Profile Check] *************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if CPU supports x86-64-v2 (AVX2 required for RHEL 9)] ****************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define CPU x86-64-v2 Check Variables] **************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store CPU Instruction Set Check Result] ************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get available CPU cores using Ansible facts] *******************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define CPU Core Check Variables] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store CPU Core Check] ******************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get total RAM using Ansible facts] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RAM Check Variables] ************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store RAM Check Result] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get total swap space using Ansible facts] **********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Calculate required swap space] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Set Swap Space Check Variables] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Swap Space Check] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if /var is a separate partition] *************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Set /var Partition Check Variables] ****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store /var Partition Check] ************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get root filesystem size] **************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Set Root Filesystem Check Variables] ***************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Root Filesystem Check] ***********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get active network interfaces] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract MTU for jumbo frames Check] ****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define jumbo frames Check Variables] ***************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store jumbo frames Check] **************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract NIC IP Configuration] **********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define NIC Configuration Check Variables] **********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store NIC Configuration Check] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract NIC Bandwidth] *****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define NIC Bandwidth Check Variables] **************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store NIC Bandwidth Check] *************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get NIC Configuration Details] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define NIC Configuration Check Variables] **********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store NIC Configuration Check] *********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ping all hosts in inventory] ***********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin] => (item=rhel-ceph-admin)

TASK [Define Network Latency Check Variables] ************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Network Latency Check] ***********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Identify Front-End NIC] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Identify Back-End NIC] *****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Front-End & Back-End NIC Separation Check Variables] ****************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store Front-End & Back-End NIC Separation Check] ***************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Generate Preflight Check Report] *******************************************************************************************************************************************************************************************************
changed: [rhel-ceph-admin -> localhost]

TASK [Display Preflight Check Report] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Show Report Summary] *******************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin] => 
  msg:
  - Preflight Check Report
  - ''
  - ==================================================
  - '    **System Checks**'
  - '--------------------------------------------------'
  - '- **OS Version**: ✅ Passed  -   **Reason:** N/A'
  - '- **SELinux**: ✅ Passed  -   **Reason:** N/A'
  - '- **Firewalld Running**: ✅ Passed  -   **Reason:** N/A'
  - '- **Podman Installed**: ✅ Passed  -   **Reason:** Podman version is 5.2.2'
  - '- **RHEL Profile**: ❌ Failed  -   **Reason:** Incorrect RHEL software profile. Expected: Server with File and Storage Server.'
  - '- **Tuned Profile**: ❌ Failed  -   **Reason:** Incorrect tuned profile. Expected: throughput-performance'
  - '- **CPU x86-64-v2**: ✅ Passed  -   **Reason:** N/A'
  - '- **CPU Cores >= 4**: ✅ Passed  -   **Reason:** N/A'
  - '- **Minimum RAM (8GB)**: ❌ Failed  -   **Reason:** System has only 7684 MB RAM, required: 8192MB'
  - '- **Swap Space (1.5x RAM)**: ❌ Failed  -   **Reason:** System has only 5119 MB Swap, required: 11526 MB'
  - '- **/var is a separate partition**: ❌ Failed  -   **Reason:** /var is not a separate partition'
  - '- **Root Filesystem >= 100GB**: ❌ Failed  -   **Reason:** Root FS is only 43GB, required: 100GB'
  - '- **Jumbo Frames Enabled**: ❌ Failed  -   **Reason:** MTU is 1500, recommended > 1500'
  - '- **NIC Static IP Configuration**: ❌ Failed  -   **Reason:** NIC is using DHCP, static IP is recommended'
  - '- **NIC Bandwidth (10GbE Recommended)**: ❌ Failed  -   **Reason:** NIC speed is -1 Mbps, recommended is 10GbE'
  - '- **NIC Configuration**: ❌ Failed  -   **Reason:** NIC options: lo, ens3'
  - '- **Network Latency**: ❌ Failed  -   **Reason:** Latency results: [''pong'']'
  - '- **Separate NICs for Frontend & Backend Networks**: ❌ Failed  -   **Reason:** Using same NIC for both front-end and back-end networks. Customers with large deployments should separate traffic for performance optimization.'
  - ''
  - ==================================================
  - '     **Summary**'
  - '--------------------------------------------------'
  - '❌ **Critical Failures Detected**: '
  - '   - RHEL Profile, Tuned Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames, NIC Configuration, NIC Bandwidth, NIC Separation'
  - '   -   **Action Required**: Please fix the above issues before proceeding.'

TASK [Final Check - Fail if any critical checks failed] **************************************************************************************************************************************************************************************
fatal: [rhel-ceph-admin]: FAILED! => changed=false 
  msg: 'Preflight checks failed for the following: RHEL Profile, Tuned Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames, NIC Configuration, NIC Bandwidth, NIC Separation. Please resolve these issues before proceeding.'

PLAY RECAP ***********************************************************************************************************************************************************************************************************************************
rhel-ceph-admin            : ok=70   changed=1    unreachable=0    failed=1    skipped=4    rescued=0    ignored=0   

cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
Copy link
Collaborator

@guits guits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, avoid as much as possible using ignore_errors: true

cephadm-preflight.yml Outdated Show resolved Hide resolved
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from 5d002a4 to 280a9cf Compare February 5, 2025 07:53
@Kushal-deb Kushal-deb requested a review from guits February 5, 2025 07:57
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
cephadm-preflight.yml Outdated Show resolved Hide resolved
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from 280a9cf to 2a6cb0f Compare February 5, 2025 14:30
@Kushal-deb Kushal-deb requested a review from guits February 5, 2025 14:41
cephadm-preflight.yml Outdated Show resolved Hide resolved
@Kushal-deb Kushal-deb changed the title Add preflight OS checks Add preflight OS and other checks Feb 6, 2025
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from 2a6cb0f to e837ef9 Compare February 6, 2025 11:38
@Kushal-deb Kushal-deb requested a review from guits February 6, 2025 11:39
@Kushal-deb Kushal-deb changed the title Add preflight OS and other checks Add preflight OS , CPU, RAM, Swap, and Filesystem checks Feb 6, 2025
@Kushal-deb Kushal-deb changed the title Add preflight OS , CPU, RAM, Swap, and Filesystem checks Add preflight OS, CPU, RAM, Swap, and Filesystem checks Feb 6, 2025
@Kushal-deb Kushal-deb closed this Feb 6, 2025
@Kushal-deb Kushal-deb deleted the implement_os_preflight_checks branch February 6, 2025 11:43
@Kushal-deb Kushal-deb restored the implement_os_preflight_checks branch February 6, 2025 11:43
@Kushal-deb Kushal-deb reopened this Feb 6, 2025
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from e837ef9 to 6e47331 Compare February 6, 2025 17:19
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from 6e47331 to 9546e44 Compare February 11, 2025 08:25
@Kushal-deb Kushal-deb requested a review from guits February 11, 2025 09:49
preflight_checks.yml Outdated Show resolved Hide resolved
preflight_checks.yml Outdated Show resolved Hide resolved
preflight_checks.yml Outdated Show resolved Hide resolved
preflight_checks.yml Outdated Show resolved Hide resolved
- Implemented OS preflight checks to validate system requirements before Ceph cluster creation.
- Checks include:
  - OS version (RHEL 9+ required)
  - SELinux enforcing mode
  - Firewalld installation and status
  - Required package availability (rpcbind, podman, firewalld)
  - Podman version check (>= 3.3)
  - RHEL software profile validation
  - Tuned profile check
  - CPU, RAM, Swap, and Filesystem (part of other checks)
  - Check whether jumbo frames are enabled
  - Is it configured with DHCP or static IP
  - Is the bandwidth sufficient
  - Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
  - Check and report network latency (ping) with all hosts provided in the inventory file
  - Separate NICs for front-end and back-end networks
@Kushal-deb Kushal-deb force-pushed the implement_os_preflight_checks branch from 9546e44 to 39a250e Compare February 11, 2025 15:06
@Kushal-deb Kushal-deb requested a review from guits February 11, 2025 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants