Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-5027] opt-in per namespace ROS OCP recommendations #355

Merged
merged 26 commits into from
May 21, 2024
Merged

Conversation

maskarb
Copy link
Member

@maskarb maskarb commented May 15, 2024

  • the ROS queries all join on kube_namespace_labels{label_insights_cost_management_optimizations='true', namespace!~'kube-.*|openshift|openshift-.*'} which will filter the results based on this label
  • add message when no namespaces are enabled for ROS

Example report where costmanagement-metrics-operator is the only namespace with the enabled label.
ros-openshift-202405.csv

Testing:
(using a cluster. can request one from ClusterBot with launch 4.15 aws,single-node)

  1. build an image:
$ VERSION=3.3.0
$ make docker-buildx IMG=quay.io/$USERNAME/koku-metrics-operator:v$VERSION; docker pull quay.io/$USERNAME/koku-metrics-operator:v$VERSION
  1. install the CRD and deploy the deployment:
$ make install
$ make deploy IMG=quay.io/$USERNAME/koku-metrics-operator:v$VERSION
  1. create the following CR:
cat << EOF | oc apply -f -
apiVersion: koku-metrics-cfg.openshift.io/v1beta1
kind: KokuMetricsConfig
metadata:
  name: kokumetricscfg-sample-v1beta1
  namespace: koku-metrics-operator
spec:
  authentication: {}
  packaging: {}
  prometheus_config:
    collect_previous_data: false
    disable_metrics_collection_cost_management: false
    disable_metrics_collection_resource_optimization: false
  source:
    create_source: false
  upload:
    upload_toggle: false
EOF

This will create a CR that only collects the most recent hour of data. If using a ClusterBot cluster, there likely won't be any data to collect during the first reconciliation anyway.
4. Wait until the operator generates its first report.
5. Apply the following label:

oc label namespaces koku-metrics-operator insights_cost_management_optimizations="true" --overwrite=true
  1. Now check the previously gathered reports. Create the volume-shell and download the reports:
$ cat << EOF | oc apply -f -
kind: Pod
apiVersion: v1
metadata:
  name: volume-shell
  namespace: koku-metrics-operator
  labels:
    app: koku-metrics-operator
spec:
  volumes:
  - name: koku-metrics-operator-reports
    persistentVolumeClaim:
      claimName: koku-metrics-operator-data
  containers:
  - name: volume-shell
    image: busybox
    command: ['sleep', 'infinity']
    volumeMounts:
    - name: koku-metrics-operator-reports
      mountPath: /tmp/koku-metrics-operator-reports

(If this pod does not mount to the volume, update the node in the yaml to match the operator node)

$ oc rsync volume-shell:/tmp/koku-metrics-operator-reports/ testing/tmp -n koku-metrics-operator

Notice that there is no ros-openshift-202405.csv file in tmp/data.
7. Examine the CR and see the data collection message:

$ oc get kokumetricsconfig/kokumetricscfg-sample-v1beta1 -n koku-metrics-operator -o jsonpath='{.status.reports}' | jq   
{
  "data_collected": true,
  "data_collection_message": "No namespaces contain the `insights_cost_management_optimizations=\"true\"` label, so no resource optimization metrics were collected.",
  "last_hour_queried": "2024-05-17 19:00:00 - 2024-05-17 19:59:59",
  "report_month": "05"
}
  1. allow another hour to pass so the next round of data is collected.
  2. download the reports again. This time see a ros-openshift-202405.csv file in tmp/data which contains line items only for the koku-metrics-operator namespace.
  3. Also, you should not see the previous data_collection message anymore:
{
  "data_collected": true,
  "last_hour_queried": "2024-05-17 20:00:00 - 2024-05-17 20:59:59",
  "report_month": "05"
}

Copy link

codecov bot commented May 15, 2024

Codecov Report

Attention: Patch coverage is 89.79592% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 86.31%. Comparing base (16ae929) to head (707c78b).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #355      +/-   ##
==========================================
+ Coverage   85.85%   86.31%   +0.45%     
==========================================
  Files          13       13              
  Lines        2171     2206      +35     
==========================================
+ Hits         1864     1904      +40     
+ Misses        224      220       -4     
+ Partials       83       82       -1     
Flag Coverage Δ
unittests 86.31% <89.79%> (+0.45%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
internal/collector/prometheus.go 97.43% <100.00%> (+0.21%) ⬆️
internal/controller/prometheus.go 95.19% <100.00%> (+0.24%) ⬆️
internal/packaging/packaging.go 82.50% <100.00%> (ø)
...nternal/controller/kokumetricsconfig_controller.go 86.13% <88.88%> (+1.75%) ⬆️
internal/collector/collector.go 88.62% <80.00%> (-0.72%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 16ae929...707c78b. Read the comment docs.

@maskarb maskarb changed the title dynamic ros [COST-5027] opt-in per namespace ROS OCP recommendations May 15, 2024
@maskarb maskarb marked this pull request as ready for review May 15, 2024 17:09
@maskarb maskarb requested a review from a team May 21, 2024 13:50
@maskarb maskarb marked this pull request as draft May 21, 2024 14:02
@maskarb maskarb marked this pull request as ready for review May 21, 2024 16:06
@maskarb maskarb merged commit 58145e2 into main May 21, 2024
10 checks passed
@maskarb maskarb deleted the ros-updates branch May 21, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants