Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

patilsuraj767 · 2023-10-30T09:21:24Z

… metrics

upadhyeammit · 2023-11-01T13:27:01Z

internal/utils/aggregator.go

@@ -40,6 +42,33 @@ func Aggregate_data(df dataframe.DataFrame) dataframe.DataFrame {

 	df = df.Mutate(s.Col("X0")).Rename("k8s_object_type", "X0")
 	df = df.Mutate(s.Col("X1")).Rename("k8s_object_name", "X1")
+
+	// filter out only valid workload type
+	df = df.Filter(


Is it possible to get this out as separate function? I understand its just a call to Filter however the Aggregate_data is already lengthy. With that can we have tests then? So looking at those one would understand what is invalid?

Added test cases

upadhyeammit · 2023-11-01T13:31:49Z

internal/utils/aggregator.go

+	// Validation to check if metrics for cpuUsage, memoryUsage and memoryRSS are missing
+	df, no_of_dropped_records := filter_valid_csv_records(df)
+	if no_of_dropped_records != 0 {
+		invalidDataPoints.Add(float64(no_of_dropped_records))


If I understand then we are never going to return float from filter_valid_csv_records, is it prometheus which expects float value?

yes, prometheus Add method requires the float value.

upadhyeammit · 2023-11-01T13:36:17Z

internal/utils/metrics.go

+		Name: "rosocp_invalid_datapoints_total",
+		Help: "The total number of invalid datapoints(rows) found in CSVs recevied",


nit:

Suggested change

Name: "rosocp_invalid_datapoints_total",

Help: "The total number of invalid datapoints(rows) found in CSVs recevied",

Name: "rosocp_total_invalid_datapoints",

Help: "The total number of invalid datapoints(rows) found in received CSVs",

According to prometheus metric naming convention - https://prometheus.io/docs/practices/naming/ suffix should describe the unit.

internal/utils/aggregator.go

upadhyeammit · 2023-11-01T13:38:00Z

internal/utils/aggregator.go

+	df, no_of_dropped_records := filter_valid_csv_records(df)
+	if no_of_dropped_records != 0 {
+		invalidDataPoints.Add(float64(no_of_dropped_records))
+		log.Infof("Invalid records in CSV - %v", no_of_dropped_records)


Does this need to be at Info level? and do you think we should also print more about owner_name and workload?

Yes, good to have at info level so that using request_id we can check in kibana how many rows where dropped for particular request/CSV. Our logging system by default logs all the request related info - https://github.com/RedHatInsights/ros-ocp-backend/blob/main/internal/logging/logging.go#L66-L71

saltgen

@patilsuraj767 The Aggregate_data function seems quite long.
Suggestion: Could you split the function into 2, one could just filter the records and another aggregates?

internal/utils/metrics.go

internal/utils/aggregator.go

patilsuraj767 · 2023-11-06T12:01:31Z

/retest

anurag03 · 2023-11-15T10:00:14Z

/retest

saltgen

@patilsuraj767 LGTM 👍 Thank you for adding the tests 👌

patilsuraj767 · 2023-11-15T11:08:03Z

/retest

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage…

e6eb00c

… metrics

patilsuraj767 added the Ready for review label Oct 30, 2023

Fix logging

2dab9b1

patilsuraj767 requested review from kgaikwad and saltgen November 1, 2023 05:25

upadhyeammit reviewed Nov 1, 2023

View reviewed changes

saltgen requested changes Nov 2, 2023

View reviewed changes

internal/utils/metrics.go Outdated Show resolved Hide resolved

internal/utils/aggregator.go Outdated Show resolved Hide resolved

internal/utils/aggregator.go Show resolved Hide resolved

Added unit test cases

1277252

patilsuraj767 requested a review from upadhyeammit November 6, 2023 05:32

upadhyeammit approved these changes Nov 6, 2023

View reviewed changes

saltgen approved these changes Nov 15, 2023

View reviewed changes

patilsuraj767 merged commit 85ac7b5 into RedHatInsights:main Nov 15, 2023
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

patilsuraj767 commented Oct 30, 2023

upadhyeammit Nov 1, 2023

patilsuraj767 Nov 3, 2023

upadhyeammit Nov 1, 2023

patilsuraj767 Nov 3, 2023

upadhyeammit Nov 1, 2023

patilsuraj767 Nov 3, 2023

upadhyeammit Nov 1, 2023

patilsuraj767 Nov 3, 2023

saltgen left a comment •

edited

Loading

patilsuraj767 commented Nov 6, 2023

anurag03 commented Nov 15, 2023

saltgen left a comment

patilsuraj767 commented Nov 15, 2023

		Name: "rosocp_invalid_datapoints_total",
		Help: "The total number of invalid datapoints(rows) found in CSVs recevied",

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

Conversation

patilsuraj767 commented Oct 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saltgen left a comment • edited Loading

Choose a reason for hiding this comment

patilsuraj767 commented Nov 6, 2023

anurag03 commented Nov 15, 2023

saltgen left a comment

Choose a reason for hiding this comment

patilsuraj767 commented Nov 15, 2023

saltgen left a comment •

edited

Loading