Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Support JSON format for file-metrics-collector #1744

Closed
tenzen-y opened this issue Nov 25, 2021 · 15 comments · Fixed by #1765
Closed

[Proposal] Support JSON format for file-metrics-collector #1744

tenzen-y opened this issue Nov 25, 2021 · 15 comments · Fixed by #1765
Assignees

Comments

@tenzen-y
Copy link
Member

tenzen-y commented Nov 25, 2021

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

Motivation

Currently, it is difficult to parse JSON format files by file-metrics-collector using regexp filter since file-metrics-collector is designed to use TEXT format files.
I believe if file-metrics-collector supports JSON format files, we can be further made Katib powerful because we can make use of JSON format metrics files without regexp more easily.
Therefore, I would like to support JSON format in file-metrics-collector, such as the following example, which is split by newlines.

{"foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
{“foo": “bar", “fiz": “buz"…}
…

This JSON format is also used in cloudml-hypertune recommended for use in GCP AI Platform or Vertex AI.

If you use a custom container for training or if you want to perform hyperparameter tuning with a framework other than TensorFlow, then you must use the cloudml-hypertune Python package to report your hyperparameter metric to AI Platform Training.

https://cloud.google.com/ai-platform/training/docs/using-hyperparameter-tuning#other_machine_learning_frameworks_or_custom_containers

Design

I'm thinking of the following Kubernetes API and webhook. Also, file-metrics-collector collects values whoose key is spec.objective.objectiveMetricName and spec.objective.additionalMetricNames from the metrcs file if FileSystemFileFormat is set Json.

+ type FileSystemFileFormat string
+
+ const (
+   TextFormat    FileSystemFileFormat = "Text"
+   JsonFormat    FileSystemFileFormat = "Json"
+ )

type FileSystemPath struct {
  Path string                     `json:"path,omitempty"`
  Kind FileSystemKind             `json:"kind,omitempty"`
+ FileFormat FileSystemFileFormat `json:"fileFormat,omitempty"`
}
func (g *DefaultValidator) validateMetricsCollector(inst *experimentsv1beta1.Experiment) error {
  mcSpec := inst.Spec.MetricsCollectorSpec
  mcKind := mcSpec.Collector.Kind
  ...
  switch mcKind {
  ...
  case commonapiv1beta1.FileCollector:
    ...
+     fileFormat := mcSpec.Source.FileSystemPath.FileSytemFileFormat
+     if fileFormat == "" {
+       fileFormat = commonapiv1beta1.TextFormat
+     } else if fileFormat != commonapiv1beta1.TextFormat && fileFormat != commonapiv1beta1.JsonFormat {
+         return return fmt.Errorf("The format of the metrics file is required by .spec.metricsCollectorSpec.source.fileSystemPath.fileFormat.")
+     }
  ...

Does it sound good to you? @kubeflow/wg-automl-leads

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@tenzen-y
Copy link
Member Author

/kind feature

@andreyvelich
Copy link
Member

Thank you for submitting this proposal @tenzen-y!
I have few questions.

because we can make use of JSON format metrics files without regexp more easily

Do you mean it will be easier for user to create such Experiments ? E.g. they don't need to manually add regex to Metrics Collector spec.

How we will change the parsing process in File Metrics Collector ?

Will we have special regex for JSON that we are going to use automatically, when user selects fileFormat=JSON ?

What do you think @gaocegege @johnugeorge ?

@tenzen-y
Copy link
Member Author

tenzen-y commented Dec 2, 2021

Sorry for the late reply. @andreyvelich

Do you mean it will be easier for user to create such Experiments ? E.g. they don't need to manually add regex to Metrics Collector spec.

Yes, users do not need to specify regexp using fileFormat=JSON.

Will we have special regex for JSON that we are going to use automatically, when user selects fileFormat=JSON ?

I think we can parse JSON files of the format shown in the proposal in the following script.

  • log.json
{"checkpoint_path": "", "global_step": "0", "loss": "0.22082142531871796", "timestamp": 1638422847.28721, "trial": "0"}
{"acc": "0.9349666833877563", "checkpoint_path": "", "global_step": "0", "timestamp": 1638422847.287801, "trial": "0"}
{"checkpoint_path": "", "global_step": "1", "loss": "0.1414974331855774", "timestamp": 1638422870.035161, "trial": "0"}
{"acc": "0.9586416482925415", "checkpoint_path": "", "global_step": "1", "timestamp": 1638422870.037459, "trial": "0"}
{"checkpoint_path": "", "global_step": "2", "loss": "0.10683439671993256", "timestamp": 1638422900.152162, "trial": "0"}
{"acc": "0.9688166379928589", "checkpoint_path": "", "global_step": "2", "timestamp": 1638422900.1529338, "trial": "0"}
{"checkpoint_path": "", "global_step": "3", "loss": "0.08619903773069382", "timestamp": 1638422927.9910269, "trial": "0"}
{"acc": "0.9747458100318909", "checkpoint_path": "", "global_step": "3", "timestamp": 1638422927.991675, "trial": "0"}
{"checkpoint_path": "", "global_step": "4", "loss": "0.07176543772220612", "timestamp": 1638422952.8473432, "trial": "0"}
{"acc": "0.9790566563606262", "checkpoint_path": "", "global_step": "4", "timestamp": 1638422952.848325, "trial": "0"}
  • parse.go
package main

import (
	"encoding/json"
	"fmt"

	"github.com/nxadm/tail"
)

func main() {
	t, _ := tail.TailFile("./log.json", tail.Config{Follow: true})
	metrics := []string{"loss"}

	for line := range t.Lines {

		logText := line.Text
		var jsonObj map[string]interface{}
		json.Unmarshal([]byte(logText), &jsonObj)

		for _, metric := range metrics {
			if _, exist := jsonObj[metric]; !exist {
				continue
			}
			fmt.Printf("%v\n", jsonObj[metric])
		}
	}
}

@andreyvelich
Copy link
Member

Sounds good.
Probably we should also refactor watchMetricsFile to support EarlyStopping in such Metrics Collector format.

WDYT about this proposal @gaocegege @johnugeorge ?

@tenzen-y
Copy link
Member Author

tenzen-y commented Dec 3, 2021

Probably we should also refactor watchMetricsFile to support EarlyStopping in such Metrics Collector format.

It makes sense. @andreyvelich

@tenzen-y
Copy link
Member Author

tenzen-y commented Dec 5, 2021

I'll wait for feedback from the community until December 10 (UTC+9).

@andreyvelich
Copy link
Member

@gaocegege @johnugeorge Please give your feedback on this proposal

@tenzen-y
Copy link
Member Author

I'll wait for feedback from the community until December 10 (UTC+9).

I'll postpone starting the implementation of this feature after December 18 (UTC+9) and wait for feedback from @gaocegege @johnugeorge

@gaocegege
Copy link
Member

Sorry for the late reply.

I am wondering how to use the JSON feature from the user side. Could you please give us an example, e.g. Experiment YAML?

@tenzen-y
Copy link
Member Author

tenzen-y commented Dec 10, 2021

Thanks for your reply! @gaocegege
I think users can use the below YAML, does it sound good?

apiVersion: kubeflow.org/v1beta1
kind: Experiment
metadata:
  namespace: kubeflow
  name: file-metrics-collector
spec:
  objective:
    type: maximize
    goal: 0.99
    objectiveMetricName: accuracy
    additionalMetricNames:
      - loss
  metricsCollectorSpec:
    source:
-     filter:
-       metricsFormat:
-         - "{metricName: ([\\w|-]+), metricValue: ((-?\\d+)(\\.\\d+)?)}"
      fileSystemPath:
        path: "/katib/mnist.log"
        kind: File
# omitempty; default=Text
+       fileFormat: Json
    collector:
      kind: File
...

@gaocegege
Copy link
Member

Gotcha, LGTM.

BTW, the variable should be JSON instead of Json.

I think we can go ahead! 🎉

@tenzen-y
Copy link
Member Author

Thank you for giving me feedback! @gaocegege

BTW, the variable should be JSON instead of Json.

It makes sense.

@tenzen-y
Copy link
Member Author

I'll wait for feedback from the community until December 10 (UTC+9).

I'll postpone starting the implementation of this feature after December 18 (UTC+9) and wait for feedback from @gaocegege @johnugeorge

I'd like to start the implementation of this feature!

/assign

@gaocegege
Copy link
Member

Please assign me when the PR is ready to review

Thanks for your contribution! 🎉 👍

@tenzen-y
Copy link
Member Author

Please assign me when the PR is ready to review

Thanks for your contribution! 🎉 👍

Sure, Thanks for your review! @gaocegege

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants