Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Prometheus monitoring docs for Training Operator #2254

Closed
andreyvelich opened this issue Sep 10, 2024 · 6 comments · Fixed by #2301
Closed

Update Prometheus monitoring docs for Training Operator #2254

andreyvelich opened this issue Sep 10, 2024 · 6 comments · Fixed by #2301

Comments

@andreyvelich
Copy link
Member

As we discussed in this PR, we should update and move the Prometheus monitoring docs to the Kubeflow website: #2252.

cc @kubeflow/wg-training-leads

/good-first-issue
/area docs
/kind feature

Copy link

@andreyvelich:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

As we discussed in this PR, we should update and move the Prometheus monitoring docs to the Kubeflow website: #2252.

cc @kubeflow/wg-training-leads

/good-first-issue
/area docs
/kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sophie0730
Copy link
Contributor

Hello,
I'm new to Kubeflow and interested in contributing to this issue. Before I begin, I have a few questions:

  1. Will the documentation be contributed to the kubeflow/website repository? (Will the PR be created in the kubeflow/website repo?)
  2. Should I draft a new version of the Prometheus monitoring doc, or can I directly move the document you provided to the website?
  3. If I need to draft a new version of the Prometheus monitoring doc, could you please direct me to resources that will help me fully understand this effort?

Thank you for your guidance. I'm looking forward to contributing to the project. @andreyvelich

@tenzen-y
Copy link
Member

As we discussed in this PR, we should update and move the Prometheus monitoring docs to the Kubeflow website: #2252.

+1

@tenzen-y
Copy link
Member

Will the documentation be contributed to the kubeflow/website repository? (Will the PR be created in the kubeflow/website repo?)

Yes.

Should I draft a new version of the Prometheus monitoring doc, or can I directly move the document you provided to the website?

Reorganizing existing descriptions and making the metrics details more informable would be helpful.

@andreyvelich
Copy link
Member Author

@tenzen-y is right, @sophie0730 feel free to submit draft PR in the kubeflow/website to add docs about Prometheus metrics.
/assign @sophie0730

@sophie0730
Copy link
Contributor

Thanks @tenzen-y and @andreyvelich! I'll raise a PR as soon as possible.

sophie0730 added a commit to sophie0730/training-operator that referenced this issue Oct 23, 2024
Signed-off-by: Sophie <sophy010017@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants