Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring Kubeflow docs proposal #440

Closed
wants to merge 33 commits into from
Closed

Conversation

rui-vas
Copy link
Contributor

@rui-vas rui-vas commented Oct 29, 2020

Phase 1 - mapping out docs today.

See /website #2293

@kubeflow-bot
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: RFMVasconcelos
To complete the pull request process, please assign james-jwu after the PR has been reviewed.
You can assign the PR to them by writing /assign @james-jwu in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-cla google-cla bot added the cla: yes label Oct 29, 2020
@rui-vas rui-vas changed the title [WIP] - Mapping Kubeflow website nav [WIP] - Mapping Kubeflow docs Oct 29, 2020
@rui-vas rui-vas changed the title [WIP] - Mapping Kubeflow docs [WIP] - Restructuring Kubeflow docs proposal Oct 29, 2020
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this @RFMVasconcelos, I left few comments.

proposals/restructure-KF-docs-proposal.md Outdated Show resolved Hide resolved
proposals/restructure-KF-docs-proposal.md Outdated Show resolved Hide resolved
proposals/restructure-KF-docs-proposal.md Outdated Show resolved Hide resolved
- Using IBM Cloud Container Registry (ICR)
- Pipelines on IBM Cloud Kubernetes Service (IKS)
- End-to-end Kubeflow on IBM Cloud
- **Kubeflow Operator**
Copy link
Member

@andreyvelich andreyvelich Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubeflow operator is also part of Kubeflow components ?
Or we would say that kfctl is tool to deploy, monitor and manage the lifecycle of Kubeflow ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubeflow operator seems to be deployment tooling, loosely connected to kfctl, though under the same WG umbrella. It could fall into a "Lifecycle management" section or under "Getting started", as in effect it is an alternative deployment method for OpenShift.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that @RFMVasconcelos.
My personal thought is that all Lifecycle management tools should be under the same website section.

- Experiment with the Pipelines Samples
- Run a Cloud-specific Pipelines Tutorial
- Troubleshooting
- Reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipelines has their own Reference.
Does it make sense to move Pipelines reference under /docs/reference where other projects are located ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that Pipelines and Fairing are the only components that have references outside of reference. Might make sense to merge everything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comes back to the question of splitting component applications onto separate sites. In that case, each component application's documentation will need to be self-contained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the conversations in the community meetings that I have attended, it sounds like we are moving towards separate sites for each component. Would it make sense to add a separate file for modeling out what that restructure would look like if we split the component docs onto other sites? Or, should we model that in this doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for Reference, if we move to the separate domains method, I would argue that we should keep a copy of reference on both, in the same manner as we currently do with manifests in the github org.

@joeliedtke happy with either. I guess none of this is set in stone. With this document I aim at identifying the quickest structural changes we can make to make the website more clear. We still need to rely on WG-leads to make sure their lower-level docs structure makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I can't edit files in this PR, so here is a quick sketch of what I have been thinking about for information architecture of Kubeflow.org after the component docs move to their own site:

  • Kubeflow Home Page
  • Getting started (same as current proposal, consider renaming to About Kubeflow)
  • Community (same as current proposal)
  • Components
    • Jupyter Notebooks (Single page that describes what this component is and where to learn more)
    • Central Dashboard (Single page that describes what this component is and where to learn more)
    • Metadata (Single page that describes what this component is and where to learn more)
    • Fairing (Single page that describes what this component is and where to learn more)
    • Feature Store (Single page that describes what this component is and where to learn more)
    • Frameworks for training (Single page that describes what this component is and where to learn more)
    • Hyperparameter Tuning (Single page that describes what this component is and where to learn more)
    • Kubeflow Pipelines (Single page that describes what this component is and where to learn more)
    • Jupyter Notebooks (Single page that describes what this component is and where to learn more)
    • Tools for Serving (Single page that describes what this component is and where to learn more)
    • Multi-Tenancy (Single page that describes what this component is and where to learn more)
    • Nuclio functions (Single page that describes what this component is and where to learn more)
  • Docs
    • Deployment (same as current proposal)
    • Configuring Kubeflow (same as Setups in the current proposal)
    • Resources (Review content, if it is general to Kubeflow then keep this section)
    • Troubleshooting
    • Reference (Review content, if it is general to Kubeflow then keep this section)

The thought is to focus the Kubeflow.org site on describing the Kubeflow project, components, and deployment options. All of the docs for the component applications will be hosted on other sites.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joeliedtke, thank you for the feedback!

Regarding whether docs are hosted on other sites, I think the downsides might be consistency, driving people away from Kubeflow.org and harder to make sure the websites are high quality. But despite that, I'm happy with what the community decides.

As per your suggestions:

  • I like the Kubeflow Home Page on the sidebar.
  • I think Docs is not a good word there as all are docs, but I see the point of merging those 5 things under an umbrella.
  • I also like Configuring instead of Setups

@thesuperzapper
Copy link
Member

@RFMVasconcelos just to be clear, is this a proposal for a quick (but temporary) fix to the doc structure?

Because seeing the discussion in kubeflow/website#2293, it's very clear the community wants to split the docs for each component into their own websites/subdomains. (While obviously leaving general stuff in kubeflow.org, and linking to the new component websites)

@rui-vas
Copy link
Contributor Author

rui-vas commented Dec 5, 2020

@thesuperzapper please see this comment. WDYT?

@rui-vas rui-vas changed the title [WIP] - Restructuring Kubeflow docs proposal Restructuring Kubeflow docs proposal Dec 6, 2020
@mameshini
Copy link
Contributor

mameshini commented Jan 14, 2021 via email

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 15, 2021

@joeliedtke, thank you for the feedback!

I currently have Tasks>Deployment>Kubeflow on X cloud. I now get your point that under Kubeflow on X cloud there is more than deployment.

@mameshini, thank you as well! I also see that this proposal does not have a good space for distributions yet.

I believe the best way to bring clarity then is through the separation of:
Platforms - Workstation, AWS, Azure, GCP, IBM Cloud
Distributions - MiniKF, AgileStacks, Charmed Kubeflow, Openshift, Operator

This would look like:

  • Home
  • Getting Started
  • Community
  • Components
  • Platforms
    • Workstation
    • Kubeflow on AWS
    • Kubeflow on Azure
    • Kubeflow on Google Cloud
    • Kubeflow on IBM Cloud
  • Methods & Distros
    • kfctl
    • MiniKF
    • Kubeflow Operator
    • Distro X (new)
    • Distro Y (new)
    • ....
  • Troubleshooting
  • Resources
  • Reference

What are your thoughts on this?

It would be great to reach a consensus soon, so we can make this actionable, given that this effort is now 2.5 months old :) And no changes have taken effect.

cc @jbottum @jlewi @aronchick @Bobgy @james-jwu @8bitmp3 @PatrickXYS @thesuperzapper @andreyvelich @cvenets

@mameshini
Copy link
Contributor

@RFMVasconcelos your proposal for docs structure looks great to me, happy to help with the content.

@PatrickXYS
Copy link
Member

Let's move forward with this doc proposal, overall lgtm

Thank you for your contributions @RFMVasconcelos

@joeliedtke
Copy link
Member

One of the challenges that I see in revising the information architecture at this time is that we have a couple large changes within the project that are currently in progress, such as moving the component docs off of Kubeflow.org and moving towards providing different distributions of Kubeflow. In both cases, I don't think that we have enough information to make the best decisions so I lean towards making smaller changes now and incrementally revising the plan as the larger changes within the project move forward.

For example, will Distributions be distinct from Platforms? Or, will we have individual distributions to support different cloud providers or service providers? Are the docs for the AgileStacks distribution the same on-prem and AWS, or is this a separate set of documentation?

If the documentation should support individual distributions, then way may not need a platforms section (since content for a platform could live within the section for the distribution.). If there is some content that is shared between distributions (for example an AWS distribution and AgileStacks on AWS), then we may want to consider options for managing shared content within the distributions section. (That said, this increases the complexity of managing the docs.)

I think that more discussions are required to understand the relationship between Distributions and Platforms going forward. So, I'm not sure that it makes sense to add Distributions and Platforms at this time. There is probably another set of questions around, is Kubeflow.org the best place for documentation about distributions? Should Kubeflow.org describe the distributions that are available and help users determine which distribution is right for them and then link the user to the distribution's docs? (Similarly to the long term plan for components.)

Also, please do not abbreviate Distributions to Distros in the docs.

@mameshini
Copy link
Contributor

@joeliedtke Yes Distributions are separate from Platforms. Distributions are based on upstream Kubeflow but complement it with additional installation tools, opinionated configurations, or integrations. For example OpenShift distribution of Kubeflow brings it together with other OpenShift tools. It only helps Kubeflow users to know about available deployment options - A) do it yourself from the latest upstream on selected platform (such as Azure) or B) select a distribution and follow distribution specific deployment instructions also there is an option to engage professional services. Kubeflow.org is an excellent place to list all available distributions, similar to kubernetes.io and many other open source projects. This was already discussed many times and I am surprised you asking this question.

Distributions exist to make it easier to deploy a subset of Kubeflow components on a selected cloud platform, with available commercial support from a vendor. The detailed content specific to each distribution will consist of an overview page with links pointing to distribution-maintained. In most cases the instructions to deploy a distribution will be platform-dependent. For example, separate pages for on-prem, GCP, and AWS. But we don't have to worry about it - leave it to the distribution. We are building a community of users and companies that work together on making Kubeflow the best open source machine learning platform.

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 20, 2021

@joeliedtke, I think we shouldn't be stuck until we have the perfect information to move forward.

Many people in the community have shown support for this initiative and honestly, if we don't start now, we will continue driving users through a poor UX, which we know hinders growth in the adoption of Kubeflow, which is a driver for sales in the organizations involved and hence feeds back to how much these orgs are able to invest on Kubeflow.

It is my belief that we should be impatient in providing a comprehensible Kubeflow user experience.

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 20, 2021

@abhi-g @Bobgy @james-jwu @jlewi @richardsliu @theadactyl as the OWNERs able to approve this initiative, can you please review?

Many thanks! 🚀

@joeliedtke
Copy link
Member

I'm not suggesting that we wait for a perfect solution. I am concerned about premature optimization. Making decisions about things that we don't fully understand is a good way to create future problems. I'm suggesting that we limit the scope and iteratively work towards improving the site.

@joeliedtke
Copy link
Member

joeliedtke commented Jan 20, 2021

@mameshini To what extent are distributions and platforms distinct? For example, is all of the documentation for Kubeflow on AWS applicable to Kubeflow on AWS when deployed by AgileStacks? Should the documentation for an AgileStacks distribution specify everything that someone will need to know to install Kubeflow? Or, is there a subset of content for Kubeflow on AWS that is applicable to the AgileStacks distribution?

Analysis like this can help us determine if we need both sections (and I'm not arguing that we don't need distributions... I'm wondering if platform specific content may become part or the content for a distribution.) and techniques that will be helpful in managing this content.

@joeliedtke
Copy link
Member

@RFMVasconcelos

Please change Distros to Distributions.

@mameshini
Copy link
Contributor

@joeliedtke There is a subset of content for Kubeflow on AWS that is applicable to the AgileStacks distribution, but certainly there are some differences. Each distribution is making some opinionated choices to provide better integration or user experience. For example, documentation on AWS walks a user through steps to configure ACM, Cognito, Auth0. For AgileStacks distribution we decided to use Letsencrypt, Dev, LDAP to make it work the same way as on-prem. Therefore, information about installing Kubeflow can be very distribution & platform specific, while information about using Kubeflow should be very common between distributions. We also provide a set of tutorials/examples that are based on AgileStacks distribution so we can expect S3 buckets, secrets, configmaps to be available for tutorial to work out of the box.

Platform specific content (like Kubeflow on AWS) is in addition to distribution-specific content, and in most cases it will apply completely accurately to AgileStacks distribution. However, our goal is to maintain a distribution that is deployed using AgileStacks Hub CLI and make it work across cloud and on-prem consistently. AWS team is doing a great job maintaining AWS specific documentation, however they don't have a goal for this documentation to also work for on-prem deployments. Therefore they are selecting some tools and configuration options that will not work on-prem or on GCP. Multiple distributions make different choices for installation and integration, which are fine-tuned for different Kubeflow end user personas.

@joeliedtke
Copy link
Member

@mameshini Sorry for the delay getting back to you on this.

To me this sounds like the platform documentation should exist as part of the distribution content. We could help users understand which distributions work on their platform of choice, but it sounds like the bulk of the platform content is distribution specific.

I would recommend the following changes:

  • Content in the Platform section should move under the appropriate distribution in the Distributions section.

  • In the Methods and Distributions section, I'm not sure that Methods is a meaningful term in this case. I think that Distributions is the key term, and we can help users find the correct distribution for their environment. I would recommend changing that section to Distributions.

  • Slightly off topic, I'm interpreting the order of the pages as specifying the primary page for each section. If so, there are a few changes to suggest:

    • Under Getting Started, make the current getting started page the primary page for this section. Currently, the getting started doc is getting nearly twice the traffic that the about Kubeflow doc docs, however they are both popular pages.
    • Add an index page to Components, Distributions, Resources, Reference, and Troubleshooting.

What links are you envisioning appearing in the top nav?

@mameshini
Copy link
Contributor

Distributions documentation is in addition to platforms specific documentation. I agree that it's better to rename "Methods and Distributions" to simply "Distributions".  Let's build it out and then we can fine tune it later.  Overall the proposal lgtm.

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 28, 2021

@joeliedtke, thank you for all the comments.

I have removed "Methods &", leaving only "Distributions". As @mameshini suggests, let's take this piece by piece, and eventually we will get to the creation of things like more index pages :)

If in the meantime you'd like to start one of those efforts in parallel we can definitely attempt that. I think this has to look like a list of issues, and so I will create tomorrow a project with a list of issues, so we can start making all of this better 🚀

@joeliedtke
Copy link
Member

@RFMVasconcelos , to clarify, are you planning to add issues for the entire doc plan, or a project where we can create issues for different aspects of this project? (For example, moving Pipelines to Components, or reordering the content of the Components section.) I agree that it would be helpful to track this work in one place. I would currently recommend against creating all of the tasks now, since discussions are continuing on some aspects of this plan.

@thesuperzapper
Copy link
Member

thesuperzapper commented Jan 29, 2021

@RFMVasconcelos @joeliedtke as discussed in this week's community meeting, we should get moving on the "Components" second refactor, this is because there is no controversy around it, and we want WG's to start updating their docs for Kubeflow 1.3.

For all those following this thread, please go to kubeflow/website#2465 for the "Components" issue which discusses the specifics of how we will do the refactor.

Lets aim to get that refactor done within the next week.

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 29, 2021

Sounds great to me as a start! We can start with Components.

I've started this project and so we can track issues/PRs so we can break this beast into smaller tasks and address them effectively.

@joeliedtke this is what I meant by project, no need to make all issues for the entire doc plan now, let's start small and soon we'll have improved the overall UX :)

@rui-vas
Copy link
Contributor Author

rui-vas commented Jan 29, 2021

@joeliedtke @mameshini @thesuperzapper I have created the first set of issues & PRs all attached to this project, so we can get this up and running! 🚀

As per discussion in the last community meeting, adding "Methods" back in, as for example `kfctl` is not a distribution but rather an installation method.
@stale
Copy link

stale bot commented Jun 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Jun 2, 2021
@stale stale bot closed this Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.