Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert OLM V1 MVP share doc to markdown #166

Merged
merged 1 commit into from
Apr 18, 2023

Conversation

tlwu2013
Copy link
Contributor

This commit add the OLM v1 doc in markdown which is converted from the shared doc: OLM V1 Product Requirements Document

Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a direct port from the Google doc, but I have a couple non-blocking comments:

docs/olmv1_mvp.md Outdated Show resolved Hide resolved
## Functional Requirements
_Priority Rating: 1 highest, 2 medium, 3 lower (e.g. P2 = Medium Priority)_

**F1 - Extension catalogs (P1):** The existing OLM concepts around catalogs, packages and channels is to be used as a basis for below functional requirements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a contributor perspective, I think it would be neat to set these up as check boxes so that when we have met these requirements we can check them off. Would be helpful to easily tell which things have been achieved and which have not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having a visual checklist to track progress, but at the same time, I worry the rollout of certain items may occur in multiple phases, which could make it difficult to promptly check items off. As such, should we leave the bullet list as is for the time being and submit a pull request for an update once there are features that can be checked off?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO an item should not be checked off until it is fully implemented. I'm fine with checking off after the fact if we are okay with that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would at least make the individual items markdown headers so that we get link anchors out of them. But that can easily happen in a follow-up.


OLM has unique support for the specific needs of cluster extensions, which are commonly referred to as operators. These are classified as one or more Kubernetes controllers shipping with one or more API extensions (CustomResourceDefinitions) to provide additional functionality to the cluster (though there are deviations from this coupling of CRDs and controllers, discussed below). They are managed centrally by OLM running on the cluster, where OLMs functionality is implemented following the Kubernetes operator pattern as well.

OLM defines a lifecycle for these extensions in which they get installed, potentially causing other extensions to be installed as well, a limited set of customization of configuration at runtime, an update model following a path defined by the extension developer, and eventual decommission and removal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to call the extensions in this particular context as "operators" directly for easier understanding?

**B6 - Human-readable status extensions information (P2):** Whenever OLM is in the process of or having reached or failed to reach a desired state it needs to update the user about what is happening / what has happened without assuming knowledge about OLM internal or implementation details.

**B7 - Scalability & Resource consumption (P1):** OLM is used on clusters with hundreds to thousands of namespaces and tenants. Its API controls, specifically for F2 and F12 need to be built in such a way that resource consumption scales linearly with usage and cluster size and the overall resource usage envelope stays within manageable bounds that does not put the cluster stability, especially that of the API server at risk. System memory especially is a scarce resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to talk about safer installs and rollback in case there is install is unsuccessful.

Comment on lines 107 to 109
- No additional tenancy model will be introduced at the control plane / API layer of Kubernetes upstream

- kcp doesn’t fundamentally change OLMs role and responsibilities around managing extensions (at least initially)
Copy link
Member

@varshaprasad96 varshaprasad96 Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these assumptions explicitly mentioned for an user?


- kcp doesn’t fundamentally change OLMs role and responsibilities around managing extensions (at least initially)

- OLM will move to a descoped, cluster-wide singleton model for cluster extensions, extension management isn’t namespaced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- OLM will move to a descoped, cluster-wide singleton model for cluster extensions, extension management isn’t namespaced
- OLM will move to a descoped, cluster-wide singleton model for cluster extensions, extension management isn’t namespaced

Could we simply this to say that "OLM will consider extensions to be cluster scoped". I understand where this is coming from, but just wondering if an upstream user would be able to get the context easily.

- OLM will move to a descoped, cluster-wide singleton model for cluster extensions, extension management isn’t namespaced

## Constraints
- Only operator bundles with “AllNamespace” mode installation support can be lifecycled with the new APIs / flows in OLM 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Only operator bundles with “AllNamespace” mode installation support can be lifecycled with the new APIs / flows in OLM 1.0
- Only operator bundles with “AllNamespace” mode installation support can be lifecycled with the new APIs / flows in OLM 1.0

This was supposed to be Rukpak's limitation just for the time being, right? Or is this how it is going to be moving forward?
Seems weird to say that OLM v1 will not support single or multiple namespace mode anytime.


# TODO
- Definition of "extension"
- Does OLM become ELM? Does this provide of provisioning bundles that do not add APIs?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's ELM? (Extension lifecycle manager?)

Comment on lines 121 to 130
## Migration
- A new set of APIs is introduced in parallel to the existing set of APIs

- Users opt-in to the new set of APIs, potentially resulting in a reinstall of their extension if required

- Extensions that are shipped with the current bundle format with AllNamespace mode can simply be reused with the new set of APIs and controls

- Extensions that do not support AllNamespace mode cannot be managed with the new APIs

- Migration scripting is provided to mass-convert existing installed extensions (“Subscription” / “OperatorGroup” objects) on existing clusters to the new OLM 1.0 model assuming they are compatible

- Operator authors that are also SRE/Managed PaaS administrators are incentivized to make their operator compatible with the requirements of OLM 1.0 to reap the operational benefits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if migration from v0 to v1 is something we want to talk about now? Support for this is definitely necessary, but we still haven't looked into how we will be doing it. Can't we just say that it is necessary and needs to be worked on?

@@ -0,0 +1,136 @@
# OLM v1 MVP
Copy link
Contributor Author

@tlwu2013 tlwu2013 Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functional/behavioral requirements included in this doc are actually beyond the scope of the MVP. What do you all think if we title this as "OLM v1 Project Requirement Document (PRD)"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "OLMv1 Roadmap"? IMO it would be nice to break this down into different sections that include what is needed for the MVP and then what is planned for the future. Maybe a structure like:

# OLM v1 Roadmap

## MVP

### Functional Requirements
### Behavioral Requirements

## Future Work

### Functional Requirements
### Behavioral Requirements

That way we can have a clear line between what is necessary for the MVP and what is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the title and the filename to "roadmap" in this PR, and should we defer defining the MVP and Future Work scope in the future PR?

@tlwu2013
Copy link
Contributor Author

hey folks, thanks for looking at this PR, and I've addressed some of the feedback quickly. Since the intention for this PR is a direct porting from shared doc to markdown so we can have other feature requirements being discussed and captured in the upstream requirement markdown, should we merge this one first and address some feedback in later PRs that needs extended discussions? Thank you all!

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I agree that we should go ahead and merge then and then make further suggestions in follow-on PRs. Otherwise, I fear there will always be something debatable here that keeps this PR from merging. Let's have those individual debates in smaller PRs.

@tlwu2013 Could you amend your commit to get the DCO to pass? Instructions here: https://probot.github.io/apps/dco/

Signed-off-by: Tony Wu <tlwu2013@gmail.comexport>
@tlwu2013
Copy link
Contributor Author

This looks good to me. I agree that we should go ahead and merge then and then make further suggestions in follow-on PRs. Otherwise, I fear there will always be something debatable here that keeps this PR from merging. Let's have those individual debates in smaller PRs.

@tlwu2013 Could you amend your commit to get the DCO to pass? Instructions here: https://probot.github.io/apps/dco/

Done, thanks @joelanford !

@joelanford joelanford merged commit ca22669 into operator-framework:main Apr 18, 2023
@tlwu2013 tlwu2013 deleted the v1-mvp branch April 18, 2023 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants