Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating Windows KEP for GA #729

Merged
merged 10 commits into from
Jan 25, 2019
Merged

Updating Windows KEP for GA #729

merged 10 commits into from
Jan 25, 2019

Conversation

michmike
Copy link
Contributor

Updating the windows KEP as we move the KEP towards implementable stage

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 25, 2019
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/pm sig/windows Categorizes an issue or PR as relevant to SIG Windows. labels Jan 25, 2019
@michmike
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2019
- Some kubeadm work was done in the past to add Windows nodes to Kubernetes, but that effort has been dormant since. We will need to revisit that work and complete it in the future.
- Calico CNI for Pod networking
- Hyper-V isolation (Currently this is limited to 1 container per Pod and is an alpha feature)
- It is unclear if the RuntimeClass proposal from sig-node will simplify scheduled Windows containers. we will work with sig-node on this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is still not well understood I don't think it needs to be included here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

folks from sig-architecture will likely ask about this, which is why i included here. indicating we will do more work on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My meta-point here is that Windows stable shouldn't require supporting an alpha or beta feature. We should continue working on a plan for this alongside SIG-Node. I think this is ok as-is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be clearer if the section was renamed to "Windows Node Roadmap" to make it explicit that the eventually is beyond the scope of GA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked about RuntimeClass back in November. :-)

@craiglpeters has a good point. I assume "eventually" is post-GA for all of these?

nodeSelector:
"beta.kubernetes.io/os": windows
tolerations:
- key: "Os"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you actually want a lower-case os here to match the example.

"beta.kubernetes.io/os": windows
tolerations:
- key: "Os"
operator: "Equals"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the operator is Equal not Equals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. updated.

@benmoss
Copy link
Member

benmoss commented Jan 25, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2019
- Horizontal Pod Autoscaling
- Windows Server 2019 is the only Windows operating system we will support at GA timeframe. Note above that the host operating system version and the container base image need to match. This is a Windows limitation we cannot overcome.
- Customers can deploy a heterogeneous cluster, with Windows and Linux compute nodes side-by-side and schedule Docker containers on both operating systems. Of course, Windows Server containers have to be scheduled on Windows and Linux containers on Linux
- Out-of-tree Pod networking with [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), [OVN-Kubernetes](https://github.com/openvswitch/ovn-kubernetes), [two CNI meta-plugins](https://github.com/containernetworking/plugins), [Flannel (VXLAN and Host-Gateway)](https://github.com/coreos/flannel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't VXLAN support only in 1903 currently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astrieanna by the time we GA, it will be supported for Server 2019

operator: "Equal"
Value: “Windows”
effect: "NoSchedule"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Windows containers are specific to the os version, does it make sense to have the taint/toleration include the windows version? While only 2019 is supported at GA, eventually there will be more versions of windows support (as new Windows versions are released). A version-specific taint could help containers land on the right nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were going to add that in the docs, but i made the change here as well for additional clarity

- All features and functionality under `What works today` is fully tested and vetted to be working by SIG-Windows
- SIG-Windows has high confidence to the stability and reliability of Windows Server containers on Kubernetes
- 100% green/passing conformance tests that are applicable to Windows (see the Testing Plan section for details on these tests)
- Comprehensive documentation that includes but is not limited to the following sections
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan for where these docs will live?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to include the location

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craiglpeters - have you talked to SIG-Docs on this yet?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PatrickLang I have not. Just started conversation with internal team about docs. I'll reach out to sig-docs via slack today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to have a plan for where the docs will live, and how they will be written. But finalizing those plans shouldn't be a prerequisite for calling this KEP /implementable

Co-Authored-By: michmike <michmike@users.noreply.github.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2019
@PatrickLang
Copy link
Contributor

/lgtm

More PRs still coming. The final one will include a change to status:implementable.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: benmoss, michmike, PatrickLang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@PatrickLang
Copy link
Contributor

/remove hold

@michmike
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2019
@k8s-ci-robot k8s-ci-robot merged commit 3c177cd into kubernetes:master Jan 25, 2019
Copy link
Member

@bgrant0607 bgrant0607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates


## Proposal

As of 29-11-2018 much of the work for enabling Windows nodes has already been completed. Both `kubelet` and `kube-proxy` have been adapted to work on Windows Server, and so the first goal of this KEP is largely already complete.

### What works today
- Windows-based containers can be created by kubelet, [provided the host OS version matches the container base image](https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility)
- ConfigMap, Secrets: as environment variables or volumes
- Pod (single or multiple containers per Pod with process isolation), Deployment, ReplicaSet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to mention Deployment and ReplicaSet here, and DaemonSet and StatefulSet below. Please discuss all the workload controllers adjacent to one another.

Do Job and CronJob have any issues? If not, please list them with ReplicaSet and Deployment.


## Proposal

As of 29-11-2018 much of the work for enabling Windows nodes has already been completed. Both `kubelet` and `kube-proxy` have been adapted to work on Windows Server, and so the first goal of this KEP is largely already complete.

### What works today
- Windows-based containers can be created by kubelet, [provided the host OS version matches the container base image](https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility)
- ConfigMap, Secrets: as environment variables or volumes
- Pod (single or multiple containers per Pod with process isolation), Deployment, ReplicaSet
- Services types NodePort, ClusterIP, LoadBalancer, and ExternalName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headless services?

Are there any DNS differences?

- Dockershim CRI
- Many<sup id="a1">[1]</sup> of the e2e conformance tests when run with [alternate Windows-based images](https://hub.docker.com/r/e2eteam/) which are being moved to [kubernetes-sigs/windows-testing](https://www.github.com/kubernetes-sigs/windows-testing)
- Persistent storage: FlexVolume with [SMB + iSCSI](https://github.com/Microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows), and in-tree AzureFile and AzureDisk providers
- Windows Server containers can take advantage of StatefulSet functionality for stateful applications and distributed systems
- Windows Pods can take advantage of DaemonSet, with the exception that privileged containers are not supported on Windows (more on that below)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you mentioned "Windows server containers" and here "Windows pods". Is there any difference in meaning between the two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no difference. i will update the naming to be consistent.

- Resource limits
- Pod & container metrics
- Pod networking with [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), [OVN-Kubernetes](https://github.com/openvswitch/ovn-kubernetes), [two CNI meta-plugins](https://github.com/containernetworking/plugins), [Flannel](https://github.com/coreos/flannel) and [Calico](https://github.com/projectcalico/calico)
- Horizontal Pod Autoscaling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are system OOMs reported?

Are there notable differences in Pod Status fields?

- Some kubeadm work was done in the past to add Windows nodes to Kubernetes, but that effort has been dormant since. We will need to revisit that work and complete it in the future.
- Calico CNI for Pod networking
- Hyper-V isolation (Currently this is limited to 1 container per Pod and is an alpha feature)
- It is unclear if the RuntimeClass proposal from sig-node will simplify scheduled Windows containers. we will work with sig-node on this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked about RuntimeClass back in November. :-)

@craiglpeters has a good point. I assume "eventually" is post-GA for all of these?


### What will never work (without underlying OS changes)
- Certain Pod functionality
- Privileged containers
- Privileged containers and other Pod security context privilege and access control settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked a bunch of other questions on the original KEP PR:
#676 (comment)
#676 (comment)
#676 (comment)
#676 (comment)
#676 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bgrant0607 , which linux capabilities specifically do you mean? these ones? https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

## Graduation Criteria
#### Ensuring OS-specific workloads land on appropriate container host
As you can see below, we plan to document how Windows containers can be scheduled on the appropriate host using Taints and Tolerations. All nodes today have the following default labels
- beta.kubernetes.io/os = [windows|linux]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting the promotion of these to stable:
kubernetes/kubernetes#72929


## Implementation History
However, we understand that in certain cases customers have a pre-existing large number of deployments for Linux containers. Since they will not want to change all deployments to add nodeSelectors, the alternative is to use Taints. Because the kubelet can set Taints during registration, it could easily be modified to automatically add a taint when running on Windows only (`“--register-with-taints=’os=Win1809:NoSchedule’” `). By adding a taint to all Windows nodes, nothing will be scheduled on them (that includes existing Linux Pods). In order for a Windows Pod to be scheduled on a Windows node, it would need both the nodeSelector to choose Windows, and a toleration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just deployments, but also ecosystem off-the-shelf configurations, such as community Helm charts, and programmatic pod generation cases, such as with Operators. I think taints are going to be needed in most cases.


## Graduation Criteria
- All features and functionality under `What works today` is fully tested and vetted to be working by SIG-Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this section complete, or is @craiglpeters still working on it?

My previous comment:
#676 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i made some more edits now that i will pushing through

11. Advanced: How to use Hyper-V isolation (not a stable feature yet)
12. Advanced: How to build Kubernetes for Windows from source
13. Supported functionality (with examples where appropriate)
14. Known Limitations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any node addons work, such as node problem detector?

- Horizontal Pod Autoscaling
- Windows Server 2019 is the only Windows operating system we will support at GA timeframe. Note above that the host operating system version and the container base image need to match. This is a Windows limitation we cannot overcome.
- Customers can deploy a heterogeneous cluster, with Windows and Linux compute nodes side-by-side and schedule Docker containers on both operating systems. Of course, Windows Server containers have to be scheduled on Windows and Linux containers on Linux
- Out-of-tree Pod networking with [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), [OVN-Kubernetes](https://github.com/openvswitch/ovn-kubernetes), [two CNI meta-plugins](https://github.com/containernetworking/plugins), [Flannel (VXLAN and Host-Gateway)](https://github.com/coreos/flannel)
- Dockershim CRI
- Many<sup id="a1">[1]</sup> of the e2e conformance tests when run with [alternate Windows-based images](https://hub.docker.com/r/e2eteam/) which are being moved to [kubernetes-sigs/windows-testing](https://www.github.com/kubernetes-sigs/windows-testing)
- Persistent storage: FlexVolume with [SMB + iSCSI](https://github.com/Microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows), and in-tree AzureFile and AzureDisk providers
Copy link
Member

@ddebroy ddebroy Jan 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions/notes from the storage perspective:

  1. Is there some in-tree code that specifically allows AzureDisk to work with Windows that is not present for other similar existing in-tree block/disk backed storage plugins like GCE PD/AWS EBS/etc?

  2. If GCE PD/AWS EBS and others are known to work with Windows workers, can they also be added here (along with Azure Disk) please for clarity?

  3. In the context of the CSI Migration initiative (the effort to have in-tree plugins shim out to CSI versions of the in-tree plugins over a couple of releases so that eventually the in-tree plugin code can be removed), lack of support for CSI node plugins for Windows 2019 may have an impact if EBS/GCE-PD in-tree works with Windows workers today but their CSI counterparts will not in the future (until Windows OS enhancements to support CSI node plugins like mount propagation, privileged containers, etc. are in).

  4. While SMB based storage will be available (through the Flexvolume plugin and AzureFile), can the support for NFS based storage be clarified? For example, are there any plans for a NFS Flexvolume plugin for Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For NFS, just came across kubernetes/kubernetes#56188 (comment). So sounds like NFS [#4 above] is beyond scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will try to get answers to your questions


### Non-Goals

- Adding Windows support to all projects in the Kubernetes ecosystem (Cluster Lifecycle, etc)
- Enable the Kubernetes master components to run on Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is supporting LCOW a non goal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now yes. i will clarify

@michmike
Copy link
Contributor Author

@ddebroy and @bgrant0607 you asked some really good questions. we will find the answers and make the necessary updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants