Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.9] Issue Burndown #38

Closed
enisoc opened this issue Nov 27, 2017 · 14 comments
Closed

[1.9] Issue Burndown #38

enisoc opened this issue Nov 27, 2017 · 14 comments
Assignees
Labels
sig/release Categorizes an issue or PR as relevant to SIG Release.

Comments

@enisoc
Copy link
Member

enisoc commented Nov 27, 2017

This is a tracker for status updates on 1.9 issue burndown.

@jberkus
Copy link
Contributor

jberkus commented Nov 27, 2017

Summary for Nov. 27

A slew of issues has been opened around failing tests, including:

New Test Failures (mostly e2e tests):
#56416
#56429
#56426
#56422
#56414

Old Test Failures in Progress:
#56262
#56235
#53020
#56091

Old Test Failures Not Getting Attention:
#56244
#56155
#56052

New Bugs:
#56357
#56399
#56357
#56348 (not clear that this should be in 1.9)

Old Bugs in Progress:
#56385
#56355

Old Bugs Not Getting Attention:

Non-release-blocker:
#47820

Special Cases:

#54551: Advanced Auditing (kubernetes/kubernetes#54551) really appears to be Not Ready for 1.9, depending on a bunch of PRs which either don't exist or aren't close to approval. Some of those PRs might be separable from a 1.9 feature, but I'm not clear on which those are. I've escalated to the feature/issue owner and sig/instrumentation. This includes some related issues:
#53020
#53006

#55978: this issue looks like it should be closed, given the code merges. However, I haven't been able to get sig/node or JingXu to confirm that.

#49480: CNI 0.6.0 this issue appears to have a huge chicken-egg problem with dependencies. It's making progress, but slowly. Escalated to sig/network

@jberkus
Copy link
Contributor

jberkus commented Nov 28, 2017

Today's status:

Summary for Nov. 28

20 total issues, so making progress

All "not getting attention" issues have been raised with the respective SIGs on Slack.

New Test Failures (<1 day old)
#54574 (old bug, just reopened)

Test Failures in Progress:
#56429
#56416
#56414
#56262
#56052
#55194
#47820

Test Failures Not Getting Attention:
#56244 (6 days)
#56426 (1 day)

New Bugs (<1 day old):

Bugs in Progress:
#56399
#56357
#56348

Bugs Not Getting Attention:
#56155 (3 days)
#55892 (5 days) (gave 24-hour deadline)

Special Cases:

#54551: Advanced Auditing (kubernetes/kubernetes#54551) Exception raised on burndown list. This includes some related issues:
#53020
#53006

#55978: has been closed.

#49480: CNI 0.6.0 Now waiting on figuring out packaging/deployment. Receiving attention.

#56504: kubeadm tracker. I have not get gone through the kubeadm issues for status. More later.

@jberkus
Copy link
Contributor

jberkus commented Nov 29, 2017

So, based on today's meeting, let me give a breakdown of which issues appear to be blockers to me. Please note that, for many of the ones not receiving enough attention, I also don't have feedback from the SIG on how critical they are, so I'm guessing based on the description & reports back. Issues that I don't mention below are non-concerning.

Bugs & Test Failures

Red Tests/Bugs

Upgrade & Downgrade: Several failures, mostly on GKE. Downgrade appears generally broken.

  • 56244
  • 56426 & 56429 & 56422 on GKE

Horizontal Pod Autoscaling may be failing, or the test may be broken, we can't tell #54574

Yellow:

Misc Bugs and Performance issues with proposed fixes:

  • #56357, polling problem, already has proposed fix
  • #55892, detach broken in AWS, already has proposed PR which needs adding

Tests that need test fixes:

  • #56416, network partition test, claim is that there's a defect in the test.
  • #56414

Performance Test issues, not possible to tell if these are actual performance issues
or issues with the test platform:

  • #56052

Features

Blocking

Kubeadm just added a whole list of milestone items (#56504). However, it is opaque to me which of those is actually a blocker, and which is in progress.

CNI 0.6.0 (#49480): the code has been changed, but how this works for deployments and packaging has not been resolved.

Non-Blocking

AWS NVME Support: #56155, in progress, but if we don't have it, it's not a feature that worked before. See Exception request.

Advanced Audit Logging: includes #53020, #53006, #54551. This whole feature can be pushed to 1.10 if it's not ready on time. See Exception request.

@zacharysarah
Copy link
Contributor

zacharysarah commented Nov 30, 2017

Docs

Friday, December 1 is the deadline for all docs PRs to be reviewable.

According to the feature tracking spreadsheet, these issues still need to open a docs PR (due 11/22):

kubernetes/enhancements#365
kubernetes/enhancements#353
kubernetes/enhancements#507
kubernetes/enhancements#178

/cc @enisoc @idvoretskyi

@jberkus
Copy link
Contributor

jberkus commented Nov 30, 2017

Status update for Friday, Dec 1:

Status of PRs and Bugs

I'm adding PRs to this burndown, because I'm concerned about PRs which haven't yet been merged.

The good news is that we're down to 17 issues and 10 PRs. The bad news is that two of those 18 issues are tracking issues. This status is updated before the extensions are due.

Red

bugs/failures which appear user-visible and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

  • #56594 Kubeadm test (new failure)
  • #56357 Performance Polling Issue (PR submitted, but see below)
  • #56244 Downgrade still failing (just got owner)
  • #56052 & 55194 Scale tests still failing. (Nagged issue owner/SIG)

PRs:

  • #56478 fix for polling issue, apparently has major bugs
    (not tagged with 1.9 milestone) (SIG polled)
  • #56513 upgrade/downgrade fixes
  • #56513 Kubeadm Upgrade/Downgrade bugs

Yellow

bugs/failures which either appear minor, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

  • #56422 GKE Downgrade Failure (only affects downgrading 2 versions)
  • #56429 GKE ResourceQuota still failing (similar fix as per other GKE fixes)
  • #54574 Horiz Autoscaling Test Failure (PR submitted)
  • #56532 Storage errors (new bug, PR submitted)

PRs:

  • #56639 add test for volume resizing
  • #56598 fix PV bug
  • #56576 fix metadata agent
  • #56533 fix new storage errors
  • #50603 content negotiation bug (have queried owner because it doesn't appear
    to have anything tied to 1.9, suggested removal)

Green

issues which appear resolved but are not yet closed/removed.
PRs which are going to be removed from milestone but have not been

Issues:

  • #56416
  • #56414
  • #47820
  • #53006

Exception Status & Special Cases

Advanced Auditing: due today.
Issues: #54551, #53020
PRs: #56638

CNI 0.6.0: (#49480): see above. Also there seems to be no leadership on figuring this out.

AWS NVME Support: due today

Update Dashboard: due today, appears to be waiting only on owner approval.

Autoprobing Network-ID: due today, appears to just be waiting on owner approval.

@jberkus
Copy link
Contributor

jberkus commented Dec 4, 2017

Status of PRs and Bugs

Down to 10 issues, and 14 PRs. Many of those PRs are random changes which happen
to be approved now and owners are hoping to get them into 1.9; none of the new
PRs look like blockers.

Remaining critical issues appear to be the Azure breakage, failure to test downgrade,
and CNI 0.6.0.

Red

bugs/failures which appear user-visible and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

PRs:

Yellow

bugs/failures which either appear minor, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

PRs:

Green

issues which appear resolved but are not yet closed/removed.
PRs which are not approved for milestone
or are waiting on RC to merge

PRs:

Exception Status & Special Cases

Advanced Auditing: due 12/1

CNI 0.6.0: (kubernetes/kubernetes#49480): issue still lacks
an "owner" who is going to make sure that the deploy & packaging changes happen.

Kubeadm Changes (kubernetes/kubernetes#56504): according to SIG,
is almost complete with all changes in except for one open PR (kubernetes/kubernetes#56599)

AWS NVME Support: Completed

Update Dashboard: Completed

Autoprobing Network-ID: Completes

@jberkus
Copy link
Contributor

jberkus commented Dec 6, 2017

Status of PRs and Bugs

Currently 13 issues (including new bugs). There are 16 PRs, but quite a number
of these are pending PRs which do not have SIG approval for the milestone. I've
put the unapproved PRs in "Green", because the idea is that most of them are
more likely to simply be dropped.

Remaining critical issues appear to be failure to test downgrade,
and CNI 0.6.0.

Red

bugs/failures which appear user-visible and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

PRs:

Yellow

bugs/failures which either appear minor, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

PRs:

Green

issues which appear resolved but are not yet closed/removed.
PRs which are not approved for milestone
or are waiting on RC to merge

Issues:

PRs:

Exception Status & Special Cases

Advanced Auditing: due 12/1
Issues: kubernetes/kubernetes#55194 (tracking)
All PRs complete. Author says one more PR is required to fix one GCE-specific issue.

CNI 0.6.0: (kubernetes/kubernetes#49480): issue still lacks
an "owner" who is going to make sure that the deploy & packaging changes happen.

Kubeadm Changes (kubernetes/kubernetes#56504): according to SIG,
is almost complete with all changes in except for one open PR
(kubernetes/kubernetes#56599) which has to be finished
at release.

AWS NVME Support: Completed

Update Dashboard: Completed

Autoprobing Network-ID: Completed

@zacharysarah
Copy link
Contributor

zacharysarah commented Dec 7, 2017

Docs

Status of PRs, cribbing shamelessly from @jberkus

Red

PRs which do not have a docs and/or tech review, or whose opener is unresponsive

PR Description

Yellow

PRs which have docs/tech reviewers but require action prior to merging

PR Description

Green

PRs that have been merged to the release megabranch, or otherwise require no pre-release action

PR Description
kubernetes/website#6066 Done
kubernetes/website#6103 Needs autogeneration (PR # pending)
kubernetes/website#6260 Done
kubernetes/website#6371 Done
kubernetes/website#6392 Done
kubernetes/website#6415 Done
kubernetes/website#6427 Done
kubernetes/website#6463 Done
kubernetes/website#6465 Done
kubernetes/website#6474 Done
kubernetes/website#6479 Done
kubernetes/website#6485 Done
kubernetes/website#6487 Done
kubernetes/website#6494 Done
kubernetes/website#6496 Done
kubernetes/website#6504 Done
kubernetes/website#6518 Done
kubernetes/website#6519 Merge after release
kubernetes/website#6520 Merge after release
kubernetes/website#6521 Merge after release
kubernetes/website#6522 Merge after release
kubernetes/website#6536 Done
kubernetes/website#6550 Done
kubernetes/website#6553 Done
kubernetes/website#6554 Done
kubernetes/website#6555 Done
kubernetes/website#6558 Done
kubernetes/website#6650 Done
kubernetes/kubernetes#56942 Done

@jberkus
Copy link
Contributor

jberkus commented Dec 10, 2017

Status of PRs and Bugs

Currently 10 issues (including new bugs). There are 11 PRs. I've
put the unapproved PRs in "Green", because the idea is that most of them are
more likely to simply be dropped.

Remaining critical issues appear to be failure to test downgrade,
and CNI 0.6.0.

(and thanks Zachary for showing that I could be using markdown tables for this!)

Red

bugs/failures which appear user-visible and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

issue summary status
kubernetes/kubernetes#56244 Downgrade still failing (debugging, no ETA)
kubernetes/kubernetes#56879 Parallel downgrade failing (newly escalated)
kubernetes/kubernetes#56426 GKE downgrade failure was resolved, now failing again, in progress
kubernetes/kubernetes#56416 Network partition healing fail had PR, fix didn't work

PRs:

PR summary status

Yellow

bugs/failures which either appear minor, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

issue summary status
kubernetes/kubernetes#56357 Performance Polling Issue ETA 1.9.1, not showstopper
kubernetes/kubernetes#47820 Network Partition test fail SIG says non-blocking but has not removed
kubernetes/kubernetes#53207 Kubelet status update hang newly escalated, has PR

PRs:

PR summary status
kubernetes/kubernetes#56824 fix for Yaml quoting issue tiny, approved
kubernetes/kubernetes#50603 fix content negotiation bug not ready, no ETA, non-blocker
kubernetes/kubernetes#56967 update cadvisor PR for issue #53207 2/3 approvals
kubernetes/kubernetes#56970 cherry-pick for autoscaler 1/3 approvals
kubernetes/kubernetes#56942 doc updates for 1.9 needs approval
kubernetes/kubernetes#56679 fix azure storage delay approved

Green

issues which appear resolved but are not yet closed/removed.
PRs which are not approved for milestone
or are waiting on RC to merge

issue summary status
kubernetes/kubernetes#56813 CSI teardown issue PR merged, close?

PRs:

PR summary status
kubernetes/kubernetes#56959 fix lifecycle messaging not approved for milestone, appears minor
kubernetes/kubernetes#56918 cherry-pick to fix Azure LB issue ready, not approved for milestone
kubernetes/kubernetes#56599 Kubeadm version stamp merge @ RC
kubernetes/kubernetes#56390 text IPVS proxy ready, not approved for milestone
kubernetes/kubernetes#55925 Fix ServiceNodeExclusion version ready, not approved for milestone

Exception Status & Special Cases

Advanced Auditing: Complete

CNI 0.6.0: (kubernetes/kubernetes#49480): issue still lacks
an "owner" who is going to make sure that the deploy & packaging changes happen.

Kubeadm Changes (kubernetes/kubernetes#56504): according to SIG,
is almost complete with all changes in except for one open PR
(kubernetes/kubernetes#56599) which has to be finished
at release.

AWS NVME Support: Completed

Update Dashboard: Completed

Autoprobing Network-ID: Completed

@jberkus
Copy link
Contributor

jberkus commented Dec 11, 2017

Status of PRs and Bugs

Currently 9 issues (including new bugs). There are 15 PRs, including several partial fixes
for outstanding issues. I've
put the unapproved PRs in "Green", because the idea is that most of them are
more likely to simply be dropped.

Remaining critical issues appear to be failure to test downgrade,
and CNI 0.6.0.

(and thanks Zachary for showing that I could be using markdown tables for this!)

Red

bugs/failures which appear user-visible and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

issue summary status
kubernetes/kubernetes#56244 Downgrade still failing (debugging, no ETA)
kubernetes/kubernetes#56879 Parallel downgrade failing (newly escalated)
kubernetes/kubernetes#56426 GKE downgrade failure was resolved, now failing again, in progress
kubernetes/kubernetes#56416 Network partition healing fail had PR, fix didn't work

PRs:

PR summary status

Yellow

bugs/failures which either appear minor, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

issue summary status
kubernetes/kubernetes#56357 Performance Polling Issue ETA 1.9.1, not showstopper
kubernetes/kubernetes#53207 Kubelet status update hang newly escalated, has PR

PRs:

PR summary status
kubernetes/kubernetes#56824 fix for Yaml quoting issue tiny, approved
kubernetes/kubernetes#50603 fix content negotiation bug not ready, no ETA, non-blocker
kubernetes/kubernetes#56967 update cadvisor PR for issue #53207 2/3 approvals
kubernetes/kubernetes#56970 cherry-pick for autoscaler 1/3 approvals
kubernetes/kubernetes#56942 doc updates for 1.9 needs approval
kubernetes/kubernetes#56679 fix azure storage delay approved
kubernetes/kubernetes#57046 heapster version bump new, 2/3 approvals
kubernetes/kubernetes#57043 kubeadm etcd downgrade fix new, approved but needs retest
kubernetes/kubernetes#57023 kubelet CRI defined in cadvisor new, not approved for milestone, but author says it's a must-fix

Green

issues which appear resolved but are not yet closed/removed.
PRs which are not approved for milestone
or are waiting on RC to merge

issue summary status
kubernetes/kubernetes#47820 Network Partition test fail SIG claims test removed

PRs:

PR summary status
kubernetes/kubernetes#56959 fix lifecycle messaging not approved for milestone, appears minor
kubernetes/kubernetes#56918 cherry-pick to fix Azure LB issue ready, not approved for milestone
kubernetes/kubernetes#56599 Kubeadm version stamp merge @ RC
kubernetes/kubernetes#56390 text IPVS proxy ready, not approved for milestone
kubernetes/kubernetes#55925 Fix ServiceNodeExclusion version ready, not approved for milestone
kubernetes/kubernetes#56858 e2e test for custom metrics ready, not approved for milestone

Exception Status & Special Cases

Advanced Auditing: Complete

CNI 0.6.0: (kubernetes/kubernetes#49480): issue still lacks
an "owner" who is going to make sure that the deploy & packaging changes happen. There have been
some updates on this, but a critical outstanding issue is making sure that people can still install version 1.8 with the correct CNI.

Kubeadm Changes (kubernetes/kubernetes#56504): according to SIG,
is almost complete with all changes in except for one open PR
(kubernetes/kubernetes#56599) which has to be finished
at release.

AWS NVME Support: Completed

Update Dashboard: Completed

Autoprobing Network-ID: Completed

@jberkus
Copy link
Contributor

jberkus commented Dec 12, 2017

PR status Update

This is the Sudden Death Overtime tracking of PRs

Red

critical PRs fixing major breakage issues with 1.9. Appropriate SIGs have been queried for status today.

PR summary status
kubernetes/kubernetes#57043 kubeadm etcd downgrade fix new, approved but needs retest
kubernetes/kubernetes#57023 kubelet CRI defined in cadvisor new, not approved for milestone, but author says it's a must-fix
kubernetes/kubernetes#56967 update cadvisor PR for issue #53207 2/3 approvals

Yellow

*PRs with status updates, which are non-blockers but have not been kicked out of 1.9 *

PR summary status
kubernetes/kubernetes#56970 cherry-pick for autoscaler should merge automatically
kubernetes/kubernetes#56959 fix lifecycle messaging all approved but SIG owner used wrong labels, queries
kubernetes/kubernetes#50603 fix content negotiation bug approved, waiting for merge
kubernetes/kubernetes#56824 fix for Yaml quoting issue tiny, approved
kubernetes/kubernetes#56679 fix azure storage delay approved, waiting for merge

Green

PRs which appear non-critical, have been bumped by downgrading them to priority/important-soon
or can't be bumped because they're not approved

PR summary status
kubernetes/kubernetes#55925 Fix ServiceNodeExclusion version downgraded, but bot not kicking it out?
kubernetes/kubernetes#56390 text IPVS proxy ready, not approved for milestone
kubernetes/kubernetes#56918 azure nodeport probe blocked by member of SIG
kubernetes/kubernetes#57052 fix GCP permissions on kube-env bumped to 1.9.1

@jberkus
Copy link
Contributor

jberkus commented Dec 12, 2017

Ok, I should have created hard deadlines much earlier.

We're down to two PRs now:

Red

PR summary status
kubernetes/kubernetes#57023 kubelet CRI defined in cadvisor new, not approved for milestone, but author says it's a must-fix

Yellow

None

Green

PR summary status
kubernetes/kubernetes#56858 Custom Metrics Test not approved, not critical

@jberkus
Copy link
Contributor

jberkus commented Dec 12, 2017

Status of PRs and Bugs

Final burndown

Remaining critical issues appear to be failure to test downgrade,
and CNI 0.6.0.

Red

bugs/failures which appear critical and for which no approved PR exists
PRs which appear required and do not have all sign-offs or are stalled

Issues:

issue summary status
kubernetes/kubernetes#56244 Downgrade still failing (debugging, no ETA)
kubernetes/kubernetes#56879 Parallel downgrade failing (newly escalated)
kubernetes/kubernetes#57047 Stackdriver logging issue severity unclear, no PR, related to downgrade fail
kubernetes/kubernetes#57005 crio.sock mismatch severity unclear, has unapproved PR

PRs:

PR summary status
kubernetes/kubernetes#57023 fix for 57005 not ready, not approved

Yellow

bugs/failures which either appear to be not showstoppers, or for which a PR in good shape exists
PRs which are approved but are waiting to be merged

Issues:

issue summary status
kubernetes/kubernetes#56261 Node deletion error severity unclear, has unapproved PR
kubernetes/kubernetes#56522 Windows proxy breakage serverity unclear, has PR
kubernetes/kubernetes#56052 scale tests failing again not sure if performance regression or test failure, waiting

PRs:

PR summary status
kubernetes/kubernetes#56622 Fix for 56261 approved, waiting to merge
kubernetes/kubernetes#56529 Fix for approved but needs milestone

Green

non-blocker issues
PRs which are not approved for milestone
or are waiting on RC to merge

issue summary status

PRs:

PR summary status

Exception Status & Special Cases

Advanced Auditing: Complete

CNI 0.6.0: (kubernetes/kubernetes#49480): issue still lacks
an "owner" who is going to make sure that the deploy & packaging changes happen. There have been
some updates on this, but a critical outstanding issue is making sure that people can still install version 1.8 with the correct CNI.

Kubeadm Changes (kubernetes/kubernetes#56504): according to SIG,
is almost complete with all changes in except for one open PR
(kubernetes/kubernetes#56599) which has to be finished
at release.

AWS NVME Support: Completed

Update Dashboard: Completed

Autoprobing Network-ID: Completed

@jberkus
Copy link
Contributor

jberkus commented Dec 14, 2017

Status of PRs and Bugs

Final Final Final burndown

Yay downgrade is fixed, now it's really just CNI

Giving up on usual format, because there's really just a handful of things to fix.

Stackdriver issue: has two PRs, both approved

PR/issue summary status
kubernetes/kubernetes#57047 logging test failure approved, has PR
kubernetes/kubernetes#57174 cherrypick to fix issue approved
kubernetes/kubernetes#57172 PR to fix issue approved

CNI 0.6.0

PR/issue summary status
kubernetes/kubernetes#49480 update to CNI 0.6.0 almost complete
kubernetes/release#486 fix rpm build for release

Scalability test flakiness

PR/issue summary status
kubernetes/kubernetes#56052 scalability test failures SIG treating this as test flake rather than real issue

@enisoc enisoc closed this as completed Dec 15, 2017
@justaugustus justaugustus added the sig/release Categorizes an issue or PR as relevant to SIG Release. label Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
None yet
Development

No branches or pull requests

4 participants