Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable changing times on volume attach/detach reconciling sync to fixing impact to AWS #39551

Merged

Conversation

chrislovecnm
Copy link
Contributor

@chrislovecnm chrislovecnm commented Jan 6, 2017

#What this PR does / why we need it:

We are currently blocked by API timeouts with PV volumes. See #39526. This is a workaround, not a fix.

Special notes for your reviewer:

A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods. I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.

Release note:

Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync.  The duration of the syncs can be controlled, and the syncs can be shut off as well. 

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 6, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jan 6, 2017
Copy link

@patrickmcclory patrickmcclory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't actually fix the underlying problem but will unblock users who are seeing AWS API call flooding due to volume checks.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of early comments

// successive executions. This has been increased from every 100 ms to
// 1 minute since the timing has created an enormous amount of API traffic on
// such clouds as AWS.
reconcilerLoopPeriod time.Duration = 1 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't increase or decrease the rate of API calls, and break a lot of things (by reducing the rate at which the attach detach controller can react, so let's leave this alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

reconcilerSyncDuration time.Duration = 5 * time.Second
// This has been increased from every 5 seconds to every 5 minutes since
// the timing has created an enormous amount of API traffic on such clouds as AWS.
reconcilerSyncDuration time.Duration = 5 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this 1 minute instead of 5 (which will still reduce the rate 12x times). There is a tradeoff--without this check we could potentially reintroduce the AWS bug of wrong volume being attached.

Also maybe make this value configurable/enabled/disable in the same PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali that is the second PR that I wanted to drop. Since I am not super familiar with wiring in the cobra options.

@chrislovecnm
Copy link
Contributor Author

Will let this build run, and then wire in the cobra options. @saad-ali has asked for both in the same PR.

@chrislovecnm
Copy link
Contributor Author

chrislovecnm commented Jan 7, 2017

A couple of unit tests are failing on osx. I am going to run some build tests tonight. Also need to update openapi.

@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 7, 2017
Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some preliminary comments.

@@ -80,6 +80,8 @@ func (s *CloudControllerManagerServer) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&s.KubeAPIQPS, "kube-api-qps", s.KubeAPIQPS, "QPS to use while talking with kubernetes apiserver")
fs.Int32Var(&s.KubeAPIBurst, "kube-api-burst", s.KubeAPIBurst, "Burst to use while talking with kubernetes apiserver")
fs.DurationVar(&s.ControllerStartInterval.Duration, "controller-start-interval", s.ControllerStartInterval.Duration, "Interval between starting controller managers.")
fs.BoolVar(&s.DisableReconciliation, "disable-reconcile", false, "Disable Volume Reconcilation")
fs.DurationVar(&s.ReconcilerLoopPeriod, "reconcile-loop-period", 5 * time.Minute,"The wait time between volume reconciliation")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default 1 minutes please :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is really short. What is the side effect of doing at 5?

Copy link
Member

@gnufied gnufied Jan 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another idea - but it may not work if ReconcilerLoopPeriod is declared as a Duration. We can keep one variable int32 or something and if it is -1 - that means disable the sync otherwise the value means sync interval in seconds.

Basically lesser the knobs and switches we have - it is probably better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand that, but I alway look for the big red button. We are shutting down important code, and accidentally shutting it down would not be nice.

@@ -80,6 +80,8 @@ func (s *CloudControllerManagerServer) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&s.KubeAPIQPS, "kube-api-qps", s.KubeAPIQPS, "QPS to use while talking with kubernetes apiserver")
fs.Int32Var(&s.KubeAPIBurst, "kube-api-burst", s.KubeAPIBurst, "Burst to use while talking with kubernetes apiserver")
fs.DurationVar(&s.ControllerStartInterval.Duration, "controller-start-interval", s.ControllerStartInterval.Duration, "Interval between starting controller managers.")
fs.BoolVar(&s.DisableReconciliation, "disable-reconcile", false, "Disable Volume Reconcilation")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about DisableAttachDetachReconcilerSync? So it is clear this flag is for the attach detach controller, and this is the sync subcomponent of the reconciler (not the whole reconciler). And ReconcilerSyncLoopPeriod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if time.Since(rc.timeOfLastSync) > rc.syncDuration {

if rc.disableReconciliation {
glog.V(5). Info("Not reconciling volumes as reconciliation is diabled via the command line flag")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Skipping "attached volumes still attached" check since it is disabled via the command line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know how my change strikes you :)

if rc.disableReconciliation {
glog.V(5). Info("Not reconciling volumes as reconciliation is diabled via the command line flag")
} else if time.Since(rc.timeOfLastSync) > rc.syncDuration {
glog.V(5).Info("Reconciling volumes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Starting "attached volumes still attached" check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know how my change strikes you :)

@chrislovecnm
Copy link
Contributor Author

And I broke the build. Will fix on linux ... SGTM

@chrislovecnm
Copy link
Contributor Author

Review please @kris-nova

// are still attached to the node and udpate the status if they are not.
if time.Since(rc.timeOfLastSync) > rc.syncDuration {

if rc.disableReconciliation {
Copy link
Contributor

@jingxu97 jingxu97 Jan 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not disable reconciliation. DisableAttachDetachReconcilerSync flag Saad suggested is only for disable rc.sync() below.

So rc. reconcile() should not be changed at all. Only change as below
if !rc. DisableAttachDetachReconcilerSync {
rc.Sync()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this might be the reason for the test failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@chrislovecnm
Copy link
Contributor Author

chrislovecnm commented Jan 7, 2017

@jingxu97 / @saad-ali I noticed that ebf796a#diff-2b533d4422a255f4bd8521394e285679 shows that we have openapi auto-generated. What is the magic command for that??

@chrislovecnm chrislovecnm force-pushed the reconciler-time-increases branch from a5ea375 to 3aab5d2 Compare January 7, 2017 05:15
// This flag enables or disables reconcile. Is false by default, and thus enabled.
DisableReconciliation bool
// ReconcilerSyncDuration is the amount of time the reconciler sync states loop
// wait between successive executions. Is set to 5 min by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names should make it clearer these options are related to volume attach/detach reconciliation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali made some recommendations

// Reconciler runs a periodic loop to reconcile the desired state of the with
// the actual state of the world by triggering attach detach operations.
// This flag enables or disables reconcile. Is false by default, and thus enabled.
DisableReconciliation bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the side effects of disabling reconciliation? Is reconciliation the mechanism that updates mounted config file and secret volumes when they change in the API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali care to comment?


// reconcilerSyncDuration is the amount of time the reconciler sync states loop
// wait between successive executions
reconcilerSyncDuration time.Duration = 5 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this changing from 5 seconds to 5 minutes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and @saad-ali are discussing it. @saad-ali is recommending 1 min, and my question is what is the implication of 5 min. Read the referencing issue for more information why we are backing this off.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If reconciliation has varying cost for different volume types, is a single tuning interval across all types what we want?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make reconcilerSyncDuration time very long might cause problem discussed in issue #33760 when state is out of sync. It explained the main reason we added sync loop here. Since this PR makes configurable, I think 1 min is good enough for most of the cases. This is a tradeoff between overhead and reducing the window of state out sync.

Copy link
Contributor Author

@chrislovecnm chrislovecnm Jan 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you recommend @liggitt? We are planning to drop this in as an emergency release on Tuesday.

As I mentioned in #39526

TLDR;

Anyone that is using 1.4.6 or 1.4.7 in AWS will exceed their rate limits with as little as 20 PV attached to a cluster. We have one account that is at about 24k API calls per hour because of timeouts. This makes PV unusable on AWS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Jordan, as a concerned lead I have no idea what the impacts or risks to this change are, and all I'm seeing is "might", "could", and "flake" on both sides of the fix / existing state.

Anything that needs to get cherry-picked that is important had better have a clear definition so people can understand them.

Please add that - removing the label until someone can summarize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, a comment was made:

this increases the change of mounting the wrong volume to the pod

which is incredibly terrifying.

Copy link
Member

@gnufied gnufied Jan 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrong volume will be only mounted if:

  1. Volume is detached from pod, outside of k8s system (like AWS console)
  2. Node is rebooted.

#1 is not a huge concern because - usually we don't recommend users to detach volumes attached to nodes/pods.

#2 is bit of a problem and I am damned for saying this - but as long as time it takes for a reboot to complete is more than time specified here - we would be perhaps okay. But even original fix isn't a complete fix for #33760 as commented by @jingxu97

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was that an "OR" or an "AND"? If it's an AND, that resolves my concern (because I agree 1 is PIBKAC). If it's an OR, then that's a really important thing to document, and very scary, and should be part of the flag documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an "OR". @gnufied lists two possible scenarios that the problem might happen.

Copy link
Contributor Author

@chrislovecnm chrislovecnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments

// Reconciler runs a periodic loop to reconcile the desired state of the with
// the actual state of the world by triggering attach detach operations.
// This flag enables or disables reconcile. Is false by default, and thus enabled.
DisableReconciliation bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali care to comment?

// This flag enables or disables reconcile. Is false by default, and thus enabled.
DisableReconciliation bool
// ReconcilerSyncDuration is the amount of time the reconciler sync states loop
// wait between successive executions. Is set to 5 min by default.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali made some recommendations


// reconcilerSyncDuration is the amount of time the reconciler sync states loop
// wait between successive executions
reconcilerSyncDuration time.Duration = 5 * time.Second
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and @saad-ali are discussing it. @saad-ali is recommending 1 min, and my question is what is the implication of 5 min. Read the referencing issue for more information why we are backing this off.

// are still attached to the node and udpate the status if they are not.
if time.Since(rc.timeOfLastSync) > rc.syncDuration {

if rc.disableReconciliation {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@chrislovecnm
Copy link
Contributor Author

@k8s-bot unit test this

@chrislovecnm chrislovecnm force-pushed the reconciler-time-increases branch from 3aab5d2 to 620477e Compare January 7, 2017 06:11
@chrislovecnm
Copy link
Contributor Author

@saad-ali / @jingxu97 I am leaving this in WIP, until I run some base smoke test tomorrow. Will update the issue when I would like a final review.

@jingxu97
Copy link
Contributor

jingxu97 commented Jan 7, 2017 via email

@@ -80,6 +80,8 @@ func (s *CloudControllerManagerServer) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&s.KubeAPIQPS, "kube-api-qps", s.KubeAPIQPS, "QPS to use while talking with kubernetes apiserver")
fs.Int32Var(&s.KubeAPIBurst, "kube-api-burst", s.KubeAPIBurst, "Burst to use while talking with kubernetes apiserver")
fs.DurationVar(&s.ControllerStartInterval.Duration, "controller-start-interval", s.ControllerStartInterval.Duration, "Interval between starting controller managers.")
fs.BoolVar(&s.DisableAttachDetachReconcilerSync, "disable-attach-detach-reconcile", false, "Disable volume attach detach reconcilation")
fs.DurationVar(&s.ReconcilerSyncLoopPeriod, "reconcile-sync-loop-period", 5*time.Minute, "The wait time between volume attach detach reconciliation")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali / @jingxu97 & et al, I am noticing that other durations are like ControllerStartInterval metav1.Duration. Does the ReconcilerSyncLoopPeriod default need to be this as well? If so how the heck do I set the default value?

Ah API machinery code .... Feel like such a n00b ... love it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might set it up as line 47
NodeMonitorPeriod: metav1.Duration{Duration: 5 * time.Second},

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, follow the patten of the existing vars. Pass the current value in as the default for the flag. Define the default you want in NewCloudControllerManagerServer

@chrislovecnm
Copy link
Contributor Author

Review status: 0 of 8 files reviewed at latest revision, 10 unresolved discussions.


pkg/controller/volume/attachdetach/attach_detach_controller.go, line 68 at r1 (raw file):

Previously, chrislovecnm (Chris Love) wrote…

@saad-ali that is the second PR that I wanted to drop. Since I am not super familiar with wiring in the cobra options.

Updated PR with cobra comments.


Comments from Reviewable

@chrislovecnm
Copy link
Contributor Author

Thanks!


Review status: 0 of 8 files reviewed at latest revision, 10 unresolved discussions.


Comments from Reviewable

@chrislovecnm
Copy link
Contributor Author

Review status: 0 of 8 files reviewed at latest revision, 10 unresolved discussions.


pkg/apis/componentconfig/types.go, line 779 at r3 (raw file):

Previously, chrislovecnm (Chris Love) wrote…

@saad-ali care to comment?

Jin responded to this.


Comments from Reviewable

@chrislovecnm
Copy link
Contributor Author

@saad-ali I have a couple of other tweaks I will push in after this. Fixing times on two tests, and removing dead code in reconciler. I am letting e2e run now, but will be in shortly.

@saad-ali
Copy link
Member

saad-ali commented Jan 9, 2017

@saad-ali I have a couple of other tweaks I will push in after this. Fixing times on two tests, and removing dead code in reconciler. I am letting e2e run now, but will be in shortly.

Ack.

and the option to shut off reconciliation.
@chrislovecnm chrislovecnm force-pushed the reconciler-time-increases branch from df5eb50 to a973c38 Compare January 9, 2017 23:48
@chrislovecnm
Copy link
Contributor Author

@saad-ali last e2e was successful, now waiting on this one to run. The release notes are up to date now, and the changes that were requested are in.

@saad-ali
Copy link
Member

saad-ali commented Jan 9, 2017

Cool, I'll take another look

@k8s-ci-robot
Copy link
Contributor

Jenkins unit/integration failed for commit a973c38. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@@ -408,6 +408,10 @@ func StartControllers(controllers map[string]InitFunc, s *options.CMServer, root
go volumeController.Run(stop)
time.Sleep(wait.Jitter(s.ControllerStartInterval.Duration, ControllerStartJitter))

if s.ReconcilerSyncLoopPeriod.Duration < time.Second {
return fmt.Errorf("Duration time must be greater than one second as set via command line option reconcile-sync-loop-period. One minute is recommended.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the recommendation part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -181,6 +182,8 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&s.SecondaryNodeEvictionRate, "secondary-node-eviction-rate", 0.01, "Number of nodes per second on which pods are deleted in case of node failure when a zone is unhealthy (see --unhealthy-zone-threshold for definition of healthy/unhealthy). Zone refers to entire cluster in non-multizone clusters. This value is implicitly overridden to 0 if the cluster size is smaller than --large-cluster-size-threshold.")
fs.Int32Var(&s.LargeClusterSizeThreshold, "large-cluster-size-threshold", 50, "Number of nodes from which NodeController treats the cluster as large for the eviction logic purposes. --secondary-node-eviction-rate is implicitly overridden to 0 for clusters this size or smaller.")
fs.Float32Var(&s.UnhealthyZoneThreshold, "unhealthy-zone-threshold", 0.55, "Fraction of Nodes in a zone which needs to be not Ready (minimum 3) for zone to be treated as unhealthy. ")
fs.BoolVar(&s.DisableAttachDetachReconcilerSync, "disable-attach-detach-reconcile", false, "Disable volume attach detach reconciler sync. Disabling this may cause volumes to be mismatched with pods. Use wisely.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable-attach-detach-reconcile -> disable-attach-detach-reconcile-sync

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -181,6 +182,8 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&s.SecondaryNodeEvictionRate, "secondary-node-eviction-rate", 0.01, "Number of nodes per second on which pods are deleted in case of node failure when a zone is unhealthy (see --unhealthy-zone-threshold for definition of healthy/unhealthy). Zone refers to entire cluster in non-multizone clusters. This value is implicitly overridden to 0 if the cluster size is smaller than --large-cluster-size-threshold.")
fs.Int32Var(&s.LargeClusterSizeThreshold, "large-cluster-size-threshold", 50, "Number of nodes from which NodeController treats the cluster as large for the eviction logic purposes. --secondary-node-eviction-rate is implicitly overridden to 0 for clusters this size or smaller.")
fs.Float32Var(&s.UnhealthyZoneThreshold, "unhealthy-zone-threshold", 0.55, "Fraction of Nodes in a zone which needs to be not Ready (minimum 3) for zone to be treated as unhealthy. ")
fs.BoolVar(&s.DisableAttachDetachReconcilerSync, "disable-attach-detach-reconcile", false, "Disable volume attach detach reconciler sync. Disabling this may cause volumes to be mismatched with pods. Use wisely.")
fs.DurationVar(&s.ReconcilerSyncLoopPeriod.Duration, "attach-detach-reconcile-period", s.ReconcilerSyncLoopPeriod.Duration, "The reconciler sync wait time between volume attach detach. This duration must be larger than one second, and increasing this value from the default my allow for volume mismatches.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attach-detach-reconcile-period -> attach-detach-reconcile-sync-period

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default my allow for volume -> default my increase the likelihood of volume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo fixing

@@ -635,3 +635,5 @@ garbage-collector-enabled
viper-config
log-lines-total
run-duration
disable-attach-detach-reconcile
attach-detach-reconcile-period
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to update these after rename above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// This flag enables or disables reconcile. Is false by default, and thus enabled.
DisableAttachDetachReconcilerSync bool
// ReconcilerSyncLoopPeriod is the amount of time the reconciler sync states loop
// wait between successive executions. Is set to 5 min by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 min -> 5 sec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -57,6 +57,7 @@ func NewReconciler(
loopPeriod time.Duration,
maxWaitForUnmountDuration time.Duration,
syncDuration time.Duration,
disableReconciliation bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disableReconciliation->disableReconcilerSync?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for attach_detach_controller.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and renamed a couple stragglers.

@chrislovecnm
Copy link
Contributor Author

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/39551/pull-kubernetes-unit/12293/ <- same open API stuff that was failing over the weekend. No idea what is up

@k8s-ci-robot
Copy link
Contributor

Jenkins Bazel Build failed for commit ac49139. Full PR test history.

The magic incantation to run this job again is @k8s-bot bazel test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@saad-ali
Copy link
Member

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/39551/pull-kubernetes-unit/12293/ <- same open API stuff that was failing over the weekend. No idea what is up

Failing UT issue looks like #39604 which is a marked a flake, so it should pass if we run it often enough.

@saad-ali
Copy link
Member

@k8s-bot bazel test this

@saad-ali
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2017
@saad-ali
Copy link
Member

Marking P0 to get merged ASAP for tomorrow's release.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)

@k8s-github-robot k8s-github-robot merged commit 7c3fff1 into kubernetes:master Jan 10, 2017
@saad-ali
Copy link
Member

Chris this is merged. Can you please prepare the cherry pick to 1.5 and once that is complete we can do the 1.4 cherry pick

@saad-ali saad-ali added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jan 10, 2017
jessfraz added a commit that referenced this pull request Jan 11, 2017
…51-upstream-release-1.4

Automated cherry pick of #39551 upstream release 1.4
k8s-github-robot pushed a commit that referenced this pull request Jan 11, 2017
…51-upstream-release-1.5

Automatic merge from submit-queue

Automated cherry pick of #39551 upstream release 1.5

Automated cherry pick of #39551 ("Increasing times on reconciling volumes fixing impact to AWS") to upstream release 1.5
@saad-ali saad-ali changed the title Increasing times on reconciling volumes fixing impact to AWS. Enable changing times on volume attach/detach reconciling sync to fixing impact to AWS Jan 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.