Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve tunnel availability #375

Merged
merged 1 commit into from
Jul 21, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions cmd/yurt-tunnel-agent/app/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package app

import (
"fmt"
"time"

"github.com/openyurtio/openyurt/cmd/yurt-tunnel-agent/app/config"
"github.com/openyurtio/openyurt/cmd/yurt-tunnel-agent/app/options"
Expand All @@ -30,6 +31,8 @@ import (
"github.com/openyurtio/openyurt/pkg/yurttunnel/util"

"github.com/spf13/cobra"

"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/util/certificate"
"k8s.io/klog/v2"
)
Expand Down Expand Up @@ -91,6 +94,17 @@ func Run(cfg *config.CompletedConfig, stopCh <-chan struct{}) error {
}
agentCertMgr.Start()

// 2.1. waiting for the certificate is generated
_ = wait.PollUntil(5*time.Second, func() (bool, error) {
if agentCertMgr.Current() != nil {
return true, nil
}
klog.Infof("certificate %s not signed, waiting...",
projectinfo.GetAgentName())
return false, nil
}, stopCh)
klog.Infof("certificate %s ok", projectinfo.GetAgentName())

// 3. generate a TLS configuration for securing the connection to server
tlsCfg, err := pki.GenTLSConfigUseCertMgrAndCA(agentCertMgr,
tunnelServerAddr, constants.YurttunnelCAFile)
Expand Down
2 changes: 1 addition & 1 deletion cmd/yurt-tunnel-server/app/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ func (o *ServerOptions) Config() (*config.Config, error) {
if err != nil {
return nil, err
}
cfg.SharedInformerFactory = informers.NewSharedInformerFactory(cfg.Client, 10*time.Second)
cfg.SharedInformerFactory = informers.NewSharedInformerFactory(cfg.Client, 24*time.Hour)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you change the resync time?

Copy link
Contributor Author

@aholic aholic Jul 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my situation, i saw thousands tunnel certificate-sign-requests was in pending/approved status. All of them was enqueued every 10 seconds, which full filled the work-queue. The pending certificate-sign-requests could not be approved(after 24hours pending, it was deleted automatically). So i did two things:

  1. make the re-sync period longer. I think there's no need for the re-sync period to be so short, it's just a method to fix some unexpected situation.
  2. filter before enqueue, only do enqueue for pending tunnel certificate-sign-requests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Alternatively, we may set it to 0 to disable resync .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, 0 is also ok. let's just make it a relative long period, in case of some unexpected situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fei-Guo Set a relative long period(like 24 * time.Hour) for resync is more reasonable than disabling resync, in case of some unexpected situation.


klog.Infof("yurttunnel server config: %#+v", cfg)
return cfg, nil
Expand Down
25 changes: 22 additions & 3 deletions pkg/yurttunnel/pki/certmanager/csrapprover.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,25 @@ func enqueueObj(wq workqueue.RateLimitingInterface, obj interface{}) {
runtime.HandleError(err)
return
}
wq.AddRateLimited(key)

csr, ok := obj.(*certificates.CertificateSigningRequest)
if !ok {
klog.Errorf("%s is not a csr", key)
return
}

if !isYurttunelCSR(csr) {
klog.Infof("csr(%s) is not %s csr", csr.GetName(), projectinfo.GetTunnelName())
return
}

approved, denied := checkCertApprovalCondition(&csr.Status)
if !approved && !denied {
klog.Infof("non-approved and non-denied csr, enqueue: %s", key)
wq.AddRateLimited(key)
}

klog.V(4).Infof("approved or denied csr, ignore it: %s", key)
}

// NewCSRApprover creates a new YurttunnelCSRApprover
Expand Down Expand Up @@ -139,6 +157,7 @@ func approveYurttunnelCSR(
csrClient typev1beta1.CertificateSigningRequestInterface) error {
csr, ok := obj.(*certificates.CertificateSigningRequest)
if !ok {
klog.Infof("object is not csr: %v", obj)
return nil
}

Expand All @@ -149,12 +168,12 @@ func approveYurttunnelCSR(

approved, denied := checkCertApprovalCondition(&csr.Status)
if approved {
klog.V(4).Infof("csr(%s) is approved", csr.GetName())
klog.Infof("csr(%s) is approved", csr.GetName())
return nil
}

if denied {
klog.V(4).Infof("csr(%s) is denied", csr.GetName())
klog.Infof("csr(%s) is denied", csr.GetName())
return nil
}

Expand Down