-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating to use IndexInformer #92
Conversation
- Removed reliance on the store for maintain state and migrated to the IndexInformer. This now will only cache "active" pods (no failed/completed pods will be returned) - Extracted role based fallback logic to it's own processor and removed from the Server/Event handlers - Migrated event handlers to the k8s package - Moved the `cmd` package back to the base to allow for easy builds and acknowlding the fact that the project itself is the unit of deployment and it's unlikely this would be used as a library in and of itself - Added in processor tests for the new RoleProcessor
Thanks @jrnt30, can you please revert this:
I will start reviewing when this is done. It's the "standard" directory structure for go projects and I would like to keep it as-is. Moreover it creates unnecessary noise in the PR making it harder to review. |
Conflicts: namespace.go pod.go server/server.go store/store.go
@jtblin Sure thing, that is done and conflicts merged with master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Just a few nitpicks for now, I will be travelling so will continue the review during the weekend. Also it seems there are some tests failing in the build.
k8s/k8s.go
Outdated
@@ -3,6 +3,8 @@ package k8s | |||
import ( | |||
"time" | |||
|
|||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put before "time" dep.
server/server.go
Outdated
"github.com/jtblin/kube2iam/iam" | ||
"github.com/jtblin/kube2iam/k8s" | ||
"github.com/jtblin/kube2iam/store" | ||
"github.com/jtblin/kube2iam/processor" | ||
"k8s.io/client-go/tools/cache" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the k8s.io dep to the other external deps i.e. after log "github.com/sirupsen/logrus"
.
processor/processor.go
Outdated
package processor | ||
|
||
import ( | ||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep deps orderered:
- Std lib e.g. fmt, encoding/json
- external deps
- kube2iam deps
Thanks for the feedback. The imports structure makes a lot of sense, those are cleaned up. The tests themselves were an oversight when I moved main back into the That's been fixed as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience. Few more changes requested, it's a huge PR so it's expected, almost there. Good work overall! 👍
cover.sh
Outdated
@@ -11,13 +11,12 @@ echo "mode: count" > coverage.out | |||
# Initialize error tracking | |||
ERROR="" | |||
|
|||
declare -a packages=('' \ | |||
'cmd' \ | |||
declare -a packages=('cmd' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that precents root files to be tested. Please follow the original usage and add "processor" after "k8s".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are no *.go
files now, Coveralls will complain if the ''
is left in. Moved the other to be sorted.
Makefile
Outdated
@@ -26,10 +26,10 @@ setup: | |||
gometalinter --install --update | |||
glide install --strip-vendor | |||
|
|||
build: *.go fmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changing that ooc? Was that creating any issue for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we removed the root *.go
files it didn't trigger properly.
k8s/k8s.go
Outdated
@@ -13,12 +14,18 @@ import ( | |||
|
|||
const ( | |||
// Resync period for the kube controller loop. | |||
resyncPeriod = 30 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please order alphabetically.
k8s/k8s.go
Outdated
) | ||
|
||
// Client represents a kubernetes client. | ||
type Client struct { | ||
*kubernetes.Clientset | ||
podIndexer cache.Indexer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please order alphabetically. It makes finding things easier.
namespaceController *cache.Controller
namespaceIndexer cache.Indexer
podController *cache.Controller
podIndexer cache.Indexer
k8s/k8s.go
Outdated
@@ -75,5 +137,7 @@ func NewClient(host, token string, insecure bool) (*Client, error) { | |||
if err != nil { | |||
return nil, err | |||
} | |||
return &Client{client}, nil | |||
k8s := &Client{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simplify to: return &Client{Clientset: client}, nil
processor/processor.go
Outdated
ar := GetNamespaceRoleAnnotation(ns, r.namespaceKey) | ||
for _, role := range ar { | ||
if r.iam.RoleARN(role) == roleArn { | ||
log.Debugf("Role:%s on namespace:%s found.", roleArn, namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space after :
processor/processor.go
Outdated
return true | ||
} | ||
} | ||
log.Warnf("Role: [%s] on namespace: [%s] not found.", roleArn, namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the [] are unnecessary and better use fields for structured logging. See *Fields convenience methods in other places.
processor/processor_test.go
Outdated
@@ -0,0 +1,241 @@ | |||
package processor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding all these tests! 👍
There's no test for GetRoleMapping
, maybe not strictly necessary though as you're testing the other methods but if not too painful could be nice.
server/server.go
Outdated
} | ||
|
||
if !synced { | ||
log.Fatalf("Attempted to wait for caches to be synced for [%d] however it is not done. Giving up.", defaultCacheSyncAttempts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit ... for %d attempts, however ...
server/server.go
Outdated
|
||
synced := false | ||
for i := 0; i < defaultCacheSyncAttempts && !synced; i++ { | ||
synced = cache.WaitForCacheSync(nil, podSynched, namespaceSynched) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that should prevent the race condition for default role? 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was more around ensuring that both the pods and namespaces were indexed to ensure that you couldn't "sneak in" a role. This only happens @ startup time, so there is the possibility that you could get the default role immediately if the new pod hasn't been indexed.
What this PR aims to solve is a permanent caching of those incorrect roles.
We've tried running of this branch in our dev environments, but still get a lot of "default role" under load (our build-engineering team spinning up 100s of concurrent jobs) |
Thanks for the info @dadux, that's annoying. Looking at the code, it only falls back to the default role if the pod is found by its ip, there is no annotation, and there is a default role set. So if it's getting the default role when it shouldn't, it can only be because the cache has not been updated yet. The solutions I can think of are:
Anyhow let's not try to tackle that in this PR for now. We can look into it after. |
@jtblin Thanks for the guidance and constructive feedback, it's very helpful. I have addressed a majority of these issues. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I got distracted. I think it's the last nit apart from package naming. I'd suggest just going with "mappings" for now or something like that.
I was also waiting for @dadux to confirm that everything was looking good with the binary built from this branch.
Makefile
Outdated
@@ -29,7 +29,7 @@ setup: | |||
build: *.go fmt | |||
go build -o build/bin/$(ARCH)/$(BINARY_NAME) $(GOBUILD_VERSION_ARGS) github.com/jtblin/$(BINARY_NAME)/cmd | |||
|
|||
build-race: *.go fmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can change this one back as well now.
processor/processor.go
Outdated
namespaceKey string | ||
namespaceRestriction bool | ||
iam *iam.Client | ||
kubeStore store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant changing the variable name from kubeStore
to store
in the struct as well pls.
I also know that @placydo was looking at it a bit. Are your experiencing ok so far? |
Changes Unknown when pulling c8e937a on jrnt30:index-refactor into ** on jtblin:master**. |
Changes Unknown when pulling 775c7cf on jrnt30:index-refactor into ** on jtblin:master**. |
I've pushed docker image |
Hey. Currently I am in middle of something. I will try to pull this image today and give it a try + test it over the weekend. Will come back with feedback asap. Thanks guyz! |
@struz did you guys see any issue running this on our cluster? |
return false | ||
} | ||
|
||
// DumpDebugInfo outputs all the roles by IP addresr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/addresr/address/
Thanks a ton again @jrnt30 for the great work. This is now merged and released as |
NOTE: This is the start of a rebase/rework for #32. There ended up being some significant changes that made it difficult to integrate the changes easily and I ended up moving around a bit of the logic from store and server in an attempt to make it a bit more reusable and testable.
Very open to feedback but wanted to get some eyes on this from @jtblin and @mikekap if they have time to ensure I'm heading down the right path.
I have done some initial testing in our staging cluster but need to spend some more time next week going through some of the edge cases on in
server/
and test some cross-account roles testedmigrated to the IndexInformer. This now will only cache
"active" pods (no failed/completed pods will be returned)
and removed from the Server/Event handlers