Skip to content
This repository has been archived by the owner on Nov 7, 2019. It is now read-only.

aws-service-operator vs aws-service-broker #137

Open
prashantchitta opened this issue Nov 7, 2018 · 20 comments
Open

aws-service-operator vs aws-service-broker #137

prashantchitta opened this issue Nov 7, 2018 · 20 comments
Assignees
Labels
question Issues without actions, just questions.

Comments

@prashantchitta
Copy link

prashantchitta commented Nov 7, 2018

What is the difference between aws-service-operator vs aws-service-broker? I dont see anywhere in the internet what is the difference between the two. Both of them are intended towards creating aws services using kubernetes in a unified way

I have to create s3 and dynamodb services in our K8 cluster running on AWS. Both of these options are there. I want to know what is the right choice for us.

I explored aws-service-operator and i see few things are missing for our usecase. Such as binding a policy to a s3 bucket, also the issue we discussed previously #126. So want to see if aws-service-broker solves this.

It would be great if you can provide some insights

@nukepuppy
Copy link

one is made by a random person or group of people.. the other (this one) is made by an aws developer..

@DavidTPate
Copy link

That is incorrect @nukepuppy.

@prashantchitta from an outsider's view there are a few differences between them.

The AWS Service Broker is utilized to create resources on AWS which could then be utilized by resources running in Kubernetes. One example would be let's say I have an application which needs an S3 bucket and PostgreSQL (ie. Wordpress). I could utilize the Service Broker to create/manage these resources and setup my Containers to connect to them. The final part of this is that Service Brokers are more of a canned set of functionality, so I as a Corporation might say "Our standard database is PostgreSQL configured in X, Y, Z methods. On GCP we launch it on Compute and on AWS we use RDS." and provide that functionality.

On the other hand the Service Operator in this repository is usually more of a freeform type resource which typically handles the SysOps function. I might for example setup an Operator which keeps a Route53 record up-to-date pointing to my current Ingress and through this I also handle the switch over in a blue-green type of deployment.

That said, there is definitely some overlap in the general sense of "I can request a resource and I get one" but there's some nuance in the use cases.

@pierreozoux
Copy link

pierreozoux commented Dec 12, 2018

Ok, this will be an interesting discussion!

My Problem

I run a multitenant OpenShift cluster on AWS, and I need to give my users the possibility to provision, and operate an RDS instance, from OpenShift.
(Nobody will get access to AWS web UI nor API)

Another thing is that, we are a bit special in our company, and we might have a different opinion of what is an RDS instance.

Solution space

From lifecycle management perspective, the service broker looks like perfect:

From day two operation perspective, it looks like I'll need an operator:

  • create an on demand snapshot of my instance
  • list only my snapshots (and not the snapshots of other users)

And for observability purpose, ideally, I'll need a prom exporter for MySQL and also cloudwatch.

This is all very new I guess, but I'm really interested to hear what you have in mind!

related to:

@pierreozoux
Copy link

I put more thoughts on it, and I wanted to share publicly my vision around that topic.

Just as a reminder, CloudFormation is the state of the art for declarative infrastructure, that's why both project use it, and I also think it is the way to go.

Requirements for an RDS as a service:

Day 1:

As a developer, I need to:

  • provision an RDS instance
  • delete this instance
  • connect to it

Day 2:

As a developer, I need to:

  • update the instance type
  • make an on demand snapshot before a deployment
  • list my snapshots
  • create an RDS instance based on a snapshot
  • observe RDS health

Day 3:

As a ClusterOperator, I need to

  • isolate from a network standpoint various Namespaces accessing various RDS instances

Service operator + Service Broker

I think that service operator and service catalog are complementary.
The limitation comes from service catalog that is an opinionated API, that is just about the lifecycle of a service, but not side operations.
To be more precise, and about our RDS example, service broker can:

  • CRUD an RDS instance
  • create and bind credentials

But it can't and will never be ale to do anything specific to the service itself. For instance, if I want to list snapshots of an RDS instance, I'll never be able to do it from Service broker.

That's how I see the complementary of both projects. For instance, and to relate to RDS, we'll probably need 3 operators:

  • a snapshot operator that is able to CRUD RDS snapshots.
  • an observability operator that would setup an rds_exporter to be able to observe RDS (MySQL + CloudWatch) and send metrics to prom
  • a security operator that would configure istio to make sure of the network isolation

And the way I see it to ease our user experience would be to create a wrapper operator that would deploy the necessary objects in kubernetes with a single yaml.

I'm curious to hear what you'd think about that approach.

@schlomo
Copy link
Contributor

schlomo commented Jan 7, 2019

@pierreozoux IMHO there is much more to Day 1. Everything related to "it just works" from a developer perspective translates into more challenging automation from a service provider perspective:

  • if the instance fails, automatically launch from the last good backup (and therefore take care of backups)
  • if the AWS account fails (is hacked/deleted/whatever), recover my RDS with last good backup (together with my code running on K8S/OpenShift)
  • monitoring & alerting (who reacts?)
  • RDS autoscaling (up and down)
  • if I specified a "latest" DB version, update the database version as new versions become available. And roll back if it fails.

Your "developer view" just described the manual interactions which is totally fine as a user story. The developer assumes that the service is always there no matter what goes wrong. And that takes extra effort that must be implemented in the set of operators.

About the design: I think it is too early to really decide upon the cut of the domain into different operators. I would start with the top level operator and see where there is a benefit from breaking out functionality into a dedicated operator that maybe also can be used for other stuff. Maybe breaking it out is even something better left for a later iteration and when you actually have that other thing that needs the same functionality :-)

@nilebox
Copy link

nilebox commented Jan 15, 2019

The main technical difference is that aws-service-broker is implementing an OSB REST API (http://www.openservicebrokerapi.org/) that can be integrated with Kubernetes via Service Catalog (https://github.com/kubernetes-incubator/service-catalog), while aws-service-operator is a CRD controller that directly integrates with Kubernetes API.

Both can solve a similar set of problems, and the difference is mainly in which features have been implemented so far, if I understand correctly. To me these are competing projects from AWS, probably written by different teams, and whichever pattern wins in the long term will/may become a standard.

@pierreozoux
Copy link

Also relevant to this discussion: https://blog.byte.builders/post/the-case-for-appbinding/
I'd love @tamalsaha to give his thoughts on this issue! ( I think he'd say go for operators! )

@christopherhein
Copy link
Contributor

Hey all, sorry for the radio silence here with KubeCon and the holidays this fell by the way side. Being the author on the project I'll give a little direction and why this project was created when we, the team I work on, also created the AWS Service Broker. These tools weren't created completely in silos.

Starting from the beginning although these projects are overlapping in some core areas there are a few key components on each side of the fence.

  1. From a development/operations background I see a heavy benefit of using the raw CRD API to help allow for an easier deployment and usage of a tool like this, with the operator from a ops perspective you don't need to worry yourself with deploying an additional etcd cluster and deploying the service-catalog agg API Server. For one this means less to manage which carries weight in large scale systems needing to monitor, log and manage more complex components.

  2. The ability to leverage the builtins within Kubernetes, as of today things like OSB or any Agg API server needs to implement it's own interpretation of the layers on top of just the provisioning system think RBAC, it's own endpoint structure, resource (un)marshaling, validations, mutations, status handling, versioning, etc. This leads to desperate implementations or none. This for me was a huge driver, I can now using the AWS Service Operator model RBAC model for provisioning and managing system using the same components I use to manage my application workloads and same goes for a higher level management system, for example if you wanted to always validate that your dev teams configured your S3 buckets without public read you could write a validating admission controller and check for the value of a specific in the object and deny those requests. All in all this allows you to highly customize your own environment.

  3. The OSB spec is a solid platform, it provides a great way to expose the provisioning and binding of those resources which is great if your organization is already following down this path it makes sense to carry forward with that.

  4. OSBs have a form of playbooks, the operator on the other hand is free form and allows you to both make your own mistakes or highly customize the resources. With the AWS Service Broker you have access to plans these plans will help you to provision something like a production RDS or a dev RDS instance without having to change much, with the Service Operator you'll need to understand a bit more about what you are setting.

  5. what @pierreozoux was saying about continuing this forward is in general the direction I'm hoping to eventually take this after the provisioning story is ironed out, I would love to either in this project or others see the continuation of Day 2-n components.

Overall both projects are continuing forward right now, and it comes down to a preference of the level of integration you want. The operator is also still in it's early days release in October and we have a healthy roadmap including implementing the full CFN resource specification so that any supported CFN resource is exposed via a CRD. Enabling highly complex systems to be managed using the control loop design. Ubernetes on AWS 👍 Check out - #153 for more information on this.

Hopefully this helps! Given this is alpha I would expect there to be a little bit of churn as we continue to iron out more of the kinks.

@christopherhein christopherhein self-assigned this Jan 15, 2019
@christopherhein christopherhein added the question Issues without actions, just questions. label Jan 15, 2019
@schlomo
Copy link
Contributor

schlomo commented Jan 15, 2019

@christopherhein thanks a lot for your view, can you share your thoughts on the following point: Only with the operator there is an active component in play that is able to react to changes happening in AWS or to take care of regular maintenance. Also, this active component can take a decision upon instantiation and utilize a backup instead of starting fresh if the resource deployed has backups from a previous life (e.g. automated disaster recovery).

AFAIKT the OSB doesn't allow for this sort of active component but rather has a fairly static mapping and won't support any autonomous changes by the system itself after it was instantiated.

Am I wrong with this view? For me the lack of an active component is the main differentiator. The operator implements automated operations while the OSB implements automated provisioning - and relies on manual operations or other means of operations.

@christopherhein
Copy link
Contributor

@schlomo you are correct in that OSB doesn't currently have a mechanism for this, that's not to say that won't change, but as of today yes.

I'm really interested in this lifecycle as the project continues to grow and develop, as of today we have a reconciliation loop but it's one sided where if there are changes in the k8s side it will reconcile with the CFN resources but if something changed on the AWS side it would be out of sync, last year we release the Drift Detection for CloudFormation stacks and I'm really looking forward to implementing this to create a bidirectional sync with Kubernetes and AWS that way if you have something using native Autoscaling for example and it changes a desired count of X that update can be pulled back into Kubernetes and reflected in etcd.

@schlomo
Copy link
Contributor

schlomo commented Jan 15, 2019

@christopherhein cool! You mentioned validating AWS resource configuration via an access controller. How would you handle CF functions like References or Conditionals in such a scenario?

So far I can only think of using CF change sets to get a preview of the change after all the CF functions have run their course, do you see another way? I think that an access controller would be very challenging to implement as it would only get the CF template with the functions but not the result of processing that.

@christopherhein
Copy link
Contributor

christopherhein commented Jan 15, 2019

The validation I'm thinking of is more on the k8s side of things. So if you had a policy for your organization that stated no S3 buckets could have accessControl: PublicRead you could easily formulate these policies and rules into code and make a dynamic admission controller just for your environment and if someone or something kubectl apply'd a manifest with that set it could catch and return an error.

From the CFN side you are correct it's more difficult.

@prashantchitta
Copy link
Author

prashantchitta commented Jan 15, 2019

Personally I prefer aws service operator as it is more closer to K8s way of implementation. CRDs are the way to go forward. Create a CRD, write a controller which listens to events and realizes what is required to do.

Coming to lifecycle of a resource such as S3 or DynamoDB, aws service operator can be extended to include these functionalities.

For example, i am currently looking to create S3 buckets and also assign bucketpolicies to the S3 buckets. Looking at both service operator and aws service broker, for me aws service operator is a easy choice to pick. But none of those projects create bucket policies. What i am doing currently is to extend it to include bucket policies so that teams can define it and dont need to go to aws console for this

@prashantchitta
Copy link
Author

And coming to validations, OPA is good project to look at. You can define policies and restrict whatever you want based on the policy you have defined.

@schlomo
Copy link
Contributor

schlomo commented Jan 16, 2019

@prashantchitta I haven't had the chance to use OPA, could you tell us how OPA would be able to handle CF functions? I think that supporting CF with all its features is essential to delivering a useful product in this area.

@pierreozoux
Copy link

Thanks a lot @christopherhein for your detailed insight.

Just a minor point about

From a development/operations background I see a heavy benefit of using the raw CRD API to help allow for an easier deployment and usage of a tool like this, with the operator from a ops perspective you don't need to worry yourself with deploying an additional etcd cluster and deploying the service-catalog agg API Server. For one this means less to manage which carries weight in large scale systems needing to monitor, log and manage more complex components.

I think it was true in the early days, I don't think it is still current, anyway, on openshift, it is transparent for us, and we don't need to maintain a separate etcd cluster.

But about all the rest, looking forward to see the operator develop! Looks promising! On our side I think we'll bet on operators. At the end of the day, this is my job, predict the future and bet on the right architecture :)

On my side, I think we can close the issue, but the philosophical discussion can continue.

@christopherhein
Copy link
Contributor

christopherhein commented Jan 16, 2019

Cool @pierreozoux, I'm probably going to keep this issue open since I've been asked this multiple times. If anyone is interested in doing a tiny PR might be useful to add a link to this on the readme.adoc :)

@mattmcneeney
Copy link

This thread is really interesting, and appreciate you giving us the context for how this project fits in with the AWS Service Broker @christopherhein

@pierreozoux - I agree that Operators and OSBAPI can be complimentary. There was one thing I wanted to point out though after reading this part of your comment:

But it can't and will never be ale to do anything specific to the service itself.

You're completely right that, today, OSBAPI only handles basic lifecycle management (provisioning, updating, binding, unbinding, deprovisioning), and if you want to do anything else (e.g. creating snapshots) you need another mechanism to do so (e.g. an Operator). This is something that the OSBAPI group have discussed many times, and there is some work in progress that aims to allow service providers to add extensions to their brokers, which platforms could choose to expose to developers (think actions like backup, restore, etc). Hope that helps. I'm also keenly watching how this future is going to develop!

@christopherhein christopherhein pinned this issue Mar 27, 2019
@schollii
Copy link

@prashantchitta I have a question about what you say here:

What i am doing currently is to extend it to include bucket policies so that teams can define it and dont need to go to aws console for this

Can you elaborate on "extend it", do you mean you are forking the git repo and adding code, or what? If so, is there code you can share?

@bingosummer
Copy link

@christopherhein aws-service-broker can bind the secrets of the service to the application as environment variables. Can aws-service-operator do this? How? Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Issues without actions, just questions.
Projects
None yet
Development

No branches or pull requests

10 participants