-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UPSTREAM: <carry>: Added support for TLS to MLMD GRPC Server #683
base: main
Are you sure you want to change the base?
Conversation
A new image has been built to help with testing out this PR: To use this image run the following: cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/683/head
git checkout -b pullrequest 85a6ca9ebb69f28e05780de54509b0d3855cb533
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-683" More instructions here on how to deploy and test a Data Science Pipelines Application. |
c2cfd2e
to
488ff25
Compare
620fd03
to
777e81c
Compare
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
5adfe93
to
20dd747
Compare
Change to PR detected. A new PR build was completed. |
/verified |
// +kubebuilder:validation:Optional | ||
CertificateContents string `json:"certificateContents,omitempty"` | ||
// +kubebuilder:validation:Optional | ||
PrivateKeyContents string `json:"privateKeyContents,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love that the user needs to paste these in here. For apiserver, we are able to just use the paths to the OpenShift-provded certs like so:
data-science-pipelines-operator/config/internal/apiserver/default/deployment.yaml.tmpl
Lines 180 to 191 in e101b0f
name: ds-pipeline-api-server | |
command: ['/bin/apiserver'] | |
args: | |
- --config=/config | |
- -logtostderr=true | |
{{ if .APIServer.EnableSamplePipeline }} | |
- --sampleconfig=/config/sample_config.json | |
{{ end }} | |
{{ if .PodToPodTLS }} | |
- --tlsCertPath=/etc/tls/private/tls.crt | |
- --tlsCertKeyPath=/etc/tls/private/tls.key | |
{{ end }} |
could we not do something similar for here? Could we not read those file contents instead of having them be manually pasted here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is automatically set by the operator and is not for users to use (technically they can). I had to add these fields to be able to set the contents in the template.
Regarding setting the contents, I also don't like it, but the MLMD config requires that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: having to set the contents -- sure, no way around that.
I had to add these fields to be able to set the contents in the template.
Ok, I understand the reason. I really don't like adding fields to the API for this though. Hmm. How sure are we that there's no other way to feed things to the manifestival templating? That doesn't sound right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think you could just add two strings roughly here
CustomKfpLauncherConfigMapData string |
and then populate them
and then use those to feed the templating engine (example:
data-science-pipelines-operator/config/internal/apiserver/default/kfp_launcher_config.yaml.tmpl
Line 4 in e101b0f
{{.CustomKfpLauncherConfigMapData}} |
and avoid changing the DSPA API
controllers/dspipeline_params.go
Outdated
func (p *DSPAParams) getMlmdGrpcCertificatesSecret(ctx context.Context, client client.Client) (v1.Secret, error) { | ||
secretName := types.NamespacedName{ | ||
Namespace: p.Namespace, | ||
Name: "ds-pipeline-metadata-grpc-tls-certs-" + p.Name, | ||
} | ||
secret := v1.Secret{} | ||
err := client.Get(ctx, secretName, &secret) | ||
return secret, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be a static helper method like we do for configmaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
controllers/dspipeline_params.go
Outdated
if apierrs.IsNotFound(err) && ignoreSecretNotFound { | ||
logger.Info("Ignoring secret not found due to DSPO_MLMD_IGNORE_AUTO_GENERATED_SECRET_NOT_FOUND_ERROR = true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a project admin, I think I might find this log message confusing.
Maybe something more user friendly like "Did not detect an MLMD GRPC Service-CA provisioned TLS kubernetes secret, continuing (DSPA is configured to ignore)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a more broader point, I actually think we might want to report an error here instead, because we enter here at: if params.PodToPodTLS == true, in this case we expect the service-ca provided tls secret to exist, if it doesn't, something has gone wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #683 (comment).
ssl_config { | ||
server_cert: "{{.MLMD.GRPC.CertificateContents}}" | ||
server_key: "{{.MLMD.GRPC.PrivateKeyContents}}" | ||
client_verify: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to confirm, this is for mutual tls right? i.e. server also verifying client tls certs? I'm thinking maybe we should drop a comment here mentioning that? wdyt? in proto files comments can be added via //
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
controllers/dspipeline_params.go
Outdated
@@ -893,3 +898,29 @@ func (p *DSPAParams) ExtractParams(ctx context.Context, dsp *dspa.DataSciencePip | |||
|
|||
return nil | |||
} | |||
|
|||
func (p *DSPAParams) LoadMlmdCertificates(ctx context.Context, client client.Client, logger logr.Logger) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we keep ExtractParams()
at the bottom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
DBConnectionTimeoutConfigName = "DSPO.HealthCheck.Database.ConnectionTimeout" | ||
RequeueTimeConfigName = "DSPO.RequeueTime" | ||
ApiServerIncludeOwnerReferenceConfigName = "DSPO.ApiServer.IncludeOwnerReference" | ||
MlmdIgnoreAutoGeneratedSecretNotFoundError = "DSPO.Mlmd.IgnoreAutoGeneratedSecretNotFoundError" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you elaborate on why this is needed? when do we want to ignore the secret?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll ignore the secret on tests. Since tests don't run on OpenShift, the secret is never created.
Should I add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this might be a hack to get around our existing tests. ATM I'm thinking we should just not be doing PodtoPodTLS
testing that is is not an live openshift environment. See: https://issues.redhat.com/browse/RHOAIENG-11646, Can we just set this to false, and not require the creation of such secrets to begin with in this mode?
What I'd rather see is we add openshift ci, and add integration tests for when podtopodtls == true
Since this feature inherently requires a working live openshift ci environment, the tests for this should be integration tests in nature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have tests that have TLS enabled, for example
data-science-pipelines-operator/controllers/testdata/declarative/case_6/deploy/04_cr.yaml
Line 15 in cf1bd60
podToPodTLS: true |
err := r.ApplyDir(dsp, params, mlmdTemplatesDir+"/"+mlmdGrpcService) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
if params.PodToPodTLS { | ||
err = params.LoadMlmdCertificates(ctx, r.Client, r.Log) | ||
if err != nil { | ||
return err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, okay so I think you're doing this because we need the tls/key pair generated to submit into the protofile. One issue I see with this is that it could result in potential failures initially, because service-ca is creating these secrets asynchronously, and may not create it in time before we attempt to fetch it.
I'm wondering if it might be better to do this in stages:
- first reconcile and create the service
- ensure dspo triggers a reconcile when the service-ca tls secret is created
- in the new reconcile, detect that we are podtopodtls == true && secretCreated == true, then proceed with r.ApplyDir(mlmddir)
Interested to hear thoughts, @gregsheremeta / @hbelmiro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HumairAK we would wait for the new reconcile indefinitely and never know if the secret was never created.
What about trying to LoadMlmdCertificates
and repeat until a timeout (maybe 30s?)? It would be simpler and we can log an error if the certificates are not found.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Change to PR detected. A new PR build was completed. |
This reverts commit 692aad2.
Change to PR detected. A new PR build was completed. |
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Signed-off-by: hbelmiro <helber.belmiro@gmail.com>
Change to PR detected. A new PR build was completed. |
The issue resolved by this Pull Request:
Resolves https://issues.redhat.com/browse/RHOAIENG-4971
This PR depends on:
Description of the changes:
Added support for TLS to MLMD GRPC Server
Testing instructions
There are 2 scenarios for testing:
TLS Enabled
Deploy the following DSPA
Run the sample pipeline
The run must complete successfully
TLS Disabled
Deploy the following DSPA
Run the sample pipeline
The run must complete successfully
Checklist