-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Jaeger CSV and Package for OLM integration and deployment of the … #173
Conversation
Codecov Report
@@ Coverage Diff @@
## master #173 +/- ##
======================================
Coverage 96.7% 96.7%
======================================
Files 32 32
Lines 1639 1639
======================================
Hits 1585 1585
Misses 41 41
Partials 13 13 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @awgreene)
a discussion (no related file):
This looks really cool, @awgreene! Thanks for your contribution. I just have a couple of questions, but nothing blocking.
deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml, line 18 at r1 (raw file):
description: |- Provides monitoring and troubleshooting microservices-based distributed systems keywords: ['tracing', 'monitoring', 'troubleshooting', 'distributed']
What's the source for these keywords? Perhaps distributed
could be removed, or perhaps changed to distributed tracing
?
deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml, line 45 at r1 (raw file):
strategy: deployment spec: permissions:
Can we reuse this from the role.yaml
? Or have a pre-commit hook that checks and blocks the commit if the contents diverge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @awgreene)
a discussion (no related file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
This looks really cool, @awgreene! Thanks for your contribution. I just have a couple of questions, but nothing blocking.
I have one more request: could you please add information to the CONTRIBUTING.adoc about how to run the scorecard tests?
Hello @jpkrohling, thanks for reviewing my PR.
I made a best effort attempt at adding the appropriate categories based on the content of the operator - I will follow your suggestion and remove
Good question - the CSV has to include the service account name and permissions. Creating a pre-commit hook that compares the permissions in the CSV and the openshift role (and vice verse) seems like an appropriate solution. I haven't made a pre-commit hook before, is there an easy way to add a hook that checks that one block of text exists in another?
I will work on updating the document today. |
It would probably be better to make it as a shell script, so that we can reuse it in the If the same YAML is expected in both files, only with different indentation, then the easiest solution would be to play with The hook itself is the easy part: just add a |
1db0375
to
509a481
Compare
@jpkrohling I have updated the CONTRIBUTING.adoc to include information about testing the CSV with the operator-sdk. Please let me know if you would like any changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @awgreene)
CONTRIBUTING.adoc, line 146 at r2 (raw file):
OLM also enforces some constraints on the components it manages in order to ensure a good user experience. The Jaeger community can provide and mantain a link:https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md/[ClusterServiceVersion (CSV) YAML] to integrate with OLM.
Not quite sure I understood this. Should this be "... provides and maintains a ..." instead?
CONTRIBUTING.adoc, line 151 at r2 (raw file):
[source,bash] ---- operator-sdk scorecard --cr-manifest deploy/cr-example.yaml --csv-path ./deploy/bundle/camel-k.v0.1.0.clusterserviceversion.yaml
This should probably be:
operator-sdk scorecard --cr-manifest deploy/examples/simplest.yaml --csv-path ./deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml
But the command above fails for some attempts with:
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Checking that writing into CRs has an effect
FATA[0011] failed to get logs: container "scorecard-proxy" in pod "jaeger-operator-676d5f74b6-84kmd" is waiting to start: ContainerCreating
Eventually, I get this output:
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Checking that writing into CRs has an effect
Checking for CRD resources
Checking for existence of example CRs
Checking spec descriptors
Checking status descriptors
Basic Operator:
Spec Block Exists: 1/1 points
Status Block Exist: 1/1 points
Operator actions are reflected in status: 0/1 points
Writing into CRs has an effect: 1/1 points
OLM Integration:
Owned CRDs have resources listed: 0/1 points
CRs have at least 1 example: 1/1 points
Spec fields with descriptors: 0/12 points
Status fields with descriptors: N/A (depends on an earlier test that failed)
Total Score: 4/18 points
It would be nice to add the expected output to the doc as well, along with a description of what's necessary to avoid the error above.
Which version of the Operator SDK is required for this? If a version newer than 0.1.1 is required, we need to update a few places as well:
CONTRIBUTING.adoc
- Installing the Operator SDK command line tool.travis/install.sh
Gopkg.toml
(and the lock file, after runningdep ensure
)
deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml, line 23 at r2 (raw file):
"strategy": "allInOne", "allInOne": { "image": "jaegertracing/all-in-one:1.8",
This demonstrates that the contents of this file will get outdated really fast if we don't add some sort of automated check... We are at 1.9 already :-)
This is a fast moving project! The operator-sdk introduced functionality that will generate a CSV based on the deploy files. If the deploy files change and a CSV already exist the sdk will update the changed values in the CSV. The hook could run the SDK command and check if the files don't match. Note: There is a pending PR to the SDK that need to be merged before this can be done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @awgreene and @jpkrohling)
CONTRIBUTING.adoc, line 146 at r2 (raw file):
provides and maintains a
Good point, I will update the contribution doc to reflect this suggestion.
CONTRIBUTING.adoc, line 151 at r2 (raw file):
operator-sdk scorecard --cr-manifest deploy/examples/simplest.yaml --csv-path ./deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml
This is correct, I will update the contribution doc.
But the command above fails for some attempts with:
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Checking that writing into CRs has an effect
FATA[0011] failed to get logs: container "scorecard-proxy" in pod "jaeger-operator-676d5f74b6-84kmd" is waiting to start: ContainerCreating
This error appears to be the result of the Jaeger operator not coming up in time for the CR test - one can configure the timeout with an addition flag to the sdk:
$ operator-sdk scorecard --help
...
-init-timeout int Timeout for status block on CR to be created in seconds (default 10)
...
I will update the example command above to include a 30 second init-timeout to avoid this in the future.
Which version of the Operator SDK is required for this?
The scorecard feature is currently implemented in the master branch, a tagged release with the feature hasn't yet been released. For now, I have update this section of the document to suggest using the master branch and will update it once the version has been tagged.
deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml, line 18 at r1 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
What's the source for these keywords? Perhaps
distributed
could be removed, or perhaps changed todistributed tracing
?
Done.
3da1e28
to
14952c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @awgreene)
a discussion (no related file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
I have one more request: could you please add information to the CONTRIBUTING.adoc about how to run the scorecard tests?
I think we are close. I'm just missing changes to the following files (from a previous comment):
CONTRIBUTING.adoc
- Installing the Operator SDK command line tool.travis/install.sh
Gopkg.toml
(and the lock file, after running dep ensure)
There are a couple of other non-blocking comments as well and I'm a bit uneasy to use master
instead of a tagged version, but considering that we won't fail the build because of this, I think it's OK to have it there as a "preview".
CONTRIBUTING.adoc, line 151 at r2 (raw file):
Previously, awgreene (Alexander Greene) wrote…
operator-sdk scorecard --cr-manifest deploy/examples/simplest.yaml --csv-path ./deploy/bundle/jaeger.v1.8.2.clusterserviceversion.yaml
This is correct, I will update the contribution doc.
But the command above fails for some attempts with:
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Checking that writing into CRs has an effect
FATA[0011] failed to get logs: container "scorecard-proxy" in pod "jaeger-operator-676d5f74b6-84kmd" is waiting to start: ContainerCreatingThis error appears to be the result of the Jaeger operator not coming up in time for the CR test - one can configure the timeout with an addition flag to the sdk:
$ operator-sdk scorecard --help ... -init-timeout int Timeout for status block on CR to be created in seconds (default 10) ...
I will update the example command above to include a 30 second init-timeout to avoid this in the future.
Which version of the Operator SDK is required for this?
The scorecard feature is currently implemented in the master branch, a tagged release with the feature hasn't yet been released. For now, I have update this section of the document to suggest using the master branch and will update it once the version has been tagged.
In a follow-up PR, I'll add this command as is to a Makefile
target, probably named scorecard
, so that developers just need to run make scorecard
.
Also, the timeout would probably need to be even higher, as I did get the same failure as before:
$ operator-sdk scorecard --cr-manifest deploy/examples/simplest.yaml --csv-path deploy/olm-catalog/jaeger-operator.csv.yaml --init-timeout 30
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Checking that writing into CRs has an effect
FATA[0031] failed to get logs: container "scorecard-proxy" in pod "jaeger-operator-7bdf49d479-rl8l8" is waiting to start: ContainerCreating
@awgreene: could you please confirm the following message isn't a real problem?
Unknown type for key (enabled) in spec: (<nil>)
Here's the full context (3rd output line):
$ operator-sdk scorecard --cr-manifest deploy/examples/simplest.yaml --csv-path deploy/olm-catalog/jaeger-operator.csv.yaml --init-timeout 30
Checking for existence of spec and status blocks in CR
Checking that operator actions are reflected in status
Unknown type for key (enabled) in spec: (<nil>)
Checking that writing into CRs has an effect
Checking for CRD resources
Checking for existence of example CRs
Checking spec descriptors
Checking status descriptors
Basic Operator:
Spec Block Exists: 1/1 points
Status Block Exist: 1/1 points
Operator actions are reflected in status: 0/1 points
Writing into CRs has an effect: 1/1 points
OLM Integration:
Owned CRDs have resources listed: 0/1 points
CRs have at least 1 example: 1/1 points
Spec fields with descriptors: 0/12 points
Status fields with descriptors: N/A (depends on an earlier test that failed)
Total Score: 4/18 points
deploy/olm-catalog/jaeger-operator.csv.yaml, line 16 at r3 (raw file):
"strategy": "allInOne", "allInOne": { "image": "jaegertracing/all-in-one:1.8",
This should also be bumped to 1.9
, but can be done in a follow-up PR.
deploy/olm-catalog/jaeger.package.yaml, line 4 at r3 (raw file):
channels: - name: alpha currentCSV: jaeger-operator.v1.8.2
Shouldn't this be 1.9.0
?
0220e60
to
c3cb162
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 4 files reviewed, 4 unresolved discussions (waiting on @jpkrohling)
a discussion (no related file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
I think we are close. I'm just missing changes to the following files (from a previous comment):
CONTRIBUTING.adoc
- Installing the Operator SDK command line tool.travis/install.sh
Gopkg.toml
(and the lock file, after running dep ensure)There are a couple of other non-blocking comments as well and I'm a bit uneasy to use
master
instead of a tagged version, but considering that we won't fail the build because of this, I think it's OK to have it there as a "preview".
I assume we can't update the .travis/install.sh
and Gopkg.toml
until the new tagged version is released right?
CONTRIBUTING.adoc, line 151 at r2 (raw file):
@awgreene: could you please confirm the following message isn't a real problem?
I spoke with Alex Pavel, the designer of the Operator-sdk scorecard feature, and he said that the error is a result of a couple of structs in the spec that have a field named "enabled" of type *bool
. If that is unset, that value is <nil>
. It might be a bit cleaner to no have pointers as fields for k8s CRs, and that might fix the problem but there may be a reason for using pointers instead of real bool
s, so it depends how the operator actually works. The "Operator actions are reflected in status" test is still a new feature and future development to improve the test is planned. This should not be seen as a huge issue for now.
deploy/olm-catalog/jaeger-operator.csv.yaml, line 16 at r3 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
This should also be bumped to
1.9
, but can be done in a follow-up PR.
I can make sure to update this now since I have to address the error in the package.
deploy/olm-catalog/jaeger.package.yaml, line 4 at r3 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
Shouldn't this be
1.9.0
?
It should be - I failed to update this after upgrading the CSV with the SDK.
…operator through OperatorHub - This pull requests introduces a CSV and Package which can be bundled together and used by the [Operator Lifecycle Manager](https://github.com/operator-framework/operator-lifecycle-manager) to install, manage, and upgrade the Jaeger Operator in a cluster. - The Jaeger operator versions available to all Kubernetes clusters using OLM can be updated by submitting a pull request to the [Community Operators GitHub repo](https://github.com/operator-framework/community-operators) that includes the latest CSVs, CRDs, and Packages. - The CSV was created based of the [documentation provided by OLM](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md) and the [documentation](https://github.com/operator-framework/community-operators/blob/master/docs/marketplace-required-csv-annotations.md) provided by [OperatorHub](https://github.com/operator-framework/operator-marketplace). - The operator and the CSV were tested using the [scorecard functionality recently introduced](operator-framework/operator-sdk#758) to the [Operator-sdk](https://github.com/operator-framework/operator-sdk). Currently, the Jaeger operator scores a 4/18 on the scorecard tests. OLM integration could be improved by future code changes to the Operator: - Addressing the operator-sdk scorecard tests that failed. - Adding additional information to the Jaeger CSV based on the [CSV documentation provided by OLM](https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/design/building-your-csv.md) - Adding additional information to the Jaeger CSV based on the [CSV documentation provided by OperatorHub](https://github.com/operator-framework/community-operators/blob/master/docs/marketplace-required-csv-annotations.md) Signed-off-by: awgreene <agreene@redhat.com>
deploy/olm-catalog/jaeger.package.yaml, line 4 at r3 (raw file): Previously, awgreene (Alexander Greene) wrote…
Done. |
deploy/olm-catalog/jaeger-operator.csv.yaml, line 16 at r3 (raw file): Previously, awgreene (Alexander Greene) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @awgreene)
a discussion (no related file):
Previously, awgreene (Alexander Greene) wrote…
I assume we can't update the
.travis/install.sh
andGopkg.toml
until the new tagged version is released right?
I'd rather not, as we don't want the PR builds to break because of something outside of the PRs (like a bad commit to master
).
I think this for a first round. We may just need to perform a follow-up PR based on our day-to-day usage.
@objectiser WDYT?
CONTRIBUTING.adoc, line 151 at r2 (raw file):
Previously, awgreene (Alexander Greene) wrote…
@awgreene: could you please confirm the following message isn't a real problem?
I spoke with Alex Pavel, the designer of the Operator-sdk scorecard feature, and he said that the error is a result of a couple of structs in the spec that have a field named "enabled" of type
*bool
. If that is unset, that value is<nil>
. It might be a bit cleaner to no have pointers as fields for k8s CRs, and that might fix the problem but there may be a reason for using pointers instead of realbool
s, so it depends how the operator actually works. The "Operator actions are reflected in status" test is still a new feature and future development to improve the test is planned. This should not be seen as a huge issue for now.
*bool
is valid, because the absence of value has a meaning (not set), allowing the operator to apply a sensible default (true/false). I'll just ignore this message for now, but let me know if you want me to open an issue against the Operator SDK repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved
…operator through OperatorHub
This pull requests introduces a CSV and Package which can be bundled together and used by the Operator Lifecycle Manager to install, manage, and upgrade the Jaeger Operator in a cluster.
The Jaeger operator versions available to all Kubernetes clusters using OLM can be updated by submitting a pull request to the Community Operators GitHub repo that includes the latest CSVs, CRDs, and Packages.
The CSV was created based of the documentation provided by OLM and the documentation provided by OperatorHub.
The operator and the CSV were tested using the scorecard functionality recently introduced to the Operator-sdk. Currently, the Jaeger operator scores a 4/18 on the scorecard tests.
OLM integration could be improved by future code changes to the Operator: