Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHIP-0024: Surfacing Errors to BuildRuns #30

Conversation

dalbar
Copy link
Member

@dalbar dalbar commented Sep 20, 2021

This ship proposes introducing error messages from low-level resources like TaskRuns into BuildRuns. So that users can better understand the underlying issues and third-party applications have an API to consume them.

Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
Comment on lines 138 to 139
results. It requires iterating all steps and processing the message
field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all steps but the first one that failed = the one that we store in status.failedAt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. However it introduces a dependency that UpdateBuildRunUsingTaskRunCondition has to be run before the error extracting logic.

ships/0024-surfacing-errors-to-buildrun.md Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
@dalbar dalbar force-pushed the 0024-surfacing-errors-to-buildrun branch 4 times, most recently from be2dac7 to e349b62 Compare October 5, 2021 13:59
Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci
Copy link

openshift-ci bot commented Oct 5, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SaschaSchwarze0

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 5, 2021
@dalbar dalbar force-pushed the 0024-surfacing-errors-to-buildrun branch from e349b62 to 86c09d9 Compare October 5, 2021 14:06
@dalbar
Copy link
Member Author

dalbar commented Oct 11, 2021

Initial implementation for reference: shipwright-io/build#901

@SaschaSchwarze0
Copy link
Member

/assign coreydaley
/assign adambkaplan

Comment on lines +33 to +38
This ship provides a mechanism to surface errors in the build process
into BuildRun objects granting more visibility into
without being Tekton/Implementation specific. To
achieve this goal, we introduce two new generic results
`shp-error-reason` and `shp-error-message`, that a TaskRun can emit
and a BuildRun can pick up.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these should be more generic, which would allow for surfacing not only error messages and reasons, but success messages and reasons (and anything in between).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in shipwright-io/build#901 (comment). I would rather prefer a dedicated field for errors. Do you see a specific use-case where we would need a generic message/reason that is not fitting into our current build results and also not into the possible future implementation of the strategy author results (see https://github.com/shipwright-io/community/blob/main/ships/0023-surface-results-in-buildruns.md#strategy-results-api-proposal).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So some analogous "prior art" that conforms with I think applies to what is laid out here:

  1. we are really talking about "communicating" information discovered by one controller that is of interest to another controller, where some sort of dependency between the controllers exists

  2. this is analogous in my opinion to what the openshift cluster version operator gathers from its dependent operators (where operator == controller + mgmt API)

  3. and if you look at that contract, you'll see something very similar. Consider: https://github.com/openshift/api/blob/release-4.8/operator/v1/types.go#L158-L164 ... note reason and error

  4. and more immediately / directly, look at what exists in shipwright that looks eerily similar: https://github.com/shipwright-io/build/blob/ac139dba9437bca165137856e18a593417945630/pkg/apis/build/v1alpha1/buildrun_types.go#L241-L261

So I see a successful pattern that has be reapplied a few times.

I would contend the focus on errors is a simplifying assumption that removes the need for type/status/last transition time from that operator type I noted above

the intermediate / non-terminal TaskRun statues are captured via other elements of the existing shipwright API

so in that sense, fwiw I at least am good with only introducing reason/message (again) in our API

that said, I do wonder about where in the build run status we could surface our failed reason/message

I'll see if it makes sense to dive into that as I progress with this

Copy link
Member

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general approval from me @dalbar

but I think there is one opportunity for making our status a bit more compact and organized

thanks

Comment on lines +33 to +38
This ship provides a mechanism to surface errors in the build process
into BuildRun objects granting more visibility into
without being Tekton/Implementation specific. To
achieve this goal, we introduce two new generic results
`shp-error-reason` and `shp-error-message`, that a TaskRun can emit
and a BuildRun can pick up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So some analogous "prior art" that conforms with I think applies to what is laid out here:

  1. we are really talking about "communicating" information discovered by one controller that is of interest to another controller, where some sort of dependency between the controllers exists

  2. this is analogous in my opinion to what the openshift cluster version operator gathers from its dependent operators (where operator == controller + mgmt API)

  3. and if you look at that contract, you'll see something very similar. Consider: https://github.com/openshift/api/blob/release-4.8/operator/v1/types.go#L158-L164 ... note reason and error

  4. and more immediately / directly, look at what exists in shipwright that looks eerily similar: https://github.com/shipwright-io/build/blob/ac139dba9437bca165137856e18a593417945630/pkg/apis/build/v1alpha1/buildrun_types.go#L241-L261

So I see a successful pattern that has be reapplied a few times.

I would contend the focus on errors is a simplifying assumption that removes the need for type/status/last transition time from that operator type I noted above

the intermediate / non-terminal TaskRun statues are captured via other elements of the existing shipwright API

so in that sense, fwiw I at least am good with only introducing reason/message (again) in our API

that said, I do wonder about where in the build run status we could surface our failed reason/message

I'll see if it makes sense to dive into that as I progress with this

ships/0024-surfacing-errors-to-buildrun.md Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Show resolved Hide resolved
@dalbar dalbar force-pushed the 0024-surfacing-errors-to-buildrun branch from 86c09d9 to eaf6ab9 Compare October 13, 2021 17:17
@dalbar
Copy link
Member Author

dalbar commented Oct 14, 2021

The Ship now includes #30 (comment) and more alternatives including conditions.

Copy link
Member

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some wording suggestions but otherwise I think we are basically there

ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
ships/0024-surfacing-errors-to-buildrun.md Outdated Show resolved Hide resolved
@dalbar dalbar force-pushed the 0024-surfacing-errors-to-buildrun branch 2 times, most recently from 1b0396e to 5e046a4 Compare October 14, 2021 18:14
@gabemontero
Copy link
Member

LGTM

I'll defer putting the actual label for now in case @SaschaSchwarze0 wants one more pass before this merges

thanks @dalbar !

Added initial draft for error surfacing ship.

Update ships/0024-surfacing-errors-to-buildrun.md

Co-authored-by: Sascha Schwarze <schwarzs@de.ibm.com>
@dalbar dalbar force-pushed the 0024-surfacing-errors-to-buildrun branch from 5e046a4 to c063203 Compare October 14, 2021 18:38
@SaschaSchwarze0
Copy link
Member

LGTM

I'll defer putting the actual label for now in case @SaschaSchwarze0 wants one more pass before this merges

thanks @dalbar !

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2021
@openshift-merge-robot openshift-merge-robot merged commit 8772511 into shipwright-io:main Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants