Link git outputs between tasks #270

shashwathi · 2018-11-27T17:51:11Z

In this PR I have implemented outputs using PVC within lifetime of pipelinerun. Closed old PR #255 in favor of this. Discussion for design is in previous PR so please refer to that for more details.

Note:

Replaces current e2e test for multiple tasks pipelinerun test with fan-in and fan-out pipeline. (More complex than existing e2e test and covers all use cases previous e2e test as well)
Added another example under examples to demonstrate git resource dependency
Support for targetPath is in pipeline task resource. e2e test includes this use case as well.

What will not be included?

Support for images
This PR only includes support for git resource dependency. Support for images includes some design of metadata and I would like to follow up with another PR for that.

/assign @imjasonh
/assign @bobcatfish
/assign @pivotal-nader-ziada

bobcatfish

I'd like to start off by saying this is a great PR and thanks for putting all this thorough work in 😍

I especially appreciate your attention to test coverage and all of the well factored unit tests :D

Of course I've always got some feedback tho, my apologies 😅 ! Initial thoughts below:

Could this PR also include some info in the docs about how the outputs are passed b/w the tasks? maybe at https://github.com/knative/build-pipeline/blob/master/docs/using.md#creating-resources
- And some docs about the new targetPath resource type

I'm confused about the fact that in the e2e test, it looks liek the two fanned out tasks modify the same resource is that right? I thought we wanted to explicitly avoid that :S :S :S

Replaces current e2e test for multiple tasks pipelinerun test with fan-in and fan-out pipeline. (More complex than existing e2e test and covers all use cases previous e2e test as well)

Since we haven't completed #168 yet, what order does the fan-in, fan-out end up executing in?

Quick plug for rewriting some of the commit messages, see https://github.com/knative/build-pipeline/blob/master/CONTRIBUTING.md#commit-messages (commit messages like "POC outputs", "Fix merge conflicts" could have more info - also might lend themselves to better messages if they were rebased together? and it would be great if the issue #s could be in some of the commit messages)

bobcatfish · 2018-11-27T18:29:21Z

test/helm_task_test.go

-					Name:       "workspace",
-					ProvidedBy: []string{createImageTaskName},
+					Name: "workspace",
+					//	ProvidedBy: []string{createImageTaskName}, //Yet to be implemented


could we add a TODO(some issue # here), e.g.:

// TODO(#148 ) add support for xyz

Do you mean you are going to implement this as part of this same PR?

bobcatfish · 2018-11-27T18:35:51Z

test/crd_checks.go

@@ -30,7 +30,7 @@ import (

 const (
 	interval = 1 * time.Second
-	timeout  = 5 * time.Minute
+	timeout  = 10 * time.Minute // TODO: timeout should be configurable via go test timeout


go test timeout is the timeout for all tests being run, while this timeout is for individual calls to PollImmediate, I don't think you can use go test timeout for this

which call to PollImmediate needs 10 minutes? That seems extremely long

if it's unavoidable to have a 10 minute timeout, I suggest that we change the interface to these functions to take a timeout parameter so we can use the timeout that is needed each time we call it (i.e. certainly most calls to Poll don't need 10 min)

also, if we DO need to up one of these timeouts to 10 min, we should update the test README (https://github.com/knative/build-pipeline/blob/master/test/README.md) to indicate that the go test default timeout won't cut it anymore

New pipelinerun e2e test includes 4 tasks and this takes upto 6min (on my gke) cluster. I wanted to add timeout to be on safer side so increased it upto 10min but it is very specific to that e2e test. So probably I will just update timeout for that test instead of all crds check timeout

just bumping this before we merge

oh wait my bad, this is an outdated comment. ignore me!

bobcatfish · 2018-11-27T18:38:31Z

test/pipelinerun_test.go

+		}, &v1alpha1.Task{
+			ObjectMeta: metav1.ObjectMeta{
+				Namespace: namespace,
+				Name:      "check-create-files-exists",


hahaha i feel like ive seen this exact example in the concourse docs (as we slowly re-create concourse XD)

bobcatfish · 2018-11-27T18:42:44Z

test/pipelinerun_test.go

+					Name:    "read-something-else",
+					Image:   "ubuntu",
+					Command: []string{"cat"},
+					Args:    []string{"/workspace/readingspace/something-else"},


im a bit confused by what each task and each step in each task is doing - maybe we could use the names of the steps to make it a bit more clear, e.g.:

In the first task, "write-something" could become "write-task-1"

In the next two tasks that run in parallel, "read-stuff" could be something like "read-data-from-task-1", and "write-soemthing-else" could be "write-data-from-task-2-1" or something

what do you think if we had the tasks themselves assert? i.e. have the step fail if the expected text doesnt exist, e.g. using grep

what do you think if we had the tasks themselves assert? i.e. have the step fail if the expected text doesnt exist, e.g. using grep

I like the idea. I will update the tasks to assert contents.

im a bit confused by what each task and each step in each task is doing - maybe we could use the names of the steps to make it a bit more clear

I tried to name tasks with "stuff" and "something" because the tasks are actually writing the same words. I could update them to pattern you mentioned.

bobcatfish · 2018-11-27T18:45:57Z

pkg/reconciler/v1alpha1/pipelinerun/pipelinerun.go

-		})
-	}
+
+	resources.WrapSteps(&tr.Spec, pr.Spec.PipelineTaskResources, pt)


i really like breaking this logic into a separate function!! ❤️

bobcatfish · 2018-11-27T18:56:07Z

pkg/apis/pipeline/v1alpha1/taskrun_types.go

+	// +optional
+	PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"`
+	// +optional
+	PostBuiltSteps []TaskBuildStep `json:"postBuiltSteps,omitempty"`


what does Built mean in this context? I think this is intended to run after all the steps run, is that right? i think since this CRD isn't called Build anymore, using Built is a bit confusing, some other ideas:

PreSteps

BeforeSteps

InitializationSteps

SetupSteps

I leaned toward Built because thats the end object we are creating now. I like the name BeforeSteps for pre build steps, for post build steps we could go with AfterSteps.

bobcatfish · 2018-11-27T18:56:23Z

pkg/apis/pipeline/v1alpha1/taskrun_types.go

+}
+
+// TaskBuildStep contains information to construct build pre and post steps
+type TaskBuildStep struct {


again im not sure about calling this Build (maybe Im missing something tho)

How about TaskStep? I agree with moving away from associating with Build keyword

bobcatfish · 2018-11-27T18:57:30Z

pkg/apis/pipeline/v1alpha1/taskrun_types.go

+	// +optional
+	PVCName string `json:"pvcName,omitempty"`
+	// +optional
+	PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"`


At the moment it would look to a user of TaskRun as if these are fields we expect them to provide - do we? What would it look like if a user provided these fields? Can we document it?

If not, can we somehow make it obvious we don't want users to provide them?

Do we have to have these steps in the definition of the taskRun type as a field, can we not append them in the appropriate order while creating the taskRun

If not, can we somehow make it obvious we don't want users to provide them?

How about add validation in taskrun to check if owner reference is pipelinerun Kind then post build and pre build steps should be allowed else throw error.

If we don't allow user to provide those field, it shouldn't be in the spec (should it ?), but I guess they are required for pipeline/pipelinerun to populate them… so… yeah most likely a validation error could work there.

Is there a use case for a user to provides those in addition to normal steps ?

Is there a use case for a user to provides those in addition to normal steps ?

@vdemeester: @shashwathi and i discussed, and we were thinking that maybe a use case where a user wants to debug tasks that are part of a pipeline in isolation? it might be a bit of a stretch tho!

i think adding paths to the interface instead of the builtSteps makes it a bit clearer hopefully!

bobcatfish · 2018-11-27T18:57:59Z

pkg/apis/pipeline/v1alpha1/taskrun_validation_test.go

+	tests := []struct {
+		name    string
+		steps   TaskBuildStep
+		wantErr *apis.FieldError


looks like we don't need wantErr, we have no error cases here (and id prefer those in a separate test anyway!)

bobcatfish · 2018-11-27T18:58:42Z

pkg/apis/pipeline/v1alpha1/pipelinerun_types_test.go

@@ -83,3 +83,24 @@ func TestInitializeConditions(t *testing.T) {
 		t.Fatalf("PipelineRun status getting reset")
 	}
 }
+
+func Test_GetPVC(t *testing.T) {


nader-ziada

Thanks for the PR @shashwathi

ITs a little bit hard to follow, so wondering if you can maybe update the commit message and add some more comments

nader-ziada · 2018-11-27T18:31:17Z

examples/pipelines/kritis-resources.yaml

+    value: admin
+
+---


why do you need a new resource that has the same properties as existing resource?

nader-ziada · 2018-11-27T18:35:21Z

pkg/apis/pipeline/v1alpha1/resource_types.go

-	Type PipelineResourceType `json:"type"`
+	Name       string               `json:"name"`
+	Type       PipelineResourceType `json:"type"`
+	TargetPath string               `json:"targetPath"`


should be optional since its not applicable to all types of resources

nader-ziada · 2018-11-27T18:35:53Z

pkg/apis/pipeline/v1alpha1/resource_types.go

@@ -118,6 +119,8 @@ type PipelineResource struct {
 type TaskRunResource struct {
 	Name        string              `json:"name"`
 	ResourceRef PipelineResourceRef `json:"resourceRef"`
+	// +optional
+	SourcePaths []string `json:"sourcePaths"`


Can you add a comment to what this new field is for

I do not think I am using that. I will go ahead and delete that field

nader-ziada · 2018-11-27T18:45:19Z

pkg/reconciler/v1alpha1/taskrun/resources/pre_post_build_step.go

+				}
+				newSteps = append(newSteps, []corev1.Container{{
+					Name:         fmt.Sprintf("source-mkdir-%s", source.Name),
+					Image:        "busybox",


I would prefer not to use busybox and instead use some base image controlled by the project

Can I follow up with another PR to fix image?

nader-ziada · 2018-11-27T19:09:21Z

pkg/apis/pipeline/v1alpha1/taskrun_types.go

+	// +optional
+	PVCName string `json:"pvcName,omitempty"`
+	// +optional
+	PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"`


Do we have to have these steps in the definition of the taskRun type as a field, can we not append them in the appropriate order while creating the taskRun

nader-ziada · 2018-11-27T19:11:53Z

test/helm_task_test.go

-					Name:       "workspace",
-					ProvidedBy: []string{createImageTaskName},
+					Name: "workspace",
+					//	ProvidedBy: []string{createImageTaskName}, //Yet to be implemented


Do you mean you are going to implement this as part of this same PR?

vdemeester

Looks good, most of my question were already covered by @bobcatfish and @pivotal-nader-ziada 😉

vdemeester · 2018-11-28T11:28:36Z

pkg/apis/pipeline/v1alpha1/taskrun_types.go

+	// +optional
+	PVCName string `json:"pvcName,omitempty"`
+	// +optional
+	PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"`


If we don't allow user to provide those field, it shouldn't be in the spec (should it ?), but I guess they are required for pipeline/pipelinerun to populate them… so… yeah most likely a validation error could work there.

Is there a use case for a user to provides those in addition to normal steps ?

shashwathi · 2018-11-28T16:57:07Z

@pivotal-nader-ziada @bobcatfish @vdemeester

I updated the design in recent change. Please take a look at it and leave comments. Thanks to @bobcatfish for slack brainstorming.

Updated e2e test to assert contents
No more use of steps(post, pre). Using taskrun.resource.paths to accomplish passing information to taskrun.
Updated docs.
Updated commit msg to include issue number and more details.

bobcatfish · 2018-11-29T02:28:00Z

docs/Concepts.md

@@ -37,6 +37,7 @@ A Task is a collection of sequential steps you would want to run as part of your
 A task will run inside a container on your cluster. A Task declares:

 1. Inputs the task needs.
+Task input resource can provide `targetPath` to initialize resource in specific directory. Resource will be placed under `/workspace/targetPath`. If `targetPath` is not specified then resource will be initialized under `/workspace`.


the PR is looking great and I really like the interface change!

I find this part of the docs a bit hard to follow - maybe it would be more clear if it was in a different section, such as Resources or TaskRun, and maybe had an example?

I'm not 100% sure how to improve it tho, feel free to merge this and we can iterate on it.

bobcatfish · 2018-11-29T02:29:41Z

test/crd_checks.go

@@ -73,12 +73,12 @@ func WaitForPodState(c *clients, name string, namespace string, inState func(r *
 // interval until inState returns `true` indicating it is done, returns an
 // error or timeout. desc will be used to name the metric that is emitted to
 // track how long it took for name to get into the state checked by inState.
-func WaitForPipelineRunState(c *clients, name string, inState func(r *v1alpha1.PipelineRun) (bool, error), desc string) error {
+func WaitForPipelineRunState(c *clients, name string, polltimeout time.Duration, inState func(r *v1alpha1.PipelineRun) (bool, error), desc string) error {


bobcatfish · 2018-11-29T02:30:03Z

test/pipelinerun_test.go

+					Name:    "read-from-task-0",
+					Image:   "ubuntu",
+					Command: []string{"bash"},
+					Args:    []string{"-c", "[[ stuff == $(cat /workspace/stuff) ]]"},


niiiiiice :D

bobcatfish · 2018-11-29T02:32:59Z

/lgtm

In case you want to make some tweaks to the docs - altho it's unlikely a user would want to use the paths directly, I think it would be nice to try to show how they could do that:

/hold

And also I'm still wondering about fan-out and fan-in, seems like we can end up with tasks running in parallel changing the same data?

shashwathi · 2018-11-29T03:09:56Z

In case you want to make some tweaks to the docs - altho it's unlikely a user would want to use the paths directly, I think it would be nice to try to show how they could do that:

I will take another shot at docs tomorrow. 👍

And also I'm still wondering about fan-out and fan-in, seems like we can end up with tasks running in parallel changing the same data?

Yes tasks have individual copy of the resources so even if they do modify it, I do not see why it could be a problem. I agree that fan-in of resources could lead to undesired side effects.

I can see a use case where this feature is useful like using same git resource across whole pipeline but user has to be aware of the changes in each tasks which does not seem too unreasonable I think.

shashwathi · 2018-11-29T17:05:39Z

@bobcatfish : I have updated the docs to include examples. Please review that and let me know if you have any feedback.

- Implementation details Pipelinerun creates pvc for the lifetime for object and uses that pvc as scratch space to transfer git resources between them. This information is passed to taskrun via resource paths. Paths are array of strings and incase of inouts these paths will be considered as new source of pipeline resource. In the case of outputs paths will be considered as new destination directory. - Update docs to include examples of paths Partially fixes tektoncd#148

knative-metrics-robot · 2018-11-29T17:24:55Z

The following is the coverage report on pkg/.
Say /test pull-knative-build-pipeline-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/apis/pipeline/v1alpha1/taskrun_types.go	0.0%	55.6%	55.6
pkg/reconciler/v1alpha1/pipelinerun/pipelinerun.go	82.7%	77.2%	-5.5
pkg/reconciler/v1alpha1/pipelinerun/resources/input_output_steps.go	Do not exist	95.5%
pkg/reconciler/v1alpha1/taskrun/resources/pre_post_build_step.go	Do not exist	100.0%
pkg/reconciler/v1alpha1/taskrun/taskrun.go	73.2%	74.1%	0.9

bobcatfish · 2018-11-29T17:30:08Z

docs/Concepts.md

@@ -169,6 +199,66 @@ Creating a `TaskRun` will invoke a [Task](#task), running all of the steps until
 completion or failure. Creating a `TaskRun` will require satisfying all of the input
 requirements of the `Task`.

+`TaskRun` definition includes `inputs`, `outputs` for `Task` referred in spec.
+
+Input resource includes name and reference to pipeline resource and optionally `paths`. `paths` will be used by `TaskRun` as the resource's new source paths i.e., copy the resource from specified list of paths. `TaskRun` expects the folder and contents to be already present in specified paths. `paths` feature could be used to provide extra files or altered version of existing resource before execution of steps.


ah great, I get it now! thanks for the example above and the extra detail in the docs :D

bobcatfish · 2018-11-29T17:30:27Z

/lgtm
/meow space

knative-prow-robot · 2018-11-29T17:30:28Z

@bobcatfish:

In response to this:

/lgtm
/meow space

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

shashwathi · 2018-11-29T17:36:26Z

@bobcatfish : I need /approve label too.

bobcatfish · 2018-11-29T17:40:10Z

whoops, just took it for granted that @shashwathi had approver rights already, she's one of the top contributors!!

/approve

knative-prow-robot · 2018-11-29T17:40:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bobcatfish, shashwathi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [bobcatfish]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

bobcatfish · 2018-11-29T17:41:02Z

/hold cancel

knative-prow-robot assigned bobcatfish, imjasonh and nader-ziada Nov 27, 2018

knative-prow-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 27, 2018

knative-prow-robot requested review from aaron-prindle and imjasonh November 27, 2018 17:51

shashwathi mentioned this pull request Nov 27, 2018

[WIP] Link git resources from previous task #255

Closed

shashwathi force-pushed the output-with-tests branch from c744923 to 33d6996 Compare November 27, 2018 18:50

bobcatfish reviewed Nov 27, 2018

View reviewed changes

nader-ziada reviewed Nov 27, 2018

View reviewed changes

vdemeester reviewed Nov 28, 2018

View reviewed changes

shashwathi force-pushed the output-with-tests branch 3 times, most recently from 6ed542e to 7381dab Compare November 28, 2018 16:46

bobcatfish mentioned this pull request Nov 29, 2018

Run taskRun without a Task by adding TaskSpec #262

Merged

bobcatfish reviewed Nov 29, 2018

View reviewed changes

knative-prow-robot added lgtm Indicates that a PR is ready to be merged. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Nov 29, 2018

shashwathi force-pushed the output-with-tests branch from 7381dab to 71e86e4 Compare November 29, 2018 16:07

knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 29, 2018

shashwathi force-pushed the output-with-tests branch from 71e86e4 to 95a17ab Compare November 29, 2018 17:23

bobcatfish reviewed Nov 29, 2018

View reviewed changes

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 29, 2018

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 29, 2018

knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 29, 2018

knative-prow-robot merged commit c747227 into tektoncd:master Nov 29, 2018

bobcatfish mentioned this pull request Nov 30, 2018

Design Output handling #124

Closed

bobcatfish mentioned this pull request Jan 9, 2019

TestPipelineRun/fan-in_and_fan-out is broken #375

Closed

mchmarny unassigned imjasonh and nader-ziada Mar 7, 2019

Link git outputs between tasks #270

Link git outputs between tasks #270

Conversation

shashwathi commented Nov 27, 2018

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shashwathi Nov 27, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nader-ziada left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdemeester left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shashwathi commented Nov 28, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobcatfish commented Nov 29, 2018

shashwathi commented Nov 29, 2018 • edited Loading

shashwathi commented Nov 29, 2018

knative-metrics-robot commented Nov 29, 2018

Choose a reason for hiding this comment

bobcatfish commented Nov 29, 2018

knative-prow-robot commented Nov 29, 2018

shashwathi commented Nov 29, 2018

bobcatfish commented Nov 29, 2018

knative-prow-robot commented Nov 29, 2018

bobcatfish commented Nov 29, 2018

shashwathi Nov 27, 2018 •

edited

Loading

shashwathi commented Nov 29, 2018 •

edited

Loading