Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add max-txn-size flag to gazctl #161

Merged
merged 2 commits into from
Feb 25, 2019
Merged

Conversation

jskelcy
Copy link

@jskelcy jskelcy commented Feb 21, 2019

This change is Reviewable

Connects #154
Connects #159

Please review commit by commit.
First commit:
There is an outstanding ticket (#154) to create a loopback server for testing against a consumer. This implements a basic loopback server in the model of the broker loopback server. This is a first pass implementation and sought to solve the problems introduced in writing tests for the second commit. As this testing tool is used more I expect it will be expanded upon.

Second commit:
Add a max transaction size flag to all of the apply/edit commands in gazctl.

@ghost ghost assigned jskelcy Feb 21, 2019
@ghost ghost added the in progress label Feb 21, 2019
@jskelcy jskelcy force-pushed the js/max_txn_size branch 5 times, most recently from ae1aede to 34e19ed Compare February 21, 2019 18:19
@jskelcy
Copy link
Author

jskelcy commented Feb 21, 2019


v2/cmd/gazctl/journals_apply.go, line 37 at r2 (raw file):

	var ctx = context.Background()
	resp, err := client.ApplyJournalsLimit(ctx, journalsCfg.Broker.JournalClient(ctx), req, cmd.MaxTxnSize)

@rupertchen Looking for some feedback here. I have made it so that within the ApplyFn we panic. It seemed like the idea was if you return an error from this function that idea is that the user should then be able to "clean up" the changes in the temp file, and in this case that is not going to be possible. Rather we want to panic with some error information and the user can try again from starting from scratch.

If that seems wrong to you let me know.

@jskelcy
Copy link
Author

jskelcy commented Feb 21, 2019


v2/pkg/client/list.go, line 109 at r2 (raw file):

// ShardClient.Apply call. Response validation or !OK status from Apply RPC are
// mapped to error. In the event of an error any unapplied changes will be
// available on |parentReq|.

I made a decision here that might be a little divisive and I would like to hear what people think. This function (and ApplyShardsLimit) have side effects wherein the parentReq is mutated as a result of this function. The idea being if there is an error of some kind of you can see what changes were not applied. I could see this helpful however I also acknowledge that functions with side effects can be confusing.

@jskelcy jskelcy changed the title Add max-txn-size flag to gazctl (not ready for review) Add max-txn-size flag to gazctl Feb 21, 2019
Copy link
Contributor

@rupertchen rupertchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review. Finished r1 and halfway through r2.

Reviewed 1 of 1 files at r1, 6 of 9 files at r2.
Reviewable status: 7 of 10 files reviewed, 4 unresolved discussions (waiting on @jskelcy and @rupertchen)


v2/cmd/gazctl/journals_apply.go, line 37 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

@rupertchen Looking for some feedback here. I have made it so that within the ApplyFn we panic. It seemed like the idea was if you return an error from this function that idea is that the user should then be able to "clean up" the changes in the temp file, and in this case that is not going to be possible. Rather we want to panic with some error information and the user can try again from starting from scratch.

If that seems wrong to you let me know.

You've got the right idea.

Aborting the edit makes sense to me. While we could place the user back in the editor, they should probably be looking at what the current state of the specs is. They'll be given the path to the YAML for the edit, so it'd be easy for them to re-run the exact same thing.


v2/cmd/gazctl/journals_apply.go, line 68 at r2 (raw file):

func tooManyOpsPanic() {
	log.Panic(`

If it is convenient to get numbers such as how many ops were sent and what the limit was, it'd be nice to include them in the log.


v2/pkg/client/list.go, line 109 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

I made a decision here that might be a little divisive and I would like to hear what people think. This function (and ApplyShardsLimit) have side effects wherein the parentReq is mutated as a result of this function. The idea being if there is an error of some kind of you can see what changes were not applied. I could see this helpful however I also acknowledge that functions with side effects can be confusing.

The side effect does make me wince a little. I took a crack at a different approach where I pass the slice of changes that weren't applied back in the error.

  • The original error is embedded into the new struct so it should be mostly usable as a drop-in replacement.
  • The identity check for a sentinel error is replaced by a check for support behavior that I find reads better.

jskelcy/gazette@js/max_txn_size...rupertchen:feedback/max_txn_size


v2/pkg/client/list.go, line 136 at r2 (raw file):

		var resp *pb.ApplyResponse
		var err error
		if resp, err = jc.Apply(pb.WithDispatchDefault(ctx), req, grpc.FailFast(false)); err != nil {

FailFast is deprecated, can we use WaitForReady (the opposite) instead?

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 10 files reviewed, 11 unresolved discussions (waiting on @jskelcy and @rupertchen)


v2/cmd/gazctl/journals_apply.go, line 68 at r2 (raw file):

Previously, rupertchen (Rupert Chen) wrote…

If it is convenient to get numbers such as how many ops were sent and what the limit was, it'd be nice to include them in the log.

the number of etcd cmps / ops is 1:1 with the number of specs in the yaml.

etcd doesn't provide a way to introspect the limit, that I'm aware of. the default is currently 128.


v2/cmd/gazctl/main.go, line 45 at r2 (raw file):

	SpecsPath  string `long:"specs" description:"Path to specifications file to apply. Stdin is used if not set"`
	DryRun     bool   `long:"dry-run" description:"Perform a dry-run of the apply"`
	MaxTxnSize int    `long:"max-txn-size" short:"m" default:"0" description:"Maxium number of transactions to be processed within a single batch"`

"maximum number of specs to be processed within an apply transaction. If 0, the default, all changes are issued in a single transaction".


v2/pkg/client/list.go, line 109 at r2 (raw file):

Previously, rupertchen (Rupert Chen) wrote…

The side effect does make me wince a little. I took a crack at a different approach where I pass the slice of changes that weren't applied back in the error.

  • The original error is embedded into the new struct so it should be mostly usable as a drop-in replacement.
  • The identity check for a sentinel error is replaced by a check for support behavior that I find reads better.

jskelcy/gazette@js/max_txn_size...rupertchen:feedback/max_txn_size

my gut is YAGNI. we shouldn't mutate the request, but can come back later and add this context to the returned error without breaking the API contract if it proves important to have.


v2/pkg/client/list.go, line 110 at r2 (raw file):
nits:

  1. the repo doesn't break lines in this way. Quoting "CodeReviewComments":

If you find that this produces lines that are too long, then change the names or the semantics and you'll probably get a good result.

eg parentReq => req, and maxTxnSize => limit.

(You can use scoped curReq / curResp or something similar within your loop.)

Recall that argument names are part of the documentation contract. As a reader, I have no idea what "parentReq" is (it's just my request?).

  1. I know I threw in ApplyJournalsLimit, but wondering if ApplyJournalsInBatches is a better name.

v2/pkg/client/list.go, line 129 at r2 (raw file):

		}

		var req = &pb.ApplyRequest{}

how about just maintaining an index into Changes, and slicing it on-demand with limit to craft the current iteration's request? no need to allocate / copy-loop.


v2/pkg/client/list.go, line 134 at r2 (raw file):

		}

		var resp *pb.ApplyResponse

nit:

var resp, err = jc.Apply(...)
if err != nil {

or

if resp, err := jc.Apply(...); err != nil {
} else if no-more-changes-remain {
   return resp, err
}

^ kind of nice cause it matches the pattern of returning resp in the other cases, and gets rid of the finalResp variable.


v2/pkg/consumer/consumer_stub.go, line 1 at r2 (raw file):

package consumer

this should be in a file suffixed with _test.go, or it'll otherwise be compiled into the release package.


v2/pkg/consumer/consumer_stub.go, line 51 at r2 (raw file):

}

// NewShardServerStub returns a ShardServerStub instance served by a local GRPC server.

nit:shardServerStub / newShardServerStub ?


v2/pkg/consumer/shard_api.go, line 207 at r2 (raw file):

// mapped to error. In the event of an error any unapplied changes will be
// available on |parentReq|.
func ApplyShardsLimit(

same comments as with ApplyJournalsLimit.

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 10 files reviewed, 11 unresolved discussions (waiting on @jskelcy and @rupertchen)


v2/pkg/client/list.go, line 134 at r2 (raw file):

		}

		var resp *pb.ApplyResponse

nit: how about

if resp, err := jc.Apply(...); err != nil {
} else if no-more-changes-remain {
   return resp, err
}

^ kind of nice cause it matches the pattern of returning resp in the other cases, and gets rid of the finalResp variable.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/pkg/client/list.go, line 110 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

nits:

  1. the repo doesn't break lines in this way. Quoting "CodeReviewComments":

If you find that this produces lines that are too long, then change the names or the semantics and you'll probably get a good result.

eg parentReq => req, and maxTxnSize => limit.

(You can use scoped curReq / curResp or something similar within your loop.)

Recall that argument names are part of the documentation contract. As a reader, I have no idea what "parentReq" is (it's just my request?).

  1. I know I threw in ApplyJournalsLimit, but wondering if ApplyJournalsInBatches is a better name.

I think ApplyJournalsLimit works well. In InBatches feels a little cumbersome.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/pkg/client/list.go, line 136 at r2 (raw file):

Previously, rupertchen (Rupert Chen) wrote…

FailFast is deprecated, can we use WaitForReady (the opposite) instead?

I believe we are on an older version of grpc which does not support WaitForReady. I will take a look at this.

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 10 files reviewed, 11 unresolved discussions (waiting on @jskelcy and @rupertchen)


v2/pkg/client/list.go, line 110 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

I think ApplyJournalsLimit works well. In InBatches feels a little cumbersome.

the trouble is, as a reader I don't know at a glance what Limit is or means. Does it only apply the first N changes? Does it return an error if there are more than N? I think the obvious parallel is io.LimitedReader, which returns EOF on reaching N -- and this seems to imply that not all specs are actually applied.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/pkg/client/list.go, line 109 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

my gut is YAGNI. we shouldn't mutate the request, but can come back later and add this context to the returned error without breaking the API contract if it proves important to have.

Ok, I think given that YAGNI is a good philosophy I think we can punt on the surfacing unapplied changes until we need it. But there are a few nice things that I will borrow from your suggestion Rupert.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/pkg/client/list.go, line 110 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

the trouble is, as a reader I don't know at a glance what Limit is or means. Does it only apply the first N changes? Does it return an error if there are more than N? I think the obvious parallel is io.LimitedReader, which returns EOF on reaching N -- and this seems to imply that not all specs are actually applied.

got it. The LimitedReader is a good parallel.

Copy link
Author

@jskelcy jskelcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 7 unresolved discussions (waiting on @jgraettinger, @jskelcy, and @rupertchen)


v2/cmd/gazctl/main.go, line 45 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

"maximum number of specs to be processed within an apply transaction. If 0, the default, all changes are issued in a single transaction".

Done.


v2/pkg/client/list.go, line 110 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

got it. The LimitedReader is a good parallel.

Done.


v2/pkg/client/list.go, line 129 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

how about just maintaining an index into Changes, and slicing it on-demand with limit to craft the current iteration's request? no need to allocate / copy-loop.

Done.


v2/pkg/consumer/shard_api.go, line 207 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

same comments as with ApplyJournalsLimit.

Done.


v2/pkg/consumer/consumer_stub.go, line 1 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

this should be in a file suffixed with _test.go, or it'll otherwise be compiled into the release package.

Done.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/cmd/gazctl/journals_apply.go, line 68 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

the number of etcd cmps / ops is 1:1 with the number of specs in the yaml.

etcd doesn't provide a way to introspect the limit, that I'm aware of. the default is currently 128.

I could see it potentially being difficult to figure out exactly how many specs are described in a yaml, so I can include the total number of changes as well as the default etcd config.

Copy link
Author

@jskelcy jskelcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 7 unresolved discussions (waiting on @jgraettinger, @jskelcy, and @rupertchen)


v2/cmd/gazctl/journals_apply.go, line 68 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

I could see it potentially being difficult to figure out exactly how many specs are described in a yaml, so I can include the total number of changes as well as the default etcd config.

Done.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/pkg/client/list.go, line 136 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

I believe we are on an older version of grpc which does not support WaitForReady. I will take a look at this.

It was a local env issue. Ill fix this up.

Copy link
Author

@jskelcy jskelcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 7 unresolved discussions (waiting on @jgraettinger, @jskelcy, and @rupertchen)


v2/pkg/client/list.go, line 136 at r2 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…

It was a local env issue. Ill fix this up.

Done.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019

a discussion (no related file):
@rupertchen @jgraettinger addressed all of your comments PTAL.


Copy link
Contributor

@rupertchen rupertchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: A nit from me about docs, but I'm good with this.

Reviewed 10 of 10 files at r3, 9 of 9 files at r4.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @jgraettinger)


v2/cmd/gazctl/journals_apply.go, line 75 at r4 (raw file):

of transactionality and should be used with caution. Instead it is recomended 
that additional label selectors are used to limit the number of changes within
this operation.

Nit.

I think the warning to use --max-txn-size by with caution would be more valuable as part of the option's long description. If placed there, this panic message could be shortened to redirect readers to the help doc.

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of comments below. I don't love the use of panic here, and think this is an opportunity to improve the editor API contract. And I'm pretty sure the ErrGRPCTooManyOps check doesn't work as advertised?

other stuff is small nits.

Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @jskelcy)


v2/cmd/gazctl/journals_apply.go, line 38 at r4 (raw file):

	var ctx = context.Background()
	resp, err := client.ApplyJournalsInBatches(ctx, journalsCfg.Broker.JournalClient(ctx), req, cmd.MaxTxnSize)
	if err == rpctypes.ErrGRPCTooManyOps {

are you sure this actually works? cause i'm scratching my head on how it would. this will come through as a grpc error, and would need to be unwrapped & mapped. How have you tested?


v2/cmd/gazctl/journals_edit.go, line 62 at r4 (raw file):

	var resp *pb.ApplyResponse
	var err error
	resp, err = client.ApplyJournalsInBatches(ctx, journalsCfg.Broker.JournalClient(ctx), req, cmd.MaxTxnSize)

nit: var resp, err = rather than extra declaration lines above

dave cheney recently recommend := if you're initializing a value, and var iff it's zero-valued. I'm down with that, if we want to switch, but we should be consistent.


v2/cmd/gazctl/shards_apply.go, line 34 at r4 (raw file):

	var ctx = context.Background()
	var resp *consumer.ApplyResponse

nit: same here.


v2/cmd/gazctl/shards_apply.go, line 37 at r4 (raw file):

	var err error
	resp, err = consumer.ApplyShardsInBatches(ctx, shardsCfg.Consumer.ShardClient(ctx), req, cmd.MaxTxnSize)
	if err == rpctypes.ErrTooManyOps {

we should't be panic-ing here. we're not in the editor loop. (and even there, panic-ing is gross. what else can we come up with? what about an editor.ApplyError interface { error ; ShouldAbort() bool } type which applyFn's return instead of an error ?


v2/cmd/gazctl/shards_edit.go, line 54 at r4 (raw file):

	var ctx = context.Background()
	var resp *consumer.ApplyResponse

nit: var declarations here too


v2/pkg/client/list.go, line 102 at r4 (raw file):

}

// ApplyJournalsInBatches is a helper function for applying changes to journals which

supernit: Documenting that it's a helper function isn't helpful :) (what's a "helper" function anyway?).

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 10 files at r3, 9 of 9 files at r4.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @jskelcy)

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/cmd/gazctl/journals_edit.go, line 62 at r4 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

nit: var resp, err = rather than extra declaration lines above

dave cheney recently recommend := if you're initializing a value, and var iff it's zero-valued. I'm down with that, if we want to switch, but we should be consistent.

I like the suggestion from dave cheney but I think given the current pattern we should stick to this.

@jskelcy
Copy link
Author

jskelcy commented Feb 25, 2019


v2/cmd/gazctl/shards_apply.go, line 37 at r4 (raw file):

Quoted 4 lines of code…
batches of at most the max transaction size, however this means a loss
of transactionality and should be used with caution. Instead it is recomended
that additional label selectors are used to limit the number of changes within
this operation.

Is there a hard and fast rule about why we should not be panic-ing? I saw we were using mbp.must and assumed that there was not an issue with a panic.

Copy link
Contributor

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @jgraettinger and @jskelcy)


v2/cmd/gazctl/shards_apply.go, line 37 at r4 (raw file):

Previously, jskelcy (Jake Skelcy) wrote…
batches of at most the max transaction size, however this means a loss
of transactionality and should be used with caution. Instead it is recomended
that additional label selectors are used to limit the number of changes within
this operation.

Is there a hard and fast rule about why we should not be panic-ing? I saw we were using mbp.must and assumed that there was not an issue with a panic.

Fair. I was thinking that editor is a library, and therefore shouldn't be panic-ing, but the source of the panic is not within editor but rather within gazctl & friends. I'm gonna walk this comment back, chalk it up to lack of sleep :)

@jskelcy jskelcy force-pushed the js/max_txn_size branch 5 times, most recently from cfe8880 to 5297974 Compare February 25, 2019 20:47
Copy link
Author

@jskelcy jskelcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 10 files reviewed, 4 unresolved discussions (waiting on @jgraettinger and @rupertchen)


v2/cmd/gazctl/journals_apply.go, line 38 at r4 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

are you sure this actually works? cause i'm scratching my head on how it would. this will come through as a grpc error, and would need to be unwrapped & mapped. How have you tested?

Done.


v2/cmd/gazctl/shards_apply.go, line 37 at r4 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

Fair. I was thinking that editor is a library, and therefore shouldn't be panic-ing, but the source of the panic is not within editor but rather within gazctl & friends. I'm gonna walk this comment back, chalk it up to lack of sleep :)

Done.


v2/pkg/client/list.go, line 102 at r4 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

supernit: Documenting that it's a helper function isn't helpful :) (what's a "helper" function anyway?).

Done.

@jskelcy jskelcy merged commit 08aa452 into gazette:master Feb 25, 2019
@ghost ghost removed the in progress label Feb 25, 2019
@jskelcy jskelcy deleted the js/max_txn_size branch February 25, 2019 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants