Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

call Job.validate when running tests under JobTest #1441

Merged
merged 1 commit into from
Sep 23, 2015

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Sep 22, 2015

This will help users (e.g., me) who actually override Job.validate in their Job's and expect to be able to write a unit test for that override. Plus this makes JobTest more closely match the behavior seen when actually running a Job.

This will help users (e.g., me) who actually override Job.validate in their Job's and
expect to be able to write a unit test for that override. Plus this makes JobTest
more closely match the behavior seen when actually running a Job.
@ianoc
Copy link
Collaborator

ianoc commented Sep 23, 2015

Seems reasonable/good to me. Thanks!

ianoc added a commit that referenced this pull request Sep 23, 2015
call Job.validate when running tests under JobTest
@ianoc ianoc merged commit e24bb67 into twitter:develop Sep 23, 2015
@johnynek
Copy link
Collaborator

I'm a little nervous about this. I'd rather an option to opt in to calling validate on JobTest. We've broken people before with innocuous looking default changes.

Adding something like .validate to JobTest which would turn this on would be safer.

@ianoc
Copy link
Collaborator

ianoc commented Sep 24, 2015

So this did come up as you thought it might @johnynek but basically only in 4 tests at twitter, i'm inclined to fix those. Maybe default opt in if its mostly safe? I don't mind making it opt-in either. Just that if its pretty much working like this a little bit more testing couldn't hurt

@tdyas
Copy link
Contributor Author

tdyas commented Sep 24, 2015

I'd be fine with making this opt-in.

In case it helps, the motivation for this change is that for jobs at my company that generate "HFile" key/value stores, I was adding validation logic to make sure callers did not forget to use a helper which ensures the keys are ordered as expected by the HFile format. validate in our FSJob class seemed like the best place to put that logic instead of putting it in run. I of course want to write a unit test for that logic (which requires validate be called).

Feel free to revert the change for now until I can put together an opt-in version.

@johnynek
Copy link
Collaborator

@ianoc good that it was only 4 jobs.

Good to fix those anyway.

So, what do we hope to test with validate in TestMode? Generally, validate is just checking that all the data needed for the flow is not obviously bad or missing.

It might make sense to make it opt-in since that clearly can't break anyone, though it was good to detect this.

By the way, is this different in Execution. It looks like we always call validateSources:
https://github.com/twitter/scalding/blob/develop/scalding-core/src/main/scala/com/twitter/scalding/Execution.scala#L466

So, I guess no one is using VersionedKeyValSource with Execution and tests (not even sure how the mocking can be done in an Execution with the whole TestMode, I guess it can't currently).

@tdyas
Copy link
Contributor Author

tdyas commented Sep 24, 2015

Opt-in version: #1444

Will let you all decide what version you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants