-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add withConfig api to allow running an execution with a transformed config #1489
Conversation
I think the cache key needs to include the config if we do this. Also, if we go this route, I would say we change |
I like this idea, will help in many cases where we have to go wide on a source using splits map side but want to write larger files to hdfs at the end of a job. Currently there is not much choice aside from using .shard |
@@ -209,6 +209,9 @@ object Execution { | |||
override def join[T, U](t: Execution[T], u: Execution[U]): Execution[(T, U)] = t.zip(u) | |||
} | |||
|
|||
def withConfig[T](ex: => Execution[T])(c: Config => Config): Execution[T] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is lazy, the arg in the constructor needs to be, otherwise this should not be lazy.
Looks great! (1 minor issue) |
Killed the lazy, not useful I don't think here |
Good to go now @johnynek ? |
Add withConfig api to allow running an execution with a transformed config
Adds the the ability to transform Config's (and hence args) for sub-sections of Execution flows. This can be useful to override hadoop or source level options in subsections. This lets the user have more control over things like split sizes, memory used in mappers/reducers, combining small files, etc..