target multiple jobs from a sensor #4590

sryza · 2021-08-19T22:32:03Z

A use case we heard from a user

Multiple jobs are derived from a single graph. Each job processes data from a different data source. He'd like a single sensor to trigger all of them.

Prior to the graph/job/op changes, the user had a partition set for each of these jobs. With the graph/job/op changes, each partition set becomes a job.

Another hypothesized use case

Diverse jobs might all depend on the same event stream, and that event stream may be expensive to read from. E.g. there might be some external API that hosts data, and different jobs might do different things with that data. Querying that external API might be expensive, so you might want to have a single sensor responsible for polling it and dispatching to different jobs.

yuhan · 2021-08-25T22:05:02Z

say, we have two jobs from the same graph in one repo:

event_reports_prod1 = event_reports.to_job(
    name="event_reports_prod1", resource_defs={"mode": ResourceDefinition.string_resource("prod1")}
)
event_reports_prod2 = event_reports.to_job(
    name="event_reports_prod2", resource_defs={"mode": ResourceDefinition.string_resource("prod2")}
)

Option 1
then, we want to have a sensor that targets these two jobs at the same time:

@sensor(jobs=[event_reports_prod1, event_reports_prod2])
def multi_job_sensor(context):
    if some condition:
        yield RunRequest(run_key=some_key, job_name="event_reports_prod1")
    else:    
        yield RunRequest(run_key=some_other_key, job_name="event_reports_prod2")

however, many params on @sensor/SensorDefinition are tied to a single job, so adding jobs to @sensor will conflict with those params (i.e. pipeline_name, solid_selection, mode)

Option 2
it may make more sense to introduce a separate for multi-job sensor, like:

@multi_target_sensor(jobs=[event_reports_prod1, event_reports_prod2])
def multi_job_sensor(context):
    if some condition:
        yield RunRequest(run_key=some_key, job_name="event_reports_prod1")
    else:    
        yield RunRequest(run_key=some_other_key, job_name="event_reports_prod2")

thoughts?

sryza · 2021-08-25T23:12:40Z

@yuhan - thoughts on whether @multi_target_sensor would return a SensorDefinition or some different type?

sryza added the area: sensor Related to Sensors label Aug 19, 2021

yuhan added the practitioner label Aug 24, 2021

yuhan self-assigned this Aug 25, 2021

This was referenced Sep 2, 2021

multi_target_sensor - v0 #4714

Closed

support multiple job targets for sensors #4745

Merged

prha closed this as completed in #4745 Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

target multiple jobs from a sensor #4590

target multiple jobs from a sensor #4590

sryza commented Aug 19, 2021

yuhan commented Aug 25, 2021

sryza commented Aug 25, 2021

target multiple jobs from a sensor #4590

target multiple jobs from a sensor #4590

Comments

sryza commented Aug 19, 2021

A use case we heard from a user

Another hypothesized use case

yuhan commented Aug 25, 2021

sryza commented Aug 25, 2021