support multiple job targets for sensors #4745

yuhan · 2021-09-06T10:15:22Z

Summary

resolves #4590

this PR changes the SensorDefinition to accepts multiple targets. it doesn't handle solid_selection - my plan is to punt it and consolidate it with the the op selection effort later.

Test Plan

bk

vercel · 2021-09-06T10:15:25Z

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/elementl/dagster/4QnJ6bw3R1CmxURr1UDas41ymKfF
✅ Preview: https://dagster-git-yuhan-multi-job-sensor-elementl.vercel.app

sryza

I think this looks pretty great. I'd like @prha to take a look as well. We could consider having a subclass of RunRequest that has the job name? Though not sure that's better.

python_modules/dagster/dagster/core/definitions/decorators/sensor.py

python_modules/dagster/dagster_tests/daemon_tests/test_sensor_run.py

python_modules/dagster/dagster/core/definitions/sensor.py

python_modules/dagster/dagster/core/definitions/pipeline_sensor.py

alangenfeld

My gut reaction is that it is not clear to me that targeting, 0, 1, or N things should be represented by distinct definition types.

Having separate decorators feels more ok, but I would lean towards just having @sensor without a deeper understanding of the exact trade offs.

python_modules/dagster/dagster/core/host_representation/external_data.py

alangenfeld · 2021-09-09T19:29:50Z

python_modules/dagster/dagster_tests/daemon_tests/test_sensor_run.py

+@multi_job_sensor(jobs=[the_job, config_graph.to_job()])
+def two_job_sensor(context):
+    counter = int(context.cursor) if context.cursor else 0
+    if counter % 2 == 0:
+        yield RunRequest(run_key=str(counter), job_name="the_graph")
+    else:
+        yield RunRequest(
+            run_key=str(counter),
+            job_name="config_graph",
+            run_config={"solids": {"config_solid": {"config": {"foo": "blah"}}}},
+        )


There something odd about pointing at the job objects in the decorator, and then aligning with their name string property from the sensor. I guess we could update the examples to do job_name=the_job.name or something.

When we discussed this in the past I pitched having targets being a dict and then aligning on the key. Awkward in its own way, but has the upside of preventing a change to the graph/job name breaking the sensor at run time if its using string literals for job_name

An extra layer of indirection adds a lot of pain / opportunities for error, so I think we should be pretty confident it's necessary if we're going to add it. If someone is worried about the job name changing, they could do RunRequest(job_name=my_job.name).

Maybe the_job.create_run_request(...) could be interesting, might be able to give a better error experiences.

The fundamental awkwardness is some amount of duping since we want the ahead of time declaration and then need to align amongst the multiple targets.

I found RunRequest(job_name=job.name) to be pretty good... not sure why, but I feel hesitant to create methods on JobDefinition that are thin wrappers around RunRequest

python_modules/dagster/dagster/core/definitions/sensor.py

alangenfeld

to your queue

alangenfeld

you can make a cross compat test by copy pasting out the existing external sensor stuff and using the manual seres whitelist stuff to simulate loading in old from written in new

alangenfeld · 2021-09-15T21:59:09Z

python_modules/dagster/dagster/core/definitions/decorators/sensor.py

 def sensor(
    pipeline_name: Optional[str] = None,
    name: Optional[str] = None,
    solid_selection: Optional[List[str]] = None,
    mode: Optional[str] = None,
    minimum_interval_seconds: Optional[int] = None,
    description: Optional[str] = None,
-    job: Optional[Union[PipelineDefinition, GraphDefinition]] = None,
+    job: Optional[Union[GraphDefinition, JobDefinition]] = None,


should we asspire to remove job if we have jobs ? both isnt the worst i guess

yeah, I think it's just a convenience argument...

python_modules/dagster/dagster/core/host_representation/external.py

python_modules/dagster/dagster/core/host_representation/external_data.py

yuhan · 2021-09-16T14:23:53Z

@prha this is looking great to me!

sryza

I like it

alangenfeld

im not certain we need to care about the reading new from old case, since I think that would mean old dagit and newer user code which is less critical than new dagit old user code but inline comments are around the fact that we don't handle that.

alangenfeld · 2021-09-16T17:46:04Z

python_modules/dagster/dagster_tests/core_tests/host_representation_tests/test_external_data.py

+def test_back_compat_external_sensor():
+    SERIALIZED_0_12_10_SENSOR = '{"__class__": "ExternalSensorData", "description": null, "min_interval": null, "mode": "default", "name": "my_sensor", "pipeline_name": "my_pipeline", "solid_selection": null}'
+    external_sensor_data = deserialize_json_to_dagster_namedtuple(SERIALIZED_0_12_10_SENSOR)
+    assert isinstance(external_sensor_data, ExternalSensorData)
+    assert len(external_sensor_data.target_dict) == 1
+    assert "my_pipeline" in external_sensor_data.target_dict
+    target = external_sensor_data.target_dict["my_pipeline"]
+    assert isinstance(target, ExternalTargetData)
+    assert target.pipeline_name == "my_pipeline"


this tests reading old in new, but should we test reading new from old?

python_modules/dagster/dagster/core/host_representation/external_data.py

* MultiJobSensorDefinition and ISensorDefinition * format, single target properties * mypy * base sensor def * get rid of base * make external repr backcompat * fix tests * fix test error message * List -> Sequence for jobs mypy type * alex comments * ensure backcompat deserialization from 0.12.10 * fix mypy error * make sure pipeline name, mode are non-null in sensor targets * populate legacy fields for external sensor data Co-authored-by: prha <prha@elementl.com>

hebo-yang · 2021-09-20T06:37:32Z

Any chance this PR would cause dagster.serdes.errors.DeserializationError: Attempted to deserialize class "ExternalTargetData" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.?

yuhan · 2021-09-20T07:13:27Z

Hi @hebo-yang yes, this is related to the error.
The most likely cause is that your dagster package is on an earlier version than 0.12.11 (the version where we introduced this change). Reinstalling your Dagit and upgrade the Dagster package should fix it.

hebo-yang · 2021-09-21T04:37:42Z

Hi @hebo-yang yes, this is related to the error.
The most likely cause is that your dagster package is on an earlier version than 0.12.11 (the version where we introduced this change). Reinstalling your Dagit and upgrade the Dagster package should fix it.

Thanks! Yes. It's just that we weren't expecting breaking changes from patch releases. I will lock the dependency version to avoid such errors in the future.

vercel bot deployed to Preview September 7, 2021 08:25 View deployment

yuhan requested review from prha, alangenfeld and sryza September 7, 2021 08:27

vercel bot deployed to Preview September 7, 2021 08:55 View deployment

yuhan marked this pull request as ready for review September 7, 2021 08:55

sryza reviewed Sep 7, 2021

View reviewed changes

python_modules/dagster/dagster/core/definitions/decorators/sensor.py Outdated Show resolved Hide resolved

python_modules/dagster/dagster_tests/daemon_tests/test_sensor_run.py Show resolved Hide resolved

prha reviewed Sep 7, 2021

View reviewed changes

python_modules/dagster/dagster/core/definitions/sensor.py Outdated Show resolved Hide resolved

alangenfeld reviewed Sep 9, 2021

View reviewed changes

python_modules/dagster/dagster/core/definitions/pipeline_sensor.py Outdated Show resolved Hide resolved

alangenfeld reviewed Sep 9, 2021

View reviewed changes

alangenfeld requested changes Sep 9, 2021

View reviewed changes

yuhan added 3 commits September 15, 2021 14:37

MultiJobSensorDefinition and ISensorDefinition

2aa05ef

format, single target properties

95a0940

mypy

0ce834f

prha force-pushed the yuhan/multi-job-sensor branch from cb85958 to 0ce834f Compare September 15, 2021 21:38

vercel bot deployed to Preview September 15, 2021 21:38 View deployment

prha added 4 commits September 15, 2021 14:39

base sensor def

9450a3a

get rid of base

0dd5558

make external repr backcompat

a16aac4

fix tests

ffa62c7

prha changed the title ~~RFC: MultiJobSensorDefinition and ISensorDefinition~~ support multiple job targets for sensors Sep 15, 2021

prha added 2 commits September 15, 2021 14:44

fix test error message

bfa6d35

List -> Sequence for jobs mypy type

b6db86e

alangenfeld reviewed Sep 15, 2021

View reviewed changes

vercel bot deployed to Preview September 15, 2021 22:12 View deployment

prha added 3 commits September 15, 2021 15:18

alex comments

e4f42ed

ensure backcompat deserialization from 0.12.10

9914735

fix mypy error

871aca8

vercel bot deployed to Preview September 15, 2021 23:37 View deployment

make sure pipeline name, mode are non-null in sensor targets

97253c2

vercel bot deployed to Preview September 16, 2021 01:09 View deployment

yuhan requested a review from alangenfeld September 16, 2021 12:40

sryza approved these changes Sep 16, 2021

View reviewed changes

alangenfeld approved these changes Sep 16, 2021

View reviewed changes

populate legacy fields for external sensor data

effd0c6

vercel bot deployed to Preview September 16, 2021 18:01 View deployment

prha merged commit 72bd217 into master Sep 16, 2021

prha deleted the yuhan/multi-job-sensor branch September 16, 2021 18:25

dpeng817 mentioned this pull request Dec 9, 2021

Add OSS sensor backcompat tests #5889

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support multiple job targets for sensors #4745

support multiple job targets for sensors #4745

yuhan commented Sep 6, 2021 •

edited by prha

Loading

vercel bot commented Sep 6, 2021 •

edited

Loading

sryza left a comment

alangenfeld left a comment

alangenfeld Sep 9, 2021

sryza Sep 9, 2021

alangenfeld Sep 9, 2021

prha Sep 15, 2021

alangenfeld left a comment

alangenfeld left a comment

alangenfeld Sep 15, 2021

prha Sep 15, 2021

yuhan commented Sep 16, 2021

sryza left a comment

alangenfeld left a comment

alangenfeld Sep 16, 2021

hebo-yang commented Sep 20, 2021

yuhan commented Sep 20, 2021

hebo-yang commented Sep 21, 2021

support multiple job targets for sensors #4745

support multiple job targets for sensors #4745

Conversation

yuhan commented Sep 6, 2021 • edited by prha Loading

Summary

Test Plan

vercel bot commented Sep 6, 2021 • edited Loading

sryza left a comment

Choose a reason for hiding this comment

alangenfeld left a comment

Choose a reason for hiding this comment

alangenfeld Sep 9, 2021

Choose a reason for hiding this comment

sryza Sep 9, 2021

Choose a reason for hiding this comment

alangenfeld Sep 9, 2021

Choose a reason for hiding this comment

prha Sep 15, 2021

Choose a reason for hiding this comment

alangenfeld left a comment

Choose a reason for hiding this comment

alangenfeld left a comment

Choose a reason for hiding this comment

alangenfeld Sep 15, 2021

Choose a reason for hiding this comment

prha Sep 15, 2021

Choose a reason for hiding this comment

yuhan commented Sep 16, 2021

sryza left a comment

Choose a reason for hiding this comment

alangenfeld left a comment

Choose a reason for hiding this comment

alangenfeld Sep 16, 2021

Choose a reason for hiding this comment

hebo-yang commented Sep 20, 2021

yuhan commented Sep 20, 2021

hebo-yang commented Sep 21, 2021

yuhan commented Sep 6, 2021 •

edited by prha

Loading

vercel bot commented Sep 6, 2021 •

edited

Loading