You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The lookback window should be based off the cursor date instead of the state date when performing a sync that uses stream slicing. If the look back window is 3 days, on the first time sync, records starting from 3 days before the start date should be retrieved. For the subsequent incremental sync, the we should attempt to get records starting from 3 days before the date specified in the incoming state json.
Implementation Approach:
Within the datetime_stream_slicer.pyDatetimeStreamSlicer.stream_slices() method, we derive what the starting time of the request should be. We apply the lookback to the start time specified and then choose the greater of the start_datetime and cursor_datetime. We should actually compare the two datetimes first and then apply the lookback accordingly.
Acceptance Criteria
When running an incremental sync with state, it should yield records starting after the cursor date subtracted from the look back window
Good context for validation:
A good integration to test this on is source_wikipedia_pageviews because it doesn't require authentication and has datetime stream slicing implemented already. It doesn't have a lookback window specified, but you can add that in for testing purposes.
When running the first time sync, you should see the lookback window reflected by the date of the records received when you run a read command. Now you can pass in a state json starting later than the start_datetime using --state. On this second, you should see see records with dates earlier than that of the date in state.json according to the lookback window defined.
The text was updated successfully, but these errors were encountered:
#20156)
* [ISSUE #15628] apply lookback window on earliest datetime between start and cursor
* [ISSUE #15628] update release information and clean return statement
Current Behavior
The DatetimeStreamSlicer's lookback window is only applied on the start date, which means it won't work for subsequent syncs.
context: #15027 (comment)
Expected Behavior
The lookback window should be based off the cursor date instead of the state date when performing a sync that uses stream slicing. If the look back window is 3 days, on the first time sync, records starting from 3 days before the start date should be retrieved. For the subsequent incremental sync, the we should attempt to get records starting from 3 days before the date specified in the incoming state json.
Implementation Approach:
Within the
datetime_stream_slicer.py
DatetimeStreamSlicer.stream_slices()
method, we derive what the starting time of the request should be. We apply the lookback to the start time specified and then choose the greater of thestart_datetime
andcursor_datetime
. We should actually compare the two datetimes first and then apply the lookback accordingly.Acceptance Criteria
Good context for validation:
A good integration to test this on is
source_wikipedia_pageviews
because it doesn't require authentication and has datetime stream slicing implemented already. It doesn't have a lookback window specified, but you can add that in for testing purposes.When running the first time sync, you should see the lookback window reflected by the date of the records received when you run a read command. Now you can pass in a state json starting later than the start_datetime using
--state
. On this second, you should see see records with dates earlier than that of the date instate.json
according to the lookback window defined.The text was updated successfully, but these errors were encountered: