Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crashes on data replay with assertion lastNotificationStep < updateGraph.clock().currentStep() #5422

Closed
jjbrosnan opened this issue Apr 26, 2024 · 3 comments · Fixed by #5577
Assignees
Labels
bug Something isn't working core Core development tasks query engine user-reported

Comments

@jjbrosnan
Copy link
Contributor

jjbrosnan commented Apr 26, 2024

Description

Server crashes with the following when replaying table data:

Server shutdown: Exception while processing UpdateGraph notification

io.deephaven.base.verify.AssertionFailure: Assertion failed: asserted lastNotificationStep < updateGraph.clock().currentStep(), instead lastNotificationStep == 10551, updateGraph.clock().currentStep() == 10551.

Steps to reproduce

I don't have a reproducer on hand.

Expected results

Either the table to fail to replay with an exception, or the table to just stop replaying.

Actual results

A server crash.

Additional details and attachments

If applicable, add any additional screenshots, logs, or other attachments to help explain your problem.

Versions

  • Deephaven: 0.33.3
  • OS: OS X
  • Browser: Unsure
  • Docker: Unsure

This is the (image of) the exception the user provided:
image (21)

@jjbrosnan jjbrosnan added bug Something isn't working triage user-reported labels Apr 26, 2024
@jjbrosnan
Copy link
Contributor Author

jjbrosnan commented Apr 26, 2024

result_tick_data = read("/data/storage/nse_tick_data_26_04_2024.parquet")

#-----
from deephaven.replay import TableReplayer
from deephaven.time import to_j_instant


start_time = "2024-04-26T11:52:00 Asia/Kolkata"
end_time = "2024-04-26T16:55:00 Asia/Kolkata"

result_tick_data = result_tick_data.where([f"KafkaTimestamp > '{start_time}'", f"KafkaTimestamp < '{end_time}'"])
result_tick_data = result_tick_data.sort("KafkaTimestamp")

replayer = TableReplayer(start_time, end_time)
replayed_table = replayer.add_table(result_tick_data, "KafkaTimestamp").reverse()
replayer.start()
#-----

result = result_ticreplayed_tablek_data.join(table=db_mv_zerodha_master_instruments, on=["instrument_token = instrument_token"])
result1 = result.last_by(by=["tradingsymbol"])

@rcaudy rcaudy self-assigned this Apr 26, 2024
@rcaudy rcaudy added query engine core Core development tasks and removed triage labels Apr 26, 2024
@rcaudy rcaudy added this to the 2. April 2024 milestone Apr 26, 2024
@rcaudy
Copy link
Member

rcaudy commented May 3, 2024

Added the user's exception screenshot to the description.

This looks like the ReplayTable somehow double-notified, where the second notification was an error. This doesn't present like a satisfaction bug causing a downstream listener to double notify.

I cannot find anything in the ReplayTable.run() code that would suggest that a double-notify is possible, unless the Replayer was started twice.

Without more to go on (e.g. the exception that triggered the notification), it's hard to know what happened here.

@rcaudy
Copy link
Member

rcaudy commented May 3, 2024

I've merged a PR to try to get more error logging next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core Core development tasks query engine user-reported
Projects
None yet
2 participants