Skip to content

Commit

Permalink
[SPARK-49823][SS] Avoid flush during shutdown in rocksdb close path
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
Avoid flush during shutdown in rocksdb close path

### Why are the changes needed?
Without this change, we see sometimes that `cancelAllBackgroundWork` gets hung if there are memtables that need to be flushed. We also don't need to flush in this path, because we only assume that sync flush is required in the commit path.

```
	at app//org.rocksdb.RocksDB.cancelAllBackgroundWork(Native Method)
	at app//org.rocksdb.RocksDB.cancelAllBackgroundWork(RocksDB.java:4053)
	at app//org.apache.spark.sql.execution.streaming.state.RocksDB.closeDB(RocksDB.scala:1406)
	at app//org.apache.spark.sql.execution.streaming.state.RocksDB.load(RocksDB.scala:383)
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Verified the config is passed manually in the logs and existing unit tests.

Before:
```
sql/core/target/unit-tests.log:141:18:20:06.223 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 0
sql/core/target/unit-tests.log:776:18:20:06.871 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 0
sql/core/target/unit-tests.log:1096:18:20:07.129 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 0
```

After:
```
sql/core/target/unit-tests.log:6561:18:17:42.723 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 1
sql/core/target/unit-tests.log:6947:18:17:43.035 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 1
sql/core/target/unit-tests.log:7344:18:17:43.313 pool-1-thread-1-ScalaTest-running-RocksDBSuite INFO RocksDB [Thread-17]: [NativeRocksDB-1]             Options.avoid_flush_during_shutdown: 1
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#48292 from anishshri-db/task/SPARK-49823.

Authored-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
  • Loading branch information
anishshri-db authored and HeartSaVioR committed Sep 30, 2024
1 parent 039fd13 commit 885c3fa
Showing 1 changed file with 1 addition and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ class RocksDB(
rocksDbOptions.setTableFormatConfig(tableFormatConfig)
rocksDbOptions.setMaxOpenFiles(conf.maxOpenFiles)
rocksDbOptions.setAllowFAllocate(conf.allowFAllocate)
rocksDbOptions.setAvoidFlushDuringShutdown(true)
rocksDbOptions.setMergeOperator(new StringAppendOperator())

if (conf.boundedMemoryUsage) {
Expand Down

0 comments on commit 885c3fa

Please sign in to comment.