Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compaction thread crash on sync writes #1422

Closed
ghost opened this issue Oct 26, 2016 · 7 comments
Closed

Compaction thread crash on sync writes #1422

ghost opened this issue Oct 26, 2016 · 7 comments

Comments

@ghost
Copy link

ghost commented Oct 26, 2016

Hi,

We have a use case of doing sync writes (each write ~ 1 Kb, SSD page size = 4Kb) with the following settings to alleviate stalling,

rocksdb_options_set_write_buffer_size(options,33554432);
env = rocksdb_create_default_env();
rocksdb_env_set_background_threads(env,2);
rocksdb_env_set_high_priority_background_threads(env,1);
rocksdb_options_set_max_background_compactions(options, 2);
rocksdb_options_set_max_background_flushes(options,1);
rocksdb_options_set_env(options,env);

Everything else is default (default read cache settings). After ~ 45 million rows were loaded, we got a crash on one of the compaction threads (the build was made from the current repository last week):

(gdb) bt
#0 0x00007f7f574a29e4 in UnalignedCopy64 (decompressor=0x7f7f45d23790, writer=0x7f7f45d237c0, uncompressed_len=) at snappy-stubs-internal.h:195
#1 IncrementalCopyFastPath (decompressor=0x7f7f45d23790, writer=0x7f7f45d237c0, uncompressed_len=) at snappy.cc:147
#2 AppendFromSelf (decompressor=0x7f7f45d23790, writer=0x7f7f45d237c0, uncompressed_len=) at snappy.cc:1209
#3 DecompressAllTagssnappy::SnappyArrayWriter (decompressor=0x7f7f45d23790, writer=0x7f7f45d237c0, uncompressed_len=) at snappy.cc:779
#4 snappy::InternalUncompressAllTagssnappy::SnappyArrayWriter (decompressor=0x7f7f45d23790, writer=0x7f7f45d237c0, uncompressed_len=) at snappy.cc:865
#5 0x00007f7f574a35e0 in InternalUncompresssnappy::SnappyArrayWriter (compressed=, uncompressed=) at snappy.cc:855
#6 snappy::RawUncompress (compressed=, uncompressed=) at snappy.cc:1234
#7 0x00007f7f574a3632 in snappy::RawUncompress (compressed=, n=, uncompressed=) at snappy.cc:1229
#8 0x00007f7f5795111d in Snappy_Uncompress (data=0x7f7efa0afea0 "\305\335-\220", n=215560, contents=0x7f7f45d24ee0, format_version=2, compression_dict=..., compression_type=, ioptions=...)

at ./util/compression.h:179

#9 rocksdb::UncompressBlockContentsForCompressionType (data=0x7f7efa0afea0 "\305\335-\220", n=215560, contents=0x7f7f45d24ee0, format_version=2, compression_dict=..., compression_type=,

ioptions=...) at table/format.cc:434

#10 0x00007f7f57951631 in rocksdb::UncompressBlockContents (data=Unhandled dwarf expression opcode 0xf3

) at table/format.cc:540
#11 0x00007f7f5795206c in rocksdb::ReadBlockContents (file=Unhandled dwarf expression opcode 0xf3

) at table/format.cc:388
#12 0x00007f7f5793a628 in rocksdb::(anonymous namespace)::ReadBlockFromFile (file=Unhandled dwarf expression opcode 0xf3

) at table/block_based_table_reader.cc:77
#13 0x00007f7f5793d38d in Create (this=Unhandled dwarf expression opcode 0xf3

) at table/block_based_table_reader.cc:196
#14 rocksdb::BlockBasedTable::CreateIndexReader (this=Unhandled dwarf expression opcode 0xf3

) at table/block_based_table_reader.cc:1743
#15 0x00007f7f579439d5 in rocksdb::BlockBasedTable::Open (ioptions=Unhandled dwarf expression opcode 0xf3

) at table/block_based_table_reader.cc:788
#16 0x00007f7f57938c5b in rocksdb::BlockBasedTableFactory::NewTableReader (this=Unhandled dwarf expression opcode 0xf3

) at table/block_based_table_factory.cc:59
#17 0x00007f7f578f31c5 in rocksdb::TableCache::GetTableReader (this=Unhandled dwarf expression opcode 0xf3

) at db/table_cache.cc:111
#18 0x00007f7f578f376d in rocksdb::TableCache::FindTable (this=0x1962cb0, env_options=..., internal_comparator=..., fd=..., handle=0x7f7f45d25678, no_io=false, record_read_stats=true, file_read_hist=0x19619e0,

skip_filters=false, level=-1, prefetch_index_and_filter_in_cache=true) at db/table_cache.cc:148

#19 0x00007f7f578f3d1d in rocksdb::TableCache::NewIterator (this=0x1962cb0, options=..., env_options=..., icomparator=..., fd=..., table_reader_ptr=0x0, file_read_hist=0x19619e0, for_compaction=false, arena=

0x0, skip_filters=false, level=-1, range_del_agg=0x0, is_range_del_only=false) at db/table_cache.cc:218

#20 0x00007f7f5786943c in rocksdb::CompactionJob::FinishCompactionOutputFile (this=0x7f7f45d26ca0, input_status=Unhandled dwarf expression opcode 0xf3

) at db/compaction_job.cc:1012
#21 0x00007f7f5786be23 in rocksdb::CompactionJob::ProcessKeyValueCompaction (this=0x7f7f45d26ca0, sub_compact=0x7f7ef89ea9f0) at db/compaction_job.cc:864
#22 0x00007f7f5786ca2f in rocksdb::CompactionJob::Run (this=0x7f7f45d26ca0) at db/compaction_job.cc:535
#23 0x00007f7f578943c8 in rocksdb::DBImpl::BackgroundCompaction (this=0x19510e0, made_progress=0x7f7f45d270fe, job_context=0x7f7f45d27120, log_buffer=0x7f7f45d27320, arg=0x0) at db/db_impl.cc:3616
#24 0x00007f7f578a4d68 in rocksdb::DBImpl::BackgroundCallCompaction (this=0x19510e0, arg=0x0) at db/db_impl.cc:3314
#25 0x00007f7f579c0771 in rocksdb::ThreadPoolImpl::BGThread (this=0x194ab40, thread_id=0) at util/threadpool_imp.cc:229
#26 0x00007f7f579c0853 in rocksdb::BGThreadWrapper (arg=0x1950b90) at util/threadpool_imp.cc:253
#27 0x00000034c0007851 in start_thread () from /lib64/libpthread.so.0
#28 0x00000034bf8e890d in clone () from /lib64/libc.so.6

Thanks,

Ethan.

@dhruba
Copy link
Contributor

dhruba commented Oct 28, 2016

A couple of questions:

  1. Can you recreate this crash repeatedly on demand?
  2. The crash happened when a bacjground thread was doing Compaction and was trying to read data from one of the source files in the compaction process. If you look at LOG file, you might be able to find the name of the sst files that were input to this compaction run. Once you find the sst file name, can you pl dump its contents via the ldb tool (https://github.com/facebook/rocksdb/wiki/Administration-and-Data-Access-Tool)? This will verify whether the file contents are sane or if they are corrupted.
  3. If the file contents are sane, then the problem is not with the data but rather the code. It coudl be either rocksdb code or your application code. Maybe you can build the rocksdb library with "valgrind" enabled and then rerun your test program that causes this problem.

@ghost
Copy link
Author

ghost commented Oct 28, 2016

@dhruba Thanks for reply. Unfortunately it isn't reproducible on demand. I am doing a series of stress tests and this one popped up in one of the tests, but it took many hours to show up.

I do have the coredump available so I can execute commands on it if it could help determine whether this was caused by our application code.

Thanks,

@siying
Copy link
Contributor

siying commented Nov 4, 2016

The call stack is interesting. After finish writing one SST file, its index block cannot be correctly decompressed. The Snappy library crashes while trying to decompress the block.

It is very strange to me. In your core dump, does memory address 0x7f7efa0afea0 and following 215560 bytes valid (The parameter of frame 8)? Could Snappy crash just because the data is not right? If it is the case, is there a way that the data is corrupted in the device or the file system after being generated?

@ghost
Copy link
Author

ghost commented Nov 5, 2016

@siying Thanks for the follow up. Quick question.
If Snappy is not able to decompress this, could this be caused by memory corruption. In other words, could it be that bugs in our application (like buffer overflow) is corrupting that data?

@siying
Copy link
Contributor

siying commented Nov 5, 2016

@ehamilto it is theoretically possible for sure. However, unless you tune RocksDB in a uncommon way, it is hard for me to believe as this is a common code path and almost every RocksDB instance will go through it.

@ghost
Copy link
Author

ghost commented Nov 5, 2016

@siying we have repeated the stress test that caused this several times and I haven't seen it again.
We have fixed several crashes in our app in between (although those crashes were caused by de-referencing bad pointers not memory corruption such as a buffer overflow). The only difference between the above settings and what we currently run is that now we run with a memtable size of 4 Mb. I will run a test with a memtable of 32 M in the next few days to see if I can reproduce it again.

@gfosco
Copy link
Contributor

gfosco commented Jan 10, 2018

Closing this via automation due to lack of activity. If discussion is still needed here, please re-open or create a new/updated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants