Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and elaborate FilterBuildingContext #6088

Closed
wants to merge 9 commits into from
2 changes: 1 addition & 1 deletion HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
* Changed the default value of periodic_compaction_seconds to `UINT64_MAX` which allows RocksDB to auto-tune periodic compaction scheduling. When using the default value, periodic compactions are now auto-enabled if a compaction filter is used. A value of `0` will turn off the feature completely.
* With FIFO compaction style, options.periodic_compaction_seconds will have the same meaning as options.ttl. Whichever stricter will be used. With the default options.periodic_compaction_seconds value with options.ttl's default of 0, RocksDB will give a default of 30 days.
* Added an API GetCreationTimeOfOldestFile(uint64_t* creation_time) to get the file_creation_time of the oldest SST file in the DB.
* An unlikely usage of FilterPolicy is no longer supported. Calling GetFilterBitsBuilder() on the FilterPolicy returned by NewBloomFilterPolicy will now cause an assertion violation in debug builds, because RocksDB has internally migrated to a more elaborate interface that is expected to evolve further. Custom implementations of FilterPolicy should work as before, except those wrapping the return of NewBloomFilterPolicy, which will require a new override of a protected function in FilterPolicy.
* FilterPolicy now exposes additional API to make it possible to choose filter configurations based on context, such as table level and compaction style. See `LevelAndStyleCustomFilterPolicy` in db_bloom_filter_test.cc. While most existing custom implementations of FilterPolicy should continue to work as before, those wrapping the return of NewBloomFilterPolicy will require overriding new function `GetBuilderWithContext()`, because calling `GetFilterBitsBuilder()` on the FilterPolicy returned by NewBloomFilterPolicy is no longer supported.

### New Features
* Universal compaction to support options.periodic_compaction_seconds. A full compaction will be triggered if any file is over the threshold.
Expand Down
179 changes: 172 additions & 7 deletions db/db_bloom_filter_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -648,15 +648,17 @@ TEST_F(DBBloomFilterTest, BloomFilterReverseCompatibility) {
}

namespace {
// A wrapped bloom over default FilterPolicy
class WrappedBloom : public FilterPolicy {
// A wrapped bloom over block-based FilterPolicy
class TestingWrappedBlockBasedFilterPolicy : public FilterPolicy {
public:
explicit WrappedBloom(int bits_per_key)
explicit TestingWrappedBlockBasedFilterPolicy(int bits_per_key)
: filter_(NewBloomFilterPolicy(bits_per_key, true)), counter_(0) {}

~WrappedBloom() override { delete filter_; }
~TestingWrappedBlockBasedFilterPolicy() override { delete filter_; }

const char* Name() const override { return "WrappedRocksDbFilterPolicy"; }
const char* Name() const override {
return "TestingWrappedBlockBasedFilterPolicy";
}

void CreateFilter(const rocksdb::Slice* keys, int n,
std::string* dst) const override {
Expand All @@ -683,12 +685,13 @@ class WrappedBloom : public FilterPolicy {
};
} // namespace

TEST_F(DBBloomFilterTest, BloomFilterWrapper) {
TEST_F(DBBloomFilterTest, WrappedBlockBasedFilterPolicy) {
Options options = CurrentOptions();
options.statistics = rocksdb::CreateDBStatistics();

BlockBasedTableOptions table_options;
WrappedBloom* policy = new WrappedBloom(10);
TestingWrappedBlockBasedFilterPolicy* policy =
new TestingWrappedBlockBasedFilterPolicy(10);
table_options.filter_policy.reset(policy);
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

Expand Down Expand Up @@ -718,6 +721,167 @@ TEST_F(DBBloomFilterTest, BloomFilterWrapper) {
ASSERT_EQ(2U * maxKey, policy->GetCounter());
}

namespace {
// NOTE: This class is referenced by HISTORY.md as a model for a wrapper
// FilterPolicy selecting among configurations based on context.
class LevelAndStyleCustomFilterPolicy : public FilterPolicy {
public:
explicit LevelAndStyleCustomFilterPolicy(int bpk_fifo, int bpk_l0_other,
int bpk_otherwise)
: policy_fifo_(NewBloomFilterPolicy(bpk_fifo)),
policy_l0_other_(NewBloomFilterPolicy(bpk_l0_other)),
policy_otherwise_(NewBloomFilterPolicy(bpk_otherwise)) {}

// OK to use built-in policy name because we are deferring to a
// built-in builder. We aren't changing the serialized format.
const char* Name() const override { return policy_fifo_->Name(); }

FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext& context) const override {
if (context.compaction_style == kCompactionStyleFIFO) {
return policy_fifo_->GetBuilderWithContext(context);
} else if (context.level_at_creation == 0) {
return policy_l0_other_->GetBuilderWithContext(context);
} else {
return policy_otherwise_->GetBuilderWithContext(context);
}
}

FilterBitsReader* GetFilterBitsReader(const Slice& contents) const {
// OK to defer to any of them; they all can parse built-in filters
// from any settings.
return policy_fifo_->GetFilterBitsReader(contents);
}

// Defer just in case configuration uses block-based filter
void CreateFilter(const Slice* keys, int n, std::string* dst) const override {
policy_otherwise_->CreateFilter(keys, n, dst);
}
bool KeyMayMatch(const Slice& key, const Slice& filter) const override {
return policy_otherwise_->KeyMayMatch(key, filter);
}

private:
const std::unique_ptr<const FilterPolicy> policy_fifo_;
const std::unique_ptr<const FilterPolicy> policy_l0_other_;
const std::unique_ptr<const FilterPolicy> policy_otherwise_;
};

class TestingContextCustomFilterPolicy
: public LevelAndStyleCustomFilterPolicy {
public:
explicit TestingContextCustomFilterPolicy(int bpk_fifo, int bpk_l0_other,
int bpk_otherwise)
: LevelAndStyleCustomFilterPolicy(bpk_fifo, bpk_l0_other, bpk_otherwise) {
}

FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext& context) const override {
test_report_ += "cf=";
test_report_ += context.column_family_name;
test_report_ += ",cs=";
test_report_ +=
OptionsHelper::compaction_style_to_string[context.compaction_style];
test_report_ += ",lv=";
test_report_ += std::to_string(context.level_at_creation);
test_report_ += "\n";

return LevelAndStyleCustomFilterPolicy::GetBuilderWithContext(context);
}

std::string DumpTestReport() {
std::string rv;
std::swap(rv, test_report_);
return rv;
}

private:
mutable std::string test_report_;
};
} // namespace

TEST_F(DBBloomFilterTest, ContextCustomFilterPolicy) {
for (bool fifo : {true, false}) {
Options options = CurrentOptions();
options.statistics = rocksdb::CreateDBStatistics();
options.compaction_style =
fifo ? kCompactionStyleFIFO : kCompactionStyleLevel;

BlockBasedTableOptions table_options;
std::shared_ptr<TestingContextCustomFilterPolicy> policy(
new TestingContextCustomFilterPolicy(15, 8, 5));
table_options.filter_policy = policy;
table_options.format_version = 5;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

CreateAndReopenWithCF({fifo ? "abe" : "bob"}, options);

const int maxKey = 10000;
for (int i = 0; i < maxKey / 2; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
// Add a large key to make the file contain wide range
ASSERT_OK(Put(1, Key(maxKey + 55555), Key(maxKey + 55555)));
Flush(1);
EXPECT_EQ(policy->DumpTestReport(),
fifo ? "cf=abe,cs=kCompactionStyleFIFO,lv=0\n"
: "cf=bob,cs=kCompactionStyleLevel,lv=0\n");

for (int i = maxKey / 2; i < maxKey; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
Flush(1);
EXPECT_EQ(policy->DumpTestReport(),
fifo ? "cf=abe,cs=kCompactionStyleFIFO,lv=0\n"
: "cf=bob,cs=kCompactionStyleLevel,lv=0\n");

// Check that they can be found
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ(Key(i), Get(1, Key(i)));
}
// Since we have two tables / two filters, we might have Bloom checks on
// our queries, but no more than one "useful" per query on a found key.
EXPECT_LE(TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL), maxKey);

// Check that we have two filters, each about
// fifo: 0.12% FP rate (15 bits per key)
// level: 2.3% FP rate (8 bits per key)
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ("NOT_FOUND", Get(1, Key(i + 33333)));
}
{
auto useful_count =
TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL);
EXPECT_GE(useful_count, maxKey * 2 * (fifo ? 0.998 : 0.975));
EXPECT_LE(useful_count, maxKey * 2 * (fifo ? 0.999 : 0.98));
}

if (!fifo) { // FIFO only has L0
// Full compaction
ASSERT_OK(db_->CompactRange(CompactRangeOptions(), handles_[1], nullptr,
nullptr));
EXPECT_EQ(policy->DumpTestReport(),
"cf=bob,cs=kCompactionStyleLevel,lv=1\n");

// Check that we now have one filter, about 9.2% FP rate (5 bits per key)
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ("NOT_FOUND", Get(1, Key(i + 33333)));
}
{
auto useful_count =
TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL);
EXPECT_GE(useful_count, maxKey * 0.90);
EXPECT_LE(useful_count, maxKey * 0.91);
}
}

// Destroy
ASSERT_OK(dbfull()->DropColumnFamily(handles_[1]));
dbfull()->DestroyColumnFamilyHandle(handles_[1]);
handles_[1] = nullptr;
}
}

class SliceTransformLimitedDomain : public SliceTransform {
const char* Name() const override { return "SliceTransformLimitedDomain"; }

Expand Down Expand Up @@ -1159,6 +1323,7 @@ TEST_F(DBBloomFilterTest, OptimizeFiltersForHits) {
options.table_factory.reset(NewBlockBasedTableFactory(bbto));
options.optimize_filters_for_hits = true;
options.statistics = rocksdb::CreateDBStatistics();
get_perf_context()->Reset();
get_perf_context()->EnablePerLevelPerfContext();
CreateAndReopenWithCF({"mypikachu"}, options);

Expand Down
5 changes: 5 additions & 0 deletions db/db_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -988,6 +988,11 @@ class DBTestBase : public testing::Test {
uint64_t TestGetTickerCount(const Options& options, Tickers ticker_type) {
return options.statistics->getTickerCount(ticker_type);
}

uint64_t TestGetAndResetTickerCount(const Options& options,
Tickers ticker_type) {
return options.statistics->getAndResetTickerCount(ticker_type);
}
};

} // namespace rocksdb
54 changes: 40 additions & 14 deletions include/rocksdb/filter_policy.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,12 @@
#include <string>
#include <vector>

#include "rocksdb/advanced_options.h"

namespace rocksdb {

class Slice;
struct BlockBasedTableOptions;

// A class that takes a bunch of keys, then generates filter
class FilterBitsBuilder {
Expand Down Expand Up @@ -80,8 +83,30 @@ class FilterBitsReader {
}
};

// Internal type required for FilterPolicy
struct FilterBuildingContext;
// Contextual information passed to BloomFilterPolicy at filter building time.
// Used in overriding FilterPolicy::GetBuilderWithContext().
struct FilterBuildingContext {
// This constructor is for internal use only and subject to change.
FilterBuildingContext(const BlockBasedTableOptions& table_options);

// Returns a new FilterBitsBuilder from the filter_policy in
// table_options, or nullptr if not applicable.
// (An internal convenience function to save boilerplate.)
FilterBitsBuilder* GetBuilder() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little bit odd that a "context" is also a factory class. If it is not too complicated, can we change it to a static function that takes FilterBuildingContext as an argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. How about a namespace-global function

FilterBitsBuilder* NewFilterBitsBuilder(const FilterBuildingContext &)

And we could still use temporaries for FilterBuildingContext, as in

builder = NewFilterBitsBuilder(FilterBuildingContext(table_options));

Except this function will return nullptr if there's no filter policy in effect. Do we have a naming convention for might return a new object or might return nullptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I'll go with

builder = BloomFilterPolicy::GetBuilderFromContext(FilterBuildingContext(table_options));

"from context" - The context has everything you need to get the builder.
"with context" - This BloomFilterPolicy creates the builder, consulting information in the context.

And this will be out of view from the public API.


// Options for the table being built
const BlockBasedTableOptions& table_options;

// Name of the column family for the table (or empty string if unknown)
std::string column_family_name;

// The compactions style in effect for the table
CompactionStyle compaction_style = kCompactionStyleLevel;

// The table level at time of constructing the SST file, or -1 if unknown.
// (The table file could later be used at a different level.)
int level_at_creation = -1;
};

// We add a new format of filter block called full filter block
// This new interface gives you more space of customization
Expand Down Expand Up @@ -125,27 +150,28 @@ class FilterPolicy {

// Return a new FilterBitsBuilder for full or partitioned filter blocks, or
// nullptr if using block-based filter.
// NOTE: This function is only called by GetBuilderWithContext() below for
// custom FilterPolicy implementations. Thus, it is not necessary to
// override this function if overriding GetBuilderWithContext().
virtual FilterBitsBuilder* GetFilterBitsBuilder() const { return nullptr; }

// A newer variant of GetFilterBitsBuilder that allows a FilterPolicy
// to customize the builder for contextual constraints and hints.
// (Name changed to avoid triggering -Werror=overloaded-virtual.)
// If overriding GetFilterBitsBuilder() suffices, it is not necessary to
// override this function.
virtual FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext&) const {
return GetFilterBitsBuilder();
}

// Return a new FilterBitsReader for full or partitioned filter blocks, or
// nullptr if using block-based filter.
// As here, the input slice should NOT be deleted by FilterPolicy.
virtual FilterBitsReader* GetFilterBitsReader(
const Slice& /*contents*/) const {
return nullptr;
}

protected:
// An internal-use-only variant of GetFilterBitsBuilder that allows
// a built-in FilterPolicy to customize the builder for contextual
// constraints and hints. (Name changed to avoid triggering
// -Werror=overloaded-virtual.)
virtual FilterBitsBuilder* GetFilterBitsBuilderInternal(
const FilterBuildingContext&) const {
return GetFilterBitsBuilder();
}

friend FilterBuildingContext;
};

// Return a new filter policy that uses a bloom filter with approximately
Expand Down
Loading