-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mas i350 selectivesync #1794
Mas i350 selectivesync #1794
Conversation
backend (use backend's config value) one (sync co-ordinating write only, or backend config value if set to sync there) all (sync all writes irrespective of backend config value)
Fixed error in spec on riak_kv_wm_object:malformed_custom_param
Normalised sync_on_write as atom
to xref/dialyzer/eunit pass
Default bucket types required
This PR adds selective sync support only with leveled and eleveldb backends (there is a backend_capability called flush_put and only backends with this capability receive non-standards PUT requests). Adding support for the bitcask backend is trivial, as the bitcask:sync/1 just needs to be called after a put when flush_put/4 is used. This would have no change to any existing bitcask user, as the default bucket configuration is to revert to @martincox - would you be OK if bitcask support was added to this PR? Normally I try and avoid bitcask changes, to avoid any risk for you guys. The only change required will be in riak_kv_bitcask_backend (not in the actual bitcask repo). Although you may not use the feature, it might be easier to implement across all backends rather than manage the ongoing caveat that it only works in some backends |
Sync straight after a PUT, if strategy is not already to o_sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some smaller comments, but did not find serious issues
%% Setup write options... | ||
WriteOpts2 = case Sync of | ||
true -> | ||
lists:keyreplace(sync,1,WriteOpts, {sync,Sync}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observe that keyreplace does not add sync in this parameter is missing in the WriteOpts. Is that really what you want?
This is not equal to keydelete(sync, 1, WriteOpts) ++ [{sync, Sync}]
, which intuitively one probably wants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies on the presence of sync in the eleveldb cuttlefish:
https://github.com/basho/eleveldb/blob/develop-3.0/priv/eleveldb.schema#L32-L39
which then becomes part of the write_opts at startup:
So it will always be there, base don the current logic. However, this does appear to be quite a brittle chain.
src/riak_kv_eleveldb_backend.erl
Outdated
{ok, state()} | | ||
{error, term(), state()}. | ||
flush_put(Bucket, PrimaryKey, IndexSpecs, Val, State) -> | ||
Sync = true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check comment in leveled_backend
, probably just write the boolean as argument
expand_sync_on_write(default, BucketProps) -> | ||
normalize_value(get_bucket_option(sync_on_write, BucketProps)); | ||
expand_sync_on_write(Value, _BucketProps) -> | ||
Value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you not normalise here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think normalize_value may be unnecessary in both cases due to validation at the API/riak_kv_bucket
Add selective sync to Riak KV. Prior to this change either all vnodes sync'd a PUT (i.e. flushed to disk as part of accepting the PUT) or no vnodes. With no sync the flush decision is controlled by either the database (bitcask, eleveldb) or the operating system (leveled).
This changes allows for three options:
backend - rely on backend configuration as now
all - force a sync on all vnodes regardless of backend configuration
one - force a sync only on the coordinating vnode
The sync option can be set on a per-bucket basis, and over-written on a per-write basis. This provides much more flexibility.
Previously customer who wanted to make sure data was flushed to disk in a least one place for some buckets, had to set backend sync settings to sync, and sync to all vnodes on every write on every bucket. Throughput between sync and non-sync configurations have been discovered to be large even with hardware acceleration (e.g. flash-backed write caches) - see martinsumner/leveled#350