Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward pointers implementation #651

Merged
merged 28 commits into from
Nov 27, 2018

Conversation

dimosped
Copy link
Contributor

@dimosped dimosped commented Nov 7, 2018

Solves #650, #547 and #663

Implement 3 globally configured env variables:

  • ARCTIC_FORWARD_POINTERS=Disabled: VersionStore operates as it used to
  • ARCTIC_FORWARD_POINTERS=Enabled: VersionStore only updates the version document with the fw pointers to segments.
  • ARCTIC_FORWARD_POINTERS=Hybrid: Compatibility operation. VersionStore updates both the version document with fw-pointers, and the segments with backwards pointersr to parents/versions. Makes it easy to experiment safely with FW pointers, and switch back to original implementation (disabled) without issues. In this mode, reads prefer the FW pointers (if exist) otherwise falls-back to original backwards pointers based reads.

@dimosped dimosped self-assigned this Nov 7, 2018
@dimosped dimosped force-pushed the forward_pointers_final branch from 5af7701 to 0917cc1 Compare November 7, 2018 11:48
@dimosped dimosped requested a review from pablojim November 7, 2018 12:12
{'$set': segment,
'$addToSet': {'parent': version['base_version_id']}},
upsert=True)
if ARCTIC_FORWARD_POINTERS is FwPointersCfg.DISABLED:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this block clearer if written as two optional write operations?, the second collection.update_one is rewritten but essentially the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to favor some code duplication in favor of minimizing the regression risks, and keep the existing code path (with disabled fw pointers) as similar as possible.

We can properly refactor this once we build confidence in our implementation.
I am adding a todo with your suggestion though.

segments = [x['segment'] for x in collection.find({'symbol': symbol, 'parent': parent_id},
projection={'segment': 1},
)]
segments = [x['segment'] for x in collection.find(spec, projection={'segment': 1})]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth stripping _id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handy debug info indeed, added now

'segment': {'$lt': to_index}
}
if from_index is not None:
spec['segment']['$gte'] = from_index
else:
segment_count = version.get('segment_count', None)

# We want to use FW pointers to read data if:
# ARCTIC_FORWARD_POINTERS is currently enabled/hybrid AND last version was written with FW pointers,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider making enabled use the reverse pointers to read, but then checking that the forward pointers reconcile? As a way of validating for now?

Just an idea, I'm not against this approach, just asking what we want to test during the hybrid mode period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The written data are being verified upon write, i.e. we count the segments matching the write spec which have been actually written.
https://github.com/dimosped/arctic/blob/forward_pointers_final/arctic/store/_ndarray_store.py#L505

Tbh I am not against this idea, but maybe as a separate option/flag and add it as part of the verification check:
https://github.com/dimosped/arctic/blob/forward_pointers_final/arctic/store/_ndarray_store.py#L505

It would also mean r/w will be slower with an extra step of 1-query collecting N-document IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your call.

Copy link
Contributor Author

@dimosped dimosped Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually went ahead and implemented this, but during tests which force a failed check, a nasty bug was uncovered.

If mongo_retry operation fails once, the successfully written segments always remain (it is Arctic's design decision which won't change now). The upserted ids though in subsequent retried, don't include them, breaking the model.

Working on it and will add explicit tests.

@yschimke
Copy link
Contributor

yschimke commented Nov 7, 2018

Looks great to me, but I don't consider myself qualified to approve

@bmoscon
Copy link
Collaborator

bmoscon commented Nov 7, 2018

I dont have an issue with this PR - this is more a general critique - rather than using env variables (we have quite a few of them now, all defined in different files) can we also support a YAML file? the default would be env vars, but if any of the values were also defined in the yaml file, those would take precedence?

@yschimke
Copy link
Contributor

yschimke commented Nov 7, 2018

Building on @bmoscon point, some of these flags could be configured differently per connection. e.g. in the case of a backwards incompatible flag, a migration script might need to configure two instances with different flag values.

@dimosped
Copy link
Contributor Author

dimosped commented Nov 7, 2018

I was also thinking to refactor and create a yaml or "arcitc/config.py" module which holds all configurations.
The rest of the modules will simply import variables from this, and user will have all config in a common file.

@dimosped
Copy link
Contributor Author

dimosped commented Nov 7, 2018

maybe we could even have instead in config.py a python dict with default values.

I always prefer plain python dicts over yaml when possible, and given all configs are exposed via env variables, they don't require a code re-release with updated values.

@yschimke @bmoscon @jamesblackburn @willdealtry any objections with the above?

@yschimke
Copy link
Contributor

yschimke commented Nov 7, 2018

@dimosped no objections, it might be best done as a separate PR and move everything over for consistency.

@dimosped
Copy link
Contributor Author

dimosped commented Nov 7, 2018

@yschimke agreed, I will add even more config options in the next PR for the internal async work, so it would be best to refactor then, with the next PR.

@dimosped
Copy link
Contributor Author

dimosped commented Nov 8, 2018

thanks to @yschimke suggestion for reconciliation option, @bmoscon @jamesblackburn this was a nasty one over with partial segment writes between mongo_retries:
dimosped@f2bb752

Need to also correct/update the versoins/segments garbage collection, and make it compatible also for versions which were created with forward pointers.

Copy link
Contributor

@jamesblackburn jamesblackburn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dimos - tricky change! One thing is to try to be as forwards compatible as possible. IMO new versions of arctic should be able to read either pointer scheme without environment variables being set... Similarly new versions of arctic shouldn’t corrupt existing libraries / writes.

I wonder if we need a version scheme on the library or similar, so we can raise early if the arctic version is too old to read data correctly?

Will need some battle testing!

arctic/_util.py Show resolved Hide resolved
# If this is the first use of fw-pointers, query for the full list of segment IDs,
# or continue by extending the IDs from the previous version
if previous_version and FW_POINTERS_KEY not in previous_version:
segment_ids = {_id for _id in collection.find({'symbol': symbol, 'parent': version_base_or_id(version)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth asserting here the segment id count is right (as is done in the read path)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_written has been updated to make this check both for fw pointers and legacy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I modified slightly the query to make it targeted for the version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yschimke @jamesblackburn @bmoscon I added a check here.
James has a point, because the check_written is only called for writes, not appends, so it is safer to double check here.

I am now marking this conversation as resolved.

arctic/store/_ndarray_store.py Outdated Show resolved Hide resolved
# Reason is that if only some of the segments might get written, and we re-execute write() via mongo_retry,
# the upserted_id(s) won't be a complete list of all segment ObjectIDs
shas_to_ids = {x['sha']: x['_id']
for x in collection.find({'symbol': symbol}, projection={'sha': 1, '_id': 1})}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won’t be covered by an index, will it? Also looks quite expensive as you’re iterating over every chunk for a symbol - and mongo will thrash the WT cache as it loads each into memory...

Should we be using shas instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check this tomorrow with an explain() and see what happens because _id is quite special, and we have already an index for sha.

If this proves indeed to be expensive, I will consider the switch to shas.

Copy link
Contributor Author

@dimosped dimosped Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before sharing the explain() results, some benchmarks vs this:
https://github.com/manahl/arctic/blob/master/arctic/store/_ndarray_store.py#L515-L519

image

I assume the extra time is only for the transfer of the ObjectIDs and execution is acceptably fast.
The extra ms IMHO in almost a second of total time for the read-query, can be considered negligible.

Will share the explain() findings shortly.

Copy link
Contributor Author

@dimosped dimosped Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesblackburn I am not certain, but I would expect (NO-EVIDENCE here) MongoDB is smart enough to have a light in-memory structure to maintain the relation of an index to _id values, i.e. not having to iterate and really fetch the documents, simply for fetching the IDs, given you have are hitting only fields which are part of an index.

Do we have a way to evaluate the WT cache-pollution effects ? @rob256 @bmoscon any ideas here ?

Copy link
Contributor Author

@dimosped dimosped Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, @jamesblackburn look how the projection changes the index used between the first and the last queries.

This raises some suspicion that WT cache might indeed get polluted.
When only sha is in the projection, it selects the symbol_sha index, while when _id is in the projection, it picks the symbol_hashed index.

image

Copy link
Contributor Author

@dimosped dimosped Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed,

eqlib._collection.find({'symbol': symbol}, projection={'sha': 1, '_id': 1}).explain()

has a FETCH stage,
while this doesn't

eqlib._collection.find({'symbol': symbol}, projection={'sha': 1, '_id': 0}).explain()

It is quite obvious that @jamesblackburn intuition is correct, we would be polluting the cache, an d it is a good idea to convert to using SHAs instead, or add an index (symbol, sha, _id).

I'd rather use SHA though for obvious reasons

projection={'sha': 1, '_id': 0},
)
])
if ARCTIC_FORWARD_POINTERS is FwPointersCfg.DISABLED:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again shouldn’t this be based on precious_version, not on the environment variable?

I’m worried that people / code writing to different libraries with different env variables set will leave the versions in a mess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already updated in my last commit this, be based whenever makes sense on the previous version configuration.

)
return [version["base_version_id"] for version in cursor]

def _prune_previous_versions(self, symbol, keep_mins=120, keep_version=None):
def _prune_previous_versions(self, symbol, keep_mins=0.01, keep_version=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you turning keep_mins down?

This is / was needed to handle replication lags etc. Need to ensure this is larger than the maximum lag, or secondary reads can fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore this, I have set it for testing the pruning, when the PR reaches its final form, I will revert

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will resolve this conversation once I switch it back.

tests/integration/store/test_version_store.py Show resolved Hide resolved
@dimosped
Copy link
Contributor Author

dimosped commented Nov 16, 2018

@jamesblackburn @yschimke my last commit made this change fully forward/backwards compatible between enabled-fw-pointers/legacy-parent-pointers/hybrid.

Tomorrow I will finalize the pruning, with respect to the last change for compatibility.

I will also address the comments

@dimosped
Copy link
Contributor Author

dimosped commented Nov 16, 2018

@jamesblackburn @yschimke forgot to mention that an extra aim was to make this also possible/supported:

  • user tries out FW pointers in hybrid mode (e.g. with arctic v1.73.0), writes symbolX
  • then another application which is still in e.g. v1.67,1, can read/write symbolX as we are fully backwards compatible
  • arcitc v1.73.0 can re-read/re-write symbolX touched by v1.67.1
  • user switches to FW ENABLED (only forward pointers) and now v1.67.0 of course can no longer read/write
  • user switches back to FW HYBRID or FW DISABLED and touches symbolX again.
  • v1.67.0 can read/write symbolX again

@yschimke
Copy link
Contributor

LGTM - since this is optional, I suggest landing this and adding the tests as next PR. It's a big PR already.

@dimosped dimosped force-pushed the forward_pointers_final branch from a43e2e8 to ec4adee Compare November 22, 2018 04:35
@dimosped dimosped merged commit 6735c04 into man-group:master Nov 27, 2018
@yschimke
Copy link
Contributor

🥳

shashank88 added a commit to shashank88/arctic that referenced this pull request Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants