-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Support for multiple blockstores. #3119
Comments
This is great! On my stack to review. I'm a bit behind (reviewing 0.4.3 to |
Each datastore is mounted under a different mount point and a multi-blockstore is used to check each mount point for the block. The first mount checked of the multi-blockstore is considered the "cache", all others are considered read-only. This implies that the garbage collector only removes block from the first mount. This change also factors out the pinlock from the blockstore into its own structure. Only the multi-datastore now implements the GCBlockstore interface. In the future this could be separated out from the blockstore completely. For now caching is only done on the first mount, in the future this could be reworked. The bloom filter is the most problematic as the read-only mounts are not necessary immutable and can be changed by methods outside of the blockstore. Right now there is only one mount, but that will soon change once support for the filestore is added. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
Each datastore is mounted under a different mount point and a multi-blockstore is used to check each mount point for the block. The first mount checked of the multi-blockstore is considered the "cache", all others are considered read-only. This implies that the garbage collector only removes block from the first mount. This change also factors out the pinlock from the blockstore into its own structure. Only the multi-datastore now implements the GCBlockstore interface. In the future this could be separated out from the blockstore completely. For now caching is only done on the first mount, in the future this could be reworked. The bloom filter is the most problematic as the read-only mounts are not necessary immutable and can be changed by methods outside of the blockstore. Right now there is only one mount, but that will soon change once support for the filestore is added. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
Each datastore is mounted under a different mount point and a multi-blockstore is used to check each mount point for the block. The first mount checked of the multi-blockstore is considered the "cache", all others are considered read-only. This implies that the garbage collector only removes block from the first mount. This change also factors out the pinlock from the blockstore into its own structure. Only the multi-datastore now implements the GCBlockstore interface. In the future this could be separated out from the blockstore completely. For now caching is only done on the first mount, in the future this could be reworked. The bloom filter is the most problematic as the read-only mounts are not necessary immutable and can be changed by methods outside of the blockstore. Right now there is only one mount, but that will soon change once support for the filestore is added. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
Each datastore is mounted under a different mount point and a multi-blockstore is used to check each mount point for the block. The first mount checked of the multi-blockstore is considered the "cache", all others are considered read-only. This implies that the garbage collector only removes block from the first mount. This change also factors out the pinlock from the blockstore into its own structure. Only the multi-datastore now implements the GCBlockstore interface. In the future this could be separated out from the blockstore completely. For now caching is only done on the first mount, in the future this could be reworked. The bloom filter is the most problematic as the read-only mounts are not necessary immutable and can be changed by methods outside of the blockstore. Right now there is only one mount, but that will soon change once support for the filestore is added. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
NOTE: This is not directly related to this issue, but rather meant as a private note for @kevina. Alright, so for the purpose of the filestore integration, heres what i would like to see. A separate blockstore implementation for the filestore.This will handle both filestore reads and writes and normal blockstore reads and writes based on the blocks it is given. (if the block is a
|
Can we decouple garbage collection and datastores? I want to make a datastore that can be used by many nodes (Swift Object Store), thus giving the impression of a cluster--the only reason I have multiple nodes is for redundancy and chunking performance. This could work well with Go plugins (https://tip.golang.org/pkg/plugin/), allow others to offer BlockStores as plugins. @whyrusleeping @kevina Also, I want to manually handle garbage collection; the model I want to use mandates a separate index, which means IPFS's GC can't work properly for my use-case. |
The filestore is a datastore, but it is only designed to handle a subset of the blocks that can be used in IPFS, therefore the main datastore is still needed and some sort of support for multiple block or datastores is needed so both the filestore and datastore can coexist. This is a required infrastructure change in order to land #2634.
The following describes how it is currently implemented now. Please let me know if you agree with and understand the changes. Once there is a general consensus I can separate out the non-filestore bits to support this infrastructure change so we can work through the implementation details.
Sorry if it is a bit long.
@whyrusleeping please CC anyone else who should be involved.
Overview
There are several ways to support the "filestore". What I believe makes the most sense and will be the easiest to implement will be to support a "cache" and then any number of additional "aux"
datastores with the following semantics:
These rules imply that the garbage collector should only attempt to remove data from the "cache" and leave the other datastores alone.
High level implementation details
The multiplexing can either happen at the datastore or the blockstore level. I originally implemented it at the datastore level but changed it to the blockstore level to better interact with caching. The filestore is still implemented as a datastore (for now).
In the
fsrepo
Normal blocks are mounted under the/blocks
prefix (this is unchanged) the filestore is mounted under the/filestore
prefix (this is new). Thefsrepo
has been enhanced to be able to retrieve the underlying datastore based on its prefix. (This is required by the filestore.)The top-level blockstore is now a multi-blockstore that works by checking a pre-configured set of prefixs in turn in order to find a matching key. Each mount is wrapped in its own blockstore with its
own caching semantics. The first mount "/blocks" is considered the cache and all Put and Deletes only go to the cache. The multiblock store interface is as follows:
The garbage collector uses FirstMount().AllKeysChan(ctx) to get the list of blocks for the list of to try and delete.
Any caching is currently only done on the first mount for now.
As an implementation detail it is worth noting that files are removed or added to the filestore directly using a specialized interface that bypasses the normal Blockstore and Filestore interface. This was discussed with @whyrusleeping (#2634 (comment)).
Duplicate blocks (that is blocks found under more than one mount) are not forbidden as doing so would be impractical. The Locate() method can be used to discover what mount a block is found. It will list all mounts which can be used to help eliminate the duplicates.
Other uses
The two mounts
/blocks
and/filestore
are currently hard coded, with some effort this can be made into a more general purpose mechanize to support multiple blockstores.One use case I can think of is to have a separate read-only datastore to store permanent content as a alternative to maintaining a large pin-set which currently has performance problems. The datastore could even be on a readonly filesystem to prevent any possibly of the data accidental being deleted either by user error or a software bug. Some additional design decisions will need to made for this so I am not proposing it right now, but merely offering it as a possibility.
Another possibility is to support a local cache on a local filesystem and a larger datastore on the cloud.
The text was updated successfully, but these errors were encountered: