-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose snapshot index through API for S3 #24288
Labels
:Distributed Coordination/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
>feature
Comments
abeyad
pushed a commit
to abeyad/elasticsearch
that referenced
this issue
May 10, 2017
Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all) provides information about snapshots in the repository, including the snapshot state, number of shards snapshotted, failures, etc. In order to provide information about each snapshot in the repository, the call must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for every snapshot. In cloud-based repositories, this can be expensive, both from a cost and performance perspective. Sometimes, all the user wants is to retrieve all the names/uuids of each snapshot, and the indices that went into each snapshot, without any of the other status information about the snapshot. This minimal information can be retrieved from the repository index blob (`index-N`) without needing to read each snapshot metadata blob. This commit enhances the get snapshots API with an optional `verbose` parameter. If `verbose` is set to false on the request, then the get snapshots API will only retrieve the minimal information about each snapshot (the name, uuid, and indices in the snapshot), and only read this information from the repository index blob, thereby giving users the option to retrieve the snapshots in a repository in a more cost-effective and efficient manner. Closes elastic#24288
abeyad
pushed a commit
that referenced
this issue
May 10, 2017
…24477) Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all) provides information about snapshots in the repository, including the snapshot state, number of shards snapshotted, failures, etc. In order to provide information about each snapshot in the repository, the call must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for every snapshot. In cloud-based repositories, this can be expensive, both from a cost and performance perspective. Sometimes, all the user wants is to retrieve all the names/uuids of each snapshot, and the indices that went into each snapshot, without any of the other status information about the snapshot. This minimal information can be retrieved from the repository index blob (`index-N`) without needing to read each snapshot metadata blob. This commit enhances the get snapshots API with an optional `verbose` parameter. If `verbose` is set to false on the request, then the get snapshots API will only retrieve the minimal information about each snapshot (the name, uuid, and indices in the snapshot), and only read this information from the repository index blob, thereby giving users the option to retrieve the snapshots in a repository in a more cost-effective and efficient manner. Closes #24288
abeyad
pushed a commit
that referenced
this issue
May 10, 2017
…24477) Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all) provides information about snapshots in the repository, including the snapshot state, number of shards snapshotted, failures, etc. In order to provide information about each snapshot in the repository, the call must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for every snapshot. In cloud-based repositories, this can be expensive, both from a cost and performance perspective. Sometimes, all the user wants is to retrieve all the names/uuids of each snapshot, and the indices that went into each snapshot, without any of the other status information about the snapshot. This minimal information can be retrieved from the repository index blob (`index-N`) without needing to read each snapshot metadata blob. This commit enhances the get snapshots API with an optional `verbose` parameter. If `verbose` is set to false on the request, then the get snapshots API will only retrieve the minimal information about each snapshot (the name, uuid, and indices in the snapshot), and only read this information from the repository index blob, thereby giving users the option to retrieve the snapshots in a repository in a more cost-effective and efficient manner. Closes #24288
clintongormley
added
:Distributed Coordination/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
and removed
:Plugin Repository S3
labels
Feb 14, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
>feature
Elasticsearch allows to fetch snapshot information in two different ways. Either a specific snapshot of a repository or a pattern can be specified, or info for all snapshots can be retrieved:
Having S3 as the snapshot location this means 1 request to get the repository index file from S3 and 1 request for each snapshot. This means each time
_all
is calledN+1
request are made to S3.It would be nice to have an API endpoint the exposes the data from the S3 repository index file with all the snapshot in it. This would allow to implement a logic on the client side for which snapshots the detailed stats should be fetched.
An example use case would be a snapshot that is created every 30 minutes with
snapshot-name-timestamp
and now on the client side I would like to fetch the stats for the most recent snapshot. If I don't know the exact timestamp without the above API I would have to doN+1
requests to S3 which can become expensive. Otherwise I could do 1 request for the repository index and 1, sort the snapshot list on the client side and fetch stats for the first entry.@abeyad FYI
The text was updated successfully, but these errors were encountered: