can't recover index from snapshot to multi-node cluster #11004

minde-eagleeye · 2015-05-06T09:29:32Z

I had one node "cluster" running 1.3.4 made a snapshot and tried to recover on 3 data node cluster (version 1.5.2).

I copied snapshot to master node of new cluster and run recovery, most of the indexes perfectly fine, but some had this issue:
Index has 2 primary shards one of them is recovered on master node, second one is trying to recover in any but master node, and it keeps looping between to other nodes trying to initialize it but it never goes to master where the recovery files are.

what I notice is that recovery data of that index is transferred to one of non master nodes, but only first shard documents are there second shard documents are not transferred.

logs keeps outputting same message which I think is related to this #9433

error logs from non master node:

[2015-05-06 08:57:24,192][WARN ][indices.cluster          ] [esdn0001.dev.localdomain] [[phoenix_basket_20140910][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [phoenix_basket_20140910][0] failed recovery
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:162)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [phoenix_basket_20140910][0] restore failed
        at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.restore(IndexShardSnapshotAndRestoreService.java:135)
        at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:109)
        ... 3 more
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [phoenix_basket_20140910][0] failed to restore snapshot [snapshot_1]
        at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.restore(BlobStoreIndexShardRepository.java:164)
        at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.restore(IndexShardSnapshotAndRestoreService.java:126)
        ... 4 more
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [phoenix_basket_20140910][0] failed to read shard snapshot file
        at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$Context.loadSnapshot(BlobStoreIndexShardRepository.java:318)
        at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$RestoreContext.restore(BlobStoreIndexShardRepository.java:710)
        at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.restore(BlobStoreIndexShardRepository.java:162)
        ... 5 more
Caused by: java.io.FileNotFoundException: /var/log/elasticsearch/snapshots/indices/phoenix_basket_20140910/0/snapshot-snapshot_1 (No such file or directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:146)
        at org.elasticsearch.common.blobstore.fs.FsBlobContainer.openInput(FsBlobContainer.java:87)
        at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$Context.loadSnapshot(BlobStoreIndexShardRepository.java:315)
        ... 7 more

If you need any additional data let me know.

work around would be copy snapshot to all nodes then it recovers it correctly

The text was updated successfully, but these errors were encountered:

imotov · 2015-05-06T15:04:11Z

@minde-eagleeye "The path specified in the location parameter should point to the same location in the shared filesystem and be accessible on all data and master nodes". Please see http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html#_shared_file_system_repository for more information. Please use our mailing list or forums at http://discuss.elastic.co if you have any additional questions.

minde-eagleeye · 2015-05-06T15:19:47Z

@imotov Thank you I missed the "shared filesystem" bit

minde-eagleeye closed this as completed May 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't recover index from snapshot to multi-node cluster #11004

can't recover index from snapshot to multi-node cluster #11004

minde-eagleeye commented May 6, 2015

imotov commented May 6, 2015

minde-eagleeye commented May 6, 2015

can't recover index from snapshot to multi-node cluster #11004

can't recover index from snapshot to multi-node cluster #11004

Comments

minde-eagleeye commented May 6, 2015

imotov commented May 6, 2015

minde-eagleeye commented May 6, 2015