Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled the use of secondary Azure storage endpoint in addition to th… #93

Closed
wants to merge 6 commits into from

Conversation

craigwi
Copy link

@craigwi craigwi commented Jun 16, 2015

…e primary end point:

  • Added account2 and key2 to specific the alternate storage; this account is always accessed through the secondary endpoint
  • Added “use_secondary” to the Azure repository settings; if true, uses the account2/key2 and secondary endpoint; otherwise, use the original account/key and primary endpoint
  • Avoid writing to the secondary endpoint during snapshot repository registration
  • Fixed the encapsulation of AzureBlobStore class so that it controls whether to pass on the useSecondary flag; this avoids having a mode in AzureStorageServiceImpl as to which client is selected.

…e primary end point:

* Added account2 and key2 to specific the alternate storage; this account is always accessed through the secondary endpoint
* Added “use_secondary” to the Azure repository settings; if true, uses the account2/key2 and secondary endpoint; otherwise, use the original account/key and primary endpoint
* Avoid writing to the secondary endpoint during snapshot repository registration
* Fixed the encapsulation of AzureBlobStore class so that it controls whether to pass on the useSecondary flag; this avoids having a mode in AzureStorageServiceImpl as to which client is selected.
@craigwi
Copy link
Author

craigwi commented Jun 16, 2015

I signed the CLA.

@pickypg
Copy link
Member

pickypg commented Jun 16, 2015

Relates to #90

@srijan55
Copy link

this will be immensely helpful in multi-tenant architectures where you move customers from one stamp in a region to another, when their vicinity changes

@dadoonet
Copy link
Member

Hi there!

Many thanks for the pull request. I really appreciate!

I'd like to understand the use case completely here. And to be honest, I think the code I wrote 2 years ago was not that good as the azure client is created once in the running node which does not allow to create a repository using the REST API and setting credential informations at creation time.

That's basically what you tried to solve with this PR.
But what would happen then, if someone needs a 3rd Storage instance in another DC for example?

So, I'm going to revisit the code to fully support what we have today with aws plugin.

At the end, we should be able to do:

PUT _snapshot/my_azure_repository1
{
    "type": "azure",
    "settings": {
        "account": "AZURE_ACCOUNT1",
        "key": "AZURE_KEY1"
    }
}

PUT _snapshot/my_azure_repository2
{
    "type": "azure",
    "settings": {
        "account": "AZURE_ACCOUNT2",
        "key": "AZURE_KEY2"
    }
}

@craigwi Would that solve what you are looking for?

We might be missing BTW a readonly global option for repositories (like what we implicitly have with url repositories which are read only). @imotov WDYT?

@imotov
Copy link

imotov commented Jun 18, 2015

@dadoonet sounds like a good idea to me. Could you open a feature request to support readonly flag on all repositories?

@craigwi
Copy link
Author

craigwi commented Jun 18, 2015

I like the idea of specifying the account in the repository settings, but the key should be specified in the startup configuration. This follows the pattern and rationale set by the path.repo setting which ensures that ES only writes to pre-configured locations. Specifying the key in the repository settings would allow any azure storage to be written to.

The standard practice to access the secondary endpoint in Azure client libraries (including the Java one) is through the LocationMode setting in the BlobRequestOptions class. I added a bool repository setting "use_secondary", but you could generalize that and have a "location_mode" setting (primary being the default).

I like the addition of a global readonly option for repositories, but would not want such a change to block this specific work. Also, as may be clear, the secondary end point in Azure is, by definition, read only.

Thus the primary Azure repository settings available would be:
{
"type": "azure",
"settings": {
"account": "",
"location_mode" : "<Primary or Secondary or ...>",
"container": "",
"compress": ""
}
}

It would be valuable for us to have this in version 2.7.x of the cloud-azure plug in for use with ES 1.6. Is that possible? How can I help make that happen?

Craig.

…figuration allows an array of

storage accounts and keys; account2 and key2 removed.  Only one active location_mode per account allowed.
@craigwi
Copy link
Author

craigwi commented Jun 21, 2015

I pushed a set of changes to match my proposal above.

Craig.

@craigwi
Copy link
Author

craigwi commented Aug 14, 2015

I learned on a call today that the full description of the feature, which I put into the release notes, didn't show up here clearly. So, here is a copy of those notes from https://github.com/craigwi/elasticsearch-cloud-azure/releases/tag/v2.7.1-craigwi:

1.supports multiple storage accounts; cloud.azure.storage.account and cloud.azure.storage.key may now be an array of accounts / keys; the arrays must be the same length; the account at an index must match the key at the same index.

2.an Azure repository specification ("type" : "azure") allows for two new settings. "account" specifies the name of the account to be used and must be one of the items in cloud.azure.storage.account. If "account" is not specified, the first item in the list of accounts is used. The other new setting "location_mode" may be used to specify the endpoint. This defaults to "primary_only" and may also be "primary_then_secondary", "secondary_only" or "secondary_then_primary".

3.when a repository is registered using "secondary_only" or "secondary_then_primary" as the "location_mode", the verification of the repository is limited to checking that the container specified exists; in particular the tests-* files are not created because the secondary endpoint is read only.

NOTE: for a given storage account, only one location_mode can be active at a time.

An example showing settings in elasticsearch.yml:

cloud.azure.storage.account: [ "azstorageaccount1", "mystorage2" ]
cloud.azure.storage.key: [ "", "" ]

A sample repository specification using the secondary endpoint:

{ "type": "azure", "settings": { "account" : "mystorage2", "container": "snapshots-20150701",
"location_mode": "secondary_only"}}

Let me know if you have any questions.

Craig.

@ppf2
Copy link
Member

ppf2 commented Aug 14, 2015

Thanks @craigwi.

@dadoonet @imotov This was discussed at our call with @skearns64 and he asked @craigwi to post details above to help clear the misunderstanding.

@dadoonet
Copy link
Member

I think I totally understood the use case. I just moved it to elasticsearch repo where the cloud-azure code now lives and also split it into 2 needs:

@craigwi: We really appreciate all the efforts you did for this PR. We won't merge it here because the code is now in elasticsearch (from elasticsearch 2.0.0) and we will only fix major issues for older versions in this repo. We won't merge the code as is but will split it in the two parts I mentioned. If you want to send a PR in elasticsearch repo to support elastic/elasticsearch#12759, we will be happy to review and merge it when ready.

Note that you also raised a valid point which is that we need to support in elasticsearch.yml multiple credentials.
We could imagine that as a generic feature whatever repository type you want to use.

Let say that we can now create something like:

cloud:
    azure:
        storage:
            azure1:
              account: your_azure_storage_account1
              key: your_azure_storage_key1
              default: true
            azure2:
              account: your_azure_storage_account2
              key: your_azure_storage_key2
            azure3:
              account: your_azure_storage_account3
              key: your_azure_storage_key3

Then when we create the repo, we can specify which credentials we want to use:

# use credentials 2
PUT _snapshot/my_backup2
{
  "type": "azure",
  "settings": {
      "creadentials": "azure2",
      "container": "backup_container",
      "base_path": "backups"
  }
}

# This one will use the one marked as "default"
PUT _snapshot/my_backup3
{
  "type": "azure"
}

I hope all of this make sense. If I said anything stupid, please tell me and I'll reopen the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants