Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

*: add ABS support for backup and restore #1842

Merged
merged 3 commits into from
Jan 25, 2018

Conversation

rjtsdl
Copy link
Contributor

@rjtsdl rjtsdl commented Jan 9, 2018

Add ABS support for backup-operator and restore-operator
Fix #1784

@etcd-bot
Copy link
Collaborator

etcd-bot commented Jan 9, 2018

Can one of the admins verify this patch?

2 similar comments
@etcd-bot
Copy link
Collaborator

etcd-bot commented Jan 9, 2018

Can one of the admins verify this patch?

@etcd-bot
Copy link
Collaborator

etcd-bot commented Jan 9, 2018

Can one of the admins verify this patch?

@hongchaodeng
Copy link
Member

Our backup test right now is a bit flaky: #1825

We are focusing on that and need to fix it first. We have rules to prioritize making testing stable, fixing bugs over adding features. Will review this PR once it is fixed.

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 10, 2018

@hongchaodeng SGTM !


blob := containerRef.GetBlobReference(key)
putBlobOpts := storage.PutBlobOptions{}
err = blob.CreateBlockBlobFromReader(r, &putBlobOpts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline putBlobOpts to
err = blob.CreateBlockBlobFromReader(r, &storage.PutBlobOptions{})

}

getBlobOpts := &storage.GetBlobOptions{}
_, err = blob.Get(getBlobOpts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline getBlobOpts to
blob.Get(&storage.GetBlobOptions{})

if err != nil {
return fmt.Errorf("failed to create ABS client: %v", err)
}
// Nothing to Close for absCli yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when should absCli be closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is simply an object. We don't need to explicit call Close. It would be GC when no reference :)

Copy link
Contributor

@fanminshi fanminshi Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjtsdl sometime a client can create gorutines that must be closed to avoid any go routines leaks. not sure if that's the case for abs client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absCli doesn't do that. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know.


abs := bc.GetBlobService()
w.ABS = &abs
return w, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline w as
return &ABSClient{ABS: &bc.GetBlobService()}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It throw error "cannot take address of bc.GetBlobService()" So I can only do small refactoring

abs *storage.BlobStorageClient
}

func NewABSReader(abs *storage.BlobStorageClient) Reader {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string on public func.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjtsdl ping? fix the above?

"io"

"github.com/Azure/azure-sdk-for-go/storage"
"github.com/coreos/etcd-operator/pkg/backup/util"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate internal and external import

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the import according to what @xiang90's suggestion?

e.g:

"bytes"
"encoding/base64"
"fmt"

"github.com/coreos/etcd-operator/pkg/backup/util"

"github.com/Azure/azure-sdk-for-go/storage"
"github.com/pborman/uuid"

@xiang90
Copy link
Collaborator

xiang90 commented Jan 16, 2018

@rjtsdl what is the test plan for this?


blob := containerRef.GetBlobReference(key)
getBlobOpts := &storage.GetBlobOptions{}
return blob.Get(getBlobOpts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline getBlobOpts? to
return blob.Get( &storage.GetBlobOptions{})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjtsdl ping? fix the above?

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 17, 2018

@xiang90 currently I don't have unittests for this. Mostly it is me, building custom image, and try it out with ABS specified.

What's the plan for different storage types? ABS should have no difference.

@xiang90
Copy link
Collaborator

xiang90 commented Jan 17, 2018

What's the plan for different storage types? ABS should have no difference.

There is an e2e test for S3 storage. I am fine if ABS starts with replicating that test.

@fanminshi
Copy link
Contributor

@rjtsdl take a look of how TestBackupAndRestore in e2eslow work. try to see if you can get it work on ABS.

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 17, 2018

@fanminshi @xiang90 thx for the pointer. I will add test in this PR. It may not be very soon. But bear with me :)

@fanminshi
Copy link
Contributor

@rjtsdl take your time. feel free to ask if you have any questions or concerns.

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 22, 2018

@shrutir25 we need to rebase/merge with the latest master :)

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 22, 2018

@shrutir25 added tests for ABS.
@fanminshi @xiang90 TAL. Let me know if you want to have an ABS secrets setup for your test pipeline. We may able to help. :)

@xiang90
Copy link
Collaborator

xiang90 commented Jan 22, 2018

@rjtsdl The API parts and the overall workflow look good to me. Need @hongchaodeng and @fanminshi to look through the implementation and tests. Thanks for the contribution!

@@ -20,7 +20,10 @@ echo "TEST_NAMESPACE: ${TEST_NAMESPACE}"
echo "OPERATOR_IMAGE: ${OPERATOR_IMAGE}"

export TEST_AWS_SECRET="aws"
export TEST_ABS_SECRET="abs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change this file test/container/run

@@ -81,6 +82,49 @@ func TestBackupAndRestore(t *testing.T) {
testEtcdRestoreOperatorForS3Source(t, clusterName, s3Path)
}

// TestBackupAndRestoreABS runs the backup test first, and only runs the restore test after if the backup test succeeds and sets the ABS path
func TestBackupAndRestoreABS(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the code is duplicate to TestBackupAndRestoreS3().
Can you refactor it to reduce duplication?

@@ -59,6 +59,30 @@ func NewS3Backup(endpoints []string, clusterName, path, secret, clientTLSSecret
}
}

// NewABSBackup creates a EtcdBackup object using clusterName.
func NewABSBackup(endpoints []string, clusterName, path, secret, clientTLSSecret string) *api.EtcdBackup {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the code are duplicate to NewS3Backup().
Can you refactor it?

}

var tlsConfig *tls.Config
if len(clientTLSSecret) != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is duplicate to handleS3(). Can you refactor it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just extract line 39 - 49 to something like generateTLSConfig ?

@hongchaodeng
Copy link
Member

Can you split this PR into two:

  1. API changes and implementation. We will review and rely on you to manually test it.
  2. Testing. We will work together to setup test credentials to pass the test.


const (
// AzureBlobBlockChunkLimitInBytes 100MiB is the limit
AzureBlobBlockChunkLimitInBytes = 104857600
Copy link
Contributor

@fanminshi fanminshi Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer AzureBlobBlockChunkLimitInBytes = 100 * 1024 * 1024 for readability; It is pretty clear that 100 * 1024 * 1024 is 100MiB

"io"

"github.com/Azure/azure-sdk-for-go/storage"
"github.com/coreos/etcd-operator/pkg/backup/util"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the import according to what @xiang90's suggestion?

e.g:

"bytes"
"encoding/base64"
"fmt"

"github.com/coreos/etcd-operator/pkg/backup/util"

"github.com/Azure/azure-sdk-for-go/storage"
"github.com/pborman/uuid"

}

buf := new(bytes.Buffer)
buf.ReadFrom(r)
Copy link
Contributor

@fanminshi fanminshi Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if reader has a 2 GB size? Will ReadFrom return ErrTooLarge on system that has smaller memory or disk space?
I think we need to figure out a way to read 100MB into buffer and then write 100MB buffer into ABS and repeat.

Copy link
Contributor

@fanminshi fanminshi Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjtsdl the suggestion is just an optimization and it will not affect the correctness. we can optimize the uploading logic in a future pr.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will submit this optimization in the next pr since I dont want to block the merging of this current pr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, we can do it in another PR. :)

@fanminshi
Copy link
Contributor

@rjtsdl most of code looks good. I just have few concern with abs uploading logic. the test can be refactor a bit more but let's do that in a separate pr like @hongchaodeng suggested.

@fanminshi fanminshi changed the title add ABS support for backup and restore *: add ABS support for backup and restore Jan 23, 2018
@shrutir25
Copy link

shrutir25 commented Jan 23, 2018

@fanminshi @hongchaodeng - thanks for reviewing the PR ! I will make all the changes as mentioned and submit out a new PR for the tests.

@shrutir25
Copy link

shrutir25 commented Jan 24, 2018

@rjtsdl @hongchaodeng @fanminshi @khenidak - I have made the changes as per the review comments. PTAL :)

}

var tlsConfig *tls.Config
if tlsConfig, err := generateTLSConfig(kubecli, clientTLSSecret, namespace); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace := with =

Copy link
Member

@hongchaodeng hongchaodeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good in general.

@@ -39,7 +39,7 @@ func init() {
rand.Seed(time.Now().UTC().UnixNano())
}

// TestBackupAndRestore runs the backup test first, and only runs the restore test after if the backup test succeeds and sets the S3 path
// TestBackupAndRestoreS3 runs the backup test first, and only runs the restore test after if the backup test succeeds and sets the S3 path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this?

@@ -6,6 +6,10 @@ The scripts at `test/pod` can be used to package and run the e2e tests inside a

The e2e tests need access to an S3 bucket for testing. Create a secret containing the aws credentials and config files in the same namespace that the test-pod will run in. Consult the [backup-operator guide][setup-aws-secret] on how to do so.

## Create the ABS secret
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test/pod/, it is testing related.
Can you revert this and put the changes into upcoming testing PR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure I will add it to the test PR.

@@ -6,6 +6,18 @@
packages = ["compute/metadata","internal"]
revision = "3b1ae45394a234c385be014e9a488f2bb6eef821"

[[projects]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to update Gopkg.toml?
How could you have changes in Gopkg.lock while Gopkg.toml remains unchanged?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjtsdl - is there a change that needs to be made in Gopkg.toml ?

Copy link
Contributor

@fanminshi fanminshi Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i saw "github.com/Azure/azure-sdk-for-go" being added. how come Gopkg.toml has not changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it :)

@hongchaodeng
Copy link
Member

@etcd-bot ok to test

@fanminshi
Copy link
Contributor

please also squash the commits into just few relevant ones:

all the go dep change can be in one commit.
all apis updates can be in one commit.
the abs_reader.go, abs_writer.go can be in one commit.
all controller changes can be in one commit.

I hope you get the idea.

Gopkg.toml Outdated
name = "github.com/Azure/azure-sdk-for-go"
version = "v11.3.0-beta"

[[constraint]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three deps aren't imported directly in the code. Can you get rid of them:

[[constraint]]
  name = "github.com/Azure/go-autorest"
  version = "v9.6.0"

[[constraint]]
  name = "github.com/dgrijalva/jwt-go"

[[constraint]]
  name = "github.com/satori/uuid"
  version = "v1.1.0

"k8s.io/client-go/kubernetes"
)

// TODO: replace this with generic backend interface for other options (PV, Azure)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this TODO.

@@ -21,6 +21,7 @@ The [test-pod-tmpl.yaml](./test-pod-tmpl.yaml) can be used to define the test-po
- `OPERATOR_IMAGE` is the etcd-operator image used for testing
- `TEST_S3_BUCKET` is the S3 bucket name used for testing
- `TEST_AWS_SECRET` is the secret name containing the aws credentials/config files.
- `TEST_ABS_SECRET` is the secret name containing the abs credentials
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert the changes related under test/pod/. These would come into testing PR later.

@rjtsdl rjtsdl force-pushed the jiren-addabssupport branch 3 times, most recently from ec07d43 to 01cd7ff Compare January 25, 2018 17:51
@hongchaodeng
Copy link
Member

Please also update generated code:
https://github.com/coreos/etcd-operator/tree/master/hack/k8s/codegen

./hack/k8s/codegen/update-generated.sh

@hongchaodeng
Copy link
Member

pkg/util/azureutil/absfactory/client.go:28:2: const tmpdir is unused

@fanminshi
Copy link
Contributor

@rjtsdl please also rebase the commits as well. probably merge remove unused const into the commit that introduces azureutil/absfactory/client.go.

@hongchaodeng
Copy link
Member

hongchaodeng commented Jan 25, 2018

No need to worry the commits.
When merging PR, maintainers should use "Squash and merge" and make sure the commit message is great.

@rjtsdl
Copy link
Contributor Author

rjtsdl commented Jan 25, 2018

@hongchaodeng @fanminshi

I rebased it a bit.
Yeah, squash and merge should do the work :)

@hongchaodeng hongchaodeng self-assigned this Jan 25, 2018
@fanminshi
Copy link
Contributor

@rjtsdl a bit nit on the commit msg. usually, we want the format to be <pkg>: commit msg.

in your case, pkg can be pkg/apis: ... for the all api changes and pkg/backup: for backup changes, and etc. if the commit contains mutiple pkgs, then use *: ... instead. I hope you get the idea.

@fanminshi
Copy link
Contributor

@hongchaodeng
Copy link
Member

@fanminshi
I will help correct the corresponding commits when merging. No need to worry about that.

@fanminshi
Copy link
Contributor

@hongchaodeng sure thing. @rjtsdl you can follow my advice on future pr then.

@hongchaodeng hongchaodeng merged commit 8339b61 into coreos:master Jan 25, 2018
@rjtsdl rjtsdl deleted the jiren-addabssupport branch January 25, 2018 18:50
@hongchaodeng hongchaodeng mentioned this pull request Jan 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants