Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds object storage client for Swift OpenStack #2440

Merged
merged 3 commits into from
Apr 24, 2020

Conversation

cyriltovena
Copy link
Contributor

This adds a new storage client to support swift OpenStack. This is a popular client for companies that runs their own internal cloud on their hardware.

Right now there is no tests, I'm looking at the e2e package and add some tests for it using this docker container https://github.com/jeantil/openstack-swift-keystone-docker WDYT ? I have no other way of testing this.

This seems to be a popular request for Loki.

I still need to update the changelog too, but I'm tired going to bed.

Checklist

  • Tests updated
  • [x ] Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet and clean! Good job @cyriltovena. I have no experience with Swift (I didn't even know it before this PR) but I can't see any bad small in the client implementation. I left few minor comments, also:

  • I agree on covering this with an integration test. A good starting point may be add this storage to the tests done in integration/chunks_storage_backends_test.go. You can also definitely add a client-specific integration test to test the single functions, if you want.
  • Please add a [FEATURE] CHANGELOG entry prefixing it with Experimental:
  • Doc: supported storages should be mentioned in docs/architecture.md (see "Chunks storage" section). Don't forget to mention it as (experimental)
  • Doc: may you also list this feature in docs/configuration/v1-guarantees.md (under "Experimental features"), please?

🙏

Comment on lines 59 to 60
f.StringVar(&cfg.UserName, prefix+"swift.username", "", "Openstack username for the api.")
f.StringVar(&cfg.UserId, prefix+"swift.userid", "", "Openstack userid for the api.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep it consistent with the YAML:

Suggested change
f.StringVar(&cfg.UserName, prefix+"swift.username", "", "Openstack username for the api.")
f.StringVar(&cfg.UserId, prefix+"swift.userid", "", "Openstack userid for the api.")
f.StringVar(&cfg.UserName, prefix+"swift.user-name", "", "Openstack username for the api.")
f.StringVar(&cfg.UserId, prefix+"swift.user-id", "", "Openstack userid for the api.")

ConnectTimeout time.Duration `yaml:"connect_timeout"` // Connect channel timeout (default 10s)
Timeout time.Duration `yaml:"timeout"` // Data channel timeout (default 60s)
Region string `yaml:"region"` // Region to use eg "LON", "ORD" - default is use first region (v2,v3 auth only)
AuthVersion int `yaml:"auth"` // Set to 1, 2 or 3 or leave at 0 for autodetect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it consistent with the CLI flag:

Suggested change
AuthVersion int `yaml:"auth"` // Set to 1, 2 or 3 or leave at 0 for autodetect
AuthVersion int `yaml:"auth_version"` // Set to 1, 2 or 3 or leave at 0 for autodetect

Comment on lines 84 to 91
if cfg.EndpointType != "" &&
cfg.EndpointType != string(swift.EndpointTypePublic) &&
cfg.EndpointType != string(swift.EndpointTypeAdmin) &&
cfg.EndpointType != string(swift.EndpointTypeInternal) {
return nil, fmt.Errorf(
"unexpected endpoint type %s must be one of %s, %s or %s",
cfg.EndpointType, swift.EndpointTypePublic, swift.EndpointTypeAdmin, swift.EndpointTypeInternal)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As pattern, we're currently transitioning towards validating the config in a cfg.Validate() error function called by the parent config. I navigate back the (2) places where SwiftConfig is referenced and should be easy to call. As a reference you can see this:

// Validate config and returns error on failure
func (cfg *Config) Validate() error {
if cfg.Engine != StorageEngineChunks && cfg.Engine != StorageEngineTSDB {
return errors.New("unsupported storage engine")
}
if err := cfg.CassandraStorageConfig.Validate(); err != nil {
return errors.Wrap(err, "invalid Cassandra Storage config")
}
return nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule store doesn't have a validate. I'll add it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. Just add the missing Validate() going back the parent, but should be trivial.

@cyriltovena
Copy link
Contributor Author

Alright I'll tackle the E2E later this weekend.

@cyriltovena
Copy link
Contributor Author

The integration tests timeout on go get, I might need help to understand how to overcome this.

@gouthamve
Copy link
Contributor

Before we move forward with this, I think @jtlisi has some ideas for reusing the Thanos codebase here: #2262

If we move there we will automatically add support for Swift as well. But I am not too familiar with it, @jtlisi could you comment?

CHANGELOG.md Outdated
@@ -2,6 +2,7 @@

## master / unreleased

* [FEATURE] Experimental: Added a new object storage client for OpenStack Swift.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor things:

  • Please move the [FEATURE] to the group of [FEATURE] below
  • Remember to add the PR number at the end of the sentence (ie. #2440) cause we have a script trying to match PRs with CHANGELOG entries when we release a new Cortex version

@@ -53,11 +53,22 @@ func BuildArgs(flags map[string]string) []string {
return args
}

func GetRequest(url string) (*http.Response, error) {
func GetRequest(url string, h http.Header) (*http.Response, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the input header used? I can't find where we use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a leftover, I needed this at some point to get credentials. But I 've found another way around. I'll revert that back.

"swift",
images.SwiftEmulator,
nil,
// readiness probe inspired from https://github.com/kubernetes/examples/blob/b86c9d50be45eaf5ce74dee7159ce38b0e149d38/cassandra/image/files/ready-probe.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment, since doesn't apply.

images.SwiftEmulator,
nil,
// readiness probe inspired from https://github.com/kubernetes/examples/blob/b86c9d50be45eaf5ce74dee7159ce38b0e149d38/cassandra/image/files/ready-probe.sh
e2e.NewHTTPReadinessProbe(8080, "/", 404, 404),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any better way to test for readiness? We also support command probes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a good one for the use case, waiting for the HTTP api to be responsive. That's all we need for the e2e.

integration/chunks_storage_backends_test.go Show resolved Hide resolved
integration/chunks_storage_backends_test.go Show resolved Hide resolved
@jtlisi
Copy link
Contributor

jtlisi commented Apr 14, 2020

@gouthamve I think it would make sense to move in that direction but no one is currently working on that. @jaybatra26 started some refactoring work to make it possible but I would say we are at a standstill at this point.

@cyriltovena My recommendation at this point would be to reuse https://github.com/thanos-io/thanos/blob/91f5960a3becdbd13c296c1e57bdbf57c329eeed/pkg/objstore/swift/swift.go#L29-L44 to ensure the switchover will be seemless when it occurs.

@cyriltovena
Copy link
Contributor Author

Before we move forward with this, I think @jtlisi has some ideas for reusing the Thanos codebase here: #2262

If we move there we will automatically add support for Swift as well. But I am not too familiar with it, @jtlisi could you comment?

That would be great if we can have this, having Swift without having to wait for a big refactoring.

@jaybatra26
Copy link
Contributor

@jtlisi I am working on it. I will try to wind it up by the end of this week(hopefully).

@cyriltovena
Copy link
Contributor Author

@jtlisi I've tried to use Thanos config but it's not a one to one mapping. It seems that we use a different client. I didn't checked if Thanos had swift and choose a newer version with more identification options.

If you guys wants to pause this and wait for the Thanos change I'm fine with it. We can also get that into Loki itself /cc @slim-bean if we don't want to wait. Not sure 🤷

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Cyril for addressing my feedback! I've just pushed a rebase.

I think @jtlisi has some ideas for reusing the Thanos codebase here: #2262

Considering the work to migrate to Thanos objstore implementation hasn't even started, personally I'm OK merging this PR.

Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'm fine with not blocking this over the object client migration.

@cyriltovena
Copy link
Contributor Author

cyriltovena commented Apr 16, 2020

I can try to retrofit Thanos object store, give me some time I'll try to give more effort so that we don't have to make a breaking change.

I have a e2e working I can rely on, don't merge yet I'll give at least 2 hour.

@cyriltovena
Copy link
Contributor Author

So I tried again and there's something missing with the Thanos Client it doesn't allow application credentials. I was able to find my way around the config by reading the long long long boring openstack documentation.

I still pushed the commit so we don't introduce API break.

We should add that to Thanos though. I think you most likely want to use application credentials.

@cyriltovena
Copy link
Contributor Author

Ready !

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci
Copy link
Contributor

@cyriltovena I've fixed a couple of CLI flags, rebased master, fixed List() (signature has changed in the meanwhile in master) and squashed all commits because rebasing was a bit a pain due to the config-file-reference.md.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

@pracucci
Copy link
Contributor

So I tried again and there's something missing with the Thanos Client it doesn't allow application credentials. I was able to find my way around the config by reading the long long long boring openstack documentation.

I still pushed the commit so we don't introduce API break.

We should add that to Thanos though. I think you most likely want to use application credentials.

I think this is not a blocker and can be addressed separately. @cyriltovena could you open an issue to Thanos, please?

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some eyebrow raising concerns about context, or lack of validation (why have it then?). Approving since it's experimental feature and there are no guarantees that it works.

@@ -93,6 +93,14 @@ type Config struct {
EnableAPI bool `yaml:"enable_api"`
}

// Validate config and returns error on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to add extra methods to unrelated packages seems weird. I not a fan of this two-step validation pattern.

Comment on lines +34 to +37
// Validate config and returns error on failure
func (cfg *SwiftConfig) Validate() error {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much fuss for validation to do ... nothing? Is this correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was me suggesting it before the offline conversation we had (between me and you Peter). We already use Validate() in many places so I don't see it a problem. However, I invite you to open an issue where we can take a final decision about it and then we can submit a PR to refactor it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My (snarky) comment here is about Validate method being empty, and not validating anything. That seems weird.

I will open issue for making decision on validation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I think it was supposed to contain the delimiter validation, which has been probably lost at some point while working on the PR.

Comment on lines +41 to +54
f.StringVar(&cfg.ContainerName, prefix+"swift.container-name", "cortex", "Name of the Swift container to put chunks in.")
f.StringVar(&cfg.DomainName, prefix+"swift.domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.DomainId, prefix+"swift.domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.UserDomainName, prefix+"swift.user-domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.UserDomainID, prefix+"swift.user-domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.Username, prefix+"swift.username", "", "Openstack username for the api.")
f.StringVar(&cfg.UserId, prefix+"swift.user-id", "", "Openstack userid for the api.")
f.StringVar(&cfg.Password, prefix+"swift.password", "", "Openstack api key.")
f.StringVar(&cfg.AuthUrl, prefix+"swift.auth-url", "", "Openstack authentication URL.")
f.StringVar(&cfg.RegionName, prefix+"swift.region-name", "", "Openstack Region to use eg LON, ORD - default is use first region (v2,v3 auth only)")
f.StringVar(&cfg.ProjectName, prefix+"swift.project-name", "", "Openstack project name (v2,v3 auth only).")
f.StringVar(&cfg.ProjectID, prefix+"swift.project-id", "", "Openstack project id (v2,v3 auth only).")
f.StringVar(&cfg.ProjectDomainName, prefix+"swift.project-domain-name", "", "Name of the project's domain (v3 auth only), only needed if it differs from the user domain.")
f.StringVar(&cfg.ProjectDomainID, prefix+"swift.project-domain-id", "", "Id of the project's domain (v3 auth only), only needed if it differs the from user domain.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to keep order consistent with order of fields in thanos.SwiftConfig struct. I would also prefer to use it in the same order in NewSwiftObjectClient method, but that's trickier.

Suggested change
f.StringVar(&cfg.ContainerName, prefix+"swift.container-name", "cortex", "Name of the Swift container to put chunks in.")
f.StringVar(&cfg.DomainName, prefix+"swift.domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.DomainId, prefix+"swift.domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.UserDomainName, prefix+"swift.user-domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.UserDomainID, prefix+"swift.user-domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.Username, prefix+"swift.username", "", "Openstack username for the api.")
f.StringVar(&cfg.UserId, prefix+"swift.user-id", "", "Openstack userid for the api.")
f.StringVar(&cfg.Password, prefix+"swift.password", "", "Openstack api key.")
f.StringVar(&cfg.AuthUrl, prefix+"swift.auth-url", "", "Openstack authentication URL.")
f.StringVar(&cfg.RegionName, prefix+"swift.region-name", "", "Openstack Region to use eg LON, ORD - default is use first region (v2,v3 auth only)")
f.StringVar(&cfg.ProjectName, prefix+"swift.project-name", "", "Openstack project name (v2,v3 auth only).")
f.StringVar(&cfg.ProjectID, prefix+"swift.project-id", "", "Openstack project id (v2,v3 auth only).")
f.StringVar(&cfg.ProjectDomainName, prefix+"swift.project-domain-name", "", "Name of the project's domain (v3 auth only), only needed if it differs from the user domain.")
f.StringVar(&cfg.ProjectDomainID, prefix+"swift.project-domain-id", "", "Id of the project's domain (v3 auth only), only needed if it differs the from user domain.")
f.StringVar(&cfg.AuthUrl, prefix+"swift.auth-url", "", "Openstack authentication URL.")
f.StringVar(&cfg.Username, prefix+"swift.username", "", "Openstack username for the api.")
f.StringVar(&cfg.UserDomainName, prefix+"swift.user-domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.UserDomainID, prefix+"swift.user-domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.UserId, prefix+"swift.user-id", "", "Openstack userid for the api.")
f.StringVar(&cfg.Password, prefix+"swift.password", "", "Openstack api key.")
f.StringVar(&cfg.DomainId, prefix+"swift.domain-id", "", "Openstack user's domain id.")
f.StringVar(&cfg.DomainName, prefix+"swift.domain-name", "", "Openstack user's domain name.")
f.StringVar(&cfg.ProjectID, prefix+"swift.project-id", "", "Openstack project id (v2,v3 auth only).")
f.StringVar(&cfg.ProjectName, prefix+"swift.project-name", "", "Openstack project name (v2,v3 auth only).")
f.StringVar(&cfg.ProjectDomainID, prefix+"swift.project-domain-id", "", "Id of the project's domain (v3 auth only), only needed if it differs the from user domain.")
f.StringVar(&cfg.ProjectDomainName, prefix+"swift.project-domain-name", "", "Name of the project's domain (v3 auth only), only needed if it differs from the user domain.")
f.StringVar(&cfg.RegionName, prefix+"swift.region-name", "", "Openstack Region to use eg LON, ORD - default is use first region (v2,v3 auth only)")
f.StringVar(&cfg.ContainerName, prefix+"swift.container-name", "cortex", "Name of the Swift container to put chunks in.")

Comment on lines +79 to +84
switch {
case cfg.UserDomainName != "":
c.Domain = cfg.UserDomainName
case cfg.UserDomainID != "":
c.DomainId = cfg.UserDomainID
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we reuse Thanos comment to explain what's happening here?

	// Support for cross-domain scoping (user in different domain than project).
	// If a userDomainName or userDomainID is given, the user is scoped to this domain.

Comment on lines +70 to +71
TenantDomain: cfg.ProjectDomainName,
TenantDomainId: cfg.ProjectDomainID,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanos seems to be putting ProjectDomainName and ProjectDomainID to Domain and DomainId fields.
Why the difference?


// GetObject returns a reader for the specified object key from the configured swift container. If the
// key does not exist a generic chunk.ErrStorageObjectNotFound error is returned.
func (s *SwiftObjectClient) GetObject(ctx context.Context, objectKey string) (io.ReadCloser, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that ctx parameter is not used in any of GetObject, PutObject, List or DeleteObject methods is concerning.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
@cyriltovena
Copy link
Contributor Author

There are some eyebrow raising concerns about context, or lack of validation (why have it then?). > Approving since it's experimental feature and there are no guarantees that it works.

The client I choose is having a timeout by default which Thanos doesn't have. Both client doesn't support context propagation.

Validation was there before because the original config was in my opinion better than Thanos one, more granular, and I did my best to retrofit this for the like of yours. Since you wanted to have Thanos config in case one day (I'm sure it will be soon) we want to move to Thanos storage.

Now regarding the no guarantees that it works I did write 2 integration tests ensuring this works for the ruler and the storage where this again doesn't exist on Thanos side.

😓

@cyriltovena
Copy link
Contributor Author

Also I suggest to not rebase PR until they get merge (e.g. merge commit to get update from master) so that you can have the full history of commits which might help understand how we get there.

@pstibrany
Copy link
Contributor

Now regarding the no guarantees that it works I did write 2 integration tests ensuring this works for the ruler and the storage where this again doesn't exist on Thanos side.

Don't get me wrong, I'm not saying it doesn't work, I'm just worried about not using context properly in those methods, so it can have different behaviour than other stores.

I think you did a great job here!

Let's merge this, we can improve it when we get feedback about it.

@pracucci pracucci merged commit 64fb9ad into cortexproject:master Apr 24, 2020
simonswine pushed a commit to grafana/e2e that referenced this pull request Jan 13, 2022
…2440)

* Implement swift storage client.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Pass nil registrerer.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Make lint happy.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants