Skip to content
This repository has been archived by the owner on Dec 26, 2022. It is now read-only.

Fix s3 uri format so that its explicitly a s3 URI #2060

Merged
4 commits merged into from
Oct 18, 2022

Conversation

ghost
Copy link

@ghost ghost commented Oct 17, 2022

Which issue does this PR correspond to?

Fixes https://github.com/grapl-security/issue-tracker/issues/1045

What changes does this PR make to Grapl? Why?

This updates the s3 uri used with the Nomad artifact stanza in order to fix the 403 error we were seeing in AWS. This appears to be an issue with go-getter and http://, and can be seen in this thread: hashicorp/nomad#1113
An alternate fix would be to remove the prefix entirely (ie get rid of s3:: or the previous http://)

I've removed the comments about the slack thread, since that thread has been deleted (past the 90 day threshold or whatever it is)

How were these changes tested?

  1. On a nomad agent instance I downloaded the internally used go-getter binary and tested the former url and the new url:
sh-4.2$ ./go-getter http://plugin-registry-bucket-b1abfd2.s3.amazonaws.com/plugins/tenant_id_a7b7e3a8-fdcf-4032-a0e8-d6e3aa1973d4/plugin_type-generator/3ee133b7-8147-41bb-8908-283e0d946197.bin .
2022/10/17 23:27:21 Error downloading: bad response code: 403
sh-4.2$ ./go-getter s3::http://plugin-registry-bucket-b1abfd2.s3.amazonaws.com/plugins/tenant_id_a7b7e3a8-fdcf-4032-a0e8-d6e3aa1973d4/plugin_type-generator/3ee133b7-8147-41bb-8908-283e0d946197.bin .
2022/10/17 23:27:48 Error downloading: URL is not a valid S3 URL
sh-4.2$ ./go-getter s3::plugin-registry-bucket-b1abfd2.s3.amazonaws.com/plugins/tenant_id_a7b7e3a8-fdcf-4032-a0e8-d6e3aa1973d4/plugin_type-generator/3ee133b7-8147-41bb-8908-283e0d946197.bin .2022/10/17 23:28:52 success!
  1. I've manually updated the Nomad job with the new uri and seen that the plugin tasks starts running successfully

The hashicorp go-getter lib that Nomad uses internally doesn't
recognize the http:// uri as a s3 uri for some reason. As such, I'm
updating the format to be explicitly a s3 uri. This fixes the plugin
task when run in AWS
I've removed the comments about the slack thread, since that thread
has been deleted (past the 90 day threshold or whatever it is)
@ghost ghost added the bug Something isn't working label Oct 17, 2022
@ghost ghost self-assigned this Oct 17, 2022
@ghost ghost requested a review from colin-grapl as a code owner October 17, 2022 23:54
@codecov
Copy link

codecov bot commented Oct 18, 2022

Codecov Report

Base: 39.37% // Head: 39.39% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (c611380) compared to base (dc0bc38).
Patch coverage: 60.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2060      +/-   ##
==========================================
+ Coverage   39.37%   39.39%   +0.01%     
==========================================
  Files         436      436              
  Lines       10154    10154              
==========================================
+ Hits         3998     4000       +2     
+ Misses       6156     6154       -2     
Impacted Files Coverage Δ
...c/rust/plugin-registry/src/server/deploy_plugin.rs 59.25% <ø> (ø)
src/rust/plugin-registry/src/server/s3_uri.rs 60.00% <60.00%> (ø)
...rust-proto/src/graplinc/grapl/api/graph/v1beta1.rs 54.57% <0.00%> (+0.60%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -1,13 +1,11 @@
use grapl_config::env_helpers::ENV_ENDPOINT;

/// Some discussion about this function here:
/// https://grapl-internal.slack.com/archives/C02J5JYS92S/p1648667074625629
pub fn get_s3_url(bucket: &str, key: &str) -> String {
let local_endpoint = std::env::var(ENV_ENDPOINT).ok();
// If the above is specified, we're running locally against Localhost
// Otherwise, it's against prod S3; so we can use virtual-hosted-style URLs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks like it needs updating now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Unfortunately, since I can't access the slack message, the comment I've added may be incomplete but 🤷 .
Let me know if there's something you think I should add.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly referring to the fact that it's not an S3 virtual-host-style URL anymore (and not even a URL, or even a URI, I think 🤷)

If it's all about how go-getter handles things, then a mention of that and any relevant links would be useful.

Without that kind of background information, I could easily see someone in the future "helpfully" refactoring this into something that doesn't work anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs again. This should be better

pub fn get_s3_url(bucket: &str, key: &str) -> String {
let local_endpoint = std::env::var(ENV_ENDPOINT).ok();
// If the above is specified, we're running locally against Localhost
// Otherwise, it's against prod S3; so we can use virtual-hosted-style URLs
local_endpoint.map_or_else(
|| format!("http://{bucket}.s3.amazonaws.com/{key}"),
|| format!("s3::{bucket}.s3.amazonaws.com/{key}"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this a URL? I would buy s3://... but this just looks weird.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah let me change the function name to URI. And agreed that this is a very weird schema. It feels like someone was like, well our arns already use :: as separators, why not just reuse that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, would it have worked with https:// instead of http://?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly no. I tried that early on 😿

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So apparently s3:// does work and is officially supported by AWS. I've switched to that, especially since this is the format the s3 URI link in the console is (https://s3.console.aws.amazon.com/s3/object/plugin-registry-bucket-b1abfd2?region=us-east-1&prefix=plugins/tenant_id_024885b9-dcf0-469d-b12a-81876cc045c8/plugin_type-analyzer/51f8b573-6b7c-411c-9379-2b1070205c1f.bin and search s3 URI).
Apparently, the s3:: was something the go-getter lib created and call Forced Protocols. Basically its a way of saying that this is a s3 url.

Added a comment explaining what this function is and what its used by
to replace the comment that linked to a deleted slack comversation.
I've also renamed everything from url to uri to be more explicit
about what to expect.
@ghost ghost requested a review from christophermaier October 18, 2022 00:56
pub fn get_s3_url(bucket: &str, key: &str) -> String {
/// This gets a s3 URI to be used by Nomad's artifact stanza (in particular the grapl-plugin).
/// The underlying go-getter lib, doesn't work with the `http://` schema prefix, but does work with
/// an explicit s3 prefix `s::` or without a prefix. We've opted for the s3 prefix to be explicit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s:: -> s3::

Apparently the `s3::` protocol isn't a real thing, its something the
go-getter lib created and call [Forced Protocols](https://github.com/hashicorp/go-getter#forced-protocol)
However, AWS does support a real s3 protocol `s3://`, which I've
tested with the go-getter lib and it does work. Even better, this is
the same format as the s3 URI in the console, so its easy to copy paste
a link
@ghost
Copy link
Author

ghost commented Oct 18, 2022

I'm tempted to spend some time actually adding http protocol support to go-getter, but that will still take some time to get merged into Nomad. In the meantime I've opened a ticket to clarify what s3 formats are actually supported: hashicorp/go-getter#387

@ghost ghost requested a review from christophermaier October 18, 2022 16:19
Copy link
Contributor

@wimax-grapl wimax-grapl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i'd prever a grapl_config::env_helpers::is_local_aws() but I know people have Very Strong Opinions on resurrecting IS_LOCAL. In this case, it makes sense to me.

Copy link
Contributor

@christophermaier christophermaier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny documentation tweak, but otherwise 👍

Thanks for digging into the guts of this issue 🎉


/// This gets a s3 URI to be used by Nomad's artifact stanza (in particular the grapl-plugin).
/// The underlying library, Hashicorp's [go-getter](https://github.com/hashicorp/go-getter) lib,
/// doesn't work with the `http://` schema prefix, but does work with an explicit s3 prefix `s://`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s:// -> s3://

@ghost ghost merged commit 7228b83 into main Oct 18, 2022
@ghost ghost deleted the twunderlich/fix-plugin-s3-url branch October 18, 2022 19:14
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants