Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Break go get compabibility #498

Open
mattfarina opened this issue Jul 1, 2016 · 21 comments
Open

Discussion: Break go get compabibility #498

mattfarina opened this issue Jul 1, 2016 · 21 comments

Comments

@mattfarina
Copy link
Member

In glide.yaml files the package needs to be compatible with names that go get can fetch. This is true even for cases where a repo is specified to get the package from. This helps us to maintain compatibility with go tools including the names referenced in the source.

But, on multiple occasions people have wanted to specify something like:

- package: example.com/foo/bar
  repo: https://example.com/foo/bar.git
  subpackages:
  - baz

And then reference example.com/foo/bar/baz in their import statements.

This will break compatibility with go get. This is the case you can't go get example.com/foo/bar/baz.

Instead you'd need to go get example.com/foo/bar.git/baz but that also means your import statements need to reflect that path.

So, should Glide break with go or stay compatible? Why? What implications concern you?

@albrow
Copy link
Contributor

albrow commented Jul 1, 2016

@mattfarina thank you for opening this issue and being open to thoughts from the community.

#298 is at least part of what prompted this issue. I, and many others on that thread, would like to be able to use Glide in conjunction with GitHub Enterprise and other code hosting solutions without resorting to adding the ".git" suffix all over the place. Note that currently the only workaround is to not only add ".git" to glide.yaml, but to add it to every place in your source code where the package name appears, as well as to rename the relevant directories in GOPATH to include the ".git" suffix.

go get does not work with nested packages on GitHub Enterprise (without specifying the ".git" suffix). Any solution that we come up with for addressing this issue is going to be a departure from go get.

For my public projects, I will continue to maintain go get compatibility so that users will not be forced to use Glide if they don't want to. At work, we have a clearly defined process for hosting code on GitHub Enterprise and using Glide to manage dependencies. We don't use go get directly, so there won't be any adverse affects if we break compatibility.

I would be curious to hear if anyone would be adversely affected by breaking go get compatibility in this way.

@akutz
Copy link

akutz commented Jul 2, 2016

No. I don't want Glide to be orthogonal to Go's build process; instead it should work with & enhance it. One reason I use Makefiles to manage my projects' build processes instead of something like gb is because it replaces Go's toolchain and the surrounding ecosystem instead of enhanching it. Sure, the go command makes some odd choices. For one, using a Makefile would soo much easier if the Go packages weren't compiled with the pack option by default. An object file per source file would make things far simpler to manage via Make. However, I worked around that instead of invoking the Go compiler directly. It's just too much of a moving target to try and replace these fundamental tools.

To that end, if Glide were to break compatibility, I'd just have to stop using new versions of Glide. As crappy of a boyfriend it makes me, I think this is the classic Elaine issue from Seinfeld. I won't date someone who is a bad breaker upper. In this case it means I don't want to buy into a tool that I cannot easily replace if something happens to its maintainer or its direction. I love Glide today because even with all the tremendous value it adds, if @mattfarina turned out to be an alien who is returning to his home planet and taking his code with him, ultimately it won't be the end of the world. I may have to replace Glide, but I won't have to shift my understanding of my projects' entire dependency workflows.

@akutz
Copy link

akutz commented Jul 4, 2016

I never include the extension and it works fine for me.

@mattfarina
Copy link
Member Author

An issue brought up in other tickets... just to add context... is the Great Firewall of China that blocks some requests to sites that offer the ?go-get=1 redirect functionality. This breaks Glides ability to put something in the right place even when they can set the repo to the right one and access it.

Without suggesting how to fix this it is one that would be nice to fix because there are so many users there.

@albrow GitHub Enterprise is an easy-ish one to make better without breaking compatibility (technically). go get github.example.com/foo/bar will work because that will invoke the ?go-get=1 behavior. It's when you get to sub-packages that a problem comes up.

In a glide.yaml file the top level package for an import is the repo level. glide get could figure out this separation for you a little more intelligently.

This isn't a break in compatibility for the import statement in code.

@sdboyer
Copy link
Member

sdboyer commented Jul 7, 2016

IMO, the case @mattfarina presented in the OP actually has very little to do with glide. This is the kind of thing that's easy to go in circles on, because the question we're asking is kinda circular:

  1. What is the network URL from which a given rooted import path should be sourced?
  2. What import path does a given network URL correspond to?

The go toolchain has variable answers for the first (owing to the regex-cum-?go-get=1 behavior that admits multiple schemes), and no capacity to give anything other than one, strict answer to the second.

glide, even today, has different, more variable answers for the first, and a looser, though still mostly single answer to the second. That repo field is what blows it all wide open:

  • It means that both the client and the server (via a ?go-get=1 redirect) can service a rooted import path from an arbitrary URL - and that one can supercede the other, or combine with each other.
  • It also means that a single repo could, in theory anyway, serve multiple rooted import paths.

So, as @albrow pointed out over in #298, adding the repo field technically means glide has already broken go get compatibility. It's already possible to express import path->network location relationships that go get would miss. This is not because of differences in parsing, but because go get doesn't know about these client-side transforms. This might seem like a different kind of compatibility, but it's not - that's just the circular nature of the problem playing tricks.

Take, for example, the particular case presented in the OP:

- package: example.com/foo/bar
  repo: https://example.com/foo/bar.git
  subpackages:
  - baz

And then reference example.com/foo/bar/baz in their import statements.

The only reason we're talking about this case is because the obvious transformation from repo to package is obvious. And yet, this:

- package: github.com/Masterminds/semver
  repo: github.com/sdboyer/semver

is TOTALLY go get incompatible, but glide supported it just fine in #384 for several weeks. In the game of go get compatibility, these are not different cases. Trying to separate these cases is a mistake, based on a superficial red herring - the obviousness of the transform for this particular case.

Moreover, the red herring is harmful - it takes us wandering into the territory of the second question I initially gave - "what import path does a given network URL correspond to?" Here's why:

  • First, they're not logically equivalent. If example.com/foo/bar.git == example.com/foo/bar, then example.com/foo/bar.hg == example.com/foo/bar...so, example.com/foo/bar.git == example.com/foo/bar.hg?? Clearly not, so, how do we know when these equivalencies apply?
  • Second, though related, is that it's impossible to know without inspecting the contents of https://example.com/foo/bar.git whether it internally expects to be referenced with or without the .git. All answers to that involve code scanning, and some involve import path rewriting. Also, lots of new combinatorial failure modes.

Bottom line: we can't answer any of these classes of questions, or even know how to apply configuration that a user has explicitly given, until we've inspected the source. And THAT is itself version-specific...because what if the author of example.com/foo/bar reads this discussion, and changes its internal root import path from including .git, to not including it, or vice-versa? Now, no matter how you declare it, it's a mismatch for osme part of the history.

So, my thought: glide has already broken go compatibility. Embrace it, but do so by adding regex-based matchers. Something like this:

- repo-patterns
 - my-ghe
  - pattern: '^(?P<importroot>(?P<root>github\.example\.com)/([A-Za-z0-9][-A-Za-z0-9]*[A-Za-z0-9]/[A-Za-z0-9_.\-]+))(?P<packages>(?:/[A-Za-z0-9_.\-]+)*)$'

Total pseudocode there, but the point is, named capture groups with specific, magic names, which we could then plug in to a generic matching algorithm. People can write support for their own hosting platforms into their glide.yaml, or into a global config.

@akutz
Copy link

akutz commented Jul 7, 2016

I disagree that enabling a package to point to a different repo location is breaking compatibility. I guess I need go get compatibility defined. To me it still adheres to it because no code is being rewritten. I don't mind that @mattfarina and Glide are simply being cheeky about what they clone and where. To me that's far more preferable than rewriting import statements.

@sdboyer
Copy link
Member

sdboyer commented Jul 7, 2016

Maybe it is just silly semantics - we do need a definition. But import rewriting, while IMO worth contemplating, should not be conflated into this property. It's a wholly different kind of operation relying on a different class of information.

Some kind of declarative solution, whether it's the regex or something else, should be sufficient for this need, and render glide no more "broken" than it already is.

@akutz
Copy link

akutz commented Jul 7, 2016

Hey, I'm all against import rewriting. I thought that's what you were suggesting. My point is if Glide didn't enable importing forks into primary repo locations, import rewriting would be required. Glide avoids this by virtue of what I mentioned. To me that doesn't break go get compatibility specifically because it avoids import rewriting. To me Glide enhances a process without breaking it. If it rewrote imports, then, IMO, it would be breaking it.

@sdboyer
Copy link
Member

sdboyer commented Jul 7, 2016

sorry, i edited that too much, i meant:

But import rewriting, while IMO worth contemplating, should not be conflated into this property. It's a wholly different kind of operation relying on a different class of information.

to be in an agree-y tone, not a disagree-y tone :)

@albrow
Copy link
Contributor

albrow commented Jul 7, 2016

glide has already broken go get compatibility

I was drafting a response to point this out, but it seems @sdboyer beat me to it :)

In my opinion, there is not much merit in continuing to discuss whether Glide should break compatibility with go get in a general sense. By any useful definition I can think of, Glide has already broken it (unless you commit the vendor directory). It might be more prudent to look at these issues on a case-by-case basis.

This may be obvious, but no one has mentioned it yet. Even without specifying custom remote locations, it is trivial to set up a project which works with Glide and does not work with go get. Imagine you have a project that depends on version 1.0 of github.com/lib/foo. The latest version (on master) is 2.0 and is not compatible with version 1.0. Sure, go get knows how to download github.com/lib/foo and put it in the correct location in your GOPATH, but since it always pulls from master, the code could be broken and it may not even compile. On the other hand, if you install dependencies with glide install it will work. Practically speaking, I don't think it matters whether or not go get can download the dependencies from a remote location. What matters is whether your code actually works.

If you care about strict compatibility with go get, there's always the option of committing the vendor directory to version control. As I understand it, that will work with go get regardless of the remote locations or versions that Glide uses (that's not to say it doesn't come with its own issues). With the --strip-vcs option and other ongoing work, it appears that Glide is making this easier.

@mattfarina
Copy link
Member Author

I think we need to define compatibility so let me take a shot.

  1. Import paths in code. go get has a certain (and evolving) structure it supports reading and finding.
  2. go get pulls outside packages into your GOPATH. Once there you can manually manage the checked out version, alternate remotes and the commits they have. In essence, go get is a project retrieval tool that does not deal with versions. Because of this we can't do an apples to apples comparison between go get and Glide.

Glide package paths and those used in imports code statements need to be the same format that go get requires.

The way to look at solving for this is from a user experience point of view. What experience will give the users the fewest WTF moments and why?

@albrow
Copy link
Contributor

albrow commented Jul 7, 2016

The way to look at solving for this is from a user experience point of view. What experience will give the users the fewest WTF moments and why?

I wholeheartedly agree.

Glide package paths and those used in imports code statements need to be the same format that go get requires.

I want to ask for some clarification.

It seems to me that the only advantage of adhering to this definition of compatibility is that you can run go get on the project. go get will successfully download the dependencies and put them in your GOPATH. It doesn't mean that go get will retrieve the same version or use the same remote location that glide install does (because it has no concept of versions or custom remote locations) and consequently it doesn't guarantee that your code will work or even compile. As you mentioned, you could take some additional steps after running go get in order to get the code to compile/work.

Would you agree with these statements? Are there any other advantages of adhering to this definition that I did not mention?

@dwlnetnl
Copy link

dwlnetnl commented Aug 9, 2016

I did have a head scratcher because I assumed the example in the opening post should work but didn't. However in some cases it looks like it works, for example github.com/bradfitz/gomemcache/memcache, this is because github.com is recognised as a special case.

Because of that I think the example in the opening post should work in the same way.

@dwlnetnl
Copy link

Maybe it can be implemented as a command flag.

@albrow
Copy link
Contributor

albrow commented Aug 18, 2016

@mattfarina it's been more than a month since we've heard from you on this thread.

I don't see how the definition you have proposed is helpful for developers. I would love to hear some examples or learn more about your reasoning.

I already mentioned that by design Glide will install different versions of a repository compared to go get. In practice this means a project that uses Glide may already break if you get the dependencies with go get, because the code will expect the versions specified in glide.yaml and go get will always install from master. Sure, you can go in and manually check out the correct version for each dependency, but that does not seem practical. In addition, doing so is likely to break other packages which may depend on different versions of dependencies in your $GOPATH. Given the impracticality, and in some cases impossibility (if you have projects depending on different versions) of manually checking out the expected version, I would already consider this "broken" for all intents and purposes.

Furthermore, since Glide currently allows you to specify a different remote location via the vcs and repo fields, it is already possible to have go get and Glide disagree on what remote repository should be installed. The project may technically be go-gettable, but go get will install from a remote location that is completely different from what the project expects. Installing a dependency from the wrong remote location is strictly worse than not being able to install it at all. The former takes more steps to correct (you have to first remove the incorrectly installed dependency) and is potentially very confusing and hard to track down.

I have provided two reasons why your definition of compatibility with go get does not prevent developer headaches or wtf moments. This is why I'm claiming that definition is not useful.

#298, which partly sparked this discussion, describes a real-world annoyance that I and others have run into. Unless you can provide counterexamples, I maintain that fixing that issue will definitively not break compatibility with go get any more than we already have.

As I have already mentioned, if you care about go get compatibility, we do have an option for that. You can check in the vendor directory to version control. I'm not saying it's a perfect solution, but it will maintain complete compatibility with go get, and I would argue it is actually the only reliable way to do so. As soon as you create a tool (such as Glide) which installs something other than what go get installs, you have already broken compatibility with go get by any useful definition I can think of. This is sometimes okay :) As engineers we need to understand that and weigh the pros and cons.

I would appreciate hearing your thoughts.

@mattfarina
Copy link
Member Author

The definition I was referring to, in terms of scope, is with the paths used in import statements and the ability to figure out where to get the source for the remote path. This relates to the code scanning.

For example, when github.com/example/foo/bar/baz is encountered there are special rules used to figure out where to fetch from. Or, using the go-get redirect we can see the golang.org/x/net and figure out where the source is due to an intermediate redirection server implementing an API. Or, git.example.com/foo/bar.git/bax can be discovered by reading the URL against rules.

Each of these works with go get and Glide today for package paths.

The question is, should Glide instead allow you to map:

- package: foo/bar
   repo: https://git.example.com/baz/foo.git
   subpackages:
   - bar

go get will not be able to figure out where foo/bar resides remotely when found in an import statement. This breaks the mapping of package path to remote location mapping with go get.

When it comes to the version of the source to use there is an intentional break.

@sdboyer
Copy link
Member

sdboyer commented Aug 19, 2016

No, it should not be allowed.

@ghost
Copy link

ghost commented Aug 23, 2016

I would like to add a use case here. I have to use a VSTS git repo and I would like to use glide. The issue is that those repo don't use the .git extension (and in addition the url has a space in it...): https://mycompany.visualstudio.com/a%20name%20with%20space/_git/mypackage

Today I use glide like that:

import:
- package: mypackage
  subpackages:
  - mysubpackage
  repo: https://mycompany.visualstudio.com/a%20name%20with%20space/_git/mypackage
  vcs: git
ignore:
  - mypackage/mysubpackage

it does not spit out any error and does seems to install & update the dependency but its far from ideal. I don't think I should have to manually ignore all the subpackage in my package.

@sdboyer
Copy link
Member

sdboyer commented Aug 23, 2016

@Cedric-Venet is foo.visualstudio.com that a central hosting service? If so, IMO that's actually something that we should be able to resolve with a separate import pattern, sorta orthogonal to the discussion in this issue.

@ghost
Copy link

ghost commented Aug 24, 2016

@sdboyer yes the pattern seems to be [acount].visualstudio.com/[project]/_git/[repo] so adding a pattern would be great, but I am not sure it would solve my use case as go did not seems to access import with space in it and using %20 neither so even if glide where to get the dependency correctly, I could not use them without having some way to alias it.

@albrow
Copy link
Contributor

albrow commented Aug 29, 2016

@mattfarina I understand pretty well what your stance is. I'm moreso asking why? Given the points I have brought up, how does adhering to this definition of compatibility help developers?

Among other things, I'm asking why should we allow this:

- package: github.com/someuser/differentrepo
   repo: https://git.example.com/baz/foo.git
   subpackages:
   - bar

And not this?

- package: foo/bar
   repo: https://git.example.com/baz/foo.git
   subpackages:
   - bar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants