RFD 106 Discussion #45

tgross · 2017-08-01T18:38:55Z

For discussion of RFD 106: Engineering Guide: Golang Best Practices

tgross · 2017-08-01T19:43:00Z

Version of Go

The Go team and project are consistently advancing the state of the Go compiler and standard library. Released versions and release candidates are considered fit for production and should therefore be used in production. Upgrade early and often.

Joyent customers are often large enterprises with much less aggressive upgrade cycles. In service upgrade scenarios, there may be services built on multiple toolchains in play at a given time. We should consider whether constant toolchain upgrades are going to have impact on downstream compliance teams.

Project Structure

$GOPATH SHOULD be set to $HOME/go and SHOULD be incorporated into your shell’s environment.

While this is a common practice in the community it's terrible advice precisely because it requires dumping pkg with each toolchain update, and because it requires that all projects have the same version of toolchain and dependencies. This makes building unrelated (perhaps third party) projects more complicated than it needs to be.

More importantly, this is an unnecessary directive as the entire GOPATH should be relocatable anyways. If Alice is working in ~/work/project1 and Bob is working in ~/go doesn't matter so long as they set their GOPATH to the root of the workspace wherever it is.

Where appropriate, it is RECOMMENDED to make use of monolithic repositories (mono-repo).

Can we define what a monorepo even is in the context of a distributed system of open source projects? I can't think of any large open source projects of similar scope to Triton/Manta that are organized this way. (Compare to our favorite container-related golang projects such as kubernetes or docker, which started monolithic and were broken up.)

There are also projects with very different purposes, release cycles, and development methodologies at Joyent. To borrow some examples of existing Node.js projects, the https://github.com/joyent/node-verror is an open source library that can be used by end-users separately from any other work we have in Node.js, including those who have no particular interest in Triton services such as https://github.com/joyent/sdc-cn-agent.

One might then say that some categories of projects belong in their own monorepos, so that perhaps Triton and Manta and SDKs are all separate monorepos. This is more-or-less what Google has done, but the primary requirement motivating their multi-monorepo model is that their "main" monorepo is closed source but they have large open source projects like Android and Chromium.

Workflow

It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on?

Vendor management

Forked and cached libraries in vendor/ MUST be managed via the dep(1) tool:

The golang team says that dep is "ok to use" but its commands are not stable and there is a big security epic yet-undone. Is it perhaps too early to standardize on this as a MUST directive?

I also think this section needs to include whether or not we're suggesting commiting vendored dependencies to the source code repository, as many golang projects do. I'm opposed to this except in the case of submodules, if only because it make commit history difficult to understand.

Naming Conventions

Recommendations
Go software SHOULD conform to the recommendations outlined in the following resources:

Calling this SHOULD rather than MUST is perhaps a little too weak in this case from the Package Names doc:

Good package names are short and clear. They are lower case

Non-lowercase package names have awful interactions when case-preserving but not-case-sensitive file systems meet fully-case-sensitive+preserving file systems. The sirupsen/logrus project found this out the hard way and broke a chunk of the ecosystem in the process.

Explicit Types

Everything here looks good, but maybe we should be explicit about the examples for folks who might not be 100% familiar with the type system already? Something like:

GOOD

type ID uint64
type User struct {
    id ID
}

BAD

type User struct {
    id uint64
}

... or something along those lines.

gRPC

Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document.

richardkiene · 2017-08-01T22:32:05Z

"Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects."
— Rob Pike, 2012

Is this really necessary in the RFD? It seems like an appeal to authority fallacy when I read it.

Without providing modern treatment to past issues, it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go.

Could you either reference the source(s) of this claim or drop it from the RFD? Without referencing any source material that backs this up I'm hard pressed to believe this is a quantifiable, let alone undeniable, which makes it seem like an invisible army/bandwagon fallacy.

jwreagor · 2017-08-02T04:14:43Z

It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on?

I don't entirely understand why the Git workflow had to be there but I'd imagine we'd want something on using commands like go get, go install, go build, and go test. Even if much of this ends up wrapped in Makefiles, I think this is the document to introduce them. Go has a fair bit more OTB tooling than dynamic languages and there's bound to be useful practices that can be shared with them. Or maybe it's easy enough to RTFM.

Also, there's nothing on building CLIs. Doesn't much of the Go community favor every application defining useful CLIs for most repos/projects? Building command-line tools is also mentioned in Node.js guide.

Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document.

Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify. Except maybe in that case the practices came out of actual projects already using them, whereas nothing I know of is evaluating gRPC. If Node.js best practices have made good use out of standardizing HTTP clients, errors, logging, etc. than I'd eventually hope to see that here as well.

Other things I'd like to see in this document or discussed...

+1 @tgross, cont. expanding on types, real examples, benefits/gotchas
When and how to wrap concrete types/collections
Anonymous functions, clear distinctions or similarities with Node
Function types (with methods), Function collections, Functions as fields, etc.
Goroutines and common asynchronous control flow patterns
Stacktraces, error types, error handling, and/vs. logging

jclulow · 2017-08-02T16:33:42Z

Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify.

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

If we're going to elevate a particular inter-process communication protocol above the rest it frankly ought to be a HTTP API with JSON-formatted payloads, if only so that it fits in with much of the rest of the software estate we already have, and the tools (like curl) that we already use when debugging.

jwreagor · 2017-08-02T17:36:56Z

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

If gRPC is going to be mentioned at all than I believe the protobuf language behind gRPC should be defined alongside language best practices. I also think gRPC (or HTTP) should be well defined for specific cases not unlike how the Node.js document ~~prescribes~~ supports "HTTP Servers".

tgross · 2017-08-02T18:01:24Z

@jclulow wrote:

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

Exactly. Last time I checked LDAP and Postgres (for example) don't speak gRPC.

@cheapRoc wrote:

Also, there's nothing on building CLIs. Doesn't much of the Go community favor every application defining useful CLIs for most repos/projects? Building command-line tools is also mentioned in Node.js guide.

Fair. But there are currently zero (0) Joyent services written in golang so anointing particular third-party packages seems awfully premature.

KodyKantor · 2017-08-03T18:17:01Z

Thanks for taking the time to write this up! Unfortunately, there are many different opinions on many of these topics even within the Go community. Joyent doesn't have much experience with Go, so we haven't been able to form our own opinions. For those reasons I think this document will be a very difficult one to write, and it will probably be relatively comprehensive. With Node.js we have been able to pass along tribal knowledge through existing code, code reviews, and chatroom discussions, but that isn't the case for Go. So thanks for doing this! I don't think I would enjoy writing this RFD. I also think that our understanding of what what we think is right/wrong in writing Go will change as we learn more about using the language. I anticipate this RFD will see a lot of initial churn.

To provide feedback, bug fixes, request clarification, or initiate a discussion, please open an RFD issue and reference RFD 106 along with the necessary specifics. For issues where there is a debate or discussion, this document will include a link back to the individual issues where a particular item was addressed or debated.

This was mentioned in chat, but we should keep one issue per RFD. I'm also not a huge fan of an RFD filled with links to comment chains. It may be useful for history and background, but if I'm reading the RFD prose, I would rather not hop back and forth between an issue and the RFD document. I would rather keep the GitHub issues in an O(N) rather than O(N*M) function, where N is the number of RFDs and M is the number of GitHub issues per RFD.

I would like to see various Go-isms addressed, like:

Proper use of Go channels (if it's even possible to properly use them) and misuse of Go channels
The Go inheritance mess (interface{}, generics/go-generate, interfaces of interfaces (e.g. ReadWriter)), which are good patterns, which are bad patterns, and why
Linking Go to C code, when is it acceptable?
CGo vs "Pure" Go, which one we'll use and why
Goroutines: when to use them, when not to use them, avoiding zombie goroutines, communication with goroutines, inline goroutines vs calling go on an existing function, short-running goroutines vs long-running goroutines - are both equally appropriate?
Locking: what is the recommended lock (channels, Cond, Mutex, something else)? Really, explain the sync package, and this comment:

Package sync provides basic synchronization primitives such as mutual exclusion locks. Other than the Once and WaitGroup types, most are intended for use by low-level library routines. Higher-level synchronization is better done via channels and communication.

If I have a client and a server both written in Go, does one 'import' the other's struct definitions, or are they defined in two places? Monorepo-like applications (Docker) have the client directly import the server's structure definitions. If we don't do monorepo, what do we do?

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions. A list of these:

Monorepo
Workflow section
Editor integration
gRPC

I would like to see something about error handling. For Node.js, we luckily have a great document that we created that describes synchronous/async errors, and all that jazz. Go is known for having bad error handling. The standard library isn't consistent with error handling (some calls return negative values, some calls return error types, some calls immediately end the program). Separately, it would be useful to have a Go Error Handling document. Maybe that could explain this document: Defer, Panic, and Recover, as well as offer practical recommendations.

I'll probably think of some more things that I would like more information about. I'll have to look at Go code that I've written previously to try to recall frustrations that I've had :D .

jclulow · 2017-08-03T21:29:50Z

Kody wrote:

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions.

I definitely agree with this. A short list of other things that this document should probably not be:

a place for Rob Pike quotes
a document which justifies the use of Go, rather than a document which explains how, if a project team so chooses, they should use Go
making broad claims about relative productivity or proliferation which are at best difficult to conclusively verify, and which do not really affect the outcome; e.g.
- "Use of one-workspace per project is counter-productive"
- "it is undeniable that many, if not most ... distributed systems are being written in Go"
- "fascism ... increases the overall productivity of the entire Go community"
- "Embrace the pass-by-value nature of Go, be productive"

In general, this document should focus on guidelines and rules specific to the use of Go within projects that are structured similarly to the large, preexisting body of Joyent software. If you would like to fundamentally restructure the software estate, that's obviously a much more expensive (and essentially orthogonal) proposition and really needs to be done as an adjustment to our General Principles (outlined in RFD 104).

For example, a hard requirement on the use of godoc(1) is probably not going to fly. We have not historically been that prescriptive in the way people produce documentation for their libraries, and I don't think we want to start now. In many cases it seems more appropriate to produce a README.md file in the root of the specific project repository for idiomatic Github presentation.

As mentioned by others, the Workflow section should probably be trimmed (or even removed). This is covered already in the general principles guide: all code is reviewed before integration, and this review is to be managed through Gerrit. This is mentioned in Change management within the general principles guide.

To the extent that you're making rules, such as those in the Deadline Timers and Timeouts section, it would be good to provide a paragraph on the motivation for those rules.

I'm not really sure what you mean by "Code SHOULD be fungible". Can you expand on that? Also, what does it mean for code written in Go to be "side-effect"-free? Go seems like it has all of the regular side effect peril of other imperative languages.

With respect to vendor/ Management, I think it's a bit odd to use the term "cached libraries" to describe the process of checking a copy of everything on which you depend into your project repository along with your own code. If that's not what you mean by "cached libraries", it'd be good to expand on how exactly the dependencies are collected at build time. Presumably dep(1) also has a mode where it is able to resolve and download dependencies as a build step without checking in a copy of everything; if not, I think we'll seriously want to investigate using something more like NPM for dependency resolution.

Naming things is certainly one of the less tractable problems in software engineering. I think we should consider a repository naming convention that reflects what we're already doing in the joyent organisation on Github. For instance, I would say that a Go library which is expected to be consumed by other software (or even organisations) should probably be called go-libraryname.git. This means we could have a Go Moray client, go-moray.git, alongside the existing node-moray.git client without any confusion. If a project which includes some Go code produces a zone image shipped as part of Manta or Triton, it should probably be called triton-component.git or manta-component.git; e.g., triton-cns.git or manta-moray.git.

What does it mean to say "APIs within a single project SHOULD use tightly-coupled function signatures"? Is this something specific to the Go language or linker?

You've listed a veritable cavalcade of Static Analysis tools! Is there some expectation that we'd use all of these? If so, are we expecting to add support for these to the eng.git Makefile system, so that we can incorporate them in the make check target and be relatively uniform across projects?

jclulow · 2017-08-04T06:17:06Z

Oh, one more really important thing: support for interactive and post mortem debugging.

This document should demonstrate the best way to have Go processes abort on programmer errors (whether through a non-optional assertion or an explicit panic). It should also cover the use of mdb with Go, or link to a document which does. One imagines that it would be useful at a minimum to be able to enumerate the Go routines in a core file and get stacks for them, as well as print out objects from the heap.

It would also be good to cover any plans for support for DTrace; both via a potential ustack helper, as well as any planned or current support for creating USDT providers in Go libraries and programs.

The Debugging section of the Node Practices RFD includes analogous materials.

cburroughs · 2017-08-04T15:04:26Z

Thanks for continuing to move this forward Sean. I think you have (inadvertently!) managed to fall on several process growing pains around sync-vs-asyn communication, remote timezones, where does the D in RFD happen, where documentation lives, and which current engineering practices are actually universal. This is all ultimately necessary to bring forward even if it has made for choppy waters for the HMS Golang.

Do you know what the first projects using golang are likely to be? I think it would be helpful to have some idea of what things we will learn first For example:

A project that includes people who no previous go experience would teach us the most about how big a problem crazy looking things like GOPATH actually are.
A project that is a core service would need to carefully conform to the existing ecosystem.
SKDs, or other code for external consummation would help us understand how we need to manage runtime upgrade expectations and managing which interfaces are committed for external consumption.

We will get to all of them eventually, but I'd like to know which points of contention will continue to be abstract, and which might soon have some concrete experience to report.

vendor/ Management

I am still unsure if this doc is advocating checking 3rdparty dependencies in as source.

"Thread Worker Pools"

Could you briefly elaborate on what this topic is? One concern I have about Go is choosing yet another language (after node) where "use normal OS threads" is not an available solution to problems.

Tools similar to bazel could

As someone not coming from a Go background, in my experiments I found that a pants/bazel-like like tool alleviated most of the GOPATH/vendor/deps crazyness. I felt like I was using a normal civilized language.

My understanding is that in the Go ecosystem, source based deps are typically (exclusively?) used. How do we identity to 3rd parties which code of ours represents a committed stable interface?
Is there a path forward to accommodate closed source Go code? A particular interest I have is in internal tooling and "glue" code (stuff I am otherwise doing in Python), for which neither bash nor node are ideal.
As discussed in the last call, elaboration on how gRPC and provising http+json endpoints are not exclusive would help clarify that section.

tgross · 2017-08-04T17:29:45Z

@sean- wrote in private chat:

I don't understand the comment re $GOPATH/pkg and based on what I read seems incorrect but I wasn't sure if I misunderstood something or not and wanted to clarify w/ you first.

Which appears to be referring to my comment here:

While this is a common practice in the community it's terrible advice precisely because it requires dumping pkg with each toolchain update

My comment was because of this from the original RFD:

Every time the version of go(1) changes the contents of $GOPATH/pkg MUST be cleaned out:

But instead of simply asserting it I decided to verify it (main.go):

$ mkdir -p ~/go ~/go/pkg ~/go/src ~/go/bin
$ go version
go version go1.8.3 darwin/amd64
$ export GOPATH=~/go
$ cd ~/go/src
$ go get github.com/hashicorp/errwrap
$ <edit main.go, see linked gist above>

# build with go1.8 toolchain
$ go build main.go
$ ./main
Doesn't exist: open /i/dont/exist: no such file or directory
$ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a
568fcbe1fd49a06960387090817e0c26  /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a

# build with go1.7 toolchain
$ GOROOT=~/bin/go1.7.6 ~/bin/go1.7.6/bin/go version
go version go1.7.6 darwin/amd64
$ GOROOT=~/bin/go1.7 ${GOROOT}/go build main.go
$ ./main
Doesn't exist: open /i/dont/exist: no such file or directory
$ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a
568fcbe1fd49a06960387090817e0c26  /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a

As far as I understand here either the toolchain is linking the object file created by a later version of the toolchain or it's ignoring that anyways and recreating it from scratch (and not caching it in the pkg/ dir?). So either the toolchain just let us do something very dangerous and we should dump the pkg/ dir after upgrading the toolchain as I said in my comments, or we don't need to worry about it and the comment in the original RFD is wrong.

sean- · 2017-08-07T19:23:11Z

On Aug 1, 2017, 12:43 PM -0700, Tim Gross ***@***.***>, wrote: Project Structure > $GOPATH SHOULD be set to $HOME/go and SHOULD be incorporated into your shell’s environment. While this is a common practice in the community it's terrible advice precisely because it requires dumping pkg with each toolchain update, and because it requires that all projects have the same version of toolchain and dependencies. This makes building unrelated (perhaps third party) projects more complicated than it needs to be.

This isn't required, but it is considered best practice to do because of drift within pkg. This was intended to be a hint to folks in order to save time and frustration. There are no complications from doing this to the best of my knowledge. If there are, please let me know.

More importantly, this is an unnecessary directive as the entire GOPATH should be relocatable anyways. If Alice is working in ~/work/project1 and Bob is working in ~/go doesn't matter so long as they set their GOPATH to the root of the workspace wherever it is.

GOPATH is relocatable. The recommendation is to use $HOME/go because it reduces friction, not that it's required. The important point is that the environment variable is set. Again, not required, but it's best practice to do this because it reduces headache from older Go programs that assume GOPATH is set (it was only recently made optional, but very few tools actually use `go env GOPATH`).

> Where appropriate, it is RECOMMENDED to make use of monolithic repositories (mono-repo). Can we define what a monorepo even is in the context of a distributed system of open source projects? I can't think of any large open source projects of similar scope to Triton/Manta that are organized this way. (Compare to our favorite container-related golang projects such as kubernetes or docker, which started monolithic and were broken up.)

These projects grew into something that cased to be inward facing and a self-contained, insular ecosystems.

There are also projects with very different purposes, release cycles, and development methodologies at Joyent. To borrow some examples of existing Node.js projects, the https://github.com/joyent/node-verror is an open source library that can be used by end-users separately from any other work we have in Node.js, including those who have no particular interest in Triton services such as https://github.com/joyent/sdc-cn-agent.

This particular case was explicitly called out in the RFD: `5. Publicly consumable libraries or programs SHOULD be pushed to a distinct canonical public location and automatically synchronized to the internal codebase.`

One might then say that some categories of projects belong in their own monorepos, so that perhaps Triton and Manta and SDKs are all separate monorepos. This is more-or-less what Google has done, but the primary requirement motivating their multi-monorepo model is that their "main" monorepo is closed source but they have large open source projects like Android and Chromium.

The argument is inward facing code with code reuse, not closed source.

Workflow It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on?

I have a takeaway to add a reason why: the TL;DR: being Go is the most incestuous developer community that I have experienced and qualitatively feels like it has one of the highest levels of code-reuse that I have experienced. Working in other community code bases is a very common operation and explaining how to do this in order to hit the ground running and aid in reducing the time required to have a positive contribution.

Vendor management > Forked and cached libraries in vendor/ MUST be managed via the dep(1) tool: The golang team says that dep is "ok to use" but its commands are not stable and there is a big security epic yet-undone. Is it perhaps too early to standardize on this as a MUST directive?

Possibly oddly enough, I don't think so. There should be one way to manage vendor/ dependencies in order to have a unified set of CI/CD steps. `dep` wants to perform a security assessment, which is good and noble, but I think that's largely a reaction to npm(1) and trying to get ahead of possible security concerns.

I also think this section needs to include whether or not we're suggesting commiting vendored dependencies to the source code repository, as many golang projects do. I'm opposed to this except in the case of submodules, if only because it make commit history difficult to understand.

Good point, and yes, definitely advocating for the inclusion of dependencies in vendor/

Naming Conventions > Recommendations > Go software SHOULD conform to the recommendations outlined in the following resources: Calling this SHOULD rather than MUST is perhaps a little too weak in this case from the Package Names doc: > Good package names are short and clear. They are lower case

I deliberately kept the number of MUSTs to a minimum. There have been exceptions in the past, but in general this SHOULD becomes a MUST after the slightest amount of code review.

Non-lowercase package names have awful interactions when case-preserving but not-case-sensitive file systems meet fully-case-preserving file systems. The sirupsen/logrus project found this out the hard way and broke a chunk of the ecosystem in the process.

That was amusing and sad to see on some level. It also reinforced the position that libraries should be maintained in vendor/ .

Explicit Types Everything here looks good, but maybe we should be explicit about the examples for folks who might not be 100% familiar with the type system already? Something like: GOOD type ID uint64 type User struct { id ID } BAD type User struct { id uint64 }

Different than the examples included in the Explicit Types section? https://cdn.rawgit.com/joyent/rfd/59ff23f/rfd/0106/README.html#explicit-types

gRPC Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document.

This came up in the Friday Office Hours: the reason this was included was to recommend, not mandate, that Go-to-Go inter-process communication SHOULD use gRPC. The same recommendation included a note saying that if you know you need or are likely to need interoperability, use JSON. This recommendation is there to standardize on gRPC and protobuf for Go apps (vs Thrift or flat buffers or Cap'n Proto, etc).

sean- · 2017-08-07T19:32:52Z

On Aug 1, 2017, 3:32 PM -0700, Richard Kiene ***@***.***>, wrote: > "Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects." > — Rob Pike, 2012 Is this really necessary in the RFD? It seems like an appeal to authority fallacy when I read it.

In a net-new context like this, I think it's an interesting quote that summarizes Go and sets expectations.

> Without providing modern treatment to past issues, it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go. Could you either reference the source(s) of this claim or drop it from the RFD? Without referencing any source material that backs up this up, I'm hard pressed to believe this is a quantifiable, let alone undeniable, which makes it seem like an invisible army/bandwagon fallacy.

Supporting evidence: * dgraph * cockroachdb * etcd * consul * openebs ... https://github.com/avelino/awesome-go#database I'm not aware of another language that comes close to this level of activity in terms of net-new research and software *and* adoption. In the last few years RethinkDB, Mongo, Scylla, and Cassandra are the notable non-Go databases that I'm aware of. I'm sure others exist. I can remove the statement and didn't think it was controversial to include why it was important for us to incorporate Go into our ecosystem of tools. I regret leaving out the keyword, "new" as in "new distributed systems," which was my intention.

richardkiene · 2017-08-07T19:57:23Z

Supporting evidence:

dgraph

cockroachdb

etcd

consul

openebs

...

https://github.com/avelino/awesome-go#database

I'm not aware of another language that comes close to this level of activity in terms of net-new research and software and adoption. In the last few years RethinkDB, Mongo, Scylla, and Cassandra are the notable non-Go databases that I'm aware of. I'm sure others exist. I can remove the statement and didn't think it was controversial to include why it was important for us to incorporate Go into our ecosystem of tools. I regret leaving out the keyword, "new" as in "new distributed systems," which was my intention.

A litany of Go software and your personal awareness is hardly proof of the following:

it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go.

As some others have suggested, it may be best to focus the RFD on how to do Go well, and not bother justifying Golang with anecdotes.

sean- · 2017-08-07T20:07:26Z

On Aug 1, 2017, 9:14 PM -0700, Justin Reagor ***@***.***>, wrote: > It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on? I don't entirely understand why the Git workflow had to be there but I'd imagine we'd want something on using commands like go get, go install, go build, and go test. Even if much of this ends up wrapped in Makefiles, I think this is the document to introduce them. Go has a fair bit more OTB tooling than dynamic languages and there's bound to be useful practices that can be shared with them. Or maybe it's easy enough to RTFM.

The reason the git workflow is included is two fold: 1) Because it's common to interact with non-Joyent Go repositories (especially to begin with as we build our own ecosystem of libraries and tooling) 2) One of my FAQ's for people new to Go is, "I forked a Go library, checked it out, and it's not being used or something broken." The reason for that is that the path of the checked out fork is the per-user fork and does't use remotes with the canonical package name.

Also, there's nothing on building CLIs. Doesn't much of the Go community favor every application defining useful CLIs for most repos/projects? Building command-line tools is also mentioned in Node.js guide.

This is one of the commented out items that hasn't received any treatment yet but will: https://github.com/joyent/rfd/blame/master/rfd/0106/README.adoc#L224-L235

> Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document. Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify. Except maybe in that case the practices came out of actual projects already using them, whereas nothing I know of is evaluating gRPC. If Node.js best practices have made good use out of standardizing HTTP clients, errors, logging, etc. than I'd eventually hope to see that here as well.

Aye.

Other things I'd like to see in this document or discussed... • +1 @tgross, cont. expanding on types, real examples, benefits/gotchas • When and how to wrap concrete types/collections • Anonymous functions, clear distinctions or similarities with Node • Function types (with methods), Function collections, Functions as fields, etc. • Goroutines and common asynchronous control flow patterns • Stacktraces, error types, error handling, and/vs. logging

Stacktraces have been covered. Errors and logging are coming soon and two of the initial objectives that I wanted to hit on before publishing this RFD. Drawing comparisons with Node.js isn't something I want to tackle. Some of the other abstract idioms such as functions and methods are worth calling out but I'd like a bit more specificity in terms of the ask there (otherwise the attempt to address this will be excessively large and the aim is not to write a book so much as lay out best practices and conventions above and beyond what's expounded upon in our general engineering guidelines).

sean- · 2017-08-07T20:26:35Z

On Aug 2, 2017, 9:33 AM -0700, Joshua M. Clulow ***@***.***>, wrote: > Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify. It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you must use a HTTP-based transport for everything.

The section on gRPC didn't say MUST, it said SHOULD. If you have to grab a library to swing to talk between two Go processes, you SHOULD reach for the gRPC hammer first. https://cdn.rawgit.com/joyent/rfd/59ff23f/rfd/0106/README.html#grpc

If we're going to elevate a particular inter-process communication protocol above the rest it frankly ought to be a HTTP API with JSON-formatted payloads, if only so that it fits in with much of the rest of the software estate we already have, and the tools (like curl) that we already use when debugging.

Because this recommendation is only applicable to net-new development, the recommendation to use gRPC is a forward-looking recommendation and not something that's exclusively Go, such as the use of gobj by the `gops(1)` https://github.com/google/gops/blob/master/internal/obj/go.go And again, these are SHOULDs, not MUSTs. If in review it is deemed that something is crossing boundaries, then use of gRPC would be inappropriate and the handler and protocol SHOULD be JSON, especially if something like gRPC Gateway can't be used to provide bilingual interoperability: https://github.com/grpc-ecosystem/grpc-gateway

sean- · 2017-08-07T20:55:21Z

On Aug 3, 2017, 11:17 AM -0700, Kody A Kantor ***@***.***>, wrote: Thanks for taking the time to write this up! Unfortunately, there are many different opinions on many of these topics even within the Go community. Joyent doesn't have much experience with Go, so we haven't been able to form our own opinions. For those reasons I think this document will be a very difficult one to write, and it will probably be relatively comprehensive.

Agreed, and thank you for your understanding. :)

With Node.js we have been able to pass along tribal knowledge through existing code, code reviews, and chatroom discussions, but that isn't the case for Go. So thanks for doing this!

You're very welcome. I'm not sure writing this has or will appear on my laundry list of "favorite things I've done in the last N years" because of the decidedly large and wide ranging opinions held. I'm glad this is happening, but it's near-term not gratifying because it feels like I'm on trial for explaining best practices.

I don't think I would enjoy writing this RFD.

This. 💯 This. 😉

I also think that our understanding of what what we think is right/wrong in writing Go will change as we learn more about using the language. I anticipate this RFD will see a lot of initial churn.

Yup, please keep the feedback coming. I'm trying my best to incorporate it and filter it through both my experience and that of other people who have written a large amount of Go.

> To provide feedback, bug fixes, request clarification, or initiate a discussion, please open an RFD issue and reference RFD 106 along with the necessary specifics. For issues where there is a debate or discussion, this document will include a link back to the individual issues where a particular item was addressed or debated. This was mentioned in chat, but we should keep one issue per RFD. I'm also not a huge fan of an RFD filled with links to comment chains. It may be useful for history and background, but if I'm reading the RFD prose, I would rather not hop back and forth between an issue and the RFD document. I would rather keep the GitHub issues in an O(N) rather than O(N*M) function, where N is the number of RFDs and M is the number of GitHub issues per RFD.

Of all the things I'm _not_ going to litigate, this is certainly one of them. Given the number of independent threads, I think it would be helpful, _but_ my care-cup for changing our mode of operation mid-RFD is somewhere between tiny and non-existant. My only point was to highlight that the lack of threaded conversation on a large topic such as this feels like it could become unwieldy.

I would like to see various Go-isms addressed, like: • Proper use of Go channels (if it's even possible to properly use them) and misuse of Go channels • The Go inheritance mess (interface{}, generics/go-generate, interfaces of interfaces (e.g. ReadWriter)), which are good patterns, which are bad patterns, and why • Linking Go to C code, when is it acceptable? • CGo vs "Pure" Go, which one we'll use and why • Goroutines: when to use them, when not to use them, avoiding zombie goroutines, communication with goroutines, inline goroutines vs calling go on an existing function, short goroutines vs long goroutines • Locking: what is the recommended lock (channels, Cond, Mutex, something else)? Really, explain the sync package, and this comment:

Many of these are unwritten but have already been called out in the commented out ToC as TODO items: * Appropriate use of Channels: https://github.com/joyent/rfd/blame/master/rfd/0106/README.adoc#L265 * Interfaces: https://github.com/joyent/rfd/blame/master/rfd/0106/README.adoc#L246 * go generate: didn't have this one outlined but will include a note in the commented out ToC * Linking to C code, added to ToC (TL;DR: SHOULD use pure Go where possible and SHOULD NOT, but not a MUST. cgo != go - sometimes it makes sense, however. See section on cgo here for a quick primmer: https://open.dgraph.io/post/badger/). * go routines: loosely identified in the worker pool and channel item but needs its own section. Added to the ToC. * Locking and Mutexes, already there

> Package sync provides basic synchronization primitives such as mutual exclusion locks. Other than the Once and WaitGroup types, most are intended for use by low-level library routines. Higher-level synchronization is better done via channels and communication. • If I have a client and a server both written in Go, does one 'import' the other's struct definitions, or are they defined in two places? Monorepo-like applications (Docker) have the client directly import the server's structure definitions. If we don't do monorepo, what do we do?

It depends. Is the struct being reused to communicate within the same binary or is the struct being serialized to message with a different component over the network? I don't like what Docker did her and actually think that what we did at HashiCorp was correct, namely there was a `structs` package that defined all of the reusable components. Then the serialization and deserialization code was decoupled from the struct definitions themselves. This provided a very good forward compatibility story. I think it's still true that Consul 0.9 can talk to Consul 0.1 even though there are three different network protocol versions spoken across the different releases. For example: https://www.consul.io/docs/upgrade-specific.html#raft-protocol-version-compatibility Preserving this has been a hugely important attribute for HashiCorp and their consumers. I believe this level of network API stability is what we, Joyent, are collectively are hoping to achieve, too.

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions. A list of these: • Monorepo • Workflow section • Editor integration • gRPC

I think I've addressed why those are included for the time being but I'm not opposed to removing them either. I do think they're important to address but if everyone would sleep better at night I can remove them. I do want to understand better, and we were getting there on Friday, why we think the treatment of those subjects was either necessary or unnecessary. Stay tuned in this regard.

I would like to see something about error handling. For Node.js, we luckily have a great document that we created that describes synchronous/async errors, and all that jazz. Go is known for having bad error handling.

I actually don't share that assessment of Go's error handling: I think it's banal, to say the least, but extremely effective and because you can lint for poor error handling, very effective and predictable in large code bases. Error wrapping is very important and part of the standard library, but providing accurate and robust treatment of this subject is why it hasn't been written: it takes a decent amount of effort to get right.

The standard library isn't consistent with error handling (some calls return negative values, some calls return error types, some calls immediately end the program). Separately, it would be useful to have a Go Error Handling document. Maybe that could explain this document: Defer, Panic, and Recover, as well as offer practical recommendations.

I don't think it's inconsistent, but I think there's a lack of prescriptive, opinionated treatment of the subject and that needs to be addressed.

I'll probably think of some more things that I would like more information about. I'll have to look at Go code that I've written previously to try to recall frustrations that I've had :D .

+1 I very much appreciate this constructive feedback.

sean- · 2017-08-07T22:14:47Z

On Aug 3, 2017, 2:29 PM -0700, Joshua M. Clulow ***@***.***>, wrote: For example, a hard requirement on the use of godoc(1) is probably not going to fly. We have not historically been that prescriptive in the way people produce documentation for their libraries, and I don't think we want to start now. In many cases it seems more appropriate to produce a README.md file in the root of the specific project repository for idiomatic Github presentation.

As mentioned by others, the Workflow section should probably be trimmed (or even removed). This is covered already in the general principles guide: all code is reviewed before integration, and this review is to be managed through Gerrit. This is mentioned in Change management within the general principles guide.

This was discussed last Friday and I have an outstanding TODO item to provide a paragraph providing more context in the rationale.

To the extent that you're making rules, such as those in the Deadline Timers and Timeouts section, it would be good to provide a paragraph on the motivation for those rules.

(I think I'm in the role of a scribe and librarian right now) Added a TODO to address this. The TL:DR: Duration uses the monotonic clock, time.Time is wall-clock time. Communicating across processes and over the network is best done using wall clock time. If you don't need to communicate it or don't need precision, use time.Duration for timeouts.

I'm not really sure what you mean by "Code SHOULD be fungible". Can you expand on that? Also, what does it mean for code written in Go to be "side-effect"-free? Go seems like it has all of the regular side effect peril of other imperative languages.

There are a few things packed into that sentence: * Go code for Joyent should be compatible with the idioms of the Go community writ-large. * Code that is consumed externally or written to be consumed should interact along package's exported interface boundary. I'm not saying Go code should be functional, but I am saying if you call something within one package, don't fugger around with the state that is outside of the scope of your package. The implied bit here being that exported interfaces are tested, documented, and stable for their callers.

With respect to vendor/ Management, I think it's a bit odd to use the term "cached libraries" to describe the process of checking a copy of everything on which you depend into your project repository along with your own code. If that's not what you mean by "cached libraries", it'd be good to expand on how exactly the dependencies are collected at build time. Presumably dep(1) also has a mode where it is able to resolve and download dependencies as a build step without checking in a copy of everything; if not, I think we'll seriously want to investigate using something more like NPM for dependency resolution.

As discussed on Friday, I was desperately trying to avoid the term "vendoring" but I'm going to capitulate since that's the jargon of this language community. I still say, "use a search engine to look for XYZ" instead of "google XYZ" but I digress. Yes, we MUST vendor libraries for the time being. I don't imagine this to be a long-term position, but until that day, it's as good as we've got (also not I'm not condoning this practice's shortcomings, but I'm also not blind to its benefits). And to be fair and transparent, it works pretty well in practice. I'm resistant to handwaving off best practices because we don't like the aesthetics because having worked in this system (and out of it), my experience has been that this is a net-win for both quality, reducing time-to-ship, and overall productivity.

Naming things is certainly one of the less tractable problems in software engineering. I think we should consider a repository naming convention that reflects what we're already doing in the joyent organisation on Github. For instance, I would say that a Go library which is expected to be consumed by other software (or even organisations) should probably be called go-libraryname.git. This means we could have a Go Moray client, go-moray.git, alongside the existing node-moray.git client without any confusion. If a project which includes some Go code produces a zone image shipped as part of Manta or Triton, it should probably be called triton-component.git or manta-component.git; e.g., triton-cns.git or manta-moray.git.

+1 Good suggestion. Added an item to the ToC. In general I've found that libraries should include the language (e.g. `go-` and binaries do not need to include the language. If the library includes a standalone binary, the binary should be nested, like `go-foolib/cmd/foo` where the basename is the command (e.g. `foo(1)`). For instance: https://github.com/hashicorp/go-sockaddr/tree/master/cmd/sockaddr

What does it mean to say "APIs within a single project SHOULD use tightly-coupled function signatures"? Is this something specific to the Go language or linker?

Inward facing projects should use normal function signatures as we've historically grown to love. This specific item is a call out to a common problem that we see when publishing client libraries that are consumed by external customers and where loose coupling is required. This was my way of strongly suggesting that we SHOULD NOT use the pattern described in the common case, how's that?

You've listed a veritable cavalcade of Static Analysis tools! Is there some expectation that we'd use all of these? If so, are we expecting to add support for these to the eng.git Makefile system, so that we can incorporate them in the make check target and be relatively uniform across projects?

Yes. The `make check` target should call some or all of these static analysis tools. Where `make check` has some options disabled by default, the CI pipeline should hit all of the tests, even the expensive ones that aren't enabled locally in a project (for whatever reason).

sean- · 2017-08-07T22:17:41Z

On Aug 3, 2017, 11:17 PM -0700, Joshua M. Clulow ***@***.***>, wrote: Oh, one more really important thing: support for interactive and post mortem debugging.

Yup. It's in the list already. There is included support for doing live debugging in the 5min crash course: https://cdn.rawgit.com/joyent/rfd/59ff23f/rfd/0106/README.html#five-minute-jump-start But that's no substitute and I'm not trying to hand wave past that element of things.

This document should demonstrate the best way to have Go processes abort on programmer errors (whether through a non-optional assertion or an explicit panic). It should also cover the use of mdb with Go, or link to a document which does. One imagines that it would be useful at a minimum to be able to enumerate the Go routines in a core file and get stacks for them, as well as print out objects from the heap. It would also be good to cover any plans for support for DTrace; both via a potential ustack helper, as well as any planned or current support for creating USDT providers in Go libraries and programs. The Debugging section of the Node Practices RFD includes analogous materials.

https://github.com/joyent/rfd/blame/master/rfd/0106/README.adoc#L184

watters · 2017-08-07T22:58:36Z

Per RFD 104, a Go section of the Engineering Guide is a necessary precondition to using go to build "[n]ew server-side projects" using Go:

New server-side projects should use one of the languages for which we have developed a Best Practices section within this Guide. That's primarily Node.js at the moment, but it's expected to expand in the near future.

Language-specific best practices guides must address the sections of this guide that refer to language-specific guides, including repository naming, coding style, lint checks, automated testing, and debuggability.

However, I'm having trouble sorting out when/how the discussion happens to decide whether or not to adopt a new language. In the present case, where is the "Why Joyent Should Start Using Go for Server-Side Projects" part of the discussion? I ask because, without that part, I have a really hard time even evaluating arguments for or against a given approach to the topics actually covered.

It seems quite reasonable to bundle the "Why" portion with the "How" portion when proposing a new language, but it seems like skipping the "Why" and just diving into the "How" adds friction. I'll appeal to the documentation for what an RFD should contain and ask: What problem does using Go solve?

Will it make our product more reliable? Will it make our product easier to build and maintain? Will it make our open source projects more accessible to a broader set of community contributions? Will it help us grow revenue? Will it help us better deliver on our promises to Samsung with SPC? Is it easier or more efficient to troubleshoot workloads in production?

It's very difficult (for me anyway) to make sense of this discussion, so far, because it seems like a number of important aspects of the discussion—the underlying reasons and goals for introducing Go in the first place—are presumed. But I would expect clarity about those those reasons and goals to be fundamental to sorting out some of the stickier questions here—code repo structure, how to handle dependencies, IPC preferences, etc.

At the moment, I'd guess that everyone investing in this process is bringing their own understanding of the goals and—in good faith—projecting them onto the discussion. I believe it would be a lot easier to actually settle some of these questions if the objectives were well-defined.

Also, if you could point to some projects that exemplify the approach to Go you're proposing, I'd be grateful. I often find it easier to understand topics like this when I can read the code for working software.

sean- · 2017-08-07T23:09:33Z

On Aug 4, 2017, 8:04 AM -0700, cburroughs ***@***.***>, wrote: Thanks for continuing to move this forward Sean. I think you have (inadvertently!) managed to fall on several process growing pains around sync-vs-asyn communication, remote timezones, where does the D in RFD happen, where documentation lives, and which current engineering practices are actually universal. This is all ultimately necessary to bring forward even if it has made for choppy waters for the HMS Golang.

As I said last week: some people want to do interactive chat (video), other people want to do offline. There is no way to make 100% of people happy in this regard.

Do you know what the first projects using golang are likely to be?

There are a number of candidates and items we're considering incorporating into Triton.

I think it would be helpful to have some idea of what things we will learn first For example: • A project that includes people who no previous go experience would teach us the most about how big a problem crazy looking things like GOPATH actually are. • A project that is a core service would need to carefully conform to the existing ecosystem. • SKDs, or other code for external consummation would help us understand how we need to manage runtime upgrade expectations and managing which interfaces are committed for external consumption. We will get to all of them eventually, but I'd like to know which points of contention will continue to be abstract, and which might soon have some concrete experience to report.

Yup. The genesis of the recommendation was that we would start in a monorepo and decompose that repository as necessary and fan out where required. My estimation is that it's easier to break things out than to fan repositories in.

> vendor/ Management I am still unsure if this doc is advocating checking 3rdparty dependencies in as source.

I had a takeaway from Friday to explicitly note that it is explicitly advocating for this.

> "Thread Worker Pools" Could you briefly elaborate on what this topic is? One concern I have about Go is choosing yet another language (after node) where "use normal OS threads" is not an available solution to problems.

https://rakyll.org/scheduler/ TL;DR: go routines are cheap, use 'em, and set an upper bound on the number of system threads by explicitly setting GOMAXPROCS. I remember FreeBSD and even Sun experimented with M:N threading models and abandoned them. Go uses that model and it appears to work reasonably well in practice. Anecdotal evidence (I forget where I picked the sup, but it was from someone at Google) suggests that having hundreds of thousands of goroutines is reasonable. One place where things get complicated is the number of OS-level threads, setting GOMAXPROCS, and CPU resource constraints. In particular this is problematic in smaller zones because smaller Zones detect all available cores. The effect of this is that from a resource utilization perspective, a process doesn't actually have access to all of that compute time. To effectively deal with that, smaller packages need smaller values of GOMAXPROCS and defaulting to the number of detected cores can be a poor default for Infrastructure VMs. I don't have a hard and fast rule for this yet, but I'm using GOMAXPROCS=4 as my default value in my SMF startup scripts and scaling it up if necessary. https://blog.golang.org/pipelines https://blog.golang.org/advanced-go-concurrency-patterns https://blog.golang.org/concurrency-is-not-parallelism https://blog.golang.org/go-concurrency-patterns-timing-out-and

> Tools similar to bazel could As someone not coming from a Go background, in my experiments I found that a pants/bazel-like like tool alleviated most of the GOPATH/vendor/deps crazyness. I felt like I was using a normal civilized language.

My boost.build, cmake, autoconf, scons, GNU Make, pmake, tup, and go build self was rather pleased with bazel and am secretly looking forward to a world where that tool is rewritten in Go so that I can sidestep the use of a JVM as a build tool (a build-time tool that I'm extremely reluctant to grow a hard dependency on). And to be clear, inclusion of bazel in the foot note of this RFD was to make a point, not to hint that its even on the radar for something we're considering. These are not the tealeaves you are looking for.

• > My understanding is that in the Go ecosystem, source based deps are typically (exclusively?) used. How do we identity to 3rd parties which code of ours represents a committed stable interface?

Correct. There ... hrm. I'm not finding it right now. There is a linter that warns when the signature of exported functions changes. Right now, and the conventional way of testing this, is to test the surface area of the exported interface via a `_test` package. For example: ```go package foo func Bar() {```} ``` Would have a public interface test: ```go package foo_test func TestBar(t *testing.T) { ... } ``` And use that to test that the exported API is meeting the testable contract. Where necessary, `rdeps` or other reverse dependency tools can be used to figure out who else implements a given package. For instance: https://godoc.org/github.com/pkg/errors?importers

• Is there a path forward to accommodate closed source Go code? A particular interest I have is in internal tooling and "glue" code (stuff I am otherwise doing in Python), for which neither bash nor node are ideal.

I am rather fond of Godo: https://github.com/go-godo/godo for the one-off odd-job that doesn't quite warrant becoming a full-fledged CLI utility.

• > As discussed in the last call, elaboration on how gRPC and provising http+json endpoints are not exclusive would help clarify that section.

Yup, it'll show up at some point.

sean- · 2017-08-07T23:44:15Z

On Aug 4, 2017, 10:29 AM -0700, Tim Gross ***@***.***>, wrote: > Every time the version of go(1) changes the contents of $GOPATH/pkg MUST be cleaned out: But instead of simply asserting it I decided to verify it (main.go): $ mkdir -p ~/go ~/go/pkg ~/go/src ~/go/bin $ go version go version go1.8.3 darwin/amd64 $ export GOPATH=~/go $ cd ~/go/src $ go get github.com/hashicorp/errwrap $ <edit main.go, see linked gist above> # build with go1.8 toolchain $ go build main.go $ ./main Doesn't exist: open /i/dont/exist: no such file or directory $ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a 568fcbe1fd49a06960387090817e0c26 /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a # build with go1.7 toolchain $ GOROOT=~/bin/go1.7 ${GOROOT}/go build main.go $ ./main Doesn't exist: open /i/dont/exist: no such file or directory $ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a 568fcbe1fd49a06960387090817e0c26 /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a As far as I understand here either the toolchain is linking the object file created by a later version of the toolchain or it's ignoring that anyways and recreating it from scratch (and not caching it in the pkg/ dir?). So either the toolchain just let us do something very dangerous and we should dump the pkg/ dir after upgrading the toolchain as I said in my comments, or we don't need to worry about it and the comment in the original RFD is wrong.

`go build` rebuilds the entire .a. If a Makefile uses `go generate` and `go generate` calls a binary build with an old version, or a Makefile wraps `go install` instead of `go build`, you can find yourself in a bad place. The recommendation is there to alleviate . This has largely been fixed: golang/go@7b87631 But something like ... gah, okay, found it, something like `gobin` is required: https://github.com/rjeczalik/bin This has been a periodic thorn that I was attempting to address with this item because the side-effects used to be horrific. Since the above commit to Go, the side-effects are mostly gone until you touch release-specific versions of binaries that are coupled to a particular version of Go. This item should be revised to include the use and recommendation to use `gobin`.

sean- · 2017-08-08T02:11:56Z

On Aug 7, 2017, 3:58 PM -0700, Cameron Watters ***@***.***>, wrote: Per RFD 104, a Go section of the Engineering Guide is a necessary precondition to using go to build "[n]ew server-side projects" using Go: > New server-side projects should use one of the languages for which we have developed a Best Practices section within this Guide. That's primarily Node.js at the moment, but it's expected to expand in the near future. > Language-specific best practices guides must address the sections of this guide that refer to language-specific guides, including repository naming, coding style, lint checks, automated testing, and debuggability. However, I'm having trouble sorting out when/how the discussion happens to decide whether or not to adopt a new language. In the present case, where is the "Why Joyent Should Start Using Go for Server-Side Projects" part of the discussion? I ask because, without that part, I have a really hard time even evaluating arguments for or against a given approach to the topics actually covered.

The scantest of treatment as to "Why" was basically to quote our engineering call when this was initially discussed and provide a loose treatment of Go's vision.

It seems quite reasonable to bundle the "Why" portion with the "How" portion when proposing a new language, but it seems like skipping the "Why" and just diving into the "How" adds friction. I'll appeal to the documentation for what an RFD should contain and ask: What problem does using Go solve?

That isn't something I was attempting to actually cover in this RFD. This RFD is centered around how to use Go effectively. "If your'e new to Go, here's how you get started using Go at Joyent. If you need logging, do this. If you need to handle interrupts, do XYZ. If you need to handle CLI options, do ABC. If you need to do thread or request cancelation, use context and here's why." etc. Said differently, "when you swing the Go hammer and you're trying to build a widget XYZ, a recommended way to build your widget is through the prescribed libraries and practices."

At the moment, I'd guess that everyone investing in this process is bringing their own understanding of the goals and—in good faith—projecting them onto the discussion. I believe it would be a lot easier to actually settle some of these questions if the objectives were well-defined. Also, if you could point to some projects that exemplify the approach to Go you're proposing, I'd be grateful. I often find it easier to understand topics like this when I can read the code for working software.

I can certainly try. What kind of software are you interested in seeing? There is not a single canonical, "this is the way to Go" piece of model Go software. There are, however, some pieces of software that do some things well and are worth considering as a model Go code.

watters · 2017-08-08T03:23:52Z

That isn't something I was attempting to actually cover in this RFD. This RFD is centered around how to use Go effectively.

Effectiveness really only makes sense when measured against a clear definition of purpose or objective. You have not shared the purposes and objectives you're using as a barometer for effectiveness in this document. Without them, it's difficult to have a productive discussion, since everyone else has to guess about yours or bring their own. My request was meant to help pull the discussion out of the morass it currently appears mired in.

What kind of software are you interested in seeing?

I'd love to see a server-side project that is analogous to a project someone at Joyent would build using these guidelines which is written in a manner you consider exemplar and substantially consistent with the recommendations presented here.

There is not a single canonical, "this is the way to Go" piece of model Go software. There are, however, some pieces of software that do some things well and are worth considering as a model Go code.

I doubt there is a single, canonical piece of model software in any language, anywhere. I think this discussion would be best served by projects that are worth considering as models.

tgross · 2017-08-08T12:49:56Z

@sean- wrote:

it feels like I'm on trial for explaining best practices.

Conflating "standard practices"/"ideomatic practices" with "best practices" is I think the core of the debate. If you ask the wider Node.js community about their "best practices" you'll find many many things that don't fit well with the Joyent software development methodology. Indeed, these differences are a lot of why golang is on the table for discussion today. Total acceptance of the "golang way" (i.e. the Google way) seems unlikely.

re monorepo @sean- wrote:

These projects grew into something that c[e]ased to be inward facing and [became] self-contained, insular ecosystems.

(Edits for typos and clarity, please let me know if that's wrong)

Much like Triton then, which was once closed-source. You're arguing in multiple places for making choices which make it easier for libraries and components to be consumed by third-party developers, or to invite new contributors. I'm all for that. But using these projects as counter-examples when you're saying they changed away from monorepos because they became more open projects runs counter to the idea that this is precisely what we want.

You've posted other examples of things like Hashicorp's use of "structs" packages, which implies you have a package that can be reused by third-parties. Having all our golang code in a monorepo makes use by third-parties that much more complicated.

re GOMAXPROC, @sean- wrote:

One place where things get complicated is the number of OS-level threads, setting GOMAXPROCS, and CPU resource constraints. In particular this is problematic in smaller zones because smaller Zones detect all available cores. The effect of this is that from a resource utilization perspective, a process doesn't actually have access to all of that compute time. To effectively deal with that, smaller packages need smaller values of GOMAXPROCS and defaulting to the number of detected cores can be a poor default for Infrastructure VMs. I don't have a hard and fast rule for this yet, but I'm using GOMAXPROCS=4 as my default value in my SMF startup scripts and scaling it up if necessary.

We've also seen this sort of problem on Triton in any language runtime that uses those new-fangled things called threads and try to automatically create a number of threads relative to the number of cores (ex. Erlang, lots of Java applications). I would really love to see a Joyent library that could be used by third-parties to calculate an appropriate number based on the cpu caps we give the zone. But we've said to end-users "it depends on the workload" without giving a ton of guidance here.

re vendoring, @sean- wrote:

Good point, and yes, definitely advocating for the inclusion of dependencies in vendor/

We should not stray from using explicit terminology here. You're advocating not for the collection of dependencies in vendor but the committing of this directory into the source code repository of each project. By handwaving this as "vendoring" or "inclusion" without defining the workflow, developers can all walk away with different versions of what this means. There are at least 3 different practices of using the "vendor" folder that I've seen in the wild: dumping in unversioned dependencies (!), submodules, and subtrees.

I'm fine with "vendoring" in practice, for example, so long as it involves git submodules (or subtrees, meh) and not dumping a pile of un-versioned software as a single commit into a repo.

re git workflow, @sean- wrote:

The reason the git workflow is included is two fold:
...
2) One of my FAQ's for people new to Go is, "I forked a Go library, checked it out, and it's not being used or something broken." The reason for that is that the path of the checked out fork is the per-user fork and does't use remotes with the canonical package name.

That's a good point. We should probably update that section to take into account anything specific to the gerrit workflow though.

re documentation via godoc, @sean- wrote:

Use of a README is largely insufficient.

A README is insufficient if and only if it's the only documentation. Joyent services and applications have man pages, and READMEs generally cover only the specific service/application and include pointers back to the top-level projects. While repository documentation is convenient and particularly useful for library code or standalone applications, the bulk of our documentation is collected in Triton user and operator guides and served at https://docs.joyent.com.

Admittedly, the specific tool used to create the docs is probably orthogonal to the publishing of those docs but saying "services written in go are documented differently" doesn't make a lot of sense with respect to the primary consumers of that documentation, who are end-users and operators, not developers of Triton.

re: pkg/ directory rebuilding, @sean- wrote:

go build rebuilds the entire .a.

Unless I'm missing something (always a possibility!) that's not what the demonstration I did shows. The .a file was not changed when I changed the build toolchain.

KodyKantor · 2017-08-08T15:15:38Z

Great, thanks for the reply Sean.

I would like to see various Go-isms addressed, like:
• Proper use of Go channels (if it's even possible to properly use them) and misuse of Go channels
• The Go inheritance mess (interface{}, generics/go-generate, interfaces of interfaces (e.g. ReadWriter)), which are good patterns, which are bad patterns, and why
• Linking Go to C code, when is it acceptable?
• CGo vs "Pure" Go, which one we'll use and why
• Goroutines: when to use them, when not to use them, avoiding zombie goroutines, communication with goroutines, inline goroutines vs calling go on an existing function, short goroutines vs long goroutines
• Locking: what is the recommended lock (channels, Cond, Mutex, something else)? Really, explain the sync package, and this comment:

Many of these are unwritten but have already been called out in the commented out ToC as TODO items:

Super! That's what I would really like this RFD discussion to become. A discussion of the relative merits of various Go language features and patterns. We should be having discussions about things like how we're going to do intra-process communication and locking taking into account debugging, transfer of state, sync vs async error handling, etc.

Having said that,

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions. A list of these:
• Monorepo
• Workflow section
• Editor integration
• gRPC

I think I've addressed why those are included for the time being but I'm not opposed to removing them either. I do think they're important to address but if everyone would sleep better at night I can remove them. I do want to understand better, and we were getting there on Friday, why we think the treatment of those subjects was either necessary or unnecessary. Stay tuned in this regard.

I do think these topics should be removed, and we should strongly consider removing topics that others have suggested removing. I hear your call to keep them included, but I don't think this is the right place to discuss these topics. I agree with @tgross that the trouble here is that we're conflating recommendations for best practices (in the code), and recommendations for process and culture. In my mind this document should be focused on discussing the things that are stubbed out in the .adoc comments (that is, the code best practices). Either that, or we should rename this RFD to reflect that it proposes changing our existing processes to reflect those used in the Go community and creating another RFD to address Go (the language) Best Practices.

If we don't make a change to the scope of this document we'll be arguing about this until trumpets sound, which doesn't seem like a great path forward.

I also think that @watters brings up a good point. I can't even remember why we chose to start pursuing Go. If I could suggest a change, I would say that we replace:

"Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects."
— Rob Pike, 2012

and

Joyent has a storied history with regards to Go. Without providing modern treatment to past issues, it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go. To that extent, Joyent is embracing of Go’s contribution to distributed, enterprise, production-grade computing and celebrates the software engineering ethos held by many in the Go community because their beliefs are aligned with our principles.

with some reasoning why we may be deciding to adopt Go. This should be both technical and non-technical in nature. The non-technical part could discuss if and how Joyent and the Go community's values intersect (which you may be hinting at in the last sentence there). It would also be interesting to hear what problems Go may solve for us, like Cameron pointed out.

sean- · 2017-08-08T16:57:37Z

On Aug 8, 2017, 5:26 AM -0700, Tim Gross ***@***.***>, wrote: re documentation via godoc, @sean- wrote: > Use of a README is largely insufficient. A README is insufficient if and only if it's the only documentation. Joyent services and applications have man pages, and READMEs generally cover only the specific service/application and include pointers back to the top-level projects. While repository documentation is convenient and particularly useful for library code or standalone applications, the bulk of our documentation is collected in Triton user and operator guides and served at https://docs.joyent.com.

Generating documentation for docs.joyent.com from godoc is possible. I anticipated getting to this in the release engineering/CI/CD portion at some point and introducing a tool like: https://github.com/robertkrimen/godocdown so that the canonical source of truth is still within scope of the language community, but wouldn't result in duplication of effort. I haven't endorsed this yet, but since the documentation is an integral part of the code, it makes sense to make the record of truth godoc for Go libraries (not necessarily commands!). More work and research needs to be done here. Getting markdown to land in docs.joyent.com is largely, I believe, a function of plumbing. Getting community-driven docs for free is important for being a citizen of the Go ecosystem: * https://godoc.org/github.com/joyent/containerpilot/discovery * https://godoc.org/github.com/pkg/errors * https://godoc.org/golang.org/x/tools/cmd/stringer

Admittedly, the specific tool used to create the docs is probably orthogonal to the publishing of those docs but saying "services written in go are documented differently" doesn't make a lot of sense with respect to the primary consumers of that documentation, who are end-users and operators, not developers of Triton.

Yup, and I'm _not_ advocating docs don't end up at docs.joyent.com

re: pkg/ directory rebuilding, @sean- wrote: > go build rebuilds the entire .a. Unless I'm missing something (always a possibility!) that's not what the demonstration I did shows. The .a file was not changed.

Ah! Yes. I forgot the specifics and went to go look: go install saves its .a files, which can result in staleness. `go build` does not. `go build` rebuilds everything everytime. It was `go install` in the Build or install phase that causes problems: https://research.swtch.com/go2017

tgross · 2017-08-08T17:07:03Z

go build rebuilds everything everytime.

Please review the demonstration I provided. Try it yourself.

sean- · 2017-08-08T17:52:19Z

@tgross It doesn't save the result. The above output was expected. I'm actually not sure where .a files are used outside of go install.

When compiling a single main package, build writes
the resulting executable to an output file named after
the first source file ('go build ed.go rx.go' writes 'ed' or 'ed.exe')
or the source code directory ('go build unix/sam' writes 'sam' or 'sam.exe').
The '.exe' suffix is added when writing a Windows executable.

When compiling multiple packages or a single non-main package,
build compiles the packages but discards the resulting object,
serving only as a check that the packages can be built.

https://github.com/golang/go/blob/9e859d5e9c7c093937c79f87c4b3328383320843/src/cmd/go/internal/work/build.go#L44-L55

tgross · 2017-08-08T17:56:39Z

@tgross It doesn't save the result. The above output was expected. I'm actually not sure where .a files are used outside of go install.

Ok, but that suggests that you really do need to toss pkg/ after a toolchain upgrade, which was the thing you were disputing from my original comment. Because otherwise you've got .a files on disk that are not the ones you're using to build your binary.

jclulow · 2017-08-11T02:59:55Z

Sean wrote:

Yes, we MUST vendor libraries for the time being. I don't imagine this to be a long-term position, but until that day, it's as good as we've got (also not I'm not condoning this practice's shortcomings, but I'm also not blind to its benefits). And to be fair and transparent, it works pretty well in practice. I'm resistant to handwaving off best practices because we don't like the aesthetics because having worked in this system (and out of it), my experience has been that this is a net-win for both quality, reducing time-to-ship, and overall productivity.

This is a point on which I, and others here, strongly disagree with the RFD as written today. We are emphatically not "handwaving off" anything for "aesthetics" or any other reasons. In fact, we have repeatedly raised legitimate technical concerns with the idea of checking in copies of all of the dependencies for a project. Indeed, there exists today a solution to this problem which comes from the Go community!

We should, as a build step, resolve and fetch dependencies using dep(1). This would prevent pollution of the source history with unrelated churn, and it means we can specify dependencies which are unlocked across a reasonable semver range (e.g., accept available patch releases without needing new commits). It also means there will only be one copy of the dependency, identified by a version number or a git SHA or branch, not a divorced copy checked in to each project where there really is no guarantee that it has not been purposefully or accidentally modified on the way in or subsequently.

It would be good if we could constrain the discussion here to concrete properties of mechanisms for the management of dependencies. Cases could easily be made for just about anything under the comparatively nebulous banners of quality, velocity, and productivity.

Yes. The make check target should call some or all of these static analysis tools. Where make check has some options disabled by default, the CI pipeline should hit all of the tests, even the expensive ones that aren't enabled locally in a project (for whatever reason).

I don't think make check should mean different things in different contexts. The set of checks run on a developer's workstation should be the exact same checks as those that are run in a CI environment, or really anywhere else. To do otherwise would be confusing. Note that in our repositories we have a make prepush target, which is generally the combination of make check and make test. The check target is for things like style and lint; the test target is generally unit tests.

sean- self-assigned this Aug 2, 2017

jclulow closed this as completed Apr 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFD 106 Discussion #45

RFD 106 Discussion #45

tgross commented Aug 1, 2017 •

edited

Loading

tgross commented Aug 1, 2017 •

edited

Loading

richardkiene commented Aug 1, 2017 •

edited

Loading

jwreagor commented Aug 2, 2017

jclulow commented Aug 2, 2017 •

edited

Loading

jwreagor commented Aug 2, 2017 •

edited

Loading

tgross commented Aug 2, 2017

KodyKantor commented Aug 3, 2017 •

edited

Loading

jclulow commented Aug 3, 2017 •

edited

Loading

jclulow commented Aug 4, 2017

cburroughs commented Aug 4, 2017

tgross commented Aug 4, 2017 •

edited

Loading

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

richardkiene commented Aug 7, 2017

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

watters commented Aug 7, 2017

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 8, 2017 via email

watters commented Aug 8, 2017

tgross commented Aug 8, 2017

KodyKantor commented Aug 8, 2017

sean- commented Aug 8, 2017 via email

tgross commented Aug 8, 2017

sean- commented Aug 8, 2017

tgross commented Aug 8, 2017

jclulow commented Aug 11, 2017

RFD 106 Discussion #45

RFD 106 Discussion #45

Comments

tgross commented Aug 1, 2017 • edited Loading

tgross commented Aug 1, 2017 • edited Loading

Version of Go

Project Structure

Workflow

Vendor management

Naming Conventions

Explicit Types

gRPC

richardkiene commented Aug 1, 2017 • edited Loading

jwreagor commented Aug 2, 2017

jclulow commented Aug 2, 2017 • edited Loading

jwreagor commented Aug 2, 2017 • edited Loading

tgross commented Aug 2, 2017

KodyKantor commented Aug 3, 2017 • edited Loading

jclulow commented Aug 3, 2017 • edited Loading

jclulow commented Aug 4, 2017

cburroughs commented Aug 4, 2017

tgross commented Aug 4, 2017 • edited Loading

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

richardkiene commented Aug 7, 2017

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

watters commented Aug 7, 2017

sean- commented Aug 7, 2017 via email

sean- commented Aug 7, 2017 via email

sean- commented Aug 8, 2017 via email

watters commented Aug 8, 2017

tgross commented Aug 8, 2017

KodyKantor commented Aug 8, 2017

sean- commented Aug 8, 2017 via email

tgross commented Aug 8, 2017

sean- commented Aug 8, 2017

tgross commented Aug 8, 2017

jclulow commented Aug 11, 2017

tgross commented Aug 1, 2017 •

edited

Loading

tgross commented Aug 1, 2017 •

edited

Loading

richardkiene commented Aug 1, 2017 •

edited

Loading

jclulow commented Aug 2, 2017 •

edited

Loading

jwreagor commented Aug 2, 2017 •

edited

Loading

KodyKantor commented Aug 3, 2017 •

edited

Loading

jclulow commented Aug 3, 2017 •

edited

Loading

tgross commented Aug 4, 2017 •

edited

Loading