Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD 106 Discussion #45

Closed
tgross opened this issue Aug 1, 2017 · 31 comments
Closed

RFD 106 Discussion #45

tgross opened this issue Aug 1, 2017 · 31 comments
Assignees

Comments

@tgross
Copy link
Contributor

tgross commented Aug 1, 2017

For discussion of RFD 106: Engineering Guide: Golang Best Practices

@tgross
Copy link
Contributor Author

tgross commented Aug 1, 2017

Version of Go

The Go team and project are consistently advancing the state of the Go compiler and standard library. Released versions and release candidates are considered fit for production and should therefore be used in production. Upgrade early and often.

Joyent customers are often large enterprises with much less aggressive upgrade cycles. In service upgrade scenarios, there may be services built on multiple toolchains in play at a given time. We should consider whether constant toolchain upgrades are going to have impact on downstream compliance teams.


Project Structure

$GOPATH SHOULD be set to $HOME/go and SHOULD be incorporated into your shell’s environment.

While this is a common practice in the community it's terrible advice precisely because it requires dumping pkg with each toolchain update, and because it requires that all projects have the same version of toolchain and dependencies. This makes building unrelated (perhaps third party) projects more complicated than it needs to be.

More importantly, this is an unnecessary directive as the entire GOPATH should be relocatable anyways. If Alice is working in ~/work/project1 and Bob is working in ~/go doesn't matter so long as they set their GOPATH to the root of the workspace wherever it is.

Where appropriate, it is RECOMMENDED to make use of monolithic repositories (mono-repo).

Can we define what a monorepo even is in the context of a distributed system of open source projects? I can't think of any large open source projects of similar scope to Triton/Manta that are organized this way. (Compare to our favorite container-related golang projects such as kubernetes or docker, which started monolithic and were broken up.)

There are also projects with very different purposes, release cycles, and development methodologies at Joyent. To borrow some examples of existing Node.js projects, the https://github.com/joyent/node-verror is an open source library that can be used by end-users separately from any other work we have in Node.js, including those who have no particular interest in Triton services such as https://github.com/joyent/sdc-cn-agent.

One might then say that some categories of projects belong in their own monorepos, so that perhaps Triton and Manta and SDKs are all separate monorepos. This is more-or-less what Google has done, but the primary requirement motivating their multi-monorepo model is that their "main" monorepo is closed source but they have large open source projects like Android and Chromium.


Workflow

It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on?


Vendor management

Forked and cached libraries in vendor/ MUST be managed via the dep(1) tool:

The golang team says that dep is "ok to use" but its commands are not stable and there is a big security epic yet-undone. Is it perhaps too early to standardize on this as a MUST directive?

I also think this section needs to include whether or not we're suggesting commiting vendored dependencies to the source code repository, as many golang projects do. I'm opposed to this except in the case of submodules, if only because it make commit history difficult to understand.


Naming Conventions

Recommendations
Go software SHOULD conform to the recommendations outlined in the following resources:

Calling this SHOULD rather than MUST is perhaps a little too weak in this case from the Package Names doc:

Good package names are short and clear. They are lower case

Non-lowercase package names have awful interactions when case-preserving but not-case-sensitive file systems meet fully-case-sensitive+preserving file systems. The sirupsen/logrus project found this out the hard way and broke a chunk of the ecosystem in the process.


Explicit Types

Everything here looks good, but maybe we should be explicit about the examples for folks who might not be 100% familiar with the type system already? Something like:

GOOD

type ID uint64
type User struct {
    id ID
}

BAD

type User struct {
    id uint64
}

... or something along those lines.


gRPC

Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document.

@richardkiene
Copy link
Contributor

richardkiene commented Aug 1, 2017

"Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects."
— Rob Pike, 2012

Is this really necessary in the RFD? It seems like an appeal to authority fallacy when I read it.

Without providing modern treatment to past issues, it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go.

Could you either reference the source(s) of this claim or drop it from the RFD? Without referencing any source material that backs this up I'm hard pressed to believe this is a quantifiable, let alone undeniable, which makes it seem like an invisible army/bandwagon fallacy.

@jwreagor
Copy link

jwreagor commented Aug 2, 2017

It's not clear to me why we need a language-specific workflow section. Maybe the rationale for including this could be elaborated on?

I don't entirely understand why the Git workflow had to be there but I'd imagine we'd want something on using commands like go get, go install, go build, and go test. Even if much of this ends up wrapped in Makefiles, I think this is the document to introduce them. Go has a fair bit more OTB tooling than dynamic languages and there's bound to be useful practices that can be shared with them. Or maybe it's easy enough to RTFM.

Also, there's nothing on building CLIs. Doesn't much of the Go community favor every application defining useful CLIs for most repos/projects? Building command-line tools is also mentioned in Node.js guide.

Transport and serialization layers are orthogonal to language best practices. I'm not sure it belongs in this document.

Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify. Except maybe in that case the practices came out of actual projects already using them, whereas nothing I know of is evaluating gRPC. If Node.js best practices have made good use out of standardizing HTTP clients, errors, logging, etc. than I'd eventually hope to see that here as well.

Other things I'd like to see in this document or discussed...

  • +1 @tgross, cont. expanding on types, real examples, benefits/gotchas
  • When and how to wrap concrete types/collections
  • Anonymous functions, clear distinctions or similarities with Node
  • Function types (with methods), Function collections, Functions as fields, etc.
  • Goroutines and common asynchronous control flow patterns
  • Stacktraces, error types, error handling, and/vs. logging

@sean- sean- self-assigned this Aug 2, 2017
@jclulow
Copy link
Contributor

jclulow commented Aug 2, 2017

Mentioning gRPC doesn't appear to be any different than the Node.js best practices mentioning REST, HTTP and node-restify.

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

If we're going to elevate a particular inter-process communication protocol above the rest it frankly ought to be a HTTP API with JSON-formatted payloads, if only so that it fits in with much of the rest of the software estate we already have, and the tools (like curl) that we already use when debugging.

@jwreagor
Copy link

jwreagor commented Aug 2, 2017

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

If gRPC is going to be mentioned at all than I believe the protobuf language behind gRPC should be defined alongside language best practices. I also think gRPC (or HTTP) should be well defined for specific cases not unlike how the Node.js document prescribes supports "HTTP Servers".

@tgross
Copy link
Contributor Author

tgross commented Aug 2, 2017

@jclulow wrote:

It's not quite the same. The mention of Restify in the Node guide is for when you are building a HTTP server. It does not prescribe that you should use a HTTP-based transport for everything.

Exactly. Last time I checked LDAP and Postgres (for example) don't speak gRPC.

@cheapRoc wrote:

Also, there's nothing on building CLIs. Doesn't much of the Go community favor every application defining useful CLIs for most repos/projects? Building command-line tools is also mentioned in Node.js guide.

Fair. But there are currently zero (0) Joyent services written in golang so anointing particular third-party packages seems awfully premature.

@KodyKantor
Copy link
Contributor

KodyKantor commented Aug 3, 2017

Thanks for taking the time to write this up! Unfortunately, there are many different opinions on many of these topics even within the Go community. Joyent doesn't have much experience with Go, so we haven't been able to form our own opinions. For those reasons I think this document will be a very difficult one to write, and it will probably be relatively comprehensive. With Node.js we have been able to pass along tribal knowledge through existing code, code reviews, and chatroom discussions, but that isn't the case for Go. So thanks for doing this! I don't think I would enjoy writing this RFD. I also think that our understanding of what what we think is right/wrong in writing Go will change as we learn more about using the language. I anticipate this RFD will see a lot of initial churn.

To provide feedback, bug fixes, request clarification, or initiate a discussion, please open an RFD issue and reference RFD 106 along with the necessary specifics. For issues where there is a debate or discussion, this document will include a link back to the individual issues where a particular item was addressed or debated.

This was mentioned in chat, but we should keep one issue per RFD. I'm also not a huge fan of an RFD filled with links to comment chains. It may be useful for history and background, but if I'm reading the RFD prose, I would rather not hop back and forth between an issue and the RFD document. I would rather keep the GitHub issues in an O(N) rather than O(N*M) function, where N is the number of RFDs and M is the number of GitHub issues per RFD.

I would like to see various Go-isms addressed, like:

  • Proper use of Go channels (if it's even possible to properly use them) and misuse of Go channels
  • The Go inheritance mess (interface{}, generics/go-generate, interfaces of interfaces (e.g. ReadWriter)), which are good patterns, which are bad patterns, and why
  • Linking Go to C code, when is it acceptable?
  • CGo vs "Pure" Go, which one we'll use and why
  • Goroutines: when to use them, when not to use them, avoiding zombie goroutines, communication with goroutines, inline goroutines vs calling go on an existing function, short-running goroutines vs long-running goroutines - are both equally appropriate?
  • Locking: what is the recommended lock (channels, Cond, Mutex, something else)? Really, explain the sync package, and this comment:

Package sync provides basic synchronization primitives such as mutual exclusion locks. Other than the Once and WaitGroup types, most are intended for use by low-level library routines. Higher-level synchronization is better done via channels and communication.

  • If I have a client and a server both written in Go, does one 'import' the other's struct definitions, or are they defined in two places? Monorepo-like applications (Docker) have the client directly import the server's structure definitions. If we don't do monorepo, what do we do?

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions. A list of these:

  • Monorepo
  • Workflow section
  • Editor integration
  • gRPC

I would like to see something about error handling. For Node.js, we luckily have a great document that we created that describes synchronous/async errors, and all that jazz. Go is known for having bad error handling. The standard library isn't consistent with error handling (some calls return negative values, some calls return error types, some calls immediately end the program). Separately, it would be useful to have a Go Error Handling document. Maybe that could explain this document: Defer, Panic, and Recover, as well as offer practical recommendations.

I'll probably think of some more things that I would like more information about. I'll have to look at Go code that I've written previously to try to recall frustrations that I've had :D .

@jclulow
Copy link
Contributor

jclulow commented Aug 3, 2017

Kody wrote:

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions.

I definitely agree with this. A short list of other things that this document should probably not be:

  • a place for Rob Pike quotes
  • a document which justifies the use of Go, rather than a document which explains how, if a project team so chooses, they should use Go
  • making broad claims about relative productivity or proliferation which are at best difficult to conclusively verify, and which do not really affect the outcome; e.g.
    • "Use of one-workspace per project is counter-productive"
    • "it is undeniable that many, if not most ... distributed systems are being written in Go"
    • "fascism ... increases the overall productivity of the entire Go community"
    • "Embrace the pass-by-value nature of Go, be productive"

In general, this document should focus on guidelines and rules specific to the use of Go within projects that are structured similarly to the large, preexisting body of Joyent software. If you would like to fundamentally restructure the software estate, that's obviously a much more expensive (and essentially orthogonal) proposition and really needs to be done as an adjustment to our General Principles (outlined in RFD 104).

For example, a hard requirement on the use of godoc(1) is probably not going to fly. We have not historically been that prescriptive in the way people produce documentation for their libraries, and I don't think we want to start now. In many cases it seems more appropriate to produce a README.md file in the root of the specific project repository for idiomatic Github presentation.

As mentioned by others, the Workflow section should probably be trimmed (or even removed). This is covered already in the general principles guide: all code is reviewed before integration, and this review is to be managed through Gerrit. This is mentioned in Change management within the general principles guide.

To the extent that you're making rules, such as those in the Deadline Timers and Timeouts section, it would be good to provide a paragraph on the motivation for those rules.

I'm not really sure what you mean by "Code SHOULD be fungible". Can you expand on that? Also, what does it mean for code written in Go to be "side-effect"-free? Go seems like it has all of the regular side effect peril of other imperative languages.

With respect to vendor/ Management, I think it's a bit odd to use the term "cached libraries" to describe the process of checking a copy of everything on which you depend into your project repository along with your own code. If that's not what you mean by "cached libraries", it'd be good to expand on how exactly the dependencies are collected at build time. Presumably dep(1) also has a mode where it is able to resolve and download dependencies as a build step without checking in a copy of everything; if not, I think we'll seriously want to investigate using something more like NPM for dependency resolution.

Naming things is certainly one of the less tractable problems in software engineering. I think we should consider a repository naming convention that reflects what we're already doing in the joyent organisation on Github. For instance, I would say that a Go library which is expected to be consumed by other software (or even organisations) should probably be called go-libraryname.git. This means we could have a Go Moray client, go-moray.git, alongside the existing node-moray.git client without any confusion. If a project which includes some Go code produces a zone image shipped as part of Manta or Triton, it should probably be called triton-component.git or manta-component.git; e.g., triton-cns.git or manta-moray.git.

What does it mean to say "APIs within a single project SHOULD use tightly-coupled function signatures"? Is this something specific to the Go language or linker?

You've listed a veritable cavalcade of Static Analysis tools! Is there some expectation that we'd use all of these? If so, are we expecting to add support for these to the eng.git Makefile system, so that we can incorporate them in the make check target and be relatively uniform across projects?

@jclulow
Copy link
Contributor

jclulow commented Aug 4, 2017

Oh, one more really important thing: support for interactive and post mortem debugging.

This document should demonstrate the best way to have Go processes abort on programmer errors (whether through a non-optional assertion or an explicit panic). It should also cover the use of mdb with Go, or link to a document which does. One imagines that it would be useful at a minimum to be able to enumerate the Go routines in a core file and get stacks for them, as well as print out objects from the heap.

It would also be good to cover any plans for support for DTrace; both via a potential ustack helper, as well as any planned or current support for creating USDT providers in Go libraries and programs.

The Debugging section of the Node Practices RFD includes analogous materials.

@cburroughs
Copy link
Contributor

Thanks for continuing to move this forward Sean. I think you have (inadvertently!) managed to fall on several process growing pains around sync-vs-asyn communication, remote timezones, where does the D in RFD happen, where documentation lives, and which current engineering practices are actually universal. This is all ultimately necessary to bring forward even if it has made for choppy waters for the HMS Golang.

Do you know what the first projects using golang are likely to be? I think it would be helpful to have some idea of what things we will learn first For example:

  • A project that includes people who no previous go experience would teach us the most about how big a problem crazy looking things like GOPATH actually are.
  • A project that is a core service would need to carefully conform to the existing ecosystem.
  • SKDs, or other code for external consummation would help us understand how we need to manage runtime upgrade expectations and managing which interfaces are committed for external consumption.

We will get to all of them eventually, but I'd like to know which points of contention will continue to be abstract, and which might soon have some concrete experience to report.

vendor/ Management

I am still unsure if this doc is advocating checking 3rdparty dependencies in as source.

"Thread Worker Pools"

Could you briefly elaborate on what this topic is? One concern I have about Go is choosing yet another language (after node) where "use normal OS threads" is not an available solution to problems.

Tools similar to bazel could

As someone not coming from a Go background, in my experiments I found that a pants/bazel-like like tool alleviated most of the GOPATH/vendor/deps crazyness. I felt like I was using a normal civilized language.

  • My understanding is that in the Go ecosystem, source based deps are typically (exclusively?) used. How do we identity to 3rd parties which code of ours represents a committed stable interface?

  • Is there a path forward to accommodate closed source Go code? A particular interest I have is in internal tooling and "glue" code (stuff I am otherwise doing in Python), for which neither bash nor node are ideal.

  • As discussed in the last call, elaboration on how gRPC and provising http+json endpoints are not exclusive would help clarify that section.

@tgross
Copy link
Contributor Author

tgross commented Aug 4, 2017

@sean- wrote in private chat:

I don't understand the comment re $GOPATH/pkg and based on what I read seems incorrect but I wasn't sure if I misunderstood something or not and wanted to clarify w/ you first.

Which appears to be referring to my comment here:

While this is a common practice in the community it's terrible advice precisely because it requires dumping pkg with each toolchain update

My comment was because of this from the original RFD:

Every time the version of go(1) changes the contents of $GOPATH/pkg MUST be cleaned out:

But instead of simply asserting it I decided to verify it (main.go):

$ mkdir -p ~/go ~/go/pkg ~/go/src ~/go/bin
$ go version
go version go1.8.3 darwin/amd64
$ export GOPATH=~/go
$ cd ~/go/src
$ go get github.com/hashicorp/errwrap
$ <edit main.go, see linked gist above>

# build with go1.8 toolchain
$ go build main.go
$ ./main
Doesn't exist: open /i/dont/exist: no such file or directory
$ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a
568fcbe1fd49a06960387090817e0c26  /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a

# build with go1.7 toolchain
$ GOROOT=~/bin/go1.7.6 ~/bin/go1.7.6/bin/go version
go version go1.7.6 darwin/amd64
$ GOROOT=~/bin/go1.7 ${GOROOT}/go build main.go
$ ./main
Doesn't exist: open /i/dont/exist: no such file or directory
$ md5sum ~/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a
568fcbe1fd49a06960387090817e0c26  /Users/tim.gross/go/pkg/darwin_amd64/github.com/hashicorp/errwrap.a

As far as I understand here either the toolchain is linking the object file created by a later version of the toolchain or it's ignoring that anyways and recreating it from scratch (and not caching it in the pkg/ dir?). So either the toolchain just let us do something very dangerous and we should dump the pkg/ dir after upgrading the toolchain as I said in my comments, or we don't need to worry about it and the comment in the original RFD is wrong.

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@richardkiene
Copy link
Contributor

Supporting evidence:

  • dgraph
  • cockroachdb
  • etcd
  • consul
  • openebs

...

https://github.com/avelino/awesome-go#database

I'm not aware of another language that comes close to this level of activity in terms of net-new research and software and adoption. In the last few years RethinkDB, Mongo, Scylla, and Cassandra are the notable non-Go databases that I'm aware of. I'm sure others exist. I can remove the statement and didn't think it was controversial to include why it was important for us to incorporate Go into our ecosystem of tools. I regret leaving out the keyword, "new" as in "new distributed systems," which was my intention.

A litany of Go software and your personal awareness is hardly proof of the following:

it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go.

As some others have suggested, it may be best to focus the RFD on how to do Go well, and not bother justifying Golang with anecdotes.

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@watters
Copy link

watters commented Aug 7, 2017

Per RFD 104, a Go section of the Engineering Guide is a necessary precondition to using go to build "[n]ew server-side projects" using Go:

New server-side projects should use one of the languages for which we have developed a Best Practices section within this Guide. That's primarily Node.js at the moment, but it's expected to expand in the near future.

Language-specific best practices guides must address the sections of this guide that refer to language-specific guides, including repository naming, coding style, lint checks, automated testing, and debuggability.

However, I'm having trouble sorting out when/how the discussion happens to decide whether or not to adopt a new language. In the present case, where is the "Why Joyent Should Start Using Go for Server-Side Projects" part of the discussion? I ask because, without that part, I have a really hard time even evaluating arguments for or against a given approach to the topics actually covered.

It seems quite reasonable to bundle the "Why" portion with the "How" portion when proposing a new language, but it seems like skipping the "Why" and just diving into the "How" adds friction. I'll appeal to the documentation for what an RFD should contain and ask: What problem does using Go solve?

Will it make our product more reliable? Will it make our product easier to build and maintain? Will it make our open source projects more accessible to a broader set of community contributions? Will it help us grow revenue? Will it help us better deliver on our promises to Samsung with SPC? Is it easier or more efficient to troubleshoot workloads in production?

It's very difficult (for me anyway) to make sense of this discussion, so far, because it seems like a number of important aspects of the discussion—the underlying reasons and goals for introducing Go in the first place—are presumed. But I would expect clarity about those those reasons and goals to be fundamental to sorting out some of the stickier questions here—code repo structure, how to handle dependencies, IPC preferences, etc.

At the moment, I'd guess that everyone investing in this process is bringing their own understanding of the goals and—in good faith—projecting them onto the discussion. I believe it would be a lot easier to actually settle some of these questions if the objectives were well-defined.

Also, if you could point to some projects that exemplify the approach to Go you're proposing, I'd be grateful. I often find it easier to understand topics like this when I can read the code for working software.

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 7, 2017 via email

@sean-
Copy link
Contributor

sean- commented Aug 8, 2017 via email

@watters
Copy link

watters commented Aug 8, 2017

That isn't something I was attempting to actually cover in this RFD. This RFD is centered around how to use Go effectively.

Effectiveness really only makes sense when measured against a clear definition of purpose or objective. You have not shared the purposes and objectives you're using as a barometer for effectiveness in this document. Without them, it's difficult to have a productive discussion, since everyone else has to guess about yours or bring their own. My request was meant to help pull the discussion out of the morass it currently appears mired in.

What kind of software are you interested in seeing?

I'd love to see a server-side project that is analogous to a project someone at Joyent would build using these guidelines which is written in a manner you consider exemplar and substantially consistent with the recommendations presented here.

There is not a single canonical, "this is the way to Go" piece of model Go software. There are, however, some pieces of software that do some things well and are worth considering as a model Go code.

I doubt there is a single, canonical piece of model software in any language, anywhere. I think this discussion would be best served by projects that are worth considering as models.

@tgross
Copy link
Contributor Author

tgross commented Aug 8, 2017

@sean- wrote:

it feels like I'm on trial for explaining best practices.

Conflating "standard practices"/"ideomatic practices" with "best practices" is I think the core of the debate. If you ask the wider Node.js community about their "best practices" you'll find many many things that don't fit well with the Joyent software development methodology. Indeed, these differences are a lot of why golang is on the table for discussion today. Total acceptance of the "golang way" (i.e. the Google way) seems unlikely.


re monorepo @sean- wrote:

These projects grew into something that c[e]ased to be inward facing and [became] self-contained, insular ecosystems.

(Edits for typos and clarity, please let me know if that's wrong)

Much like Triton then, which was once closed-source. You're arguing in multiple places for making choices which make it easier for libraries and components to be consumed by third-party developers, or to invite new contributors. I'm all for that. But using these projects as counter-examples when you're saying they changed away from monorepos because they became more open projects runs counter to the idea that this is precisely what we want.

You've posted other examples of things like Hashicorp's use of "structs" packages, which implies you have a package that can be reused by third-parties. Having all our golang code in a monorepo makes use by third-parties that much more complicated.


re GOMAXPROC, @sean- wrote:

One place where things get complicated is the number of OS-level threads, setting GOMAXPROCS, and CPU resource constraints. In particular this is problematic in smaller zones because smaller Zones detect all available cores. The effect of this is that from a resource utilization perspective, a process doesn't actually have access to all of that compute time. To effectively deal with that, smaller packages need smaller values of GOMAXPROCS and defaulting to the number of detected cores can be a poor default for Infrastructure VMs. I don't have a hard and fast rule for this yet, but I'm using GOMAXPROCS=4 as my default value in my SMF startup scripts and scaling it up if necessary.

We've also seen this sort of problem on Triton in any language runtime that uses those new-fangled things called threads and try to automatically create a number of threads relative to the number of cores (ex. Erlang, lots of Java applications). I would really love to see a Joyent library that could be used by third-parties to calculate an appropriate number based on the cpu caps we give the zone. But we've said to end-users "it depends on the workload" without giving a ton of guidance here.


re vendoring, @sean- wrote:

Good point, and yes, definitely advocating for the inclusion of dependencies in vendor/

We should not stray from using explicit terminology here. You're advocating not for the collection of dependencies in vendor but the committing of this directory into the source code repository of each project. By handwaving this as "vendoring" or "inclusion" without defining the workflow, developers can all walk away with different versions of what this means. There are at least 3 different practices of using the "vendor" folder that I've seen in the wild: dumping in unversioned dependencies (!), submodules, and subtrees.

I'm fine with "vendoring" in practice, for example, so long as it involves git submodules (or subtrees, meh) and not dumping a pile of un-versioned software as a single commit into a repo.


re git workflow, @sean- wrote:

The reason the git workflow is included is two fold:
...
2) One of my FAQ's for people new to Go is, "I forked a Go library, checked it out, and it's not being used or something broken." The reason for that is that the path of the checked out fork is the per-user fork and does't use remotes with the canonical package name.

That's a good point. We should probably update that section to take into account anything specific to the gerrit workflow though.


re documentation via godoc, @sean- wrote:

Use of a README is largely insufficient.

A README is insufficient if and only if it's the only documentation. Joyent services and applications have man pages, and READMEs generally cover only the specific service/application and include pointers back to the top-level projects. While repository documentation is convenient and particularly useful for library code or standalone applications, the bulk of our documentation is collected in Triton user and operator guides and served at https://docs.joyent.com.

Admittedly, the specific tool used to create the docs is probably orthogonal to the publishing of those docs but saying "services written in go are documented differently" doesn't make a lot of sense with respect to the primary consumers of that documentation, who are end-users and operators, not developers of Triton.


re: pkg/ directory rebuilding, @sean- wrote:

go build rebuilds the entire .a.

Unless I'm missing something (always a possibility!) that's not what the demonstration I did shows. The .a file was not changed when I changed the build toolchain.

@KodyKantor
Copy link
Contributor

Great, thanks for the reply Sean.

I would like to see various Go-isms addressed, like:
• Proper use of Go channels (if it's even possible to properly use them) and misuse of Go channels
• The Go inheritance mess (interface{}, generics/go-generate, interfaces of interfaces (e.g. ReadWriter)), which are good patterns, which are bad patterns, and why
• Linking Go to C code, when is it acceptable?
• CGo vs "Pure" Go, which one we'll use and why
• Goroutines: when to use them, when not to use them, avoiding zombie goroutines, communication with goroutines, inline goroutines vs calling go on an existing function, short goroutines vs long goroutines
• Locking: what is the recommended lock (channels, Cond, Mutex, something else)? Really, explain the sync package, and this comment:

Many of these are unwritten but have already been called out in the commented out ToC as TODO items:

Super! That's what I would really like this RFD discussion to become. A discussion of the relative merits of various Go language features and patterns. We should be having discussions about things like how we're going to do intra-process communication and locking taking into account debugging, transfer of state, sync vs async error handling, etc.

Having said that,

I think there are a few cultural or somewhat unnecessary topics in this document. I don't know if that was intended, but they do not fit in to my expectation of this document. It could be that some of these are resulting in some flame and heated discussions. A list of these:
• Monorepo
• Workflow section
• Editor integration
• gRPC

I think I've addressed why those are included for the time being but I'm not opposed to removing them either. I do think they're important to address but if everyone would sleep better at night I can remove them. I do want to understand better, and we were getting there on Friday, why we think the treatment of those subjects was either necessary or unnecessary. Stay tuned in this regard.

I do think these topics should be removed, and we should strongly consider removing topics that others have suggested removing. I hear your call to keep them included, but I don't think this is the right place to discuss these topics. I agree with @tgross that the trouble here is that we're conflating recommendations for best practices (in the code), and recommendations for process and culture. In my mind this document should be focused on discussing the things that are stubbed out in the .adoc comments (that is, the code best practices). Either that, or we should rename this RFD to reflect that it proposes changing our existing processes to reflect those used in the Go community and creating another RFD to address Go (the language) Best Practices.

If we don't make a change to the scope of this document we'll be arguing about this until trumpets sound, which doesn't seem like a great path forward.

I also think that @watters brings up a good point. I can't even remember why we chose to start pursuing Go. If I could suggest a change, I would say that we replace:

"Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects."
— Rob Pike, 2012

and

Joyent has a storied history with regards to Go. Without providing modern treatment to past issues, it is undeniable that many, if not most (as of 2017) distributed systems are being written in Go. To that extent, Joyent is embracing of Go’s contribution to distributed, enterprise, production-grade computing and celebrates the software engineering ethos held by many in the Go community because their beliefs are aligned with our principles.

with some reasoning why we may be deciding to adopt Go. This should be both technical and non-technical in nature. The non-technical part could discuss if and how Joyent and the Go community's values intersect (which you may be hinting at in the last sentence there). It would also be interesting to hear what problems Go may solve for us, like Cameron pointed out.

@sean-
Copy link
Contributor

sean- commented Aug 8, 2017 via email

@tgross
Copy link
Contributor Author

tgross commented Aug 8, 2017

go build rebuilds everything everytime.

Please review the demonstration I provided. Try it yourself.

@sean-
Copy link
Contributor

sean- commented Aug 8, 2017

@tgross It doesn't save the result. The above output was expected. I'm actually not sure where .a files are used outside of go install.

When compiling a single main package, build writes
the resulting executable to an output file named after
the first source file ('go build ed.go rx.go' writes 'ed' or 'ed.exe')
or the source code directory ('go build unix/sam' writes 'sam' or 'sam.exe').
The '.exe' suffix is added when writing a Windows executable.

When compiling multiple packages or a single non-main package,
build compiles the packages but discards the resulting object,
serving only as a check that the packages can be built.

https://github.com/golang/go/blob/9e859d5e9c7c093937c79f87c4b3328383320843/src/cmd/go/internal/work/build.go#L44-L55

@tgross
Copy link
Contributor Author

tgross commented Aug 8, 2017

@tgross It doesn't save the result. The above output was expected. I'm actually not sure where .a files are used outside of go install.

Ok, but that suggests that you really do need to toss pkg/ after a toolchain upgrade, which was the thing you were disputing from my original comment. Because otherwise you've got .a files on disk that are not the ones you're using to build your binary.

@jclulow
Copy link
Contributor

jclulow commented Aug 11, 2017

Sean wrote:

Yes, we MUST vendor libraries for the time being. I don't imagine this to be a long-term position, but until that day, it's as good as we've got (also not I'm not condoning this practice's shortcomings, but I'm also not blind to its benefits). And to be fair and transparent, it works pretty well in practice. I'm resistant to handwaving off best practices because we don't like the aesthetics because having worked in this system (and out of it), my experience has been that this is a net-win for both quality, reducing time-to-ship, and overall productivity.

This is a point on which I, and others here, strongly disagree with the RFD as written today. We are emphatically not "handwaving off" anything for "aesthetics" or any other reasons. In fact, we have repeatedly raised legitimate technical concerns with the idea of checking in copies of all of the dependencies for a project. Indeed, there exists today a solution to this problem which comes from the Go community!

We should, as a build step, resolve and fetch dependencies using dep(1). This would prevent pollution of the source history with unrelated churn, and it means we can specify dependencies which are unlocked across a reasonable semver range (e.g., accept available patch releases without needing new commits). It also means there will only be one copy of the dependency, identified by a version number or a git SHA or branch, not a divorced copy checked in to each project where there really is no guarantee that it has not been purposefully or accidentally modified on the way in or subsequently.

It would be good if we could constrain the discussion here to concrete properties of mechanisms for the management of dependencies. Cases could easily be made for just about anything under the comparatively nebulous banners of quality, velocity, and productivity.

Yes. The make check target should call some or all of these static analysis tools. Where make check has some options disabled by default, the CI pipeline should hit all of the tests, even the expensive ones that aren't enabled locally in a project (for whatever reason).

I don't think make check should mean different things in different contexts. The set of checks run on a developer's workstation should be the exact same checks as those that are run in a CI environment, or really anywhere else. To do otherwise would be confusing. Note that in our repositories we have a make prepush target, which is generally the combination of make check and make test. The check target is for things like style and lint; the test target is generally unit tests.

@jclulow jclulow closed this as completed Apr 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants