Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Software Engineering Language Policy at Posthog #71

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions requests-for-comments/2022-11-08-supported-languages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Request for comments: Supported languages at PostHog

Since day 1 here at Posthog we've supported two and half core languages: Python and Javascript/Typescript.

This set of tools has carried us a long way and we should both the cpython runtime and node a huge hats off for bringing us this far.

## Problem statement

We have the opportunity coming up to rebuild or greenfield build out services that are critical parts of our data pipeline. It will be important for these services to be correct, fast, and efficient. Considering this now is a good time to ask: Are we using the correct tools for the job?
Copy link
Member

@pauldambra pauldambra Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest some of the benefits of switching language comes from re-writing/extracting the service and only part from re-writing/extracting the service in a different language

At a previous job we spent a while rewriting from Python to C# because "C# is faster than Python". Then realised that we were CPU bound and moved a bunch of work onto the GPU. Python and C# were then faster than we needed them to be.

So: "are we using the correct tools for the job" is a great question. But being really clear on why we're asking that and how we'll know when we're done is super important. E.g. moving ingestion onto the GPU is probably more complicated and not actually faster.

So, there are two interesting questions here:

  1. what needs to be done to make ingestion safer/faster/misc
  2. how many programming languages can we support

These are really different questions.

I'm not working on ingestion so I can have opinions on 1 but "so what?"

On "how many programming languages"... The problem with a new (to you) programming language is almost never the language.

Aside... from https://neverworkintheory.org/2014/01/29/stefik-siebert-syntax.html Python and Ruby are consistently measured as easiest to learn. C-style languages are no easier to learn than a language made of random keywords.

The problem with a new (to you) language is generally the tooling. Most noticeably, in my experience: dependency management, and building and releasing things.

(go and update our android library from java 8 to java 19 if you want proof of this :))

So, we need a comparison of building, releasing, and running in k8s for the languages we consider. I think we can consider a smaller list (although C# marketing should be "Java but good", and I've a friend we could hire if we started using clojure which seems to inspire massive love from folk that use it)

The other thing is adoption within the org. Who needs to learn the language, and how and when do they do that? How do we know when to write it? Are we migrating to it or adding it alongside? Which services mustn't it be used for?

And finally: hiring. "Come work here because you're excited about language X. Incidentally the first few months you'll be working on these bugs in Python". So, we can pull from a wider pool with a wider pool of languages but are we in a place to hire someone who only wants to work on language X


Bonus "after finally" point... who owns tooling for each language? Does the platform team commit to providing support for building, deploying, and running all the languages? Is that in our definition of platform? Or do we need some champions


So why even bring this up?

Frankly:

- Python is slow.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python can also be blazingly fast

  1. Do we know where our python code is slow, and why? Perhaps there is a hot path we can optimize, some memory allocations we can clean up, or wasteful logic
  2. New Python asyncio runtimes can be super fast. I spent a previous job writing a lot of async Python with things like FastAPI and uvicorn, and that was in no way slow
  3. Have we considered PyPy, or similar?
  4. Worst case we could write hot paths in something like Rust, and use something like PyOxide/similar. But for our application, I highly doubt we are CPU bound + are unlikely to see any meaningful performance benefit.

Anyway, this is all a way of saying that you can write slow python and you can write fast python, and perhaps we should consider focusing our efforts on the latter before introducing new languages.

- Node is a memory hog.
- Node tooling is slow.
- Dependencies can be huge.
- No guarantees that code is correct.
Comment on lines +13 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Python is slow.

True generally, but relevant only for specific usecases (like here).

  • Node is a memory hog.

Has this been a problem? There are alternative implementations of JS (e.g. bun, deno) that do this better, but are too fresh to rely on in production.

  • Node tooling is slow.

Depends on the tooling. The frontend builds quite fast (sub 3-10sec), the plugin server can be improved 10x if needed.

  • Dependencies can be huge.

We're switching to pnpm that'll fix this

  • No guarantees that code is correct.

Can you ellaborate?


Can we be successful continuing to use these languages? Of course.

- They are well understood.
- Our tooling is setup to support these languages
- We have internal expertise.
- Generally they are _good enough_.

Why should we be open to other languages?

- We have different people with experience working with different languages.
- 100% guaranteed static typing is quite beneficial
- Compiled languages provide more confidence shipped code is correct
- There are compiled languages that provide significant efficiency wins on CPU, Memory, and performance
- Using the right tool for the job is typically the right thing to do (if you can ship it)

Candidate Languages:

- [Golang](https://go.dev/) (big surprise)
- [Rust](https://www.rust-lang.org/) 🦀
- [OCaml](https://ocaml.org/)
- [Elixir](https://elixir-lang.org/) (dynamically typed)
- [Scala](https://www.scala-lang.org/)
- [Java](<https://en.wikipedia.org/wiki/Java_(programming_language)>)

## Meet the eligible languages

### Golang
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I would not be against us introducing Go for a new service, so long as we're sure that Python does not and cannot suit our use case


The good:

- Extremely easy to learn. Most engineers can learn and be productive in about a day.
- Built to remove most contentious parts of development
- Designed to make engineering easier
- Built for network services and concurrency
- Great standard lib
- Super fast compile times
- Light on memory
- Used by plenty of organizations big and small
- We have plenty of Gophers here.
- A simple binary as a deliverable.

The bad:

- Most people would not say the language sparks joy (I disagree)
- Not terribly expressive
- Verbose, but easy to read

### Rust
Copy link

@ellie ellie Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would not be wrong in saying I have the most Rust experience here. I LOVE Rust. and I'd love to write it at work. I also have several friends who are fantastic engineers + would consider applying here purely because of Rust.

But tbh I'd be super against us adopting it. The learning curve is huge (prepare to not be comparatively productive for around a quarter). Unless it turns out there's actually more people writing Rust for fun here than I thought 😄

While maturing, the ecosystem is nothing like Python. We'd be spending a lot more time in the weeds. There are fewer "obvious" choices than other languages. EG, which async runtime should we use (if any...)? Why? Do we know enough about how our application does/should work in production to even start making such a choice?

Generally organizations choose Rust because

  1. They have explicitly decided that they will be a "Rust company", from the beginning (eg see TrueLayer)
  2. They have eked out every last drop of performance from their current stack, and need to go yet further (eg see Discord)


The good:

- Very hot right now
- Very good for when you need to hyper optimize something but would like to avoid C or C++
- Extraordinarily expressive and fun to program
- Extremely performant and light on memory
- Used by plenty of organizations big and small
- We have a few Crustaceans here!
- A simple binary as a deliverable.

The bad:

- Slow compile times
- Harder to read because of how expressive
- Slower ramp up time to become proficient in as an Engineer

### OCaml

> OCaml is a general-purpose, industrial-strength programming language with an emphasis on expressiveness and safety.

You may not be familiar with this one but it is _very_ safe and performant. Companies that have huge amounts of money on the line who are risk adverse to defects and slowdowns, like the HFT firm Jane Street use OCaml because it is unforgiving in how type safe it is.

> OCaml’s powerful type system means more bugs are caught at compile time, and large, complex codebases are easier to maintain. This makes it a good language for running critical code. At the same time, sophisticated inference makes the type system unobtrusive, creating a smooth developer experience.

The good:

- Garbage collected
- Algebraic data types
- Pattern matching
- Type inference
- Immutable
- Static Type-checking
- First-class functions
- Parametric polymorphism
- Used by serious programming shops

The bad:

- Not terribly popular relative to the others
- Can be considered relatively academic
- No one experienced here

### Elixir

Oh, Erlang <3

The good:

- Built for streaming data
- The cockroach of runtimes.
- Automatic function level clustering
- Hot reloading of functions
- Compiled
- Relatively functional

The bad:

- Dynamically typed
- Runs in a VM

### Scala / Java

Grouping these together because they really are converging. Really this is any language on the JVM.

The JVM is truly a wonderful runtime. It's very fast. The hotspot detection + Just In Time Compiling is _super_ impressive.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a talk by the folk at a German car sales company who were shipping an ML model in Java. They had to write tooling to hit all of the code paths they cared about during boot because JIT was a problem for them.

That said there are a bunch of alternative VMs that are focussed on speed because of serverless-style loads...


The good:

- Surprisingly performant
- JVM Library interrop means huge availability of libraries out there
- Tons of expertise and companies big and small are building software in this
- Kafka and Zookeeper are built on this

The bad:

- Considered somewhat crusty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong opinions weakly held time. Anyone writing java should be writing kotlin.

The dependency management and build tooling in java is horrible: arcane, manual, and hard to debug.

- Slow boot times
- JVM tuning is no fun

## Discussion

So with this overview laid out let's talk about our first scenario where we need to decide what language we should use:

[RFC: Inserter Service Requirements](https://github.com/PostHog/meta/pull/68)

The TL;DR requirements here are:

- Consume from Kafka
- Insert into ClickHouse
- Deserialize some portion of payload (serialization TBD) to determine where to insert

It's a simple service and could be written any anything really. It does give us an opportunity to branch off our well traveled path of Python and Typescript. In my opinion I think Golang or Rust would be a great fit here, as would the other languages listed. I'm a Gopher in particular so I would really like to see more written here.

## Success criteria

_How do we know if this is successful (i.e. metrics, customer feedback), what's out of scope, whats makes this ambitious?_

The goal here is to spark debate about languages that we should use here at Posthog. What are we open to? Why should we not adopt new technologies.

Success here would be to make a decision and effectively enact a policy on this and have engineers aligned and not worry about this again (at least for some time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above... one decision only?

  1. what needs to be done to make ingestion safer/faster/misc
  2. how many programming languages can we support
  3. how do we support any one programming language