Skip to content

Slogans, Fallacies, and Concepts

andychu edited this page Apr 10, 2024 · 79 revisions

UNDER CONSTRUCTION

Related: Patterns and Anti-Patterns

Slogans to Explain the Oils Project

  • Oils is our upgrade path from bash to a better language and runtime.
  • OSH is the most bash-compatible shell, by a mile.
  • YSH is for Python and JavaScript users who avoid shell. (the clean slate view)
    • Shell should be more like the dynamic languages that won (Python, JavaScript, Ruby, PHP, etc.)
  • Oils is a Small Tool That Unifies Shell, Python, Regex, JSON, and YAML
    • (the project expanded beyond shell -- data has to be fixed too)
  • Oils adds the missing declarative part to shell (via Hay, Ruby-like blocks)
  • The shitty language rehabilitation project: Oils lets you make your "devops sludge" nicer, incrementally. Old sludge: sh, make, awk, m4. New cloud sludge: YAML, JSON, Go templates, etc.
  • Concept: Tables, Records, and Documents: TSV8, JSON8, and HTML
  • If we do a good job, everyone will be equally unhappy

NOT done yet (we need escaping functions):

Interactive Shell

Regarding the Headless Shell Headless Mode

  • The shell UI should have a terminal, it shouldn't be a terminal
  • The shell is a language-oriented interface, but it doesn't have to be a terminal-oriented one

Slogans to Explain Shell

  • Shell is about ad hoc, coarse-grained reuse. It's for gluing together things that weren't meant to be glued together. Oils lets you do this correctly.
  • Shell is the language of heterogeneity and diversity (a mix of languages, evaluation models, devices, operating systems, networks, performance characteristics, etc.)
    • but without cacophony! (awk / make / YAML are cacophony)
  • Shell is the language of process-based concurrency. It can use all your cores.

  • The Worst Amounts of Shell are 0% and 100%

    • Three Things That Are Worse Than Shell Scripts - Shell in wiki pages, Shell in package.json, shell that's only on other people's computers (YAML)
  • Unix Shell Should Be the Center of User-Centric Computing. Shell is a language-oriented interface that does what you say! It doesn't "nudge" you in certain directions with dark patterns.

  • It's for describing the architecture of distributed systems: processes and ports.

  • Shell is where software starts (because it's the language of user space; it bootstraps the rest of the system; all language package managers use it)

  • Unix Is Equally Inconvenient For Everyone, and That's a Feature -- https://old.reddit.com/r/ProgrammingLanguages/comments/s7shox/why_static_languages_suffer_from_complexity/htep2dm/

Concepts

  • Shell is a "meta-language" -- in a few senses. Generating source code in other languages. Building source code. Measuring the output of running programs.

Interior Fictions vs. Exterior Reality

or Text vs. Types

  • When Models and Reality Collide, Reality Wins https://lobste.rs/s/9rrxbh/on_types#c_qanywm
  • Protocols and Interchange Formats Over In-Memory Data !
    • JSON/TSV/HTML over the network / on disk are primary; in-memory representations (struct/classes) are secondary. Most programmers think the opposite, but that's a bias caused by the work they do.
    • HTML can be processed with DOM or SAX, etc.
    • ditto for JSON and jq -- it can cut things down to size first
    • or grep on QTT
    • Lazy deserialization, doctools/oil_doc.py, etc.
  • The OS process is the only software abstraction with hardware support (the MMU)
  • Re-serializing and Re-parsing are idiomatic in shell and Oil (e.g. $0 dispatch pattern and Hay SHELL nodes)

Data-Centric Software Architecture

  • Data over Code; Protocols over Services. The Kubernetes diagram talks too much about services and not enough about protocols!
    • this is the rule of representation applies to distributed systems.
    • *data over code; stateless over stateful, explicit state over implicit state * (Urbit)

More

  • Shell is now a good language
    • Unix without the 70's style macro processing
  • A New Jersey Design with an MIT Implementation Style (Oil is a mix of practical and principled)
  • Unix Programmers And Woodworkers Both Make Their Own Tools
  • New Unix Sludge Is Worse Than Old Sludge https://www.oilshell.org/blog/2021/07/blog-backlog-1.html#slogans
  • Kubernetes is Our Generation's Multics
  • The Stuff That Happens Between fork() and exec() is Very Important
    • It's where pipelines happen
    • It's where containers happen (and traditional dropping of privileges)
    • Contrast with: Windows, and Virtual Machines. They're TOO sandboxed. You want a mix of isolation and sharing.
    • TODO: Link fork() in the road paper.
  • Cloud Platforms like Github Actions are capable, but the languages are weak
  • Shell-Centric Shell Programming (see xargs post for examples)

Fallacies

  • Shell is a bad language. It used to be, but Oil made it good! :)
  • The Python replaces shell fallacy. It's not either-or. The unix philosophy is for shell to invoke tools in Python.
  • PHP straight-line code fallacy. However we often want this! We want people to have simple ways of reasoning.
  • Fine-grained types are always better (from a code perspective, but not from a systems/architecture perspective)
  • Processes are slow fallacy
  • Strings are slow fallacy (examples: grep, etc. I think QTT has advantages over graphs of objects in memory too)
  • You can't extend your type system across the network
    • distributed upgrade problem
    • language heterogeneity problem (and "old code" problem)
    • Google felt these scaling limits. protobufs worked well, to a point
  • Fallacy: Operating Systems Should Have Records (this can be layered on top in the shell layer!)
  • X as Narrow Waist Fallacy: https://lobste.rs/s/sdum3p/if_you_could_rewrite_anything_from#c_yijvsx
    • some people want ND-JSON, other people want space-separated columns, Elvish wants JSON-like values, nushell wants tables, etc.
    • the real answer is that byte streams are still the lowest common denominator; documents, objects, and tables are usefully represented as byte streams (HTML/XML, JSON, CSV/TSV/QTT)
    • It's very difficult to move narrow waists. It can happen, but wishing doesn't make it happen. Building does make it happen, but the builders of Elvish/nushell/Rash disagree.

Related to empirical software engineering: The Soviet-Harvard Delusion. In contrast, Unix is absorbed through apprenticeship. Neal Stephenson says that Unix is our Gilgamesh epic.

Concepts

  • String Safety aka String Hygiene: proper escaping. (sort of an analogy to memory safety)
  • Process Based Concurrency
  • Policy vs. Mechanism; Control Plane vs. Data Plane.
    • Shell is about policies and the control plane.
  • REPL as a debugger for distributed state (since conventional debuggers don't work)
  • Code- vs. Data- Centric. Unix is Data-Centric. Windows, iOS, Android, and the cloud are code-centric.
  • Data Over Code Principle or Rule of Representation
  • Situated Language (via Rich Hickey and Clojure). Many programs are entangled with "the world", and entangled with real data. The model has to be discovered, and changes over time. Shell scripts often have this flavor, and Oil is a good language for such programs.
  • distributed shell script -- a shell script that runs on multiple computers Distributed Shell
  • parasitic shell script -- a shell script that reuses entire proprietary cloud platforms! i.e. Travis CI, or AWS Lambda, etc.
    • you keep the control plane, but the data plane is outsourced
  • Allopoiesis vs. Autopoiesis, from Richard Gabriel's Design Beyond Human Abilities (PDF)
  • Models vs. Reality: The Map is Not the Territory (regarding types and runtime behavior)
  • Serializations/Concretions vs. Data Structures/Abstractions (Parochial Types, as described by Hickey)

The structure of a system made this way would be of the allopoietic part of it -- the part that does what the
designers intended, such as banking or web serving -- embedded in the autopoietic part, which would be responsible, in a sense, for keeping the overall system alive.

In a distributed system, "shell scripts" are the autopoietic part.

Software Architecture Principles

  • Perlis-Thompson Scaling Principle
    • Shell, Unix kernel, Plan 9, REST, etc.
  • The Perlis-Thompson Prophesy -- the cloud will converge on single concepts for distributed OS concepts. (It won't be Kubernetes; it's not expressive enough or simple enough to be a narrow waist.)
  • Principle of Least Power (from Tim Berners-Lee)
    • this has to do with remote code. Division between HTML, JavaScript, and CSS.
    • use eggex where possible
  • Extensibility Principle (TODO: need a name for this)
    • Browser features should be explainable in JavaScript: Extensible Web Manifesto
    • SQL features should be explainable in SQL: Against SQL. Good example of flink windowing vs SQL windowing. SQL windowing is hard-coded and even requires custom syntax (somewhat like shell's string operators!)
  • End-to-End Principle
  • Framework vs Library: Who holds main()?
    • systemd and Kubernetes hold main(), and there is a strong rationale for that. But there's value to holding main() in shell (doing things that can't be done in a framework)
  • Typed-API-Style vs. Unix Text Style. The former is what multics did. It's arguably easier to use. But the second style scales and evolves better (over decades!)
  • Service Topology Abstraction (or maybe a less intimidating term). The idea that the topology is abstract and can be mapped to physical machines. We want to run services locally. We don't hard code ports in source code; we wire services together in shell scripts.
  • The Graceful Upgrade / Thin Layer Principle
    • The web is a graceful upgrade over Unix and SGML. It augments the file system hierarchy with URLs and hyperlinks.
    • Oil is a graceful upgrade of Thompson -> Bourne -> Korn -> bash shell, as they each were of each other. (Whether you want to call it evolution or design is up for debate. Oil emphasizes design)
  • Processes and Files Are Better Than The Wrong Abstraction. Software complexity is related to superfluous abstractions that don't compose (e.g. Docker). Restoring the simplicity of "code and data" to distributed systems.

Systems Thinking vs. Code Thinking

  • Fallacies
  • Things vs. The Relationship between Things.
    • code vs. data
    • protocols vs. services/processes

Implementation Concepts

  • Lossless Syntax Tree
  • Lexer Modes (for principled parsing)
  • Lexing as non-recursive and parsing as recursive
  • Coprocesses
    • stateful: headless mode
    • stateless: capers (not implemented yet)

Concepts That Need a Name

  • Hacker News Quote: My problem with k8s, is that you learn OS concepts, and then k8s / docker shits all over them.. This is a special case of the Perlis-Thompson principle. The new abstractions should "reduce" in some way to the old ones. There should be a "principle of extension" or something.

Analogies

  • Oil vs. Metal (shell as essential lubricant)
  • Plants vs. Animals
  • Fascia vs big organs (heart, lungs)
  • Autopoietic vs Allopoietic (Richard Gabriel)
Clone this wiki locally