Skip to content

In progress tutorial series demonstrating the progression of computer science techniques, from raw binary to a custom assembler, to a custom compiler

Notifications You must be signed in to change notification settings

cj-dimaggio/bits-to-compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 

Repository files navigation

Bits to Compiler

This is the rough code implementation for a hopeful future series of articles documenting the progression of abstractions in software development and system programming basics. The idea is to document how one would write a simple bootloader using nothing but raw 1s and 0s (describing x86 ModR/M instruction encoding), progressing to a simple two pass assembler to make our lives easier with mnemonics, labels, and simple mathematics, and finally iterating on this design until we have a compiler for a toy c-like language.

The repo is currently a hodgepodge of rough tests and examples strewn throughout git history. While work still needs to be done to cleanup the codebase and write the accompanying articles, the code itself is more or less working up to standards.

Bits

To simulate the error prone tediousness of altering bits manually with something such as a front panel, this chapter focuses on writing a simple program in Rust to reduce a text file of ASCII 1s and 0s into a raw binary that, when run on an x86 machine, will print the words "Hello, World!" to the terminal via BIOS commands.

This state of the project can be found at: https://github.com/cj-dimaggio/bits-to-compiler/tree/1d18cb065a0d56b6e0cf658155243b10c3a40c68 where we have:

The compiler can be built and run by cding into the compiler subdirectory and running:

$ cargo run ../examples/example-loop.bit

The build artifact can then be tested using QEMU using:

$ qemu-system-x86_64 -fda ../examples/example-loop.bin

(Debugging the produced binary can also be done by examining the disassembly: bjdump -b binary -mi386 -Maddr16,data16 -D examples/example-loop.bin)

Bit file

Assembler

The finished assembler code can be found at: https://github.com/cj-dimaggio/bits-to-compiler/tree/28ef57c5f0b0252520be87fbd1ff300ab56ead88

Here we have implemented a simple assembler capable of a handful of instructions (really only the ones we need for our toy example), some directives, label references, and simple arithmetic via a shunting yard implementation.

Our testing file is now: loop-2.bit

which can be compiled the same way from the compiler directory with:

$ cargo run ../examples/loop-2.bit

And then run in QEMU with:

$ qemu-system-x86_64 -fda ../examples/loop-2.bin

Bit file

Compiler

Our assembler is iterated on until we have a working compiler for a C-Like language, which is first completed at commit: https://github.com/cj-dimaggio/bits-to-compiler/tree/d507c64990780ac840afaf645cb8b9ae93cb263c

For code simplicity, our compiler currently doesn't emit actually binary anymore but instead it transpiles to assembly (whose inner workings we should now be well familiar with).

The testing file can be found at: https://github.com/cj-dimaggio/bits-to-compiler/blob/d507c64990780ac840afaf645cb8b9ae93cb263c/examples/c-like.bit

When compiled with cargo run ../examples/c-like.bit this outputs an assembly artifact: https://github.com/cj-dimaggio/bits-to-compiler/blob/d507c64990780ac840afaf645cb8b9ae93cb263c/examples/c-like.asm

Which, when compiled with NASM:

$ nasm -f bin examples/c-like.asm -o examples/c-like.bin

Can then be run with:

$ qemu-system-x86_64 -fda examples/c-like.bin

Bit file

Bit file

About

In progress tutorial series demonstrating the progression of computer science techniques, from raw binary to a custom assembler, to a custom compiler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages