an attempt at making a functional languge heavily inspired by elixir but with monads
Faeyne stands for: "Functions are everything you need EXCEPTION"
The core idea is we only need piecewise functions for achiving basically everything. Type Exceptions are also present to make things more argonomics IO monad (and others) are implemented as follows
def main(system) {
input = system(:input)
x = pure_func(input)
io_func(x,system(:printer))
}
this allows u to pass a diffrent function. for instance if you want to supress printing for "io_func" you can do
io_func(x,fn(str) -> {})
since we dont have arrays pattern matches would have to do. you can use lammda functions as a place holder and there is special syntax for it
arr = match fn {0=>a,1=>b,2=>c};
In the future we will add pattern matching to allow you to know what patterns a "match fn" or "def" function will accept as inputs. This could potentially let you check for the length of an array by checking the pattern match You can also pass the length explicitly with an atom like so
arr = match fn {0=>a,:len => 1};
and then update like so
def append(arr,x){
new_len = arr(:len)+1;
fn (k) -> {
match k {
:len => new_len,
new_len => x,
_ => arr(k)
}
}
}
in the future with new pattern matching this could look like so
def append(arr,x){
new_len = arr(:len)+1;
match fn {
:len => new_len,
new_len => x,
(k) | k<len => arr(k)
}
}
I am hoping that the JIT can compile this as a modifications in some cases. This is one of the reasons we will opt into Reffrence Counting for our GC. Because the languge is pure and strict making a refrence cycle is impossible.
NOTE: if you make an extension to :system (idk who will but saying anyway) Then if that extension allows mutating functuions. It is responsible for its own GC which can be very tricky.
if at any point you want to crash doing something like
x = 1 + "error message";
will work and should give line information.
the languge uses a stack VM. poping an argument is achived via stack.pop_value and will return None when the stack has reached a terminator.
every function must must follow this calling convention:
- pop ALL its arguments AND the terminator out of the stack.
- return 1 value without a terminator.
- leave the values of the previous function unchanged (note you are allowed to pop them then push back)
- only return types of compatible contexts (see below)
- only call passed functions with types of permisble context (see below)
when calling other functions push a terminator then the arguments starting with the left most going right.
the runtime will never cause UB as long as you never directly change the stack to hold an invalid state. HOWEVER there is a way to get unsound behivior by passing values between diffrent contexts.]
the following types are valid in all contexts:
strings ints floats bools nil StaticFunc
Atoms are only valid in contexts that were generated by the same StringTable. OR a parent string table that is fully contained within a child string table with the same IDs.
passing an atom with the wrong context will cause wrong comperisons and bad debug messages.
there are 2 key components in the context:
- the StringTable
- the global varible table (only an issue for native functions)
all primitive and FFI types can be passed to a diffrent context as long as the source code in both was constructed via the same StringTable. because of this the Code struct uses an Arc and RwLock so that it is easy to have multiple contexts that can share primitives. note that runing any function locks the lock for its entire duration. if this is an issue cloning the table then using the clone for the new code also preserves the invariance and is well defined behivior.
when it comes to passing functions between contexts its generally recommended to avoid this when possible. SOME functions can be passed safely. as long as there is no PushGlobal command anywhere in the code. if this is unaccptble it is possible to wrap the functions in a closure using DataFunc. since sharing a stack is allowed
the FuncData type is used for passing data in and out of the VM. depending on usage it CAN be heled after the excution context is done. this only happens if it was passed to another FFI handle that holds state.
it is generally recommended to hold a weak pointer to FuncData when possible. or at the very least giving some form of clear route for for freeing all FuncData heled by your extension context.
also note that function equality is detrmined by pointer. this includes data containers which can be confusing. for this reason data containers should generally support
data(:eq)(other)
to allow users to check for equality.
the lifetime of the global scope is giving me trouble. after a lot of fighting with it I got it to free everything while being ALMOST fully safe.
This required a 4 hours refactor to add a lifetime followed by some fairly weird code to get lifetime anotations where they should be. It is likely I am missing a more elegent way to do it but as of now I have made it where the global scope is leaked and needs to be manualy turned into a box.
I managed to drop it down into 2 static lifetimes in system that need to be fixed. they corespond to 2 closures that are being created. The issue is that the lifetime of the closure and the lifetime of the returned value are not directly linked... what we need is 1 struct containing all of the context that runs things using a &'ctx self
we also potentially want to impl Fn EXPLICITLY which would mean that we need to have a similar trait thats used. that trait could be potentially very benifical as it can be used for debuging as well
starting to look at profiling and picking very low hanging fruit I found that global scope copying was a problem. after a short redsgin of how i do blocks there things worked out.
when doing the string inversion run most of the cost was on the alocation of new string parts. This is good because it means the overhead from the languge itself is basically zero and everything is in the algorithem.
I then wanted to test closures. making a very deeply nested closure and calling it seems to have a LOT of cost in the hash function(which is weird considering we are using ints)
Gc is going to be a proper nightmare. the main issue is that if we are called from a subprocess we may acidently remove values other processes are using. we need a way to clearly mark global values AND the arguments of main as a no remove.
the VM uses an Arc which was chosen because it does not require a central location where Gc exists in order to work. as such this allows for fully safe API which makes the languge easier to embed within Rust aplications.
When it comes to the stack it is a custom implementation thats alocated on the heap in most cases and has a constant size. this was chosen because it is much better to error on a catched overflow excption than allow for arbitrary alocation size and then erroring with an OOM. there is also a slight performance benifit
other than the stack there are multiple varible tables that are just vectors holding values. unlike the stack these are implemented safely and thus take more room. these varible tables are used for storing captured varibles and assiments. they are droped on tail call optimization (which is the BULK of the memory)
tail call optimizatios does need to store some debug information. specifcly 64bits of span and a bool (which because this is rust is stored as a 64bit integer) that metadata is not stored for recursive calls to self. this is to allow loops to be implemented as O(1) memory. also as a side benifit debug messages from loops are now much nicer then they were earlier since there is no more stacking.
for now source code is stored in a diffrent heap position for every function. however the VM itself is made in a way that allows us to store source code together. if cache locality remains an issue we will move to a diffrent implementation.