Skip to content
This repository has been archived by the owner on Sep 16, 2021. It is now read-only.

Conditional statement #148

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

remaininlight
Copy link

So this isn't actually a pull request yet, it's a rough sketch of what a conditional statement might look like (I thought Maxim could use one?). However, this is my first time writing Rust and I've only dabbled lightly in compilers before.. so apologies for the sharp edges and lack of it compiling atm.

I thought I'd ask your opinions @cpdt before I go too much further/ make any structural changes to what already exists. Any pointers are useful, I suspect some change might be necessary to codegen but I don't want to be too brutal - I imagine you probably have some reasons for the current architecture!

Also, how are you debugging the parser at the moment? I thought some tests (like https://github.com/TheDan64/inkwell/tree/master/tests/all) might be useful but haven't set any up yet.

Parsing

AST

  • Added ConditionalExpression and new_conditional. Should the fields on this be SubExpressions or something else?

Mir

  • ast::Blocks with Expressions are converted into CompileResultmir::Blocks with Statements
  • Added Conditional Statement to mir/block/statement.rs
  • Added lower_conditional_expression and matcher for it to pass/lower_ast.rs. What is the idea behind lowering the ast? Haven't come across it before.

Codegen

Copy link
Member

@cpdt cpdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! I've gone through and written a few comments on some places, but I'll also answer your questions/mention my thoughts on the overall goal here. Be prepared for several walls of text :D

Yeah, there aren't any tests for the compiler yet. Probably would've been a good idea to write those while I did the rewrite several months ago, but my free time was relatively scarce at the time :D I'll get around to writing proper tests for everything there so far eventually, if you want to set some up for the parts you're touching feel free (but there's no obligation to).

So, there's actually a reason why there's no conditional support in Maxim at the moment. It's completely open to discussion (and I would welcome your thoughts!) but basically my thinking is I'd rather have the ability to conditionally execute things on the node level (i.e turning on and off groups of nodes) than in code. There are a couple reasons:

  • Having it on the node level is much more powerful, since it would allow you to disable/enable more than one node based on the same condition (whereas having it in code means you need to duplicate the condition) and would allow you to turn on and off things like group nodes (which don't have code, they contain another surface inside of them). Extractors already kinda do something similar. If you're not familiar, they basically clone a group of nodes 32 times and then individually enable/disable each group - they're useful for things like polyphony, and can even be used as a primitive form of conditional execution (by just enabling/disabling one of the instances) although that's quite wasteful.
  • I'm worried that having branching inside a node could lead to some odd, counter-intuitive behavior around mutable things. For example, the compiler uses the code to determine which direction a control's value goes - if it's read by the code it's treated as an input, if it's written to it's treated as an output. This is used to determine some things like node evaluation order, if a control shares it's value among all extracted instances, etc. But let's say a control is only written to in one of the branches - while that branch is active the control acts as an output (overwriting whatever's put into it), but when it's not active it can suddenly be changed. From the compiler's perspective I don't think this would be a problem, but it could be hard to notice in the editor leading to unexpected things happening. Conditionals on the node level would still have this behaviour but it would be more visible as the disabled nodes would be greyed out (I'd imagine).
  • There's also the question of controls that do some processes each sample (like graph controls), and functions which store some state (like the delay function). Controls are updated before running the code in a node, which would mean they'd always be updated even if they're only used in a branch that isn't active. Functions, however, are run at the point where they're called, which means they wouldn't update if the branch wasn't active. This might be counterintuitive, but I'm not sure...

Let me know if that makes sense. I'd like to hear your thoughts.

Regarding structural changes, you can see my individual comments for more details. There is one thing I think you missed though:

In compiler/src/mir/var_type.rs, you'll need to add a branch to of_statement to determine the type that the conditional expression returns. Assuming you've checked both the consequence and alternative expressions return the same type, I guess you could just recurse to get the type of one of those.

compiler/src/ast/expression.rs Outdated Show resolved Hide resolved
index: usize,
condition: &usize,
consequence: &usize,
alternative: &usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you're passing references to usizes here? Just usize should be fine.

compiler/src/codegen/block/gen_conditional.rs Outdated Show resolved Hide resolved
compiler/src/codegen/block/gen_conditional.rs Outdated Show resolved Hide resolved
// TODO Why is float created here?
// TODO Find a way to invoke a compile_expr function
let condition = self.compile_expr(condition)?;
let condition = builder.build_float_compare(FloatPredicate::ONE, condition, zero_const, "ifcond");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbers in Maxim aren't actually just floats, rather they're a structure containing a u8 (the form) and an f32 2-vector (the left and right channels). There's a utility called NumValue that wraps a pointer to one of these structs and provides methods to get the components, e.g.

let condition_num = NumValue::new(condition);
let condition_vec = condition_num.get_vec(builder);

This does bring up the question of how the vector is handled for the conditional branch. Normally when a function needs only a single float it takes the left channel value, but since here we just need a condition instead of an actual value I'd be open to making it AND or OR between the channels.

For the first case (taking the left channel), you can use build_extract_value to get one of the values out of the vector. Here's an example.

@@ -74,6 +81,7 @@ impl Statement {
| Statement::Extract { .. }
| Statement::Combine { .. }
| Statement::LoadControl { .. }
| Statement::Conditional { .. }
| Statement::CallFunc { .. } => false,
Statement::StoreControl { .. } => true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for the current structure (since the conditional doesn't 'contain' the consequence/alternative statements), but it would be incorrect if the statements where inside the conditional as described above since the statements inside could have side effects.

It's worth mentioning that this is used to remove dead code (if a statement doesn't have side effects and isn't used by another statement, it's removed), so it's pretty important to be correct!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll leave this comment open so I don't forget..

compiler/src/parser/mod.rs Outdated Show resolved Hide resolved

let end_pos = Parser::expect_token(TokenType::CloseBracket, stream.next())?
.pos
.1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason it needs to end with a close bracket? Unless I'm reading this incorrectly, it looks like the syntax right now would be:

if cond then something else something_else)

Which I'm guessing isn't what you're intending.

// TODO Make conditional matchers match conditionals
get_matcher(r"if", TokenType::If),
get_matcher(r"then", TokenType::Then),
get_matcher(r"else", TokenType::Else),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex here is fine, although I think this needs to be earlier in the array. The token goes from top to bottom, which means as it stands it'll tokenize "if", "then" or "else" as an identifier instead of one of these tokens. Putting it after the multi-char tokens should be fine, probably under the header "indentifiers".

consequence,
alternative,
})
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things here:

  • You can do &expr.condition instead of expr.condition.as_ref() to get a reference to the value.
  • As I've mentioned previously, this is actually going to emit code that means both the consequence and alternative expressions are executed before the conditional statement, which probably isn't what you want. But the solution is complicated :)
  • You'll need to typecheck the condition to make sure it's a num - self.check_statement_type will do that.
  • You'll need to make sure both expressions have the same type.

@remaininlight
Copy link
Author

Amazing, thanks for the detailed feedback! Interesting learning how it's all put together.

I completely agree with you on disabling/enabling functionality on the node level being a good way to go most of the time and look forward to checking out Extractors. My desire for a conditional is motivated by wanting to be able to create midi events for generative and interactive music ie.

if sensor_value == 1:
    // Send midi note

There shouldn't be anything too heavy on their branches and tbh I don't think I'd even use the else that much, so for now I think changing the MIR structure shouldn't be necessary atm - given that evaluating both then and else should be cheap.

I'd wondered how inputs and outputs were determined, that sounds neat - I'll check it out when I have time. I love the flexibility of being able to modulate all the controls with each other but I think it might be a bit confusing for your average musician/ casual user. If they form part of your intended audience I wonder if, at some point in the future, some way of hiding internal state from the control surface and overriding each control's status as an input or output could be a good idea - perhaps in the right-click menu for that control? Otherwise, if people are making nodes themselves imo they should know better than to plug signals into outputs etc.

Re: the difference in updates for controls and functions, I may be missing some subtlety here but it sounds like the functions would behave correctly and there would be some extra processing for a control which isn't active on that branch? I think that a constant cost for each control is not too unexpected, painting them on the screen probably gives you a comparable overhead. I admire your eye for detail in elegance/efficiency (I guess this is from demoscening?) but if Axiom is being used on a PC I don't think anyone would ever notice ;) + If they do then they could do their switching at the node level.

I look forward to progressing through your comments over the next few days.

@cpdt
Copy link
Member

cpdt commented Nov 14, 2018

@remaininlight Hey, sorry for taking a while to get back to you... had a busy weekend :)

That makes sense with the branching. However, I am concerned that using that kind of "traditional" if/else structure will mean people will assume only one branch is evaluated instead of both... That's fine in a lot of cases, but it becomes a bit of a problem when the branches have side-effects (e.g. setting a control's value or calling a function like delay that has some internal state). It would probably also make the result of something like this quite confusing:

if some_cond:
    my_var = 2
else:
    my_var = 3

Since both branches are executed, my_var would end up always being 3 which doesn't make much sense!

Something that would be a bit easier to implement (wouldn't require changes in the parser or MIR) is to define a new if function, that looks like this:

my_var = if(some_cond, 2, 3)

I think that makes the semantics a bit more clear. What do you think?

Regarding the MIDI use-case specifically: yeah, that's been something I've been wanting to implement for a while, but I'm not 100% sure on the best way to do it yet. I wonder if something like:

out:midi = midi_note(gate, note, velocity)

would work better than having an explicit if? The way I'm imagining that is that internally the function would emit a note on event when gate rises, then a note off when it falls. Might be a bit more error-proof than providing functions to just emit MIDI events :)

As for inputs/outputs, yeah I have some plans around that to make it easier to understand. I think what you're talking about with hiding internal state is something that group nodes already handle nicely (if I'm understanding your point correctly) - they basically allow you to have a whole node area inside of them, and you can 'expose' the controls you want from inside that to show on the group node's control surface. I do think we can do a lot more around showing which controls are inputs and outputs, and which ways data is flowing through connections. Currently input number controls default to knobs while outputs default to "plugs" (which look kinda like read-only knobs), although I have considered adding arrows to knobs and connections to make that more clear. I think that added clarity will also allow us to add more behaviour around how connections interact, too. For example, if you currently connect two outputs into one input, one of the outputs will override the other (and it's not defined which). But a more useful behaviour here would probably be to add together the values of those two inputs, which could get quite confusing without some way to easily see the direction of connections.

As for control/function updates, it's less about performance (which is still very important when you're running the functions 44100 times per second :D) and more about side-effects. For example, the graph control has a bar moving across that shows the current time in the graph. The position of this bar (and the state of the graph) is updated in the control's update function. I was imagining a case where someone was only using the control inside a branch, and was possibly expecting the graph to become 'disabled' (i.e stop moving) when that branch isn't active, which wouldn't happen because control updates are always run. But I'm not too sure if that actually would be counter-intuitive, and it doesn't really matter with the "branchless" design anyway.

(&else_val, &else_bb)
]);

phi.as_basic_value().into_pointer_value()
Copy link
Member

@cpdt cpdt Nov 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth mentioning, if we're going for a "branchless" approach (i.e both sides are evaluated), LLVM actually has an instruction called select for this that means we don't need to do any branching/basic blocks/phi nodes in the IR. You give it a boolean and two values of the same type, and it returns either the left or right of the values depending on the boolean.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks - I've attempted to implement this in an if function on this pull request but it's not quite working yet..

@remaininlight
Copy link
Author

No worries, I can relate! :) I think I'm slowly starting to get my head more into Maxim, now I see that in the case where both branches are executed it would not be possible to conditionally execute only one of them (by definition..).

I had actually started making an if function when I first came to this but abandoned it in favour of chasing the if statement. I agree the function is clearer for the case when both branches are executed, it looks like it might be used in the place some languages use a ternary statement:

my_var = some_cond ? 2 : 3

Though I guess that's still a little ambiguous as to what is executed. I'm not sure if Maxim can return different types from the same function (ie. can a function return a midi message or something which casts to no midi message?) - would it be possible to conditionally send a midi note with this? Perhaps something like:

out:midi = if(some_cond, midi_note(channel, note, velocity), 0)

A midi_note function could be a good way to go for many people's use cases, though if a note on/off is triggered each time the gate changes I guess you could end up with 44100 notes per second from a continuously changing signal? Maybe having gate non-zero being a note and zero being no output could work.

Note ons with zero velocity vs note offs are pragmatically the same thing in my experience (in fact I think it says something along those lines in the MIDI standard somewhere) but some people might also want to be able to send note offs with velocities. Most libraries I've seen have two functions, meaning the note_on function (which is used most of the time) doesn't have to have too many parameters and people can still use a note_off function when they want it.

Yes, now you mention it, groups do that well. Arrows sound good, as does adding the values of the connections together on one input. Re: controls and function updates I am guided by your knowledge of how these pieces fit together :)

For now I have implemented an if function which is on this pull request. It doesn't quite work.. it doesn't switch when the conditional input changes but I thought I'd send it over anyway - I have to be up early and can't quite finish it tonight.

Copy link
Member

@cpdt cpdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, looks good! Yeah, it is roughly equivalent to a ternary expression, although those normally also conditionally execute only one side or the other.

MIDI values in Maxim aren't actually just a single MIDI event, instead they're an array of up to 16 events... so a single MIDI connection in the editor, or variable in Maxim, can carry up to 16 things happening at the same instant. This means that "no MIDI message" would just be an empty array. It also means you could theoretically combine two event values by concatenating the arrays if you wanted to, say, add a new event into a 'stream'.

Yeah, that was what I was meaning by rising/falling edge - when the gate goes from 0 to something that's not zero it generates note on, and vice versa with note off. The point about note off velocities is interesting, I don't think Axiom currently does anything with those since I wasn't too sure how it would actually be used. Maybe you can provide some insight to how that would be useful here?

I've left a few more comments on the new code, let me know your thoughts :)

compiler/src/codegen/functions/if_function.rs Outdated Show resolved Hide resolved
@@ -106,7 +106,8 @@ define_functions! {
Note = "note" func![(Midi) -> Tuple(vec![Num, Num, Num, Num])],
Voices = "voices" func![(Midi, VarType::new_array(Num)) -> VarType::new_array(Midi)],
Channel = "channel" func![(Midi, Num) -> Midi],
Indexed = "indexed" func![(Num) -> VarType::new_array(Num)]
Indexed = "indexed" func![(Num) -> VarType::new_array(Num)],
If = "if" func![(Num, Num, Num) -> Num]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is declaring that the function takes in numbers and output numbers, which means the compiler will only accept calling it with numeric values and not MIDI ones (or arrays, or tuples) whereas we probably want it to accept/return any type. There's no way to define generic functions at the moment, but I don't think a basic implementation of that should be hard. If you want to give that a shot you can, otherwise I'm happy to look at it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a swing at an Any type, which so far has involved disabling type safety for it and it only allocates space for numbers atm from AnyValue::get_type.. which doesn't seem ideal. Let me know if I'm headed in the right direction!

@@ -769,7 +771,9 @@ impl<'a> AstLower<'a> {
expected_type: mir::VarType,
actual_type: mir::VarType,
) -> Option<CompileError> {
if expected_type != actual_type {
if (expected_type != actual_type)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type safety checking is disabled for Any type here. It would be good if there was some way of linking the output type of a function etc to its input types? ie. an if function has midi messages as its arguments, so the output takes the type of a midi message, haven't worked out the best way to do this yet.


impl AnyValue {
pub fn get_type(context: &Context) -> StructType {
// TODO Just returning Number structure for now. Can this infer what its correct type should be from context?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what should be returned here, think some restructuring might be necessary in order to infer the actual type the AnyValue is representing and return space for that?

@@ -106,7 +106,8 @@ define_functions! {
Note = "note" func![(Midi) -> Tuple(vec![Num, Num, Num, Num])],
Voices = "voices" func![(Midi, VarType::new_array(Num)) -> VarType::new_array(Midi)],
Channel = "channel" func![(Midi, Num) -> Midi],
Indexed = "indexed" func![(Num) -> VarType::new_array(Num)]
Indexed = "indexed" func![(Num) -> VarType::new_array(Num)],
If = "if" func![(Num, Num, Num) -> Num]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a swing at an Any type, which so far has involved disabling type safety for it and it only allocates space for numbers atm from AnyValue::get_type.. which doesn't seem ideal. Let me know if I'm headed in the right direction!

@cpdt
Copy link
Member

cpdt commented Nov 20, 2018

Hey! Just wanted to say, thanks for the continued interest/work on this :)

Unfortunately I don't think we'll be able to get away with just an Any type. Since we're compiling code to bare-metal it's pretty important that we do actually know what the types are - bad stuff would happen if a user tried to do this for example:

# 'something' is typed as Any but contains a Midi value
something = if(cond, note_on(), note_off())
# out:num expects a Num, since 'something' is Any it passes the type checker, but it'll crash when emitting the LLVM IR
out:num = something

What we need is a way for a function declaration to define generic parameters, and then the AST -> MIR lowering pass can check this and change the call to be to a concrete function (i.e one with actual types defined). Then we just need a way to instantiate the concrete function, which I don't think should be complicated. I can have a look at getting this infrastructure in place sometime soon if you want, since it'll probably get quite far down into the nitty-gritty of how all the pieces work together :)

This does raise a question though, of if we want to allow the function to have different behaviour for different types. For example, numeric values contain two floats for left and right... if you have an expression like:

if(cond, a, b)

and cond is equal to {0, 1}, should the result be {b.left, a.right}? Or should we just take the left channel value like we'd do for other types (since they don't have two channels) and return b? I kinda prefer the first one since it better mirrors how other numeric functions work (i.e they operate independently for each channel), but it might be counterintuitive that it doesn't work that way for other types?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants