-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular PlusCal [proposal] #75
Comments
Here is a more fleshed out example of the transformation. Before
After
Now, if you wanted to use this algorithm from another PlusCal module:
Or if you want to use it from custom Go code (code is approximate, precise model for "global" variables needs work):
Here is an approximation of counter.go:
|
Extra thoughts on codegen for globals in this context: what if we used unbuffered channels as locks? As in, at the beginning of the program you send the initial values of all globals on channels. You then pass those channels to the relevant processes and when a process needs that variable they perform a receive. If that variable is available its value will be received. If it has already been received any other processes trying to receive it will block. When the process is done with the variable, it sends the new (possibly updated) value back along the channel. The key advantage here is that you can pass channels around easily, whereas passing pointers around requires separately tracking a lock mechanism. This is likely to not compose as well in a modular context. |
A couple thoughts/questions:
|
@rmascarenhas good questions.
|
Interesting point from a discussion with @mrordinaire yesterday, clarifying that my idea for Module PlusCal is directly replacing the PlusCal translator rather than compiling from Module PlusCal to regular PlusCal. The important insight is that regular PlusCal is fundamentally broken when it comes to variable priming (that is, correctly generating |
P2TCommit.tla before
P2TCommit.tla after
|
In consideration of #77 , here are my thoughts on how custom implementations might apply to Modular PlusCal.
Example 1: if a network between two processes is modeled by a global variable and another process that "interferes" with that variable, we model reads and writes to that global as calls to an opaque object (taken as a Modular PlusCal parameter) that passes these through to a real network. Example 2: if a process needs a timer, it can communicate with a second process via a shared global variable that "pretends" to keep time (since PlusCal is not realtime). In Go that variable can be modeled as an opaque object (taken as a Modular PlusCal parameter) that implements real semantics. In summary, we can implement almost any semantics with a combination of model-check-only processes and flexible compilation of variable reads and writes. For ease of use we should provide a Go library of network/OS primitives that can be plugged into this mechanism. If someone wants something we don't have, the mechanism should be defined as an interface with a few necessary methods (acquire, release, ... not sure what else) so they can implement extras as necessary. Extra note: this change would also remove the requirement for locking on global variables, and I'm not sure what place lock groups would have here. You could still have custom Go modules that use locking internally, but if a point to point network is correct for some variables you could dispense with locking semantics entirely for those. Extra note 2: to avoid locking/synchronisation bugs, we could have PGo automatically generate alphabetically-ordered "acquire" and "release" calls the same way the backends operate currently. @bestchai any thoughts? This is the "purest" adaptation of PlusCal that I can think of, with no additional semantic changes to Modular PlusCal. |
Example: send/recv with retryBelow is a set of archetypes that implement a send/receive pair with ack. This will be the basis for trying to express some custom error functions in Modular PlusCal.
A lossless network (just "plug" the two instances into each other):
A lossy network with no reordering:
Now for a similar exercise with a lossy and reordering-capable network (but with no packet duplication) ("vanilla" example omitted as it would only get longer and harder to get right):
critique: while it is fairly clear what it going on here, this is getting longer again due to more duplicated code. Since we are effectively redefining a variable's read/write interface, we need a way to abstract away the "network-as-set" pattern. Idea: factor out the duplicate code using a "mapping macro". This is a construct that encapsulates a complete remapping of a single variable. Assuming a library of standard constructs was available, this would make reuse of common communication models fairly straightforward.
critique: what if we need two variables to be consistently remapped together, or a mapping only makes sense with a parameter? |
2PC and fail-stopMapping is useful when communication channels are fallible, but does not address when processes are fallible. One of the big ideas behind 2PC is logging and retry on failure, which needs simulated failure and "real" logging. Below is a 2PC spec in Modular PlusCal that does not implement failure.
In order to show that this algorithm cannot withstand fail-stop, I propose an additional mapping construct: the ability to inject code in between steps of an algorithm. Here is how one would change the
Semantics of |
I see some interesting ideas here, @fhackett, thanks for writing these detailed examples.
|
I've added responses in-line - thanks for your comments.
On Wed, Jul 25, 2018 at 4:22 PM, Renato Costa ***@***.***> wrote:
I see some interesting ideas here, @fhackett <https://github.com/fhackett>,
thanks for writing these detailed examples.
-
Comment 1
<#75 (comment)>
- is the emptyMsg constant some leftover from a previous version of
your idea? If so, better remove it so it's less confusing.
I can't remember why that was there, and can't find any references to it
either. Removed.
-
- If I understand this right, toSend is a sequence of messages you
want each instance of SendRetry to send, is that right? If so, it
would make the example more realistic to actually pop from the list, or at
least iterate over its elements. I know these are just drafts, but making
them as close to "reality" as possible would help understand the ideas (at
least I think it would).
I have added a couple of lines in order to make the algorithm more correct - the corrected version is what was intended.
-
- Another question: is toSend supposed to be local to instances of
sendRetry? Does it matter?
toSend is how one would control an instance of SendRetry - you would pass
in a list from Go and the algorithm would send the contents of the list (in
theory you could update the list dynamically so you're not limited to some
initial list value. That is the main reason why toSend is not local.
-
- I don't think the Maybe operator has the meaning you intended it
to. CHOOSE will not cause TLC to branch, so you wouldn't explore
both cases (i.e., message is sent or dropped). I'm sure you know this since
you used either in your following examples, so this is also in the
"making the examples realistic" category.
My mistake. I had forgotten that that was how TLC worked. Added a note to
the example.
-
- I'm not sure I understand the idea of "mapping macros". Is the
idea of these macros always define read and write "operators"? I
was confused by the fact that you have read chosen within the
definition of read itself, but my current understanding is that in read
chosen, read is some special keyword that only has meaning within a
mapping macro. Is that right?
Actually it's the other way around. Mapping macros mirror the way you
would be able to pass in custom Go implementations of variables in Go, but
for model checking. Whenever a variable is read (i.e any reference in a
critical section), the corresponding "read" code would run. Same for write,
but with assignment. The "read" and "write" statements are a way of
specifying the effective result of the mapping macro, like "return" for
functions.
-
-
Comment 2
<#75 (comment)>
- Syntax nitpick: I think you probably meant to say fair instance 1..3
of Cohort? The \in currently there looks out of place.
Sure, syntax is up for debate. The only issue is how to differentiate
between sets of processes and single processes, since syntactically your
change would be indistinguishable from a single instance named {1, 2, 3}.
-
- Not clear to me: the BOOL you added to localSuccesses is just a
placeholder for an actual boolean the spec writer would use, or is that
some special value in Modular PlusCal? You probably want something with the
semantics of "either true or false" there (i.e., have TLC to branch every
possible decision for every node in cohort).
That is a mistake. Corrected to { TRUE, FALSE } (I must have forgotten to
correct the code in the post - I found and fixed that bug in my test files
before).
-
- Is it right to say interleaving { S } can be desugared by
wrapping every step with either { step } or { S }? If so, we can
experiment with this in regular PlusCal to see if works well as a technique
to express failure/recovery/restart.
That sounds like a valid desugaring. The actual process I imagined for
direct-to-TLA compilation was to add extra branches nearer to Next, but
your version should be equivalent if more verbose ... though one possible
hurdle is control flow like while and if, which may be harder to convert.
Worth a try at least.
-
-
General comments.
- This does seem to transform PlusCal into a different beast. It has
the advantage of being a lot more general than #77
<#77>, but on the other hand
the changes to existing specs would be a lot more involved.
I agree. The original intention here was to convert our existing 2PC spec,
but the changes required in order to specify a proper API were non-trivial.
There is definitely a "you may have to rethink your code" disclaimer, but I
think that is a fundamental property of adding an API to code that did not
originally have one.
-
- I wonder how the compiler will optimize "mappings" as in the
examples you have here, since the mechanisms seem to be pretty general.
Mappings are for model-checking so they will not appear in the Go output,
since the idea is to compile only the archetypes. If we did compile the
mappings (say, in order to generate an example main function that runs the
archetypes as described in the spec), they would probably just be expanded
like macros and then be optimised (or not) like anything else.
-
- I guess a good exercise would be to try to write a complicated
distributed algorithm in Modular PlusCal. Maybe WPaxos (which has an
existing PlusCal spec), or Raft (a lot more work). Otherwise it's hard to
know whether we are missing something critical. I can help with that.
Any help would be appreciated there - I ran into a wall on the Raft front,
and was unaware of WPaxos. One spec I'd like to add is 3PC, even if we just
write it from scratch since it features timeouts and I want to make sure
the mappings support that properly.
…
-
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AImmObSmwBUBXvAuUkGmacmxJ3h5ZSruks5uKP28gaJpZM4Uy_qH>
.
|
Here is an example spec showing proposed pass by reference semantics for discussion later today. The motivation is being able to use procedures inside archetypes, where the globals usually operated on by procedures are parameterised and thus not accessible with the original procedure semantics.
|
|
|
Thanks for thinking this through, @fhackett! A couple of questions/comments follow:
Maybe examples would make it easier for me to understand what you propose.
In your example:
Would it be allowed? Were you planning to have My goal here is to make things more straightforward not only in terms of PGo's implementation, but also understanding what's going on under the hood when we are writing specs in modular PlusCal. The semantics seem to be pretty nuanced and I'm thinking how we could drop features without sacrificing expressiveness. One possibility would be to not support mapping macros on tuples and records (i.e., not at the granularity you are proposing here). But things could still work if we restricted their assignment to |
More compilation thoughts and discussionI have tried some more experiments and it seems that any "smart" compilation where we parameterise references using TLA+ tricks will not work. The problem is generating correct Under a mapping macro that might assign to any global variable during its execution, it is impossible to statically compute what side-effects any part of a Modular PlusCal algorithm might have. Attempts at expressing dynamic computation of this have failed with TLC errors, so I'm leaning toward "impossible as far as I can tell" as the status of "smart" compilation of Modular PlusCal. So, this leaves us with complete static expansion of @rmascarenhas to address your comments in order:
it seems that was impossible anyway (see above). I agree, that was confusing.
I see. I tried this in the TLA+ toolbox to double check and I have no idea how I reached that idea. Thank you for making me check this. The correct semantics are that
You can't actually map This does bring up the issue of passing the same object in multiple positions as
See my response to read. Seems like this was never a thing. 🤦♂️ Write macros will be executed only once, not because of some weird logic but because you can only assign to something once in the same critical section.
Given our strategy to statically expand these things, assignments to parameter |
Progress update on PlusCal translator:
Aside: it seems that you can write |
Closing this in favor of the Wiki entry on Modular PlusCal, which should include the result of all the discussion that happened here and offline. |
This issue documents my proposal to extend the PlusCal language with features that enable composability.
Currently one PlusCal algorithm cannot interact with another in any way. They cannot refer to each other during model checking, and there is no consistent way to link PGo-compiled implementations together via shared variables since the algorithms assume exclusive access and this is liable to break invariants without significant modifications to the generated code.
The minimal set of features required for PlusCal to support modules are:
Example of what a PlusCal developer would do in order to use these things, assuming they are writing a system like RAFT where all the processes have the same code:
Hopefully this give a decent outline of what I'm talking about - let me know if anything is unclear.
The text was updated successfully, but these errors were encountered: