Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define identity elements for -< and ==* and name some flows. #70

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

NoahStoryM
Copy link
Collaborator

Summary of Changes

As a kind of Cartesian Product, Values's identity element (values) can be marked as 1.

-< and ==* can be regarded as monoids with identity elements *->1 and 1->1 respectively.

Public Domain Dedication

  • In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

@NoahStoryM NoahStoryM changed the title Define identity elements for tee and relay. Define identity elements for -< and ==*. Aug 29, 2022
@NoahStoryM NoahStoryM force-pushed the identity branch 2 times, most recently from add22cd to 845aa79 Compare August 30, 2022 11:23
@NoahStoryM
Copy link
Collaborator Author

NoahStoryM commented Aug 31, 2022

I wonder if some forms can be abstracted as functions? like X, and n> (it seems that the forms without parentheses can be handled like this?).

@countvajhula
Copy link
Collaborator

Yeah, @benknoble also pointed this out recently and it ended up making a big difference in performance in that case. It would be great to have implementations behind functions if there is no compile-time benefit to defining the macro in-place. For instance with the threading form ~>, it makes sense to use a syntactic expansion instead of a function as it gives us a performance benefit (otherwise we would need to reverse the order of composition at runtime using reverse, instead of doing it syntactically at compile time). I haven't done a review of cases where a function would work, and I think you're right that the pure-identifier forms would all be abstractable as functions.

I haven't had a chance to review this PR (and some of your other PRs! 😅 ) yet, but on a logistical note, abstracting forms behind functions sounds like it would be best to do in a separate PR (or PRs) to keep them narrowly scoped.

@NoahStoryM NoahStoryM changed the title Define identity elements for -< and ==*. Define identity elements for -< and ==* and name some flows. Sep 1, 2022
@@ -186,6 +189,9 @@
(append (values->list (apply op vs))
(apply zip-with op (map rest seqs))))))

(define 1->1 (thunk (values)))
(define *->1 (thunk* (values)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the notation here? It doesn't seem to indicate the number of values since the first is 0->0, and the second is N->0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear in error messages shown to the user:

> (~> (1 2 3) (relay))
; 1->1: arity mismatch;
;  the expected number of arguments does not match the given number
;   expected: 0
;   given: 3

Any reason not to use (procedure-rename ...) within relay, relay* and other forms using 1->1 and *->1 so that the name of the form used (e.g. relay in the above example) is reported to the user?

Copy link
Collaborator Author

@NoahStoryM NoahStoryM Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the notation here? It doesn't seem to indicate the number of values since the first is 0->0, and the second is N->0.

Values can be regard as a kind of Cartesian Product, so that (values "a" "b" "c") can be marked as "a" × "b" × "c". In category theory 1 is the identity element of ×, which is (values) in racket.

*->1 : accepts any arguments and returns (values).
1->1 : accepts no value and returns (values).

On the one hand, I haven't thought of a better notation for (values). On the other hand, Qi seems to have a deep connection with category theory, so I think it makes sense to use the notation in category theory directly (and we can use + and 0 to represent covalues and the identity element of it).

Any reason not to use (procedure-rename ...) within relay, relay* and other forms using 1->1 and *->1 so that the name of the form used (e.g. relay in the above example) is reported to the user?

For example, *->1 is not only the identity element of -<, but also in qi:

> (eq? (☯ ⏚) (☯ (-<)))
#t

And if we rename these procedures, the equality will be lost.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! The correspondence to category theory and the duality between sum and product makes sense, but do you anticipate any particular advantage gained by having the eq? equivalence? In general since equality of functions is undecidable, I would be skeptical of code that employs logic based on checks for equivalence between functions based on their identities. Unless there are some specific benefits you have in mind, I would favor keeping the names recognizable in error messages to supporting an eq? equivalence. We can still have the actual functions named as 1->1 and *->1 as that would preserve the identity from the perspective of the codebase instead of having duplicate implementations in the different forms.

Copy link
Collaborator Author

@NoahStoryM NoahStoryM Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but do you anticipate any particular advantage gained by having the eq? equivalence? In general since equality of functions is undecidable, I would be skeptical of code that employs logic based on checks for equivalence between functions based on their identities.

Yes, in general functions are undecidable. But *->1 is special, it is the identity element of -< (a monoid), so it should have this property:

Welcome to Racket v8.6 [cs].
> (require qi)
> (define (f) 123)
> (eq? f (☯ (-< (-<) f)))
#t
> (eq? f (☯ (-< f (-<))))
#t

I prefer to preserve the properties of mathematical structures as much as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that we should aim to preserve mathematical properties. Yet, eq? is not a mathematical relation but an implementation-dependent relation in the Scheme world. It doesn't assess equality based on properties of the objects being compared but based on arbitrary details of the implementation (e.g. memory location where the values may happen to be stored). As a result, I would say that we should avoid considering eq? behavior in our design process except in cases where it provides a compelling performance benefit in practice -- otherwise, better to think in terms of equal? (but even that is problematic -- the paper on egal? covers some of the issues but not all).

Aside from the specific choice of equality relation, in the present case, I feel the mathematical properties we'd like to preserve are:

((☯ (-< (-<) f)) arg ...) = (f arg ...) = ((☯ (-< f (-<))) arg ...)

That is, an operational equivalence in terms of the result of applying these functions to arguments. Since, for instance, we could have a totally different definition of *->1 which would also fulfill the monoid laws, but would not be eq? to the *->1 defined in the codebase. And in this case, there is nothing specific that we need to do in order to ensure the above relation holds, as (relay) and others would satisfy this relation even without being eq?.

Btw, I also meant that code in general should not do checks like (if (eq? f1 f2) ...) or even (if (equal? f1 f2) ...) or (if (member f1 (list f2 f3 f3)) where the fs are functions, since in the general case this is undecidable, and in special cases, it is implementation-specific and akin to a "hack" for performance. In this line of thinking, we should not encourage users writing code like (if (member (☯ (-<)) (list f ...)). Instead, we could simply apply the function to a relevant argument of interest, (if (= v (☯ (-<)) v) ...) without making a general statement that the function "is" the monoid identity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a result, I would say that we should avoid considering eq? behavior in our design process except in cases where it provides a compelling performance benefit in practice.

I think eq? does provide performance benefit. Because in this way programmers can insert the identity functions anywhere without worrying about the performance overhead.

For example, I thought about simulating the limit in category theory by inserting procedures between the arguments of compose.

(define do (make-parameter values))
(~> f g h ...) ; = (~> f (do) g (do) h (do) ...)

But it affects the performance of qi's original code -- because it inserts valuess between all the function arguments of compose. This is what motivated me to submit this PR.

And on the other hand, I'm not sure if it's a good idea to rename the returned procedures in any case. If we rename (-<), how should we deal with (-< add1)? If we decide to rename add1, there seems to be 2 ways:

> ((procedure-rename add1 'compiled-tee-flow) 1 2 3)
compiled-tee-flow: arity mismatch;
 the expected number of arguments does not match the given number
  expected: 1
  given: 3
 [,bt for context]
> ((let ([compiled-tee-flow (lambda args (apply add1 args))]) compiled-tee-flow) 1 2 3)
add1: arity mismatch;
 the expected number of arguments does not match the given number
  expected: 1
  given: 3
 [,bt for context]

The 1st way might be consistent with the way you expect to rename *->1, and the 2nd way is consistent with the case that -< accepts more arguments:

> (~> (1 2 3) (-< add1 sub1))
add1: arity mismatch;
 the expected number of arguments does not match the given number
  expected: 1
  given: 3
 [,bt for context]

I'm not sure if renaming named functions (like add1, *->1) can make it easier for programmers to debug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related thoughts:

  • the idea that $\forall f g, (\forall x, f x = g x) \implies f = g$ is called extensionality (specifically functional extensionality, since the objects are functions). It derives from a general axiom of dependent functional extensionality (I believe in general extensional views are debated, but in this case it seems particularly useful).
  • In Racket, (eq? f f) when f is a lambda of some kind is guaranteed, and transitively any expression that evaluates to f is eq? to f. This means that you could (for example) make functions the keys of a hasheq. But I think this is probably broken under procedure-rename. OTOH, this is only useful when trying to dispatch on behavior from a set of procedures. I don't think the perf. benefit being discussed is the result of eq? but rather the result of not adding extraneous layers to the computation.

(make-list n args))))]))
#'(procedure-rename
(curry repeat-values n)
'compiled-fanout-flow)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to change the implementation here? The original version for this case does some computation at compile time, while the new version runs at runtime. i.e. the original would expand to something like:

(fanout 5)
->
(lambda args
  (apply values
    (append args args args args args)))

... which I believe was slightly faster on benchmarks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget why I did this at the time, maybe it was for debugging. I'll revert it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, as part of the compiler work, we are separating expansion from compilation, and eventually this will probably become a compiler optimization and wouldn't be part of the expansion step.

n
" arguments from "
args)))
(let-values ([(sargs rargs) (split-at args n)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👌

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering if it's necessary to optimize group:

(group 0 *->1 f)   ; f
(group 0 1->1 f)   ; f
(group +inf.0 f g) ; f
(group +inf.f f g) ; f

In addition, it seems that loom-compose does not need to use optional argument?


(define (fanout-parser stx)
(syntax-parse stx
[_:id #'repeat-values]
[(_ 0) #'*->1]
[(_ 1) #'values]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not able to verify the original behavior here at the moment, but does this PR modify the behavior of any edge cases, e.g. (fanout 0) or (gen), or (relay)? If it does, it would be great to add tests for these cases. For cases like (fanout 0) and (fanout 1) we would need tests even if the behavior hasn't changed since it would now hit different code that needs to be covered by tests (unfortunately the coverage check on PRs doesn't work at the moment, but you can run make cover to generate a coverage report locally).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't modify the original behavior, I will write more tests later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants