You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please can you implement a layer that works like the "value" propagation of IPA, but without keys, queries, and attention. Instead, have an optional "gating" layer that sees both nodes and the edge info, that will multiply the entire message by a scalar (maybe with a default softplus activation to keep it between 0 and inf). It might also be useful to try a version that does this with some structuring similar to the "multi-head" setup, where there are many small layers side by side, each with their own gating, instead of one large one.
We can call it the "Invariant Point Graph Perceptron" or something. The main idea is to try and get as much of the IPA behavior as possible, but remove the strong inductive bias to attending to close neighbours, since we can control that via the graph structure anyway.
The text was updated successfully, but these errors were encountered:
Please can you implement a layer that works like the "value" propagation of IPA, but without keys, queries, and attention. Instead, have an optional "gating" layer that sees both nodes and the edge info, that will multiply the entire message by a scalar (maybe with a default softplus activation to keep it between 0 and inf). It might also be useful to try a version that does this with some structuring similar to the "multi-head" setup, where there are many small layers side by side, each with their own gating, instead of one large one.
We can call it the "Invariant Point Graph Perceptron" or something. The main idea is to try and get as much of the IPA behavior as possible, but remove the strong inductive bias to attending to close neighbours, since we can control that via the graph structure anyway.
The text was updated successfully, but these errors were encountered: