-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This project explores a novel artificial neural architecture/application. Here, a simplified spiking neural network represents Turing-complete computation as a many-to-many cyclic causal graph with effector nodes that can alter graph structure and perform interop such as I/O. Logical operations are encoded in edges with coefficients that specify how presynaptic spikes contribute to postsynaptic integrators.
This architecture combines neural computation and language extension towards strong (general) natural language understanding. The hypothesis is that by implementing a fully runtime-extensible interactive interpreter within a simplified spiking neural network, we can leverage explicit programming, language extension, and reinforcement learning to extend the interpreted language towards natural language in a tractable way.
The choice of a spiking neural network over a more conventional ANN is a little atypical and warrants further discussion. The two primary features driving this choice are topological flexibility and discrete activation. In addition, targeted local tuning replaces broad optimization via gradient descent.
Nodes are the atom of state and network structure, and are basically neurons, with presynaptic and/or postsynaptic connections. Input nodes may be activated by the system, and action nodes can trigger built-in side effects.
For now, we shall refer to nodes that can be presynaptic as priors and nodes that can be postsynaptic as posteriors. These terms should not be construed to carry their Bayesian implications, and different terms may be chosen in the future.
Integrators represent the junctions between nodes, and are an analogue of neural synapses and dendritic/somatic summation. These integrators are a heavily simplified leaky integrate and fire model with linear combination and linear decay in long- and short-decay timing profiles.
At a high level, the SNN operates as a time-continuous mapping from states to actions. The line between states and actions is blurred, as nodes towards built-in actions are themselves atoms of state. The states thus considered can be described as abstract aggregates of recency-weighted node activations.
Biologically, several more complex mechanisms should be at work, which are omitted from the prototype in the short term. In particular, the simplified model does not exhibit adaptation or vesicle depletion, and decay is linear.
The coefficients governing the linear combination of incoming nodes in synapses are modeled by distribution data structures.
The current implementation is unimodal, evidence-based, and deterministic. Each instance can be roughly characterized by an expected value (coefficient) and a weight (resilience), with additional state and logic to tune behavior when feedback is received that suggests the generated value should have been different.
Feedback to the distribution comes in the form of STDP, reward/penalty-based reinforcement, and capture operations.
An earlier implementation included random perturbation. However, system behavior at the time suggested that merely incorporating causal feedback in conjunction with race conditions seemed sufficient to guide convergence via reinforcement. The architecture has changed drastically since then, so it is unclear whether this will continue to be sufficient.
Clusters are collections of nodes that constrain overarching connection topologies and allow ensemble operations such as capture (association and disassociation), reinforcement, suppression, and state reset. Special cluster implementations may have additional behaviors.
Generalized capture is a combined association/disassociation operation, and is the primary mechanism for modifying graph structure. This operation captures a posterior activation state to be reproduced by activations in a set of prior clusters, using particular timing profiles. Posteriors are captured by coincidence and priors are captured by trace (activation recency in the context of the specified timing profiles).
Suppression is an inhibitory operation that momentarily inhibits edges between specified clusters. This is different from normal inhibition in that the degree of inhibition is effectively based on the excitation rather than learned inhibitory coefficients. This is useful for doing captures without side effects.
Integrators and traces in a cluster can be reset to prepare for a new computation without waiting for them to decay naturally.
This network needs to be bootstrapped with a parser for a minimal extensible language. To accomplish this, there are several mechanisms under development.
To facilitate interop with built-in logic, a special DataCluster
contains posteriors that hold Java values. These values are published to an event stream when their node is activated.
Nodes values may be mutable or final. Mutable nodes additionally trigger a dedicated node when they are assigned to.
The Binding Problem and its computational specialization of variable binding are cornerstones of general intelligence research. The current mechanism uses suppression and capture to conjunctively bind values to their context and naming priors. However, the naive way of doing this loses associativity, so work is in progress to make each property instance separately addressable.
Computationally, binding can be used for context, property bindings, dictionary storage, and stack frames.
Stack frames are a special application of binding that facilitate contextual computation. Stacks are implemented using frames linked by property bindings. Previously, we explored an edge scaling method, which worked well but can be fragile and wasn't as compatible with temporal summation probing as we'd hoped.
Stack frames may additionally serve in other capacities as representations of a task, such as in parsing, planning, and episodic memory.
Short-term memory registers are special, single-node clusters that can be easily bound to and unbound from posteriors so that they may be used as registers. These constructs facilitate computation similar to registers in a classical architecture.
"Constructs" such as decoders are mechanisms that implement action node side effects.
Decoders are domain-specific built-in boosts that can be activated on data to decode into more meaningful node ensembles. A simple example of this is a Boolean decoder, which simply activates a different node depending on whether a value is true. Other decoders in use are binary decoders and character class decoders. Decoders here play a similar role to encoders in ANNs, so the nomenclature may change.
Another particularly important construct is the CoincidentEffect
, upon which several other side effects are built. This applies an action to any node within a target cluster that is active at the same time as the effector node.
The docs that follow are the unfiltered design dumps in use during development, from newest to oldest. The material in these documents is unlikely to be correct and should not be regarded as documentation.