-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TM Connections integration #537
Conversation
use minPermanence as "zero", also avoid hard-coded crop for permanence, is done in connections.
merge the common code in if-else branches
was 0.0f, now minPermanence
make the code same as in connections
call connections.adaptSegment instead of reimplementing the logic
this is functionality used in TM
this matches signiture in Connections::adaptSegment and avoids need for conversions
segments with no synapses (can be removed when pruneZeroSynapses is set ON) are pruned too by calling `destroySegment`.
can set limit on max segments on cell, if reached, least recently used segments will be pruned to make space. Used by TM.
this field is used for createSegment which optionally (maxSegmentsPerCell > 0) removes excessive segments (least used first). Only used by TM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review, help me discuss TODO items. Thank you 🙏
@@ -461,28 +459,32 @@ void Connections::adaptSegment(const Segment segment, | |||
update = -decrement; | |||
} | |||
|
|||
//prune permanences that reached zero | |||
if (pruneZeroSynapses && synapseData.permanence + update < htm::minPermanence + htm::Epsilon) { | |||
destroySynapse(synapse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prune "dead" synapses that reached 0.
- from TM
- is synapse really dead when reached 0? -> yes, because cannot get any activation.
- TODO this should be ON even for SP(?!)
- how does this relate to New method Connections::synapseCompetiton #466 synapse competition? Can I use that instead here?
- TODO if we remove synapses, ensure we do NOT go under
stimulusThreshold
See Smarter SP parameters #536 . If we do:- grow new synapse(s)?
- destroy segment, and create new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctrl-z-9000-times how is #466 with biological plausibility? I like how the competition helps keep synapses better balanced, but I don't see a natural mechanism. For this reason I started question raisePermanencesToThreshold
, see if/how it's plausible? Or replace these "raise to threshold" with synaptic death & growth of new. ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems plausible to me, that dendrites could have a desired number of synapses and that they could enforce this preference.
SynapseIdx numConnected; | ||
CellIdx cell; //mother cell that this segment originates from | ||
SynapseIdx numConnected; //number of permanences from `synapses` that are >= synPermConnected, ie connected synapses | ||
UInt32 lastUsed = 0; //last used time (iteration). Used for segment pruning by "least recently used" (LRU) in `createSegment` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lastUsed moved here from TM lastUsedIterationForSegment:
- reduced size from 64bit to 32b (4*10^9 iterations of the alg unlikely to exceed)
- for TM same mem footprint, see Memory footprint #534
- for SP currently unused, so slight increase (but only 4MB RAM per 1M columns -> neglible)
- plans for SP to do pruning as well, the it'd be used
for (const auto &segment : activeSegments_) { | ||
lastUsedIterationForSegment_[segment] = iteration_; | ||
for (auto &segment : activeSegments_) { | ||
connections.dataForSegment(segment).lastUsed = iteration_; //TODO the destroySegments based on LRU is expensive. Better random? or "energy" based on sum permanences? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how, but it feels the removal of LRU is expensive. The pruning happens only rarely, while lastUsed needs to be written every activation. Is there a better way? (random, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctrl-z-9000-times what about cummulative permanence for segment, as "energy", rather than LRU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a bad idea, but I'd want to see some kind of evidence that it works, and that it works at least as well as the current solution.
Another idea is to never destroy segments. Instead use the method "removeMinPermSynapses" to free up room on an existing segment.
and their reuse in destroyed buffers Maybe the buffers are not useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added stats for pruned synapses, segments.
And some pretty info in hotgym (same as in MNIST example)
stream << " Synapses pruned (" << (Real) self.prunedSyns_ / self.numSynapses() | ||
<< "%) Segments pruned (" << (Real) self.prunedSegs_ / self.numSegments() << "%)" << std::endl; | ||
stream << " Buffer for destroyed synapses: " << self.destroyedSynapses_.size() << " \t buffer for destr. segments: " | ||
<< self.destroyedSegments_.size() << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you help me make "not completely artificial" test where synapses & segments are heavily pruned?
I'd like to:
- validate the code is actually (enough) used
- verify the
destroyed*
buffers don't grow too much, cap them if needed.
This is what I get on Hotgym -> not really used at all.
TM Connections:
Inputs (4653) ~> Outputs (16384) via Segments (32626)
Segments on Cell Min/Mean/Max 0 / 1.99133 / 40
Potential Synapses on Segment Min/Mean/Max 10 / 10.5655 / 49
Connected Synapses on Segment Min/Mean/Max 0 / 0.356985 / 34
Synapses Dead (0%) Saturated (0.0110586%)
Synapses pruned (0.0708685%) Segments pruned (0%)
Buffer for destroyed synapses: 0 buffer for destr. segments: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to make it prune synapses by changing the data set. For example train it on sine waves and then move to saw tooth waves.
I think it's normal and good that it does not prune very many synapses / segments.
Segments should really only be pruned when the HTM reaches capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to make it prune synapses by changing the data set. For example train it on sine waves and then move to saw tooth waves.
would be nice to have some test on "switching contexts", where we observe some more synapse pruning shortly after the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check even if maxSegmentPerCell == 1, otherwise we could create > 1 segments on such cell.
and hopefully more readable. Deterministic by adding rule for case xx[a] == xx[b]: then return a < b
that is if num synapses left < connectedThreshold_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ctrl-z-9000-times for the explanation, I've added that to the block of code. Hope now this can be ready
@@ -351,16 +291,16 @@ burstColumn(vector<CellIdx> &activeCells, | |||
const CellIdx winnerCell = | |||
(bestMatchingSegment != columnMatchingSegmentsEnd) | |||
? connections.cellForSegment(*bestMatchingSegment) | |||
: getLeastUsedCell(rng, column, connections, cellsPerColumn); | |||
: getLeastUsedCell(rng, column, connections, cellsPerColumn); //TODO replace (with random?) this is extremely costly, removing makes TM 6x faster! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLeastUsedCells is very important on Hotgym, but I guess it just means that models is still not tuned properly. As TM should go the cellForSegment
path more ofthen if learned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This revision adds mostly cosmetic code cleanups & lots for improved doc,comments.
@ctrl-z-9000-times please one more review when you have time
* SP & TM. | ||
*/ | ||
//! const UInt32& iteration = iteration_; //FIXME cannot construct iteration like this? | ||
UInt32 iteration() const { return iteration_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
know how could I write this to use the const ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctrl-z-9000-times do you know why the commented above does not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, but C++ is really picky about the const &
trick. It stops the compiler from auto-generating certain boilerplate methods. I think I've encountered stuff like this before, but I don't remember how I fixed it.
src/htm/algorithms/Connections.hpp
Outdated
@@ -539,7 +596,7 @@ class Connections : public Serializable | |||
*/ | |||
size_t numCells() const { return cells_.size(); } | |||
|
|||
Permanence getConnectedThreshold() const { return connectedThreshold_; } | |||
constexpr Permanence getConnectedThreshold() const { return connectedThreshold_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, let's try to use constexpr more where possible
// because otherwise a strong input would be sampled many times and grow many synapses. | ||
// That would give such input a stronger connection. | ||
// Synapses are supposed to have binary effects (0 or 1) but duplicate synapses give | ||
// them (synapses 0/1) varying levels of strength. | ||
for (const Synapse& synapse : connections.synapsesForSegment(segment)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improved doc
|
||
// Extra bookkeeping for faster computing of segment activity. | ||
std::unordered_map<CellIdx, std::vector<Synapse>> potentialSynapsesForPresynapticCell_; | ||
std::unordered_map<CellIdx, std::vector<Synapse>> connectedSynapsesForPresynapticCell_; | ||
std::map<CellIdx, std::vector<Segment>> potentialSegmentsForPresynapticCell_; | ||
std::map<CellIdx, std::vector<Segment>> connectedSegmentsForPresynapticCell_; | ||
|
||
std::vector<Segment> segmentOrdinals_; | ||
std::vector<Synapse> synapseOrdinals_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the ordinals vectors, this is now stored in Sement/SynapseData
with no significant overhead, so just the code is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok to me, assuming you can fix what ever is making the unit tests fail.
Off Topic: I think that the HTM would work just as well if we never allowed for destroying segments (untested). Instead of destroying segments just remove all of the synapses. The use-case for destroying segments is when the TM is full and you want to add a new thing to it. In this situation, the destroyed segment will immediately recreated with different synapses. This would allow us to get rid of the segmentOrdinals
and segment.id
use find to check the segment exists on the cell. Issue was in comparing const iteratior vs segments.end()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctrl-z-9000-times this should fix the test cought by Debug CI.
}); | ||
|
||
NTA_ASSERT(segmentOnCell != cellData.segments.end()); | ||
const auto segmentOnCell = std::find(cellData.segments.cbegin(), cellData.segments.cend(), segment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error was dumb, comparing const iterator above with .segments.end()
here.
Replaced with a simpler find
that does not require extra sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find
is simpler and slower, but this method should not be called very often and I hope it eventually goes away altogether, so I'll approve it.
Thank you for reviewing, David! |
TM:
adaptSegment
from ConnectionsFor #373