Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TM Connections integration #537

Merged
merged 44 commits into from
Jul 18, 2019
Merged

TM Connections integration #537

merged 44 commits into from
Jul 18, 2019

Conversation

breznak
Copy link
Member

@breznak breznak commented Jul 3, 2019

TM:

  • use adaptSegment from Connections
  • remove lastUsedIterationForSegment (for LRU), add SegmentData.lastUsed
  • Connections cleanup

For #373

breznak added 15 commits July 1, 2019 22:17
use minPermanence as "zero",
also avoid hard-coded crop for permanence, is done in connections.
merge the common code in if-else branches
make the code same as in connections
call connections.adaptSegment instead of reimplementing the logic
this matches signiture in Connections::adaptSegment and avoids need for
conversions
segments with no synapses (can be removed when pruneZeroSynapses is set
ON) are pruned too by calling `destroySegment`.
can set limit on max segments on cell, if reached, least recently used
segments will be pruned to make space.

Used by TM.
this field is used for createSegment which optionally
(maxSegmentsPerCell > 0) removes excessive segments (least used first).

Only used by TM.
@breznak breznak added ready TM code code enhancement, optimization, cleanup..programmer stuff labels Jul 3, 2019
@breznak breznak self-assigned this Jul 3, 2019
Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review, help me discuss TODO items. Thank you 🙏

src/htm/algorithms/Connections.cpp Outdated Show resolved Hide resolved
@@ -461,28 +459,32 @@ void Connections::adaptSegment(const Segment segment,
update = -decrement;
}

//prune permanences that reached zero
if (pruneZeroSynapses && synapseData.permanence + update < htm::minPermanence + htm::Epsilon) {
destroySynapse(synapse);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prune "dead" synapses that reached 0.

  • from TM
  • is synapse really dead when reached 0? -> yes, because cannot get any activation.
    • TODO this should be ON even for SP(?!)
  • how does this relate to New method Connections::synapseCompetiton #466 synapse competition? Can I use that instead here?
  • TODO if we remove synapses, ensure we do NOT go under stimulusThreshold See Smarter SP parameters #536 . If we do:
    • grow new synapse(s)?
    • destroy segment, and create new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctrl-z-9000-times how is #466 with biological plausibility? I like how the competition helps keep synapses better balanced, but I don't see a natural mechanism. For this reason I started question raisePermanencesToThreshold, see if/how it's plausible? Or replace these "raise to threshold" with synaptic death & growth of new. ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems plausible to me, that dendrites could have a desired number of synapses and that they could enforce this preference.

src/htm/algorithms/Connections.cpp Show resolved Hide resolved
SynapseIdx numConnected;
CellIdx cell; //mother cell that this segment originates from
SynapseIdx numConnected; //number of permanences from `synapses` that are >= synPermConnected, ie connected synapses
UInt32 lastUsed = 0; //last used time (iteration). Used for segment pruning by "least recently used" (LRU) in `createSegment`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastUsed moved here from TM lastUsedIterationForSegment:

  • reduced size from 64bit to 32b (4*10^9 iterations of the alg unlikely to exceed)
  • for TM same mem footprint, see Memory footprint #534
  • for SP currently unused, so slight increase (but only 4MB RAM per 1M columns -> neglible)
    • plans for SP to do pruning as well, the it'd be used

src/htm/algorithms/TemporalMemory.cpp Outdated Show resolved Hide resolved
for (const auto &segment : activeSegments_) {
lastUsedIterationForSegment_[segment] = iteration_;
for (auto &segment : activeSegments_) {
connections.dataForSegment(segment).lastUsed = iteration_; //TODO the destroySegments based on LRU is expensive. Better random? or "energy" based on sum permanences?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how, but it feels the removal of LRU is expensive. The pruning happens only rarely, while lastUsed needs to be written every activation. Is there a better way? (random, ...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctrl-z-9000-times what about cummulative permanence for segment, as "energy", rather than LRU?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a bad idea, but I'd want to see some kind of evidence that it works, and that it works at least as well as the current solution.

Another idea is to never destroy segments. Instead use the method "removeMinPermSynapses" to free up room on an existing segment.

Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added stats for pruned synapses, segments.
And some pretty info in hotgym (same as in MNIST example)

stream << " Synapses pruned (" << (Real) self.prunedSyns_ / self.numSynapses()
<< "%) Segments pruned (" << (Real) self.prunedSegs_ / self.numSegments() << "%)" << std::endl;
stream << " Buffer for destroyed synapses: " << self.destroyedSynapses_.size() << " \t buffer for destr. segments: "
<< self.destroyedSegments_.size() << std::endl;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you help me make "not completely artificial" test where synapses & segments are heavily pruned?
I'd like to:

  • validate the code is actually (enough) used
  • verify the destroyed* buffers don't grow too much, cap them if needed.

This is what I get on Hotgym -> not really used at all.

TM Connections:
Inputs (4653) ~> Outputs (16384) via Segments (32626)
Segments on Cell Min/Mean/Max 0 / 1.99133 / 40
Potential Synapses on Segment Min/Mean/Max 10 / 10.5655 / 49
Connected Synapses on Segment Min/Mean/Max 0 / 0.356985 / 34
Synapses Dead (0%) Saturated (0.0110586%)
Synapses pruned (0.0708685%) Segments pruned (0%)
Buffer for destroyed synapses: 0     buffer for destr. segments: 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to make it prune synapses by changing the data set. For example train it on sine waves and then move to saw tooth waves.

I think it's normal and good that it does not prune very many synapses / segments.

Segments should really only be pruned when the HTM reaches capacity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to make it prune synapses by changing the data set. For example train it on sine waves and then move to saw tooth waves.

would be nice to have some test on "switching contexts", where we observe some more synapse pruning shortly after the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breznak added 3 commits July 5, 2019 08:15
check even if maxSegmentPerCell == 1, otherwise we could create
> 1 segments on such cell.
and hopefully more readable.
Deterministic by adding rule for case xx[a] == xx[b]: then return a < b
breznak added 2 commits July 10, 2019 19:38
Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ctrl-z-9000-times for the explanation, I've added that to the block of code. Hope now this can be ready

@@ -351,16 +291,16 @@ burstColumn(vector<CellIdx> &activeCells,
const CellIdx winnerCell =
(bestMatchingSegment != columnMatchingSegmentsEnd)
? connections.cellForSegment(*bestMatchingSegment)
: getLeastUsedCell(rng, column, connections, cellsPerColumn);
: getLeastUsedCell(rng, column, connections, cellsPerColumn); //TODO replace (with random?) this is extremely costly, removing makes TM 6x faster!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLeastUsedCells is very important on Hotgym, but I guess it just means that models is still not tuned properly. As TM should go the cellForSegment path more ofthen if learned.

Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This revision adds mostly cosmetic code cleanups & lots for improved doc,comments.
@ctrl-z-9000-times please one more review when you have time

* SP & TM.
*/
//! const UInt32& iteration = iteration_; //FIXME cannot construct iteration like this?
UInt32 iteration() const { return iteration_; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

know how could I write this to use the const ref?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctrl-z-9000-times do you know why the commented above does not work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, but C++ is really picky about the const & trick. It stops the compiler from auto-generating certain boilerplate methods. I think I've encountered stuff like this before, but I don't remember how I fixed it.

@@ -539,7 +596,7 @@ class Connections : public Serializable
*/
size_t numCells() const { return cells_.size(); }

Permanence getConnectedThreshold() const { return connectedThreshold_; }
constexpr Permanence getConnectedThreshold() const { return connectedThreshold_; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, let's try to use constexpr more where possible

// because otherwise a strong input would be sampled many times and grow many synapses.
// That would give such input a stronger connection.
// Synapses are supposed to have binary effects (0 or 1) but duplicate synapses give
// them (synapses 0/1) varying levels of strength.
for (const Synapse& synapse : connections.synapsesForSegment(segment)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved doc

@breznak breznak mentioned this pull request Jul 12, 2019
4 tasks

// Extra bookkeeping for faster computing of segment activity.
std::unordered_map<CellIdx, std::vector<Synapse>> potentialSynapsesForPresynapticCell_;
std::unordered_map<CellIdx, std::vector<Synapse>> connectedSynapsesForPresynapticCell_;
std::map<CellIdx, std::vector<Segment>> potentialSegmentsForPresynapticCell_;
std::map<CellIdx, std::vector<Segment>> connectedSegmentsForPresynapticCell_;

std::vector<Segment> segmentOrdinals_;
std::vector<Synapse> synapseOrdinals_;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the ordinals vectors, this is now stored in Sement/SynapseData with no significant overhead, so just the code is clearer.

Copy link
Collaborator

@ctrl-z-9000-times ctrl-z-9000-times left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me, assuming you can fix what ever is making the unit tests fail.

Off Topic: I think that the HTM would work just as well if we never allowed for destroying segments (untested). Instead of destroying segments just remove all of the synapses. The use-case for destroying segments is when the TM is full and you want to add a new thing to it. In this situation, the destroyed segment will immediately recreated with different synapses. This would allow us to get rid of the segmentOrdinals and segment.id

breznak added 2 commits July 18, 2019 15:17
use find to check the segment exists on the cell.
Issue was in comparing const iteratior vs segments.end()
Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctrl-z-9000-times this should fix the test cought by Debug CI.

});

NTA_ASSERT(segmentOnCell != cellData.segments.end());
const auto segmentOnCell = std::find(cellData.segments.cbegin(), cellData.segments.cend(), segment);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error was dumb, comparing const iterator above with .segments.end() here.
Replaced with a simpler find that does not require extra sorting.

Copy link
Collaborator

@ctrl-z-9000-times ctrl-z-9000-times left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find is simpler and slower, but this method should not be called very often and I hope it eventually goes away altogether, so I'll approve it.

@breznak breznak merged commit 6475fba into master Jul 18, 2019
@breznak breznak deleted the tm_conn branch July 18, 2019 21:44
@breznak
Copy link
Member Author

breznak commented Jul 18, 2019

Thank you for reviewing, David!
Yes, I hope #574 will follow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code code enhancement, optimization, cleanup..programmer stuff ready TM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants