Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Connections segment pruning #601

Merged
merged 41 commits into from
Jun 4, 2020
Merged

Fix Connections segment pruning #601

merged 41 commits into from
Jun 4, 2020

Conversation

breznak
Copy link
Member

@breznak breznak commented Aug 2, 2019

  • fixes segment pruning in Connections
    • properly used in SP, TM
  • SP with enabled synapse pruning is not platform-independent!
  • bugfixes for dataForSynapse(), dataForSegment() + tests
  • fix Connections de/serialization
  • fix TM maxSegmentsPerCell check

Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review when you find time.
Question to consider: make an equivalent of stimulusThreshold_ in Connections? (currently passed as arg to the methods)

src/htm/algorithms/Connections.cpp Outdated Show resolved Hide resolved
@@ -433,7 +429,8 @@ void Connections::adaptSegment(const Segment segment,
const SDR &inputs,
const Permanence increment,
const Permanence decrement,
const bool pruneZeroSynapses)
const bool pruneZeroSynapses,
const UInt segmentThreshold)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 2 options could be merged together, but I like to keep them separate: first gates synapse pruning, the latter segment pruning.

if(!pruneZeroSynapses) {
NTA_ASSERT(segmentThreshold == 0) << "Setting segmentThreshold only makes sense when pruneZeroSynapses is allowed.";
}
if(pruneZeroSynapses and synapses.size() < segmentThreshold) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the bug I was about, connectedThreshold_ was used incorrectly.

src/htm/algorithms/SpatialPooler.cpp Outdated Show resolved Hide resolved
src/htm/algorithms/TemporalMemory.hpp Outdated Show resolved Hide resolved
@breznak
Copy link
Member Author

breznak commented Aug 2, 2019

a change I introduced, probably the find with relaxed ordering, results in undeterministic results. Needs fixing.

@breznak breznak mentioned this pull request Aug 3, 2019
1 task
@breznak breznak removed the ready label Aug 9, 2019
breznak added 9 commits August 9, 2019 17:26
and only keep count of segments that were destroyed.
Pros:
 - simpler code
 - fixed deterministm for (serialized connections & using pruning)
 - maybe better cache utilization, as vector<SegmentData> is continuous
now
Cons:
 - larger memory utilization, as segments_ and segmentData_ vectors now
have holes in it
lower_bound failed on "does not parition",
find() is fine, and simpler code.
as a premature optimization, keep only as counter of num destroyed.
- cleaner code
- faster
- but slightly wastes memory (but we only destroy 0.3% synapses on
MNIST)
as pruning seems not platform-independent.
@breznak breznak added the bug Something isn't working label Jun 3, 2020
breznak added 2 commits June 3, 2020 13:10
the debug code was called after! the segment has been already removed
by setting segmentThreshold param. This should be ON, but there
are some issues with platform independent reproducible results
@breznak breznak force-pushed the fix_conn_stimulusThreshold branch from 3e4ae13 to 46e4ca5 Compare June 3, 2020 11:55
breznak added 14 commits June 3, 2020 14:23
for platform-independent build?
after synapse is destroyed by destroySynapse(syn), its data
were still accessible! Which could lead to false operations.

Add check to dataForSynapse() that fails if synapse has been removed.

Detail: synapseExists_() is available, improve its performance to be
widely usable (by dataForSynapse()).

Tests for create/destroySynapse
if the new permanence is higher that the permanence of the old (same)
synapse, update to the higher value.
that only test the functionality of "main" Connections,
which are already tested.
Done due to changes in the main tests, which would have to be rewritten
here as well.

The "advanced" tests should cover all the special functionality not part
of the main.
Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkeeney please have a look (even if only at the high-level changes)?
This is an old PR that accumulated many fixes that were quite hard to debug, all tests pass so I assume it's safe to merge.

@@ -163,6 +165,66 @@ def testAdaptShouldDecrementSynapses(self):
self.assertEqual(presynamptic_cells, presynaptic_input_set, "Missing synapses")



def testCreateSynapse(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added new tests

@@ -38,207 +38,6 @@ def _getPresynapticCells(self, connections, segment, threshold):
return set([connections.presynapticCellForSynapse(synapse) for synapse in connections.synapsesForSegment(segment)
if connections.permanenceForSynapse(synapse) >= threshold])

def testAdaptShouldNotRemoveSegments(self):
"""
Test that connections are generated on predefined segments.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @fcr I'm removing the "duplicit" tests for advanced_connections.py for methods that are not changed by this Connection implementation, and are already tested in the "main" tests. As I made changes to the tests and would have to port them here.

//1. just keep the older (former default)
//2. throw an error (ideally, user should not createSynapse() but rather updateSynapsePermanence())
//3. create a duplicit new synapse -- NO. This is the only choice that is incorrect! HTM works on binary synapses, duplicates would break that.
//4. update to the max of the permanences (default)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documented and changed the default strategy for createSynapse() if such synapse already exists. All four alternatives discussed above,

  • former default: leave as is;
  • now: update permanence if the new perm would be larger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...impact is not really seen in our unit-tests as none trigger this "duplicit synapse created" branch.


} else {
//quick method. Relies on hack in destroySynapse() where we set synapseData.permanence == -1
return synapses_[synapse].permanence != -1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code above is quite slow, but is needed and needed in a heavily used dataForSynapse(), therefore I created this "fast" variant/hack.

segmentData.synapses.cend(),
synapse,
[&](const Synapse a, const Synapse b) -> bool { return dataForSynapse(a).id < dataForSynapse(b).id;}
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just whitespace formatting

@@ -374,7 +378,8 @@ class Connections : public Serializable
*
* @retval Synapse data.
*/
const SynapseData &dataForSynapse(const Synapse synapse) const {
inline const SynapseData& dataForSynapse(const Synapse synapse) const {
NTA_CHECK(synapseExists_(synapse, true));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed check for dataForSynapse/Segment(). Otherwise, the data could be accessed even after removing the synapse/segment with destroy*().

//the following member must not be serialized (so is set to =0).
//That is because of we serialize only active segments & synapses,
//excluding the "destroyed", so those fields start empty.
//! ar(CEREAL_NVP(destroyedSegments_));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed bug in serialization of connections.

@@ -206,7 +206,7 @@ void TemporalMemory::activatePredictedColumn_(
do {
if (learn) {
connections_.adaptSegment(*activeSegment, prevActiveCells,
permanenceIncrement_, permanenceDecrement_, true);
permanenceIncrement_, permanenceDecrement_, true, minThreshold_);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIX: TM now honors the "max segments per cell" logic.

@breznak breznak added ready and removed in_progress labels Jun 3, 2020
Copy link

@dkeeney dkeeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found no problems while doing a high level review.
I have not been following the changes that you are making to the Connections algorithm but I assume they are all good.

@breznak
Copy link
Member Author

breznak commented Jun 4, 2020

Thank you, David!

I have not been following the changes that you are making to the Connections algorithm but I assume they are all good.

i've just double checked and the changes to Conn are all fixes, so we're good.

@breznak breznak merged commit c50a669 into master Jun 4, 2020
@breznak breznak deleted the fix_conn_stimulusThreshold branch June 4, 2020 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fix ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants