Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup exceptions #742

Merged
merged 12 commits into from
Nov 9, 2019
Merged

Cleanup exceptions #742

merged 12 commits into from
Nov 9, 2019

Conversation

breznak
Copy link
Member

@breznak breznak commented Nov 1, 2019

  • removes LogItem and LoggingException
  • htm::Exception now allows << "msg" operator, used in NTA_THROW
  • cleanup NTA_* macros
  • use NTA_LOG_LEVEL = htm::LogLevel::LogLevel_{Verbose,...} to set logging sensitivity

For #175
This hopes to help a strange crash in #736 related to NTA_THROW macro unfortunately no

use cout<< "DEBUG" instead
operator<< is needed for NTA_THROW macro,
and now we can remove LoggingException, which only added the <<
use NTA_LOG_LEVEL = htm::LogLevel::LogLevel_xxx
see Log.hpp
@breznak breznak added ready code code enhancement, optimization, cleanup..programmer stuff labels Nov 1, 2019
@breznak breznak self-assigned this Nov 1, 2019
dkeeney
dkeeney previously approved these changes Nov 2, 2019
Copy link

@dkeeney dkeeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have this code cleaned up. Thanks.

@@ -84,7 +84,7 @@ TEST(TMRegionTest, testSpecAndParameters) {
Network net;

// Turn on runtime Debug logging.
//if (verbose) LogItem::setLogLevel(LogLevel::LogLevel_Verbose);
//if (verbose) NTA_LOG_LEVEL = LogLevel::LogLevel_Verbose;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It would be better if this were a function or at least a macro so that if we need to change the logic we don't need to change it everywhere it is used.

Note that with the current logic, setting the log level only affects the current cpp file and is not global. That fact needs to be described someplace and there is no function to associate it with.

If the intent was to have the log level be set globally the variable NTA_LOG_LEVEL would have to be changed from static to extern and the actual variable would need to be defined in a .cpp someplace ... in Network.cpp perhaps.

In the NetworkAPI there are times that we would like to have this set globally so that debug messages on subordinate cpp's would be activated. Currently each Region displays something on each iteration if this is activated globally. Makes a nice trace.

@dkeeney
Copy link

dkeeney commented Nov 2, 2019

After some more thought on this, it occured to me that when we start to do multi-threading we will also need to be able to make the logging thread specific. So there are three things to consider:

  • debug logging set globaly
  • debug logging set locallly (only one Cpp file)
  • debug logging set for one thread.

One way to do this is to create a Log class in Log.hpp and make that a static variable declared in the .hpp,

   class Log {
   private:
       LogLevel local_log_level;   // this is for logging reletive to a single cpp.
       static LogLevel global_log_level;   // this is for process global logging
       
   public:
       void setGlobalLogLevel(LogLevel level) { gloval_log_level = level;}
       void setLocalLogLevel(LogLevel level) {local_log_level = level;}
       void setThreadLogLevel(LogLevel level) { 
            // this gets the thread specific storage and sets it there.
       }
      need another function to determine if output should be generated.
      it should OR the three log levels together.

This may be overkill. but something to consider.

@breznak
Copy link
Member Author

breznak commented Nov 2, 2019

This [Log class] may be overkill. but something to consider.

you're absolutely right, and the design you propose sounds good and clean. But, this PR was just a side quest to get #736 working (and it even didn't work out).

One problem with what you suggest is that we historically use NTA_THROW << "die!" (and others) macros, and those are all around. A Log class would need changing all of that.

I don't feel like looking at that now. So: should I aim to rework this PR to pass as-is, meaning just the cleanup, or we want to cancel it and rework completely later?

which is superfulous and should not inverfere with the logic of the
region
Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug in handling of (TM)Region(s?) please have a look below

tm_->getActiveCells(out->getData().getSDR());
NTA_DEBUG << "active " << *out << std::endl;
}
out = getOutput("predictedActiveCells");
if (out && (out->hasOutgoingLinks() || LogItem::isDebug())) {
if (out && out->hasOutgoingLinks() ) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkeeney this is something I'd like to consult with you. Actually the problem I'm hitting here seems to be a bug in TMRegion,
a) I'm removing the "is Debug" part of the check, as it was supposed to be a helper for debugging, but breaks the logic (tests expecting such behavior are wrong)
b) there's a problem with requiring the hasOutgoingLinks(), as:

This is the test failing:

net = engine.Network()
#net.setLogLevel(htm.bindings.engine_internal.LogLevel.Verbose)     # Verbose shows data inputs and outputs while executing.

encoder = net.addRegion("encoder", "ScalarSensor", "{n: 6, w: 2}");
sp = net.addRegion("sp", "SPRegion", "{columnCount: 200}");
tm = net.addRegion("tm", "TMRegion", "");
net.link("encoder", "sp");
net.link("sp", "tm");
net.initialize();

encoder.setParameterReal64("sensedValue", 0.8);  #Note: default range setting is -1.0 to +1.0
net.run(1)

sp_input = sp.getInputArray("bottomUpIn")
sdr = sp_input.getSDR()
self.assertTrue(np.array_equal(sdr.sparse, EXPECTED_RESULT1))

sp_output = sp.getOutputArray("bottomUpOut")
sdr = sp_output.getSDR()
self.assertTrue(np.array_equal(sdr.sparse, EXPECTED_RESULT2))

tm_output = tm.getOutputArray("predictedActiveCells")
sdr = tm_output.getSDR()
self.assertTrue(np.array_equal(sdr.sparse, EXPECTED_RESULT3))

The network is Encoder->SP->TM,
I was surprised only the TM check is failing, but it is because the TM is a "leaf" (it's last part of the Network), and we have the condition if (out && out->hasOutgoingLinks() ) .
now, we want a similar check (to "has outputs") so that we only compute for connected regions. Using getInput("name").hasIncomingLinks() would work for TM in this example, but would fail for the root (sensor/encoder).

So do we want a combination and determine if XXX (what is the "predictedActiveCells" called, it's not a region per se...?) is connected by

out = getOutput("predictedActiveCells");
if ((out && out->hasOutgoingLinks()) || (getInput("predictedActiveCells") &&  getInput("predictedActiveCells")->getIncomingLinks() ) ) { //means  either XX,or YY exists and is connected in a chain XX->TM->YY
tm_->activateDendrites();
tm_->getWinnerCells(out->getData().getSDR());
NTA_DEBUG << "winners " << *out << std::endl;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a problem understanding your post on my cell phone.
I will have to check it out later when I return.

Copy link

@dkeeney dkeeney Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (out && (out->hasOutgoingLinks() || LogItem::isDebug())) {

This logic is correct. Although we need someplace else to put isDebug() since you removed LogItem. This is part of the trace facility. When the debug mode is activated by an application by calling setLogLevel(LogLevel_Verbose), the trace should show each region that is executed and the outputs.

For the trace facility to work, we need at least the following:

  • LogtLevel must be global.
  • We need a function like setLogLevel( ) for applications to call. (they should not have access to the global LogLevel variable).
  • We need a function like isDebug( ) to allow conditional processing related to the trace.
  • We need a function (or Macro) like NTA_DEBUG( ) to indicate what is to be displayed in the trace.

This trace facility is extremely useful when debugging a problem with region execution and or linking. In particular, which data is being passed at which times and the order that the regions are executed. This is another case where we need more documentation and explanation of the 'features' that are available.

Oh, and the 'Verbose' facility that I used with some of the Unit Tests does not use the trace facility.

TMRegion has several 'optional' outputs. For the 'optional' outputs, if there is no outgoing connection, then the output is not generated. So, to get the trace in Verbose mode we should generate the output data even though it is not being sent to an output. That is what the above 'if' statement does.

However, recently we exposed the ability to access the output buffers directly with region.getOutput(name).getData(). This causes a problem because if there is no outgoing link the optional outputs cannot be accessed in this way. Perhaps to be on the safe side we should always produce all outputs even if they are not used. In that case we can remove isDebug( ) and remove the entire 'if' statement at the top but the cost is lower performance when the outputs are not needed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we want a combination and determine if XXX (what is the "predictedActiveCells" called, it's not a region per se...?) is connected by ....

I have no idea what you are trying to say here. Could you re-phrase it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (out && (out->hasOutgoingLinks() || LogItem::isDebug())) {

This logic is correct. Although we need someplace else to put isDebug() since you removed LogItem.

this is what I wanted to disprove. Debug is useful but we should not change regions' output based on it (as the output is what tests check)

For the trace facility to work, we need at least the following:

we have these, as:

LogtLevel must be global.

NTA_LOG_LEVEL

We need a function like setLogLevel( ) for applications to call. (they should not have access to the global LogLevel variable).

Network.setLogLevel() (but they do have access to the global variable)

We need a function like isDebug( ) to allow conditional processing related to the trace.

NTA_LOG_LEVEL == htm::LogLevel::LogLevel_Verbose

We need a function (or Macro) like NTA_DEBUG( ) to indicate what is to be displayed in the trace.

can we use NTA_DEBUG?

TMRegion has several 'optional' outputs. For the 'optional' outputs, if there is no outgoing connection, then the output is not generated. So, to get the trace in Verbose mode we should generate the output data even though it is not being sent to an output. That is what the above 'if' statement does.

going back to what I was trying to prove wrong.

For the 'optional' outputs, if there is no outgoing connection, then the output is not generated

and

However, recently we exposed the ability to access the output buffers directly with region.getOutput(name).getData(). This causes a problem because if there is no outgoing link the optional outputs cannot be accessed in this way. Perhaps to be on the safe side we should always produce all outputs even if they are not used.

I'm not sure what the "optional" output is in this context, but this is the bug. In the test mentioned, the Network looks like encoder -> SP -> TM with TM's getOutput("predictedActiveCells") apparently being optional. But users (and our tests) can expect the output and query for it, and it's empty.

So we should either:

  • always produce all outputs, as you suggest.
  • or document this and require user to mark the (optional) output as required. How would one do that? (eg on network_test.py, see Cleanup exceptions #742 (comment) )
  • what I wanted to do in 2149e83 is extending the limitation from "has outgoing" to "has in or out links". If you look at it as a graph problem, a network with (some) disconnected regions should not compute them (E A->B->C : do not compute E). Now we compute all of B because it has out link to C. But we do not compute optional outputs of C (as it has no outgoing links)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static LogLevel NTA_LOG_LEVEL = LogLevel::LogLevel_Minimal; // change this in your class to set log level

The above variable is not Global. It will create a different instance in every .cpp in which log.hpp is included. So it will work only if the logLevel is set and used in the same .cpp.

To make this global you need to declare it extern in the .hpp and someplace in some .cpp it must be declared without the static. OR instantiate a class that has this as a class variable...a more C++ way of doing things.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and you called the outputs 'regions'. That confused me at first. A region is the wrapper around an algorithm. A region can have multiple outputs.

Keep in mind that when we start doing multi-threading, this global variable will not be very useful. A better way to handle trace when doing threads is to have the global in the thread-global space so that each thread's trace can be controlled independently. That is why I wanted to use a function or Macro to set the logging variable rather than assigning a value to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets always generate ALL outputs, which means we don't need isDebug( ).[...] If they are declaring E then they must be using if for something.

you're right, let's not be smarter than the user, the implementation will also be simpler.

A better way to handle trace when doing threads is to have the global in the thread-global space so that each thread's trace can be controlled independently. That is why I wanted to use a function or Macro to set the logging variable rather than assigning a value to it.

I agree. Are we able to do that w/o a significant change to how the NTA_WARN etc macros are used? Or we make a big step and deprecate the macros and use the Log class you've charted (like in java Logging works) ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we make a big step and deprecate the macros and use the Log class

Oh, I don't propose deprecating the macros, just change the macros to use the Log class.

If you declare the class as a static variable in the Log.hpp file it will create a separate instance of it in each .cpp in which Log.hpp is included. But that is ok since we only need state from the class variable which will be global in scope. An alternative is to go head and use the thread global space to store the LogLevel state (prefered). Either way we would not need a Log.cpp. Everything could be hidden by Macros so that they can be changed without changing the API.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, there is a new feature in C++11 that I did not know about. Its called thread_local. So we don't have to get the thread id and allocate the thread specific space for it. Its done for us.
See https://en.cppreference.com/w/cpp/language/storage_duration

thread storage duration. The storage for the object is allocated when the thread begins 
and deallocated when the thread ends. Each thread has its own instance of the object. 
Only objects declared thread_local have this storage duration. thread_local can appear 
together with static or extern to adjust linkage. See Non-local variables and Static local 
variables for details on initialization of objects with this storage duration.

With this you don't need the Log class.

@breznak breznak added bug Something isn't working and removed ready labels Nov 2, 2019
@dkeeney
Copy link

dkeeney commented Nov 4, 2019

I like what you did for the marco's but the 'trace' facility will not work as you have it. If you would like I am willing to re-implement a simplified trace for you in Log.hpp

Copy link

@dkeeney dkeeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@breznak I took another look at this so see if we can agree on what it takes to finish it. I like what you have done and would just like to correct the logging facility.

This gives me everything I was looking for.

  • I will have Network::setLogLevel(LogLevel_xxx)
  • The LogLevel is global within a thread.

@@ -176,6 +176,16 @@ void TMRegion::initialize() {
args_.sequencePos = 0;
}


bool TMRegion::isConnected_(string name) const {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about always generating all outputs, even if no connections are made. So this new function would not be needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about always generating all outputs, even if no connections are made

I'm having second thoughts about this. I've implemented that in TMRegion and it works fine! (same should then be done for SPRegion).
What we should consider is the tradeoff between ease of use for the user/speed. We're computing outputs that may not be needed (and I'm not sure of it's costly enough to care).

The proble is only with the leaf node (here TM, and not SP) which has no outgoing links. Alternative approach would be a dummy OutputRegion that the user would add after the TM to specify which outputs are linked (=used).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we should consider is the tradeoff between ease of use for the user/speed.

I agree with you. We should not be computing the outputs if they are not used if we can help it.

The problem is when an app gets the input or output buffers directly with region.getOutputData( ) or region.setInputData( ). These could be used on any buffer at any time without the region impl being involved.

Originally, these two functions were not exposed to apps. The apps had to call region.getArrayParameter(name) to access the 'optional' data and in those handlers the region impl had the chance to generate the buffer before returning it.

Perhaps we could add a hook in getOutputData( ) that allowed a region impl to do something before it returned the buffer. That adds yet another complication to the region impl's, but only for those that have optional buffers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That adds yet another complication to the region impl's, but only for those that have optional buffers.

the implementation seems already quite complex to me.

  • is there a way in the Spec to say "I will be using this optional output"?
  • or we just KISS it and compute it all for now, unless someone complains on Regions' performance.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way in the Spec to say "I will be using this optional output"?

No, the Spec is static and describes what the region will do, not how it is used.

or we just KISS it and compute it all for now, unless someone complains on Regions' performance.

I guess so. We should keep thinking about how we might manage this however.

src/htm/utils/Log.hpp Outdated Show resolved Hide resolved
@breznak breznak requested a review from dkeeney November 8, 2019 11:26
@breznak breznak added bug Something isn't working NetworkAPI and removed bug Something isn't working labels Nov 8, 2019
@dkeeney
Copy link

dkeeney commented Nov 8, 2019

@breznak I hope you don't mind me messing with this.
I made the LogLevel variable thread local and tested it out on Windows.
I still need to see if it works on Linux and OSx.
While I was at it I took the liberty to make the trace display a little more readable.

@dkeeney
Copy link

dkeeney commented Nov 8, 2019

To see the trace work, I added a test. See TMRegionTest.cpp You will need to turn on verbose for it to show in that test. No since displaying it all for normal running of the unit test.

@dkeeney
Copy link

dkeeney commented Nov 8, 2019

This is what the trace looks like on Ubuntu 19.04

[          ] Turning on Trace =========
DEBUG:	VectorFileSensor.cpp:148: compute Output: region1.dataOut [ Real32 20 ( 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 )  ] 
DEBUG:	Link.cpp:190: compute Link: copying region1.dataOut-->region2.bottomUpIn; delay=0; size=20 type=Real32 --> SDR
DEBUG:	SPRegion.cpp:155: compute Input:  region2.bottomUpIn [ SDR( 20, 1 ) 0, 10
 ]
DEBUG:	SPRegion.cpp:162: compute Output: region2.bottomUpOut [ SDR( 2, 10 ) 7
 ]
DEBUG:	Link.cpp:190: compute Link: copying region2.bottomUpOut-->region3.bottomUpIn; delay=0; size=20 type=SDR --> SDR
DEBUG:	TMRegion.cpp:214: compute Input:  region3.bottomUpIn [ SDR( 2, 10 ) 7
 ]
DEBUG:	TMRegion.cpp:242: compute Output: region3.bottomUpOut [ SDR( 5, 2, 10 ) 35, 36, 37, 38, 39
 ]
DEBUG:	TMRegion.cpp:246: compute Output: region3.activeCells [ SDR( 5, 2, 10 ) 35, 36, 37, 38, 39
 ]
DEBUG:	TMRegion.cpp:251: compute Output: region3.predictedActiveCells [ SDR( 5, 2, 10 ) 37
 ]
DEBUG:	TMRegion.cpp:256: compute Output: region3.anomaly [ Real32 1 ( 1 )  ] 
DEBUG:	TMRegion.cpp:260: compute Output: region3.predictiveCells [ SDR( 2, 10, 5 ) 
 ]
DEBUG:	Link.cpp:190: compute Link: copying region3.bottomUpOut-->region4.dataIn; delay=0; size=100 type=SDR --> Real32
DEBUG:	VectorFileEffector.cpp:71: compute Input:  region4.dataIn [ Real32 100 ( 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 )  ] 
[          ] Turned off Trace =========

@dkeeney
Copy link

dkeeney commented Nov 8, 2019

The trace would be a little more readable if I removed the std::endl from the end of the operator<< for the SDR class so that the trailing ']' would not appear on the next line. But I did not for fear that I might mess up some other display that outputs SDR.

Copy link
Member Author

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you don't mind me messing with this.

not at all, thank you for improving the logging facility!!

But I did not for fear that I might mess up some other display that outputs SDR.

it wouldn't be any critical, but I think it's fine as is for now 👍

this is ACK for me, please approve and we can merge this!

@@ -28,32 +28,36 @@

namespace htm {
enum class LogLevel { LogLevel_None = 0, LogLevel_Minimal=1, LogLevel_Normal=2, LogLevel_Verbose=3 };
static LogLevel NTA_LOG_LEVEL = LogLevel::LogLevel_Minimal; // change this in your class to set log level
// change this in your class to set log level using Network.setLogLevel(level);
extern thread_local LogLevel NTA_LOG_LEVEL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess non NetworkAPI classes can also use the static Network.setLogLevel(), but writing directly to this variable should work too, right?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but writing directly to this variable should work too, right?

yes. you can just assign to it. NTA_LOG_LEVEL = LogLevel::LogLevel_Verbose;

We could use some other class to put the setLogLevel( ) function on that is accessible by the algorithms but I did not find anything general enough. Anyway, the python binding interface has it on the Network class.

@breznak breznak added the ready label Nov 8, 2019
Copy link

@dkeeney dkeeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see if it will allow me to approve this since the last push come from me.

@breznak breznak merged commit 32ab02c into master Nov 9, 2019
@breznak breznak deleted the cleanup_exceptions branch November 9, 2019 00:27
@breznak
Copy link
Member Author

breznak commented Nov 9, 2019

Thanks for review and your code, David 👍

Lets see if it will allow me to approve this since the last push come from me.

yeah, that's a funny thing. If I open a PR with someone elses code to this repo, the person who opens the PR owns it and cannot give reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working code code enhancement, optimization, cleanup..programmer stuff NetworkAPI ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants