Skip to content
This repository has been archived by the owner on Jun 10, 2020. It is now read-only.

[BUG] Destination not available. #48

Open
Spydrouge opened this issue Nov 19, 2014 · 2 comments
Open

[BUG] Destination not available. #48

Spydrouge opened this issue Nov 19, 2014 · 2 comments

Comments

@Spydrouge
Copy link
Contributor

WONT FIX

Original Problem: There was a very finnicky error that only showed up sometimes and on some configurations, where the embodiment would hang at "Something edible was reported." This error had previously been 'fixed' by delaying a mapInfo message to make sure it only fired after the final 'terrainPerception' message had been sent. But when it cropped up again, it became clear that the OAC's initialization process was being interrupted some strange way- and inconsistently at that.

One partial solution and one workaround were implemented.

Workaround: we are using delays in attempt to pace messages to the OAC, which seems to be prone to seizures. Attempting to highlight those areas with FIXMEs, which can be seen in Monodevelop. The biggest one is putting a small delay between when the embodiment initializes and when the button spawns- even if previous code changes ought to have 'fixed' this.

Partial Solution: If OCConnectorSingleton ever recieves the message to mark the OAC as unavailable, it will throw an error, as it means embodiment did not (and will never) initialize successfully (typically owed to loading the AGI_ROBOT too soon, an issue covered in #49 )

  • Update: This task may actually have been repaired in the course of attempting to solve [Bug] Block Destroy Hypothesis #65 because we removed some unnecessary calls to functions that might have tried to notify the OAC of the update

Current Task
I am also going to set an error message and boolean in OCConnectorSingleton's 'mark element as unavailable,' as right now the only reasons this message would print would be because everything went boom! And I want to catch that.

Recently got some Unit Test 'message sent sensing' implemented after I figured out an optimal place to locate such things. Now I have been very easily been able to sense that the battery information is at least passed to OpenCog, and I SHOULD be able to very easily sense when plans are recieved and executed.

However there is now some crazy error that repeated over and over again. Unity says the Destination is not Available; and the embodiment hangs at "Something edible was reported."

Latest Update
Okay... I added a breakpoint someplace, and it ran :| I was trying to test when the element was marked as unavailable in the first place but, of course it ran making it impossible to check. Upon removing the breakpoint, everything broke again.

Right now I have really no idea how to debug this thing
So I'm just going to try a couple random things. First, I'm going to delay like a split second in between the embodiment and adding the battery voxel (even though it should be okay)

  • I delayed the battery spawn by 0.1 seconds. Nothing happened.
  • I delayed the battery spawn by 20 seconds. The OAC was never marked unavailable and the plan was successful. (Clearly the finish terrain perception message is NOT the odd one).
  • I placed the delay after the battery spawn, assuming nothing different would happen. The OAC was marked as unavailable even before the terrain perception was done. I am 90% sure this wasn't happening before, because the mapInfo message was delayed to after the terrain perception finished.
[░]   Destination not available. Discarding message to 'OAC_Hazuki93' of type 'STRING': <?xml version="1.0" encoding="UTF-8"?>

<oc:embodiment-msg xsi:schemaLocation="http://www.opencog.org/brain BrainProxyAxon.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oc="http://www.opencog.org/brain">

  <finished-first-time-percept-terrian-signal timestamp="2014-11-20T12:22:49.408" />

</oc:embodiment-msg>
  • Running it with the same code again but a slightly different messageLock had it hanging at the original location. Perhaps this failure to send the finished first itme percept terrain signal was caused by soemthing else?

Questions and Theories

  • I know that when the battery was being spawned RIGHT after the embodiment, things broke (A MapInfo got sent midway through perceive terrain, and if I remember the error state looked something like this)
  • I used a flag inside OCConnector to check whether the perceiveterrain message had been sent, and i only allowed things like PerceiveWorld to be called THEN.
  • With this adjustment made, the battery was allowed to be spawned right after the embodiment, and everything worked fine.
  • However, there is a question of WHERE the flag has been set.
  • Are the messages a stack or queue? Do they pop in the order they were sent, or in the reverse? <- Add seems to place a message at the end. They are iterated through in a foreach.
  • Dispatch times still suggest that the perceive terrain message is getting sent EARLIER < than the mapinfo messages. <--- so this shouldn't be a problem.
  • Could it be that OAC is multithreaded and that it doesn't get through perceiving the terrain data before it starts trying to perceive the map info? <- This seems to be an unlikely source of our problem, given that we managed to get it to error BEFORE the final perception message was sent.
  • I have noticed that some locations write lock(_messageSendingLock) and others lock(_messagesToSend) I altered all of them to the latter with no success in fixing things.

So how do I test this...
For all I know this error was buried in older code even... Let's see well I might be able to debug.log the context it's breaking at and through that see the stacktrace. Then I can guess at what hasn't run yet. I can run with and without the breakpoint to try and see what ends up marking it as available again that isn't running.

Okay this is interesting. When the breakpoint isn't called, OAC is never set to unavailable. It is only when it is running without the breakpoint that it ever gets set to unavailable.

I did a stack trace on where it's getting set to unavailable:

OpenCog.Network.OCNetworkElement:MarkAsUnavailable(String) (at Assets\OpenCog Assets\Scripts\Connections\Network\OCNetworkElement.cs:822)
OpenCog.Network.OldMessageHandler:parse(String) (at Assets\OpenCog Assets\Scripts\Connections\Network\OldMessageHandler.cs:195)
OpenCog.Network.OldMessageHandler:run() (at Assets\OpenCog Assets\Scripts\Connections\Network\OldMessageHandler.cs:104)

104 is in the public void run() function in this while loop:

        // TODO Make some tests to judge the read time.
        string line = reader.ReadLine();

        if(line != null)
        {
            //string answer = this.parse(line);
            this.parse(line);

            //UnityEngine.Debug.Log ("Just parsed '" + line + "'");
        }

195 is obviously under parse() and is in this big control statement:

else if(command.Equals("UNAVAILABLE_ELEMENT"))
{
    if(token.MoveNext()) // Has more elements
    {   
        // Get unavalable element id.
        string id = token.Current.ToString();

        System.Console.WriteLine(OCLogSymbol.DETAILEDINFO + "onLine: Unavailable element message received for [" + 
                  id + "].");
        this.ne.MarkAsUnavailable(id);
        answer = OCNetworkElement.OK_MESSAGE;
    }
    else
    {
        answer = OCNetworkElement.FAILED_MESSAGE;   
    }
}

So it looks to me like we are actually getting a message from the embodiment side that the OAC is unavailable. And it's never getting marked avialable. My question is, if I left it running indefinitely, would it fix itself?

The answer: No. It never corrects itself. In fact it takes quite a delay to mark itself as unavailable. Is it possible we've somehow done something to it?

Note on Connections
The embodiment-gameworld connection is really, really fritzy. I can get it to connect like 1 out of 3 tries. Part of this, I believe, can be witnessed if one turns off the play button but leaves the embodiment running. OldMessageHandler.run() still exists and will keep printing stuff to console... leading me to believe it doesn't shut down properly if embodiment stays open after the game world does.

Here is the Error
(There are actually two that repeat)

[░]   Destination not available. Discarding message to 'OAC_Bender97' of type 'STRING': <?xml version="1.0" encoding="UTF-8"?>
<oc:embodiment-msg xsi:schemaLocation="http://www.opencog.org/brain BrainProxyAxon.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oc="http://www.opencog.org/brain">
  <avatar-signal id="OAC_Bender97" timestamp="2014-11-20T10:50:12.640">
    <physiology-level name="hunger" value="0" />
    <physiology-level name="thirst" value="0" />
    <physiology-level name="pee_urgency" value="0" />
    <physiology-level name="poo_urgency" value="0" />
    <physiology-level name="energy" value="0.947222222222222" />
    <physiology-level name="fitness" value="0.694436013232917" />
  </avatar-signal>
</oc:embodiment-msg>
UnityEngine.Debug:Log(Object)
OCConnectorSingleton:SendMessages() (at Assets/OpenCog Assets/Scripts/Connections/Embodiment/OCConnectorSingleton.cs:372)
OCConnectorSingleton:FixedUpdate() (at Assets/OpenCog Assets/Scripts/Connections/Embodiment/OCConnectorSingleton.cs:311)
[░]   Destination not available. Discarding message to 'OAC_Bender97' of type 'TICK': TICK_MESSAGE
UnityEngine.Debug:Log(Object)
OCConnectorSingleton:SendMessages() (at Assets/OpenCog Assets/Scripts/Connections/Embodiment/OCConnectorSingleton.cs:372)
OCConnectorSingleton:FixedUpdate() (at Assets/OpenCog Assets/Scripts/Connections/Embodiment/OCConnectorSingleton.cs:311)

-My gut tells me that accessing the OCConnectoSingletonr's Instance() at a strange & new location prior to when the OcConnectorSingleton would normally be built by the Embodiment creation might have caused this. Obviously this SHOULD not be the case; but then it should be reasonable to test. all I have to do is backtrack edits made to this class.-

I'll try and work it out, but if anyone else has any ideas why this might be lemme know.

It is not the dispatch flags
From what I can tell, the introduction of the dispatch flags are not what damaged this. I'm trying to think of what the best approach is to finding the bug... should I look at changes over time? Go back to the last working version? Insert debug statements?

@Spydrouge
Copy link
Contributor Author

@Nemquae Hey Lake is there anything I should know about that was tinkered with embodiment side? I am having OAC_RobotNameAndNumber sending an unavialable element message to the unity game and then never sending an 'available' element message... so unless I slow down execution with breakpoints or something, we end up in an infinite loop of discarded messages for unavailable elements.

If not can you advise what would make the OAC say it's unavailable (and repeat it ad nauseam)

@Spydrouge Spydrouge added wontfix and removed major labels Nov 20, 2014
@Spydrouge
Copy link
Contributor Author

We have come to the conclusion that we are just watching Race conditions happen. And since we send off the messages in the correct order (to the best of our knowledge) it would appear this is happening on the OAC side. Therefore, implementing a workaround right now.

Added a FIXME [RACE] in the code to highlight it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant