Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue 196: acceptor stop then restart doesnt work (re-push of #351) #586

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

mgatny
Copy link
Member

@mgatny mgatny commented Mar 12, 2020

Refs #196
Replaces #351

…ns are disabled during the call to logout during acceptor.stop()
…ns are disabled during the call to logout during acceptor.stop()
…com:huwmjenkins/quickfixn into Issue-196-AcceptorStopThenRestartDoesntWork
@gbirchmeier gbirchmeier changed the title huwmjenkins: issue 196: acceptor stop then restart doesnt work issue 196: acceptor stop then restart doesnt work (re-push of #351) Mar 12, 2020
@gbirchmeier
Copy link
Member

gbirchmeier commented Mar 20, 2020

One test failing:
RestartingTheThreadedSocketAcceptorTests.cs:389 - TestCanForceStopAcceptorAndLogOffCounterpartyIfLoggedOn

Looking into it.

cc @huwmjenkins

@gbirchmeier
Copy link
Member

gbirchmeier commented Mar 20, 2020

Seems there's a little bit of flakiness in here. After deleting logs (relevant?) and running again, I can make the test pass repeatedly. However, when I run the whole suite, this time it crashed out on TestStartedAcceptorAndReceiveMultipleConnnections) (this time on the netcoreapp2.0 side, but not sure if that matters).

It's not consistent. Seems to happen only when I run certain tests together.

Might be a test fixture problem:

System.ObjectDisposedException
  HResult=0x80131622
  Message=Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
  Source=System.Net.Sockets
  StackTrace:
   at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode, AsyncCallback callback, Object state)
   at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
   at UnitTests.RestartingTheThreadedSocketAcceptorTests.ReceiveAsync(SocketState socketState) in C:\fork\quickfixn\UnitTests\RestartingTheThreadedSocketAcceptorTests.cs:line 155
   at UnitTests.RestartingTheThreadedSocketAcceptorTests.ProcessRXData(IAsyncResult ar) in C:\fork\quickfixn\UnitTests\RestartingTheThreadedSocketAcceptorTests.cs:line 210
   at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
   at System.Net.ContextAwareResult.CompleteCallback()
   at System.Net.ContextAwareResult.<>c.<Complete>b__15_0(Object s)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)

Man, I hate this part of the code.

@huwmjenkins
Copy link
Contributor

I've just had a look at the code and I don't know if I'm getting the same issue as you but what I found was when debugging was that when the ClientHandlerThread was being created a FileLog is created to log debug messages and in as the session settings in these tests do not set a DEBUG_FILE_LOG_PATH it uses the default

string debugLogFile = "log";

now when the FileLog is created it checks that the directory exists and if it doesn't it tries to create it:

        if (!System.IO.Directory.Exists(fileLogPath))
            System.IO.Directory.CreateDirectory(fileLogPath);

when I was debugging these tests it looks like it was trying to create the "log" directory in c:\windows\system32 now I'm fairly certain it didn't used to do this as I'm sure my tests used to run (but then I quite often run visual studio as administrator so it might have done and been quite happy to create this directory - for record I am currently running visual studio 16.5.0 ). Anyway, I changed the initialisation of the default debug log file locally to:

string debugLogFilePath = Path.Combine( Path.GetDirectoryName(new Uri(Assembly.GetExecutingAssembly().CodeBase).LocalPath), "log" );

and the tests have started working again. But, I'm not sure if this is the same cause of the issue as you are getting. I have been quite busy the past few weeks but I should be a little quieter next week and I shall try and have a more detailed look.

Attempt to eliminate flakiness (connamara#586)
@gbirchmeier
Copy link
Member

gbirchmeier commented Mar 20, 2020

@huwmjenkins: I think I did as you suggested; see my recent commit.

The problem hasn't gone away. I hope I can find time to dig a little deeper.

No problem if you can't find time. I'm pretty busy too, but I'm trying to stay with this so I don't start to forget about it again and let another 2 years go by. :)

@gbirchmeier
Copy link
Member

Running tests on some other branches, and I'm starting to suspect that my observed issues are not totally from this branch. Possibly something in SessionDynamicTest.cs. Needs more investigation.

@huwmjenkins
Copy link
Contributor

Ah, I think I may know the issue though I thought i'd pushed this change as part of the pull request. Anyway, in the TearDown of SessionDynamicTest.cs there is this bit of code:

if (_listenSocket != null)
_listenSocket.Close();
if (_initiator != null)
_initiator.Stop(true);
if (_acceptor != null)
_acceptor.Stop(true);

The issue is that the _initiator and _acceptor are not being disposed. It is during the dispose of these that the FileLog will be disposed of correctly. At some point, and I don't know if this was how it was before I started my changes or occured in another merge and when I upmerged this pull request I then moved code around but the dispose of the underlying session (and subsequent closure of all log handles) happens in the Dispose method of the ThreadedSocketAcceptor in my PR. I believe at some point it was in the Stop method but this then caused issues when trying to Restart the acceptor as all the sessions would need reinitialising. So, the easiest solution was to move the dispose of the sessions into the dispose of the ThreadSocketAcceptor . This was one of the things I wanted to look at a bit more detail whether it made sense to have the Stop method just stop the tcp connection and leave the sessions as they are and only get ride of the sessions once the Acceptr had been disposed of or should is it better to dispose of the session and recreate them each time the acceptor is stopped and started. I think I was favouring the former as it seemed less messy but it was something I wanted to revisit.

Anyway, after all that rambling and assuming the issue your having is the same as the issue I think you're having then changing the tear down of SessionDynamicTest.cs so that the initiator and acceptor are disposed rather than stopped should fix the issue (Dispose calls Stop(true)):

if (_listenSocket != null)
_listenSocket.Close();
if (_initiator != null)
_initiator.Dispose();
if (_acceptor != null)
_acceptor.Dispose();

@chrisbarker
Copy link
Contributor

Hi @gbirchmeier does this PR still have life? I work with @huwmjenkins and would love to see it finally get merged. Anything I can do to help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants