Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does websocket fail properly when not being able to connect? #455

Closed
JustArchi opened this issue Sep 26, 2017 · 10 comments
Closed

Does websocket fail properly when not being able to connect? #455

JustArchi opened this issue Sep 26, 2017 · 10 comments
Assignees
Labels
Milestone

Comments

@JustArchi
Copy link
Contributor

JustArchi commented Sep 26, 2017

One of my users can't establish a connection when using websocket protocol, while totally being able to do so through TCP protocol.

By taking a look at ASF debug log, I noticed the following:

2017-09-26 01:53:15|ArchiSteamFarm-5028|INFO|bot|Start() Starting...
2017-09-26 01:53:15|ArchiSteamFarm-5028|INFO|bot|Connect() Connecting...
2017-09-26 01:53:15|ArchiSteamFarm-5028|DEBUG|ASF|WriteLine() ServerList | Resolving server list
2017-09-26 01:53:15|ArchiSteamFarm-5028|DEBUG|ASF|WriteLine() ServerList | Resolved 160 servers
2017-09-26 01:53:15|ArchiSteamFarm-5028|DEBUG|ASF|WriteLine() ServerList | Next server candidiate: Unspecified/CM01-LUX.cm.steampowered.com:27021 (WebSocket)
2017-09-26 01:55:15|ArchiSteamFarm-5028|WARN|bot|InitPermanentConnectionFailure() Failed to disconnect the client. Abandoning this bot instance!
2017-09-26 01:55:15|ArchiSteamFarm-5028|INFO|bot|Stop() Stopping...
2017-09-26 01:55:15|ArchiSteamFarm-5028|INFO|bot|Start() Starting...
2017-09-26 01:55:15|ArchiSteamFarm-5028|INFO|bot|Connect() Connecting...
2017-09-26 01:55:15|ArchiSteamFarm-5028|DEBUG|ASF|WriteLine() ServerList | Next server candidiate: Unspecified/CM01-LUX.cm.steampowered.com:27021 (WebSocket)

It seems that websocket does not fail in at least 2 minutes of timeout. You can notice my workaround code that tries to detect fuckups like these (stalled connections) and destroys/recreates the bot from the ground up, but then it tries to connect to exactly the same server and loop goes forever.

Is this normal? Is websocket timeout configurable? Maybe you'd want to dig deeper into this? I don't know if this should be classified as a bug, it looks like a bug to me since websocket doesn't react in any way for 2 minutes since attempting to connect. This does not happen when SK2 fails to connect via TCP or UDP - OnDisconnected() always arrives.

@yaakov-h yaakov-h added this to the 2.0.0 milestone Sep 26, 2017
@yaakov-h
Copy link
Member

It should, but I'll have to test to see what it actually does.

What platform is that log from? Windows, macOS, or Linux?

@JustArchi
Copy link
Contributor Author

JustArchi commented Sep 26, 2017

That's from Windows 7 SP1 specifically. I have a theory that Windows 7 might have something to do with this, since I didn't receive similar reports before, neither was able to reproduce this particular behaviour myself on Windows 10.

What is weird is lack of any exception, noise in SK2 debug log or anything. It just stays like that in frozen state. That state is being kept for full 2 minutes, afterwards my code is forcefully killing that bot together with everything since it's stalled forever. I doubt anything else would happen if it didn't for straight 2 minutes, but of course I could check if needed.

@yaakov-h
Copy link
Member

I think Websocket support was only natively introduced with Windows 8 / Server 2012.

As such, I expect the ClientWebSocket constructor would blow up, which would bubble out to here.

This task is not observed until you call SteamClient.Disconnect, but since we still have a reference to it, the unobserved task exception handled (from your other websocket issue) won't be triggered.

That would explain the whole business of nothing happening.

@yaakov-h yaakov-h self-assigned this Sep 26, 2017
@yaakov-h yaakov-h added the bug label Sep 26, 2017
@JustArchi
Copy link
Contributor Author

This makes sense. As part of my own investigation I checked if it can be due to self-contained app model targetting win-x64, but even in generic published variant with .NET Core 2.0 SDK installed on top of this setup, we reproduced exactly the same behaviour, so it's not lack of prereqs, some runtime mismatch or other self-contained app problem.

Is there anything we can do to address this? I can always check against Windows 7 and disable websocket from available protocols myself, but this is probably not the best solution considering other people might stumble upon this too.

@yaakov-h
Copy link
Member

I thought perhaps we should do the same within SteamKit2, but it may just be worth escalating this one to the .NET Core team. There is a managed implementation, but they only use it on Unix targets, not Windows/UWP.

It looks like it's already possible to run the managed implementation on Windows if you compile it all yourself.

@JustArchi
Copy link
Contributor Author

JustArchi commented Sep 26, 2017

I'll open issue in corefx, see what they say. In worst case we have that check against windows 7 possibility, but perhaps there is something better. Let's keep this issue open until we decide upon something, since we probably should address it in some way in SK2.

@JustArchi
Copy link
Contributor Author

Seems like this can take a while 🙃. Gonna address it myself until then, not sure if SK2 should follow this way or not, since technically corefx is the right place for fixing this.

JustArchi added a commit to JustArchiNET/ArchiSteamFarm that referenced this issue Sep 26, 2017
@JustArchi
Copy link
Contributor Author

JustArchi commented Sep 26, 2017

I verified that the below works fine for detection of this issue, while it's probably the most "clean" solution without hardcoding any platforms (since WebSocketHandle.CheckPlatformSupport() is internal):

internal static bool SupportsWebSockets() {
	try {
		using (new ClientWebSocket()) {
			return true;
		}
	} catch (PlatformNotSupportedException) {
		return false;
	}
}

I used it for disabling websocket protocol out of my steam configuration before attempting to connect and "fixed" the issue.

However SK2 probably shouldn't make use of this, instead, I think throwing PlatformNotSupportedException from Connect() back to the consumer would be the best idea for now, unless you have some other idea in mind. We definitely can't leave this connection stalled like that - either stumble upon exception and throw it to the consumer the moment we get it, or do a pre-check similar to above and throw earlier, maybe right in SteamConfiguration.ProtocolTypes setter. At least that's what seems appropriate to me, both solutions sound good.

@yaakov-h
Copy link
Member

That works, but I'd rather avoid first-chance exceptions. I was thinking of using Win32's VerifyVersionInfo. The problem there is that if CoreFX fixes the Windows 7 behaviour, we'd still throw an exception.

We should definitely fix the stalling one way or another, but need to somehow inform a consumer why it's failing... I guess DebugLog might suffice.

@JustArchi
Copy link
Contributor Author

Yep, I thought about fetching windows version at first too, but like you said this will fail (for no reason) if Windows 7 gets websocket implementation, and it also won't cover all possible cases, for example new OSes that will come without websocket support for one reason or another. This is why I decided to go with try catch approach, it's the most reliable way to check that, since it calls WebSocketHandle.CheckPlatformSupport() under the hood in the constructor while not being that expensive to create itself.

Like always I'm just posting possible solutions that come to my mind, feel free to go with whatever you find appropriate (and I actually appreciate a lot when people solve the things better than I do, you never stop learning 👍).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants