Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: broken pipe errors starting from Go 1.12 when using a deadline #35373

Closed
AmirSimon opened this issue Nov 5, 2019 · 11 comments
Closed

net: broken pipe errors starting from Go 1.12 when using a deadline #35373

AmirSimon opened this issue Nov 5, 2019 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@AmirSimon
Copy link

What version of Go are you using (go version)?

1.12

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

linux amd64

What did you do?

My code has been running well until I have decided to upgrade the go version in my project from 1.11.9 to 1.12.
updating to Go version 1.12, resulted in multiple integration tests in my project failing by the same error:
write tcp 10.131.1.55:47494->34.251.62.18:443: write: broken pipe

Its important to add I am using Gorilla package, v1.2.0 https://github.com/gorilla/websocket

webSocket connection is made by a test client connecting to a remote machine :

How I establish a connection:

Dialer = &Dialer{
	Proxy:            http.ProxyFromEnvironment,
	HandshakeTimeout: 45 * time.Second,
}

requestHeader := http.Header{}
requestHeader.Add("Cookie", cookie)
requestHeader.Add(XSRF_TOKEN_KEY, token)
websocket.Dialer.TLSClientConfig = &tls.Config{
	InsecureSkipVerify: true,
	MinVersion:         tls.VersionTLS12,
	MaxVersion:         tls.VersionTLS12,
}
ws, _, err := websocket.Dialer.Dial(url, requestHeader)

how the server listens:

ws.websocketConn.SetReadLimit(Max)
ws.websocketConn.SetReadDeadline(time.Now().Add(60 * time.Second))
ws.websocketConn.SetPongHandler(
		func(string) error { ws.websocketConn.SetReadDeadline(time.Now().Add(60 * time.Second)); return nil })
	for {
		_, msg, err := ws.websocketConn.ReadMessage()
		if err != nil {
			CLOSE CONNECTION
			break
		}

after playing with the tests I found out that when setting the read deadline via SetReadDeadline to 0 - no deadline at all, tests seems to work again. (net.go)
(20 minutes deadline still resulted in failed tests)

which led me to think that something after the connection established takes way to long.
went over all of the release notes of 1.12 in an attempt to see what changed that could affect my code, but with no luck. (specifically net/http and net changes)

Tried with multiple Go versions, and it looks like it's failing from 1.12 up to 1.13.

I'm not able to reproduce this issue with a simple connection from my machine to a remote one.
It happens mostly in my test environment where two remote machines establish a connection and passing information with a websocket

This issue might have something to do with gorilla package, however, I have been using v1.2 for years. (also tried upgrading gorilla package to their latest 1.4)

If you think of any more information I could provide, please tell me.

What did you expect to see?

Expected connection to stay stable.

What did you see instead?

Connection disconnected

@ianlancetaylor ianlancetaylor changed the title write: broken pipe errors starting from Go 1.12 net: broken pipe errors starting from Go 1.12 when using a deadline Nov 5, 2019
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 5, 2019
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Nov 5, 2019
@ianlancetaylor
Copy link
Member

Is there a way that we can recreate the problem ourselves?

You say that with no deadline, the tests pass, but with a deadline, they fail. When you set the deadline to 20 minutes, how long does it take the test to fail?

@AmirSimon
Copy link
Author

@ianlancetaylor , Sorry for the late reply.

Did some more research - its indeed happening when trying to close a web socket from the client side which was already closed by the server because of timeout.

I stand correct on what I said before - setting the deadline to 20 minutes does make the tests pass, its all a matter of if the test runs before the deadline is met.

however, after setting the deadline to about a minute, I added timestamps to a specific test and notice something strange - on version 1.11 and version 1.12 the test takes about 5 minutes, but only the web-socket connection of version 1.12 is being closed beforehand for some reason - making the test failed when both runs call ws.Close() at the end.

am I missing something in terms of why is my connection at 1.11 staying alive until the end (ws.Close() called) compared to 1.12?

@networkimprov
Copy link

1.12 implements TCP keepalive timeouts on client side by default. When they occur, they appear to be deadline events.

See also #31490 & #23459

@AmirSimon
Copy link
Author

Tried to set the KeepAlive of the Dialer to a negative value, but saw no change.
will investigate further on this matter and update here.
Thanks @networkimprov.

@AmirSimon
Copy link
Author

First of all, its important to add that this does not happen on MAC os, and it does fails on CentOS

@networkimprov I tried disabling the default enabled KeepAlive at the client (as stated in the documentation - setting KeepAlive to a negative value.
Also tried sending every pingPeriod a ping control message from the client to the server (similar to the gorilla package example: https://play.golang.org/p/V2OvXyAQzh)
but it still fails. (where pingPeriod varied from millisecond to 54 seconds)

What did work is setting the client's Dialer timeout to
so Basically the Server close the connection with i/o timeout( timeout which was received by the client) and then when the client comes to close the web socket connection, it fails with broken pipe.

what causes my client to throw timeout?

@networkimprov
Copy link

First, update to latest Gorilla-websocket; 1.2 is 2y old. https://github.com/gorilla/websocket/releases

Then if you can provide two complete, simple programs that reproduce the problem with Go 1.13 (but are OK on 1.11), the Go team can investigate.

@AmirSimon
Copy link
Author

As stated above - "also tried upgrading gorilla package to their latest 1.4"

I just want to add that if the client OR the server is running on go 1.11 (one of them is enough), everything works.

I'll try to come up with two simple programs that reproduces the problem, thanks for your help

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Nov 15, 2019
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@dzahariev
Copy link

The issue is still present. Please, reopen!

@ianlancetaylor
Copy link
Member

Reopening, but we can't fix this without a test case that reproduces the problem.

@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 20, 2019
buger pushed a commit to TykTechnologies/tyk that referenced this issue Apr 8, 2021
<!-- Provide a general summary of your changes in the Title above -->
Add hacky retry to deal with tls errors

Upon investigation it seems to be some language bug

[somewhat similar issue](golang/go#35373)

tuning tls client doesn't help to resolve this issue

by running tests multiple times

```
go test -v -run TestAPIMutualTLS -count 50 ./gateway
```
Tests fails 2-3 times and uses 1 retry to pass

## Description
<!-- Describe your changes in detail -->

## Related Issue
<!-- This project only accepts pull requests related to open issues -->
<!-- If suggesting a new feature or change, please discuss it in an issue first -->
<!-- If fixing a bug, there should be an issue describing it with steps to reproduce -->
<!-- Please link to the issue here -->

## Motivation and Context
<!-- Why is this change required? What problem does it solve? -->

## How This Has Been Tested
<!-- Please describe in detail how you tested your changes -->
<!-- Include details of your testing environment, and the tests you ran to see how your change affects other areas of
the code, etc. -->

## Screenshots (if appropriate)

## Types of changes
<!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Refactoring or add test (improvements in base code or adds test coverage to functionality)

## Checklist
<!-- Go over all the following points, and put an `x` in all the boxes that apply -->
<!-- If you're unsure about any of these, don't hesitate to ask; we're here to help! -->
- [ ] Make sure you are requesting to **pull a topic/feature/bugfix branch** (right side). If pulling from your own
      fork, don't request your `master`!
- [ ] Make sure you are making a pull request against the **`master` branch** (left side). Also, you should start
      *your branch* off *our latest `master`*.
- [ ] My change requires a change to the documentation.
  - [ ] If you've changed APIs, describe what needs to be updated in the documentation.
  - [ ] If new config option added, ensure that it can be set via ENV variable
- [ ] I have updated the documentation accordingly.
- [ ] Modules and vendor dependencies have been updated; run `go mod tidy && go mod vendor`
- [ ] When updating library version must provide reason/explanation for this update.
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.
- [ ] Check your code additions will not fail linting checks:
  - [ ] `go fmt -s`
  - [ ] `go vet`
@seankhliao
Copy link
Member

doesn't look like there are further reports or a clear reproducer

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Jan 28, 2023
@golang golang locked and limited conversation to collaborators Jan 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants