Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusing SmtpClient hangs in net50 #49340

Closed
chrisaut opened this issue Mar 8, 2021 · 15 comments · Fixed by #70046
Closed

Reusing SmtpClient hangs in net50 #49340

chrisaut opened this issue Mar 8, 2021 · 15 comments · Fixed by #70046

Comments

@chrisaut
Copy link
Contributor

chrisaut commented Mar 8, 2021

Calling SendMailAsync multiple times on the same SmtpClient instance fails under some circumstances in net50.

Description

The following test succeeds when running under netcoreapp3.1, but fails by timing out in net50

[Fact(Timeout = 10000)]
public async Task Net50Repo()
{
    var config = Configuration.Configure();
    using var server = SimpleSmtpServer.Start(config.WithRandomPort().Port);
    var inst = new SmtpClient
    {
        Host = "localhost",
        Port = server.Configuration.Port,
        EnableSsl = false
    };

    for (int messageNo = 0; messageNo < 2; messageNo++)
    {
        var mailMessage = new MailMessage("one@example.com", "two@example.com");

        await inst.SendMailAsync(mailMessage);

        Assert.Equal(messageNo + 1, server.ReceivedEmailCount);
    }
}

The issue seems to be reusing the SmtpClient, the first SendMailAsync succeeds, but the second call always just hangs indefinitely.

Works in 3.1. Newing up a new client for each call also works. Testing against a different Smtp server (eg. Papercut) also works.

netDumpster just ends up waiting inside SmtpContext on this line:

count = this.socket.Receive(byteBuffer);
It's like if the socket remains open it cannot be reused anymore for some reason. Any idea what changed?

Configuration

We are able to reproduce with netdumpster in unit tests, production is not yet running in net50 for us (partly because of this), so I'm unsure if other smtp server are effected, we are nervous about this.

Regression?

Yes, regression from 3.1 to 5

Other information

I originally raised the issue on netdumpster, there is some more discussion there, although not much.

I am aware that other issues mention SmtpClient is not under active development, and it seems MS recommends migrating to MailKit, but it would still be nice if existing functionality would not break.

@cmendible @scalablecory

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Mar 8, 2021
@ghost
Copy link

ghost commented Mar 8, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Calling SendMailAsync on the same SmtpClient instance fails under some circumstances in net50.

Description

The following test succeeds when running under netcoreapp3.1, but fails by timing out in net50

[Fact(Timeout = 10000)]
public async Task Net50Repo()
{
    var config = Configuration.Configure();
    using var server = SimpleSmtpServer.Start(config.WithRandomPort().Port);
    var inst = new SmtpClient
    {
        Host = "localhost",
        Port = server.Configuration.Port,
        EnableSsl = false
    };

    for (int messageNo = 0; messageNo < 2; messageNo++)
    {
        var mailMessage = new MailMessage("one@example.com", "two@example.com");

        await inst.SendMailAsync(mailMessage);

        Assert.Equal(messageNo + 1, server.ReceivedEmailCount);
    }
}

The issue seems to be reusing the SmtpClient, the first SendMailAsync succeeds, but the second call always just hangs indefinitely.

Works in 3.1. Newing up a new client for each call also works. Testing against a different Smtp server (eg. Papercut) also works.

netDumpster just ends up waiting inside SmtpContext on this line:

count = this.socket.Receive(byteBuffer);
It's like if the socket remains open it cannot be reused anymore for some reason. Any idea what changed?

Configuration

We are able to reproduce with netdumpster in unit tests, production is not yet running in net50 for us (partly because of this), so I'm unsure if other smtp server are effected, we are nervous about this.

Regression?

Yes, regression from 3.1 to 5

Other information

I originally raised the issue on netdumpster, there is some more discussion there, although not much.

I am aware that other issues mention SmtpClient is not under active development, and it seems MS recommends migrating to MailKit, but it would still be nice if existing functionality would not break.

@cmendible @scalablecory

Author: chrisaut
Assignees: -
Labels:

area-System.Net, untriaged

Milestone: -

@ManickaP
Copy link
Member

Triage: we should investigate here. Potentially improve our tests to cover this.

@ManickaP ManickaP added this to the 6.0.0 milestone Mar 11, 2021
@wfurt
Copy link
Member

wfurt commented Mar 11, 2021

I did quick check and I cannot reproduce it @chrisaut. Tested on Ubuntu18 agains Postfix.

What OS do you use? Can you do packet captures with Wireshark? Perhaps we can see more there.

@chrisaut
Copy link
Contributor Author

chrisaut commented Mar 12, 2021

I did quick check and I cannot reproduce it @chrisaut. Tested on Ubuntu18 agains Postfix.

I/we are on windows testing against netdumpster. Easiest to repro with this: https://github.com/cmendible/netDumbster/tree/dotnet5
The test called "Reusing_Smtp_Client_Should_Not_Fail" succeeds in 3.1 and fails in 5.0

What OS do you use? Can you do packet captures with Wireshark? Perhaps we can see more there.

Windows. I can probably get some captures if needed, but you should be able to repro with the branch above.

Just to be clear again, we originally thought this was a netdumpster (a fake smtp server for tests) issue, however it functions perfectly in older frameworks, and against other smtp clients. Upgrading to dotnet 50 breaks when using SmtpClient.

Like mentioned before, there was a bit of discussion earlier here

@cmendible
Copy link

@wfurt if needed I can modify the code in the branch removing everything that is not needed to repro the issue. Just let me know

@wfurt
Copy link
Member

wfurt commented Mar 12, 2021

I'll try it next week on Windows. If I can reproduce triage should not be too hard IMHO. Part of it is also impact and that drives priority.

@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Mar 18, 2021
@karelz
Copy link
Member

karelz commented Mar 18, 2021

Triage: If we have simple repro and idea what to fix, we should invest. If it proves difficult, I would punt it to Future.

@karelz karelz modified the milestones: 6.0.0, Future May 6, 2021
@ghost ghost added the no-recent-activity label Oct 9, 2021
@ghost
Copy link

ghost commented Oct 9, 2021

This issue has been automatically marked no recent activity because it has been marked as needs more info but has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no recent activity.

Please refer to our contribution guidelines for tips on what information might be required.

@ghost
Copy link

ghost commented Nov 5, 2021

This issue will now be closed since it had been marked no recent activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Nov 5, 2021
@karelz
Copy link
Member

karelz commented Nov 5, 2021

The needs more info was bogus. Reopening.

@karelz karelz reopened this Nov 5, 2021
@ghost ghost removed the no-recent-activity label Nov 5, 2021
@wfurt wfurt removed their assignment Mar 24, 2022
@BrunoJuchli
Copy link
Contributor

BrunoJuchli commented May 17, 2022

FYI because possibly connected:

I'm reusing the same SmtpClient instance to send multiple e-mails via synchronous Send(MailMessage) method.

  • on .net Framework 4.7.2 and 4.8 this works without issue (Windows Server 2019)
  • on .net 6 same code results in the server closing the socket and reporting a timeout:

    SmtpException: Service not available, closing transmission channel. The server response was: Timeout. Try talking faster next time
    at System.Net.Mail.MailCommand.CheckResponse(SmtpStatusCode statusCode, String response)
    at System.Net.Mail.MailCommand.Send(SmtpConnection conn, Byte[] command, MailAddress from, Boolean allowUnicode)
    at System.Net.Mail.SmtpTransport.SendMail(MailAddress sender, MailAddressCollection recipients, String deliveryNotify, Boolean allowUnicode, SmtpFailedRecipientException& exception)
    at System.Net.Mail.SmtpClient.Send(MailMessage message)

    • run on Windows Server 2019 as well as Linux, same timeouts happen on both

EDIT: Changing the implementation to use a new instance for every e-mail fixed the timeouts. See comment below.

@BrunoJuchli
Copy link
Contributor

BrunoJuchli commented May 30, 2022

I finally got around deploying this to production. And yes, once I've moved to disposing the SmtpClient every time I send, my timeout exceptions vanished magically.
Also note, that the timeout exceptions happened on .net 6 on Windows as well as on Linux nodes (we moved to Linux in the meantime but this didn't change anything. Then adding the Dispose() helped).
You can clearly see the time we've deployed the change:

image

@vonzshik
Copy link
Contributor

I've done a small investigation, and the problem was introduced with #683. Or, to be precise, it was uncovered by it. The main problem is the fact that SmptClient attempts to open another connection every time SendAsync is called, even if the connection is open.

if (NetEventSource.Log.IsEnabled()) NetEventSource.Info(this, $"Calling BeginConnect. Transport: {_transport}");
_transport.BeginGetConnection(_operationCompletedResult, ConnectCallback, _operationCompletedResult, Host!, Port);
_operationCompletedResult.FinishPostingAsyncOp();

internal IAsyncResult BeginGetConnection(ContextAwareResult outerResult, AsyncCallback? callback, object? state, string host, int port)
{
IAsyncResult? result = null;
try
{
_connection = new SmtpConnection(this, _client, _credentials, _authenticationModules);
if (NetEventSource.Log.IsEnabled()) NetEventSource.Associate(this, _connection);
if (EnableSsl)
{
_connection.EnableSsl = true;
_connection.ClientCertificates = ClientCertificates;
}
result = _connection.BeginGetConnection(outerResult, callback, state, host, port);

And compare this to Send.

case SmtpDeliveryMethod.Network:
default:
GetConnection();
// Detected during GetConnection(), restrictable using the DeliveryFormat parameter
allowUnicode = IsUnicodeSupported();
ValidateUnicodeRequirement(message, recipients, allowUnicode);
writer = _transport.SendMail(message.Sender ?? message.From, recipients,
message.BuildDeliveryStatusNotificationString(), allowUnicode, out recipientException);
break;

private void GetConnection()
{
if (!_transport.IsConnected)
{
_transport.GetConnection(_host!, _port);
}
}

Unlike sync implementation, async doesn't check whether _transport.IsConnected is true or not.

@wfurt
Copy link
Member

wfurt commented May 31, 2022

do you want to contribute PR @vonzshik ?

@vonzshik
Copy link
Contributor

@wfurt sure, though it might take a bit of time (hopefully, building the runtime shouldn't be too hard 😄)

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 31, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 29, 2022
@karelz karelz modified the milestones: Future, 7.0.0 Jul 19, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants