Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

A more helpful Error message when we can't handshake because of too many open files #8737

Conversation

gnunicorn
Copy link
Contributor

Attemps to fix #6791 by providing a better error message when the handshake fails because
of limited resources given by the system around.

@parity-cla-bot
Copy link

It looks like @gnunicorn signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

Err(err) => {
if let Error(ErrorKind::Io(inner), state) = err {
Err(Error(
if inner.raw_os_error() == Some(24) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy with this: upacking the error to call that function on it with a specific hard-coded CODE-Number is one thing, but then replacing that with a different error to give our message seems crude -- especially here.

However, I couldn't find any way in which error_chains supports linking different local errors upon the same type but discriminating by a function call or closure -- which I think, how it is should be done instead... Also because this error message is now only matched in this particular case. Which fixes the issue reported, but should we fail with a TOO_MANY_OPEN_FILES on any part of the code (which any call to ethkey::generate might does), we still see the rather cryptic previous error message :( .

Tips/ideas appreciated!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to do this better (or if that is possible). A question though: is 24 platform agnostic? Looks like it's the Linux EMFILE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I guess you are right - though both Linux and Mac. However, there this must be understood as plattform specific code... making the entire part even more complex and ugly :( ...

not happy about this, not happy about it at all ...

Copy link
Collaborator

@dvdplm dvdplm May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking closer it seems like EMFILE is actually standard and defined in errno.h in the C stdlib. So that is probably fine. Reading the issue again I noticed they ran into error 23 and not 24 and then I found this.
I think we should try to respond sensibly to both cases (i.e. ENFILE/23 is too many files in the system and EMFILE/24 is too many files in the process).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also should use an enum with the error numbers, I'd say the libc crate is appropriate here: ENFILE, EMFILE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvdplm oh, looks like they were running into both:

The Error Message is only appropriate for 24. In 23 I suppose you have to kill some processes or restart the system or something? I'll ask in the ticket, what message they'd like for that case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill some processes or restart the system or something

well, you'll need to release some file handle descriptors! ;) Or increase the system-level ulimits. A system restart should not be necessary though.

Err(Error(
if inner.raw_os_error() == Some(24) {
// handle sys error CODE 24; "Too many open files" more gracefully
ErrorKind::TooManyOpenFiles("Can't create handshake.".to_string())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me the silly question but what does this look like to the user in the console?

TooManyOpenFiles: Can't create handshake

Does the process exit at that point? Does it carry with it the error code when exiting? I'm thinking about people running parity on a server with monitoring attached looking for specific process exit codes to trigger alarms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly wasn't able to trigger this particular warning myself, but took the information provided in the corresponding ticket - #6791 .

From my understanding this happens in a Worker-Thread, which - as a result - then panics. The process itself keeps running and from the weirdness of the backtrace message, I doubt anyone tried to have any script do things when that error appears. So for the case you are thinking about this doesn't change anything.

Copy link
Collaborator

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle both ENFILE and EMFILE. Use enums.

@gnunicorn
Copy link
Contributor Author

Closing as I've found a better approach, submitted as #8744

@gnunicorn gnunicorn closed this May 31, 2018
@5chdn 5chdn added this to the 1.12 milestone May 31, 2018
@5chdn 5chdn added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. M4-core ⛓ Core client code / Rust. labels May 31, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error message: "Too many open files in system"
4 participants