-
Notifications
You must be signed in to change notification settings - Fork 1.7k
A more helpful Error message when we can't handshake because of too many open files #8737
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -104,29 +104,45 @@ impl Session { | |
nonce: &H256, host: &HostInfo) -> Result<Session, Error> | ||
where Message: Send + Clone + Sync + 'static { | ||
let originated = id.is_some(); | ||
let mut handshake = Handshake::new(token, id, socket, nonce).expect("Can't create handshake"); | ||
let local_addr = handshake.connection.local_addr_str(); | ||
handshake.start(io, host, originated)?; | ||
Ok(Session { | ||
state: State::Handshake(handshake), | ||
had_hello: false, | ||
info: SessionInfo { | ||
id: id.cloned(), | ||
client_version: String::new(), | ||
protocol_version: 0, | ||
capabilities: Vec::new(), | ||
peer_capabilities: Vec::new(), | ||
ping: None, | ||
originated: originated, | ||
remote_address: "Handshake".to_owned(), | ||
local_address: local_addr, | ||
match Handshake::new(token, id, socket, nonce) { | ||
Ok(mut handshake) => { | ||
let local_addr = handshake.connection.local_addr_str(); | ||
handshake.start(io, host, originated)?; | ||
Ok(Session { | ||
state: State::Handshake(handshake), | ||
had_hello: false, | ||
info: SessionInfo { | ||
id: id.cloned(), | ||
client_version: String::new(), | ||
protocol_version: 0, | ||
capabilities: Vec::new(), | ||
peer_capabilities: Vec::new(), | ||
ping: None, | ||
originated: originated, | ||
remote_address: "Handshake".to_owned(), | ||
local_address: local_addr, | ||
}, | ||
ping_time: Instant::now(), | ||
pong_time: None, | ||
expired: false, | ||
protocol_states: HashMap::new(), | ||
compression: false, | ||
}) | ||
}, | ||
ping_time: Instant::now(), | ||
pong_time: None, | ||
expired: false, | ||
protocol_states: HashMap::new(), | ||
compression: false, | ||
}) | ||
Err(err) => { | ||
if let Error(ErrorKind::Io(inner), state) = err { | ||
Err(Error( | ||
if inner.raw_os_error() == Some(24) { | ||
// handle sys error CODE 24; "Too many open files" more gracefully | ||
ErrorKind::TooManyOpenFiles("Can't create handshake.".to_string()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Forgive me the silly question but what does this look like to the user in the console?
Does the process exit at that point? Does it carry with it the error code when exiting? I'm thinking about people running parity on a server with monitoring attached looking for specific process exit codes to trigger alarms. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly wasn't able to trigger this particular warning myself, but took the information provided in the corresponding ticket - #6791 . From my understanding this happens in a Worker-Thread, which - as a result - then panics. The process itself keeps running and from the weirdness of the backtrace message, I doubt anyone tried to have any script do things when that error appears. So for the case you are thinking about this doesn't change anything. |
||
} else { | ||
ErrorKind::Io(inner) | ||
}, state)) | ||
} else { | ||
Err(err) | ||
} | ||
} | ||
} | ||
} | ||
|
||
fn complete_handshake<Message>(&mut self, io: &IoContext<Message>, host: &HostInfo) -> Result<(), Error> where Message: Send + Sync + Clone { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super happy with this: upacking the error to call that function on it with a specific hard-coded CODE-Number is one thing, but then replacing that with a different error to give our message seems crude -- especially here.
However, I couldn't find any way in which
error_chains
supports linking different local errors upon the same type but discriminating by a function call or closure -- which I think, how it is should be done instead... Also because this error message is now only matched in this particular case. Which fixes the issue reported, but should we fail with a TOO_MANY_OPEN_FILES on any part of the code (which any call toethkey::generate
might does), we still see the rather cryptic previous error message :( .Tips/ideas appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to do this better (or if that is possible). A question though: is
24
platform agnostic? Looks like it's the LinuxEMFILE
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I guess you are right - though both Linux and Mac. However, there this must be understood as plattform specific code... making the entire part even more complex and ugly :( ...
not happy about this, not happy about it at all ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking closer it seems like
EMFILE
is actually standard and defined inerrno.h
in the C stdlib. So that is probably fine. Reading the issue again I noticed they ran into error23
and not24
and then I found this.I think we should try to respond sensibly to both cases (i.e.
ENFILE
/23
is too many files in the system andEMFILE
/24
is too many files in the process).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also should use an enum with the error numbers, I'd say the
libc
crate is appropriate here:ENFILE
,EMFILE
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dvdplm oh, looks like they were running into both:
The Error Message is only appropriate for 24. In 23 I suppose you have to kill some processes or restart the system or something? I'll ask in the ticket, what message they'd like for that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, you'll need to release some file handle descriptors! ;) Or increase the system-level ulimits. A system restart should not be necessary though.