-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose getaddrinfo errors. Was: Return empty list on EAI_ errors for getaddrinfo@luv (#351) #352
base: main
Are you sure you want to change the base?
Conversation
lib_eio_luv/eio_luv.ml
Outdated
| Error `EAI_ADDRFAMILY | Error `EAI_AGAIN | Error `EAI_BADFLAGS | Error `EAI_BADHINTS | ||
| Error `EAI_CANCELED | Error `EAI_FAIL | Error `EAI_FAMILY | Error `EAI_MEMORY | ||
| Error `EAI_NODATA | Error `EAI_NONAME| Error `EAI_OVERFLOW | Error `EAI_PROTOCOL | ||
| Error `EAI_SERVICE | Error `EAI_SOCKTYPE -> [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list seems excessive (even if it does match the stdlib version). In particular, EAI_AGAIN
, EAI_BADFLAGS
, EAI_MEMORY
and EAI_SOCKTYPE
seem like they should be reported as errors, rather than claiming there are no addresses. Maybe EAI_FAIL
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it too, just some notes:
EAI_FAIL
and EAI_AGAIN
(unlike EAGAIN
) can be the result of a an actual DNS reply (would have to dig deeper, but those should return an empty list). EAI_SOCKTYPE
, EAI_BADFLAGS
, EAI_FAMILY
we could, since they are mostly programming errors from our side.
I don't have a strong opinion, but I'd err on the side of just emulating the same behaviour of Unix.getaddrinfo
which ignores everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick grep in the OpenBSD tree, just to see the most common idiom.
Basically it's just log gai_errno -> return NULL
, some cases are a bit more careful into checking for EAI_NONAME
and EAI_NODATA
and not logging it as an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just copy the getaddrinfo
API and not allow an empty result. The wrapper function could enforce this:
let getaddrinfo ?(service="") (t:#t) hostname =
match t#getaddrinfo ~service hostname with
| [] -> raise (Dns_error (Failure "No addresses for ..."))
| xs -> xs
I guess most people getting an empty list will want to report an error, and it's a bit silly you have to guess what the error was when we just threw the real error away. e.g. in with_tcp_connect
we currently do:
| [] -> raise (Connection_failure (Failure (Fmt.str "No TCP addresses for %S" host)))
Then we could say if that's because the DNS server is busy, the entry doesn't exist, doesn't have an IP address, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edited since I misread the first time)
I like this better than the empty list, but if we're going this direction we could expose the gai_strerror
in Dns_error
so the user has something more meaningful than "not found". That would require providing a C stub for u-ring that exposes the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking a bit more about this, I think connect
should take a Sockaddr.stream list
, and loop over the candidates until it succeeds connecting to one. This is virtually what everyone does with getaddrinfo
+ connect
and bind
.
In this case connect
itself could raise something, getaddrinfo
could still return just an empty list, and the user would still have to handle an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the error code was just an example. We can use a better exception name and contents.
I think connect should take a Sockaddr.stream list, and loop over the candidates until it succeeds connecting to one.
with_tcp_connect
already handles looping over the candidates; I'm not sure there's much benefit to changing connect
too. In fact, that would just mean having to do something for the extra case of an empty list.
In this case connect itself could raise something, getaddrinfo could still return just an empty list, and the user would still have to handle an exception.
I don't see how that helps. We'd still be losing the detailed error from getaddrinfo
and replacing it with a guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the error code was just an example. We can use a better exception name and contents.
I think connect should take a Sockaddr.stream list, and loop over the candidates until it succeeds connecting to one.
with_tcp_connect
already handles looping over the candidates; I'm not sure there's much benefit to changingconnect
too. In fact, that would just mean having to do something for the extra case of an empty list.
empty list would return some Dns_error 'no candidates found'.
In this case connect itself could raise something, getaddrinfo could still return just an empty list, and the user would still have to handle an exception.
I don't see how that helps. We'd still be losing the detailed error from
getaddrinfo
and replacing it with a guess.
Yes, I was just assuming you wouldn't want a getaddrinfo stub that exports the getaddrinfo errno 😄 .
I'd prefer to export the error from getaddrinfo and also make connect loop, so users don't connect (List.first (getaddrinfo...))
.
I don't feel too strong about it, I just want to move forward so both backends have the same behaviour.
Should we just raise all errors from getaddrinfo on both cases ? That involves the stub and creating some Dns_error values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can I get a ruling here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just raise all errors from
getaddrinfo
on both cases?
Yes, that makes sense to me.
( Having connect
take a Sockaddr.stream list
doesn't seem useful to me. You can't give a good error if there are no addresses - you don't know what the DNS error was, what the failing hostname was, or even whether DNS was used at all! )
Thanks @haesbaert ! This works for me, but I'll differ to @talex5's knowledge of which |
I have no special knowledge - just reading the |
…#351) Unix.getaddrinfo used in u-ring ignores gar_errno and returns an empty list, this makes the luv backend follow a similar behaviour.
cb602ac
to
b419150
Compare
Some corrections to my previous statement, getaddrinfo(3) cannot return an empty list and no error, it's either something or an error. I checked libc. This PR should be
|
Tests are zefixed, marking this as ready for review Another note, I've added the stubs to a new file for two reasons
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible.
Note that I'm changing the error handling quite a lot in #378, which will impact this PR a bit.
/* All rights reserved. This file is distributed under the terms of */ | ||
/* the GNU Lesser General Public License version 2.1, with the */ | ||
/* special exception on linking described in the file LICENSE. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to avoid LGPL code in Eio (it's currently BSD and ISC only). Though they might relicense this bit if asked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasche @xavierleroy (sorry for the ping).
I've copied significant part of otherlibs/unix/getaddrinfo.c
into EIO (https://github.com/ocaml-multicore/eio/blob/61b737d5be683b9be611844b16c7e73b5bece09a/lib_eio_linux/getaddrinfo_stubs.c). The original file is LGPLed and we would like to relicense our new file to ISC.
Do you permit us to relicense the bits we copied out from LGPL to ISC in EIO ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I only use copyleft licenses for my free software. Please consider using the LGPL with OCaml linking exception for your library, or do not reuse any of my code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I only use copyleft licenses for my free software. Please consider using the LGPL with OCaml linking exception for your library, or do not reuse any of my code.
Thank you, will do one of the two as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by comments on the borrowed C code!
lib_eio_linux/getaddrinfo_stubs.c
Outdated
#include <caml/unixsupport.h> | ||
#include <caml/socketaddr.h> | ||
|
||
static value caml_unix_cst_to_constr(int n, int *tbl, int size, int deflt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could avoid this duplication by borrowing the extern
declaration in cst2constr.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, somehow my brain remembered the function being static in first place.
lib_eio_linux/getaddrinfo_stubs.c
Outdated
if (retcode == 0) { | ||
for (r = res; r != NULL; r = r->ai_next) { | ||
v = caml_alloc_small(2, Tag_cons); | ||
Field(v, 0) = convert_addrinfo(r); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This violates GC rule 5 - the convert_addrinfo
call needs to be done before the caml_alloc_small
and stored in a registered local (convert_addrinfo
allocates).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for spotting this, fixed it now
Many thanks to dra27 ! Somehow this function was static on first place in my head.
When I removed the `e` and I violated GC #rule 5. Basically a small block must be fully initialized before we trigger the next allocation, TIL.
Unix.getaddrinfo used in u-ring ignores gar_errno and returns an empty list, this makes the luv backend follow a similar behaviour.
Should fix #351 where we would get EAI_NONAME (or others) if resolution failed, this is a bit defensive as we just handle expected EAI errors.