Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getaddrinfo: -9 when connecting to localhost #43

Closed
armanbilge opened this issue Sep 11, 2022 · 34 comments · Fixed by #50
Closed

getaddrinfo: -9 when connecting to localhost #43

armanbilge opened this issue Sep 11, 2022 · 34 comments · Fixed by #50
Labels
bug Something isn't working

Comments

@armanbilge
Copy link
Owner

Replacing it with 127.0.0.1 works however.

@armanbilge armanbilge added the bug Something isn't working label Sep 11, 2022
@armanbilge armanbilge mentioned this issue Sep 11, 2022
@LeeTibbert
Copy link
Collaborator

-9 may be the gai_error "EAI_SOCKTYPE The SOCTYPE was not recognized."
There is a PR wending its way to completion which will automatically convert the
error code to human, well, network geek text.

What this most probably means is that "localhost" on that system is
defined with only an IPv6 address, or the nameserver is configured
to return IPv6 addresses first.

I can check the epollcat connect code to see what it would do
in this case.

@LeeTibbert
Copy link
Collaborator

At a higher level. I have been pecking my way towards IPv6 support, leaving trail of
PRs and Issues, but little visible progress.

From my studies so far, there are some things in epollcat which need to change.
I think most of them are easy, but seem to remember two were hard. One was
hidden, the other is configuration, see below.

The harder area is that epollcat appears to use SN 0.4.n (7?) j.n.InetAddress
& friends. I changed some things in those for SN 0.5.0 IPv6 support and have
to so see if they can safely be backported to 0.4.n. Again, the issue is
configuration.

The other alternative is to implement just the parts of j.n.InetAddress
& friends needed by epollcat (yes, tomorrow's needs will inevitably
creep to include all as defined by Java). The Harmony code
used by SN was the best that could be done at the time.
I have an inkling that a massive reduction (and correction, recent SN Issue)
could be done using posix C routines.

Configuration

To refresh prior rapid discussions.

Java defines two System Properties (I may get the names wrong here, but
concept is correct) java.net.preferIPv4 and java.net.preferIPv6Addresses.

JVM will use IPv6 as the underlying socket if any network interface
enable IPv6 (getaddrinfo AI_ADDRCONFIG) and java.net.preferIPv4 is
entire undefined or (I think) not "true".

Thunder storms, system may crash. More later.

@LeeTibbert
Copy link
Collaborator

Back from thunderstorms.

TL;DR

Based on discussions outside this context, I think that epollcat
should avoid the IPv6 configuration issue but announce
quite clearly that it is following the Java model of "check the
two System Properties" and use IPv6 as underlying transport
if available and those properties allow.

This gives a clear "opt-out". I think that is essential.
It is up to application creators to decide if they want to
allow IPv6 in their apps. Perhaps an example
"Hello World" app paragraph which sets the System Property
preferIPv4 to true.

This does get into the game of telling people to set System Properties
where Java clearly says not to do that. Setting these properties
is something that the other hand of Java says to do if necessary.
So we are not introducing anything new.

Adopting this convention, at least for scaffolding or until
proven unwise, removes one blocker for epollcat IPv6 support.

I think that puts my examining the SN InetAddress & kin change,
my previous "top of the parabola" as the work-item at the
head of the queue.

Your thoughts?

@armanbilge
Copy link
Owner Author

Thank you for sharing all your thoughts and summarizing the key points. I have been mulling on this issue as well.


This gives a clear "opt-out". I think that is essential.
It is up to application creators to decide if they want to
allow IPv6 in their apps.

I followed the discussion on Scala Native, and I absolutely agree with you on this.

following the Java model of "check the
two System Properties" and use IPv6 as underlying transport
if available and those properties allow.

Yes, this makes sense to me.

We do possibly have one other option at our disposal: a Cats Effect IOApp exposes a IORuntimeConfig that users can override with custom configurations.

https://github.com/typelevel/cats-effect/blob/3da61a59438da30f9bb01192fec9e690aacc3ce4/core/jvm/src/main/scala/cats/effect/IOApp.scala#L161-L165

https://github.com/typelevel/cats-effect/blob/3da61a59438da30f9bb01192fec9e690aacc3ce4/core/shared/src/main/scala/cats/effect/unsafe/IORuntimeConfig.scala#L22-L27

Indeed, I have been planning to introduce an EpollRuntimeConfig to expose the maxEvents parameter for user-configuration.

def apply(maxEvents: Int): (EventPollingExecutorScheduler, () => Unit) =

So in theory, we could also add a flag for enabling/disabling IPv6 to such a configuration object.

See also how the default IORuntimeConfig is actually sourced from System properties.
https://github.com/typelevel/cats-effect/blob/3da61a59438da30f9bb01192fec9e690aacc3ce4/core/jvm/src/main/scala/cats/effect/unsafe/IORuntimeConfigCompanionPlatform.scala#L26-L53

What are your thoughts about this? I could go either way: on the one hand, exposing the configuration like this is taking a step away from the JVM. On the other hand, it may be a better UX and is idiomatic for Cats Effect applications.


I do have one concern: we do not control the implementation of InetAddress.getByName and similar methods which live in Scala Native core.

  1. Are these methods also supposed to be following this configuration option?
  2. If yes, then we should determine a strategy so that users cannot get hosed when they disable IPv6 support in epollcat but getByName returns an IPv6 anyway.

val address = new InetSocketAddress(InetAddress.getByName("postman-echo.com"), 80)

@LeeTibbert
Copy link
Collaborator

 test("connect localhost".only) {
    val address = new InetSocketAddress(InetAddress.getByName("localhost"), 0)
    IOSocketChannel.open.use { ch =>
      for {
        _ <- ch.connect(address)
      } yield ()
    }
  }

Throws the expected ConnectException:

pollcat.TcpDebugSuite:
==> X epollcat.TcpDebugSuite.connect localhost 0.01s java.net.ConnectException: Connection refused

Can you try that test or its equivalent on your system?

Separately, if that test fails, could you send me privately the results from:

# The $ indicates the shell command prompt.
# The first line shows the 'normal' translation.
# The second shows the 'normal' translation & precedence if more than one
# Third gives IPv4 records, which may not have highest precedence
# Forth gives IPv6 recores, ibid.
$ host localhost
$ dig localhost
$ dig localhost A
$dig localhost AAAA

@armanbilge
Copy link
Owner Author

armanbilge commented Sep 12, 2022

@LeeTibbert here is a test that demonstrates the issue (and shows off your brand new error messages! :)

test("connect localhost".only) {
  val localhost = new InetSocketAddress(InetAddress.getByName("localhost"), 8888)
  val `127.0.0.1` = new InetSocketAddress("127.0.0.1", 8888)

  IOServerSocketChannel
    .open
    .evalTap(_.bind(new InetSocketAddress("0.0.0.0", 8888)))
    .surround {
      for {
        result1 <- IOSocketChannel.open.use(_.connect(localhost)).attempt
        _ <- IO.println(result1)
        result2 <- IOSocketChannel.open.use(_.connect(`127.0.0.1`)).attempt
        _ <- IO.println(result2)
      } yield ()
    }
}

Which gets me this result:

sbt:root> testsNative/testOnly *.TcpSuite
[info] Starting process '/workspace/epollcat/tests/native/target/scala-2.13/tests-test-out' on port '38029'.
epollcat.TcpSuite:
Left(java.io.IOException: getaddrinfo: Address family for hostname not supported)
Right(())
  + connect localhost 0.01s
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1

You can see the two printlns: the first is a failure Left(IOException...). The second is a success Right(()).

If I run the same test on the JVM I get this:

sbt:root> testsJVM/testOnly *.TcpSuite
Right(())
Right(())
epollcat.TcpSuite:
  + connect localhost 0.07s
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1

@LeeTibbert
Copy link
Collaborator

A rich discussion, perhaps a chance to move the stone a few millimeters uphill.
Let me answer a few "easy", where easy is a relative term, questions and
address the harder ones next spring.

Are these methods also supposed to be following this configuration option?

Yes, if they do not that is a blunder on my part and a bug. The properties
are checked once, at first use of a "Socket" method, which includes, IIRC,
InetAddress and possibly Inet4Address & Inet6Address.

Opps, context switch back to the original "localhost" concern.

One quick thought: your concern about "who does what when"
gave me the thought of possible ways to query SN for what
it is doing. epoll and SN javalib need to be in lockstep.
Only real way I know of doing that is to have one point
of Truth and have others reference it.

Something for me to think about.

More later (tomorrow-ish).

@armanbilge
Copy link
Owner Author

armanbilge commented Sep 12, 2022

Yes, if they do not that is a blunder on my part and a bug. The properties
are checked once, at first use of a "Socket" method, which includes, IIRC,
InetAddress and possibly Inet4Address & Inet6Address.

Indeed this seems to be the case in the forthcoming SN 0.5.x. However, that is not the case in 0.4.x, and it seems that getByName can return an IPv6 address. So if we are stuck with that behavior, then we should be prepared to accommodate it in epollcat.

@LeeTibbert
Copy link
Collaborator

I ran the tests both as me and as root. Both cases I got the Connect Timeout.
I rebased earier today, let me rebase again, or start from a fresh directory
and see if I can at least replicate locally.

What does CI do with this test?

"localhost" is almost always IPv6. Equivalent name for IPv6 is "ip6-localhost"
that is not always defined, some people tailor it off.

Some people also define IPv6 AAAA records for localhost and then
set their bind server up to deliver AAAA records first.
getaddrinfo hints are just that: hints.

Adding the test, at least for debugging, to CI will
give us other data points, on "standard" systems.

@LeeTibbert
Copy link
Collaborator

LeeTibbert commented Sep 12, 2022

Indeed this seems to be the case in the forthcoming SN 0.5.x. However, that is not the case in 0.4.x, and it seems that getByName can return an IPv6 address. So if we are stuck with that behavior, then we should be prepared to accommodate it in epollcat.

I will look at the 0.4.7 implementation tomorrow. If it is returning an IPv6 address on "standard" systems,
for a blatantly IPv4 name, that is a bug, and should be fixed, in 0.4.next and double checked in 0.5.0.
There is not much that epollcat can do, other than implement InetAddress (which is not out of the question)


I'll hang tight for a few minutes to see what CI says. You have me more curious than hungry.

@LeeTibbert
Copy link
Collaborator

LeeTibbert commented Sep 12, 2022

CI seems All Green.

I can Discourse PM you my email (or you may have it already), or you
could post on Discourse PM the results of the "host" and "dig" command line. If you
do not have those on your system, there is "$nslookup localhost"

@armanbilge
Copy link
Owner Author

Yes, CI is green but the println output corroborates my results:

localhost/0:0:0:0:0:0:0:1:8888
Left(java.io.IOException: getaddrinfo: Address family for hostname not supported)
Right(())
  + connect localhost 0.00s

https://github.com/armanbilge/epollcat/runs/8296491638?check_suite_focus=true#step:9:44

@LeeTibbert
Copy link
Collaborator

LeeTibbert commented Sep 12, 2022

Rats! If I remove the ".attempt" from (a copy) of the test, will the Exception bubble through?
I may be having problems with interleaved I/O on my system.

At least now we see where the weight of the evidence lies.

@armanbilge
Copy link
Owner Author

Yes, the .attempt captures the error in an Either without failing evaluation.

@LeeTibbert
Copy link
Collaborator

When I add a single IO(printf) I now get what I would call expected behavior.

First : Right(()) which I think means the connect (localhost) succeeded (since it is a Right),
Second, I get a test 20 second timeout (sure glad we have these)

I would say that the second timeout is expected because the server socket never does an
accept of the first connection. The second is backed up in the listen queue behind it.
As we discovered, the SYN handshake is 127 seconds or so(other things being equal).
So the test timeout fires first.

Tomorrow I will start with a freshly re-based clone in an empty directory. I will also
check that none of the IPv6 stuff got into the 0.4.7 soup.

@LeeTibbert
Copy link
Collaborator

The one CI log file I looked at gives two Rights. So both succeeded.
Let me track down which configuration I was looking at.

 Ubuntu
2022-09-12T00:30:16.1630425Z 20.04.5
2022-09-12T00:30:16.1630687Z LTS
localhost/127.0.0.1:8888
2022-09-12T00:31:13.9712554Z Right(())
2022-09-12T00:31:13.9732755Z Right(())
2022-09-12T00:31:13.9821330Z �[32mepollcat.TcpSuite:�[0m
2022-09-12T00:31:13.9822128Z �[32m  + �[0m�[32mHTTP echo�[0m �[90m0.171s�[0m
2022-09-12T00:31:13.9823365Z �[32m  + �[0m�[32mserver-client ping-pong�[0m �[90m0.044s�[0m
2022-09-12T00:31:13.9823863Z �[32m  + �[0m�[32mlocal and remote addresses�[0m �[90m0.006s�[0m
2022-09-12T00:31:13.9824291Z �[32m  + �[0m�[32mread after shutdownInput�[0m �[90m0.004s�[0m
2022-09-12T00:31:13.9824669Z �[32m  + �[0m�[32moptions�[0m �[90m0.002s�[0m
2022-09-12T00:31:13.9825045Z �[32m  + �[0m�[32mConnectException�[0m �[90m0.054s�[0m
2022-09-12T00:31:13.9825474Z �[32m  + �[0m�[32mBindException - EADDRINUSE�[0m �[90m0.003s�[0m
2022-09-12T00:31:13.9825915Z �[32m  + �[0m�[32mBindException - EADDRNOTAVAIL�[0m �[90m0.002s�[0m
2022-09-12T00:31:13.9826345Z �[32m  + �[0m�[32mClosedChannelException�[0m �[90m0.003s�[0m
2022-09-12T00:31:13.9826788Z �[32m  + �[0m�[32mserver socket read does not block�[0m �[90m0.004s�[0m
2022-09-12T00:31:13.9827271Z �[32m  + �[0m�[32mIOServerSocketChannel.accept is cancelable�[0m �[90m0.102s�[0m
2022-09-12T00:31:13.9828089Z �[32m  + �[0m�[32mconnect localhost�[0m �[90m0.012s�[0m
2022-09-12T00:31:14.0316816Z �[0m[�[0m�[0minfo�[0m] �[0m�[0mPassed: Total 12, Failed 0, Errors 0, Passed 12�[0m
2022-09-12T00:31:14.0659799Z �[0m[�[0m�[32msuccess�[0m] �[0m�[0mTotal time: 9 s, completed Sep 12, 2022, 12:31:14 AM�[0m
2022-09-12T00:31:14.4178495Z �[0J

I check the log file that you posted that had the obvious failure.

@armanbilge
Copy link
Owner Author

Yes, from the CI logs I looked at, I understand it to be working on the JVM and failing on Native.

@LeeTibbert
Copy link
Collaborator

OK, I was distracted by JVM. I will focus on the Native. Too many shells moving around on the table.

@LeeTibbert
Copy link
Collaborator

LeeTibbert commented Sep 12, 2022

Well, at least we got rid of the -9 the title mentions. A small victory for the day.

re: SN 0.4.n InetAddress returning IPv6 addresses for getByName()

Having studied the 0.4.n InetAddress code, I concur with your discovery.
IPv6 support looks like it was interrupted part way through. This looks
like a gift in time of that partial implementation.

I can create an Issue in Scala Native. The obvious fix is to change
AF_UNmumble to AF_INET and nail the door shut.

Off the top of my head, epollcat can do one of two thing:

  1. If the head of the addrinfo list that gai returns is AF_INET6,
    chase the list looking for an AF_INET. If none found,
    then fail the lookup (IOException, something like "Host not found").
    Yes, this happens late in the chain, in connect() (and probably bind())
    but it means not having to privately implement InetAddress.

  2. Implement a private InetAddress, probably staring with
    a hard coded AF_INET. This would buy time until 0.4.x
    releases (6 months?) See license discussion in 2A below.

2a) Explore implementing an improved & simplified InetAddress.
The getByName() looks do-able (famous last words).
Doing this means copying SN code and introducing epoll
being subject to the SN license.

re: localhost

From the error messages, we had surmised that on SN, connect() was
getting an IPv6 address and giving it to a IPv4 socket:

Trying to guess what was going on in CI, I wrote a WIP (work in progress)
PR. JVM shows the expected 127.0.0.1. An IO(printf mumble) shows on SN:
InetAddress.getByName(localhost) is: |localhost/0:0:0:0:0:0:0:1|

Now this is astonishing! It is quite unusual, but not unheard of for
'localhost' to have an IPv6 address. Looks like somebody, or
more probably an OS distribution, configured the system that way.

However it came to be, we have two existence proofs that it does happen and that
epoll needs to deal with it.

Arman, if you concur, I think the next step is for me to modify
my WIP PR to see how many addrinfos gai is returning and
if any of the are IPv4. If yes, use it. JVM obviously sees
an IPv4 address.

In the general case, say connecting to www.ipv6-only.mumble, which
has no IPv4 address, there is not much we can do beyond looking
at the type of Exception thrown to ensure it makes sense vis-a-vis JVM.

The same situation probably shows up in bind(). I will also
check to see if there are any other uses of getaddrinfo().

What a thicket! Good you found this.

Have I earned my supper? Good night. Interesting ride, good conversation.

@armanbilge
Copy link
Owner Author

Ha, suppertime indeed! Thanks for your persistence on this one 😅

Regarding your proposals:

  1. This sounds like an interesting solution, but would this actually work? I did not realize that if you give getaddrinfo an IPv6 address it will return a list that may (or may not) include an IPv4 address.

    By the time epollcat is receiving the socket address, it has already been resolved to an IP address by the user (most likely by getByName). So we cannot count on doing this resolution ourselves.

  2. If I understand you correctly, are you proposing that we shadow the "broken" InetSocketAddress from Scala Native with our own fixed implementation? This sounds fragile to me, if it would work at all. My working assumption is that anything implemented in Scala Native core we cannot override. So I don't think this is a good option.

Forgive me if I'm confused about the real problem here, but it seems to me there's an elephant in the room: can we not solve this by:

  1. Support IPv6 in epollcat.

So far the lack of IPv6 support has caused problems in fs2-io, http4s, and skunk. So I think it's time to bite the bullet and do what I should have done in the first place, had I not misunderstood the state of IPv6 in Scala Native 0.4.x.

@LeeTibbert
Copy link
Collaborator

LeeTibbert commented Sep 12, 2022

This is a complex discussion and will take some back and forth.

These two items directly conflict.

Support IPv6 in epollcat.
(with the background assumption of 0.4.n)
&
. My working assumption is that anything implemented in Scala Native core we cannot override

epollcat is using a blend of SN implementation (InetAddress, etc) and its own, AsyncSocket, etc.
I am not sure that InetAddress can be overwritten and epollcat's variant used., I would have to try.
I think it can. Normally, I am strongly against duplication. Dante had a special trench in
the 8 circle of hell for those who promoted ecosystem fragmentation. I think it worth a few
days to see if InetAddress can be overwritten:

  1. first baselevel to provide only IPv4 addresses. This should fix this presenting case in
    a way that is not 'astonishing'. i.e. localhost is connected to via IPv4, just
    as on Java (with no system properties override).

    It makes epollcat more robust.

    The proper fix should/will be in SN 0.4.n. If this were the only
    concern I would say wait for a fix in 0.4.n.

  2. Second, or n-th, baselevel would provide SN 0.5.0-SNAPSHOT
    'defined IPv6 operation' to epollcat. That is, The name-to-address
    translation would be done same-as-Java, controlled by the
    two system properties. Right now the behavior of InetAddress
    is ruled by implementation details, a.k.a quirks and/or bugs.

    When epollcat no longer needed to support SN 0.4.n, this
    hypothetical private implementation can and should be removed.

    As mentioned, I would have to take a few days to see if I
    can successfully override InetAddress. I am actually
    thinking of a few simplifications/improvements which,
    if successful, I would ask to contribute to SN.

After the InetAddress IPv6 parts are reliable, I think the
changes to epollcat are probably a week or ten days.

A good part of that is writing tests. Trying to do the
epollcat parts without reliable SN (real or faked) IPv6
support is building on sand.

A second part of epollcat IPv6 support would be
ensuring lack of conflict with SN 0.5.0 "use-IPv4" and
"use IPv6-addresses" configuration (actually, looking at
the code last night, I think the latter is broken in SN, oops!)

So far the lack of IPv6 support has caused problems in fs2-io, http4s, and skunk

This is useful to know. I think it falls into the category of "bad news which is good news",
meaning, sorry for the pain, but it is motivating to know that people are actually trying
to use this stuff.


To synchronize. My plan had been to l

  1. look at the epollcat connect & bind code to see if
    there is anything epollcat can do. I suspect that by the time execution gets there we are
    well hosed by SN InetAddress.getByName.

  2. Submit change to my WIP exploratory PR to print out the contents of /etc/hosts to
    see if I can understand how "localhost" is getting an IPv6 address from the environment
    (and to confirm that it is getting the IPv4 address seen by JVM).

    I do not want to spend hours on this.

  3. Try private epoll implementation of InetAddress to see if I can override.

    Before anything could be published, we would have to ensure a lack of
    license entanglement. I should probably re-read (for the 20th time)
    the epollcat/typelevel license.
    Later: I re-read epollcat license. Straight Apache. I see no conflict
    but then I am not project "owner".

    Does epollcat presently inherit a SN or Scala License? I know
    that SN has recognized a license of some Arman Bilge code.

    I do not want to get too tangled up here but also do not want to blunder.

@LeeTibbert
Copy link
Collaborator

I did not realize that if you give getaddrinfo an IPv6 address it will return a list that may (or may not) include an IPv4 address.
It all depends on what the nameserver returns and if the IPV_MAPPED (sp?) flag is set on the getaddrinfo call.

If the flag is set, all IPv4 addresses (127.0.0.1) should be mapped to IPv4MappedIPv6 addresses (::FF:127.0.0.1 or its
binary equivalent).

Forgive me if I get too concrete, geekish, or simple:
The nameservers are queried by the host, dig, nslookup commands I mentioned the other day. So called A records
indicate IPv4 addresses. AAAA records indicate IPv6 records. The name servers can be queried to return either or both.
The library code determines what it returns first, second, and later.

The Java System Property "java.net.preferIPv6Addresses" gives a hint to getaddrinfo to return AAAA records first
(since hints are hints, I am not sure if that means "only". Common practice is to assume yes, but that may be a common bug).

The SN 0.4.7 (and prior) code specifies AF_UNSPEC (unspecified) in the hints it gives to gai. This lets gai
sort in whatever order it pleases.

Currently, on you system & the CI system(s) but not on my system, the environment is configured for
localhost to have an IPv6 address and for IPv6 addresses to be returned first. Both are not "the usual"
configuration but both are valid and, as the current issue shows, are used in the wild.

@LeeTibbert
Copy link
Collaborator

We are trying to cover material that covers several semesters of
network courses, so there are lots of rabbit holes I am dancing over.
Speak out I do not explain something well, or gloss over important details.

To ground the discussion, if for no one other than me:

  1. I think the current SN 0.4.n InetAddress is broken & the fix in SN 0.4.n is
    to ensure that only AF_INET (Ipv4) addresses are return.

  2. We want to study if anything can be done in epollcat to enable controlled
    IPv6 lookups.

  3. If Item 2 bears out, we want to adjust the rest of the epollcat socket implementation
    to use Java IPv6 semantics. As mentioned, I hacked this support in a long session,
    so it is do-able. The software engineering takes longer.

@armanbilge
Copy link
Owner Author

epollcat is using a blend of SN implementation (InetAddress, etc) and its own

To be honest, I don't understand this at all. epollcat is barely using InetAddress except as a vessel for IP addresses. For example epollcat never calls the getByName method (except in the test suite). So what is the problem?

I apologize if I was unclear, but I do not thing we should attempt to override the Scala Native implementation of InetAddress at all. At best it's playing with fire; at worst it does not work at all. Let's not spend time pursuing this strategy.


The nameservers are queried by the host, dig, nslookup commands I mentioned the other day. So called A records
indicate IPv4 addresses. AAAA records indicate IPv6 records. The name servers can be queried to return either or both.
The library code determines what it returns first, second, and later.

Another point I am confused about: where in its implementation is epollcat querying name servers? epollcat only works directly with IP addresses; never with hostnames. Therefore it should never be querying any name server, to the best of my knowledge.


In epollcat the two methods called with an address are connect and bind

def connect[A](
remote: SocketAddress,
attachment: A,
handler: CompletionHandler[Void, _ >: A]
): Unit = {
val addrinfo = stackalloc[Ptr[posix.netdb.addrinfo]]()
val continue = Zone { implicit z =>
val addr = remote.asInstanceOf[InetSocketAddress]

def bind(local: SocketAddress, backlog: Int): AsynchronousServerSocketChannel = {
val addrinfo = stackalloc[Ptr[posix.netdb.addrinfo]]()
Zone { implicit z =>
val addr = local.asInstanceOf[InetSocketAddress]

In both cases the address is assumed to be already resolved to an IP address. Thus to me it seems the only way forward is to be able to handle both IPv4 and IPv6 inputs with help from AI_V4MAPPED.

@LeeTibbert
Copy link
Collaborator

I apologize if I was unclear, but I do not thing we should attempt to override the Scala Native implementation of InetAddress at
all. At best it's playing with fire; at worst it does not work at all. Let's not spend time pursuing this strategy.

Understood and agreed.

To be honest, I don't understand this at all. epollcat is barely using InetAddress except as a vessel for IP addresses.
For example epollcat never calls the getByName method (except in the test suite). So what is the problem?
The broken getByName method. If we can skip/ignore/avoid that a lot of things fall into place and become
possible.

@LeeTibbert
Copy link
Collaborator

In both cases the address is assumed to be already resolved to an IP address.
Thus to me it seems the only way forward is to be able to handle both IPv4 and IPv6 inputs with help from AI_V4MAPPED.

Eventually (and a lot sooner if we punt InetAddress.getByName() concerns), that is true. AI_V4MAPPED is almost
always used/or'd with AI_ADDERCONFIG(sp?). That says, only map IPv4 addresses if there is an active, non-loopback, IPv6 interface configured.

The recent (2 decade) practice in C land is to call getaddrinfo with AF_UNSPEC, TCP protocol, and AI_VMAPPED | AI_ADDRCONFIG (with perhaps more flags). Then allocate the socket for connection as determined by the ai_family
field in the addrinfo returned. The two match. I will skip over to topic of machines which have multiple interfaces
(multi-homed).

Java, and, I believe "epollcat" do "early allocation" of the os socket, depending on the "java.net.IPv4prefer"
and if any IPv6 non-loopback addresses are configured. That assumption is pretty baked-in for SN javalib.
It also worked in my rapid-prototyping epoll ipv6 work.

There may be a clever way to do C-style "allocate socket just-in-time" in epollcat. That could also be
a follow-on or a "week after never" concern.

With InetAddress.getByName concerns dispatched. The head-of-the-work-queue becomes
configuring the 'java.net.prefer*' options. For early work, that can be done by straight
fetch of the relevant System Properties on first use.

There would be a "due bill" to figure out if this could get out of sync with SN 0.5.0 practice.
It would be nice if SN provided a, probably non-javalib, way of querying its thoughts about
the two "prefer" settings. That means negotiating some SN API.

I think there is a relatively expensive javalib way find the info.

I think the "due bill" could be a second pass, as long as it is not forgotten.

@LeeTibbert
Copy link
Collaborator

Another point I am confused about: where in its implementation is epollcat querying name servers? epollcat only works directly > with IP addresses; never with hostnames. Therefore it should never be querying any name server, to the best of my knowledge.

There are layers to "name server". getaddrinfo with AI_NUMERICHOST and/or AI_NUMERICSERVICE is used
to avoid trips to the "name server" proper. The C library layer just below getaddrinfo() is sometimes loosely called
the name server, since it converts numeric "names" to addrinfos. You are entirely right, it is not the "name server" proper,
sometimes known, for historical reasons, as the bind server.

@LeeTibbert
Copy link
Collaborator

As a way to ground some of this discussion, I can create a WIP PR based on my earlier rapid-prototyping, our discussions
since then, and my learnings since then.

That gives something concrete.

@LeeTibbert
Copy link
Collaborator

Before we pause this discussion and I switch to an IPv6 WIP, could you
send me any lines dealing with "localhost" in your /etc/hosts?
Thank you.

I think I will somewhat easier if I at least understand
the environment that provoked this failure. Perhaps
distros are doing something different these days.
(Note to self: check macOs M1).

I tried getting CI to print out its copy, but, rightly,
got my wrist tenderly slapped for editing ci.yml
directly.

@armanbilge
Copy link
Owner Author

$ cat /etc/hosts 
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
127.0.0.1 armanbilge-epollcat-jqs9f0kwv9t

If you want to manually edit ci.yml you should make sure to remove these lines:

- name: Check that workflows are up to date
run: sbt 'project ${{ matrix.project }}' '++${{ matrix.scala }}' 'project /' githubWorkflowCheck

@LeeTibbert
Copy link
Collaborator

Thank you for hosts file. That explains how ::1 showed up for localhost.
I just checked Apple 12.5 and it appears to also define ::1 as an address for localhost.
Now to gnaw away in the dark hours why IPv6 is being returned first. Not my
biggest problem.

Knowing the environment change brings me some relief.

re: remove "Check workflow".

Thank you. The inside of my left arm is filled with tattoos of Cats Effects IO notes, so I will
have to tattoo that on the back of my right hand. You have told me that before. Perhaps this
time it will stick.

Cheers

What do you think about me switching to buff up my IPv6 rapid prototyping into a WIP PR?

@armanbilge
Copy link
Owner Author

armanbilge commented Sep 12, 2022

What do you think about me switching to buff up my IPv6 rapid prototyping into a WIP PR?

This would be fantastic! At this point I would say IPv6 is the last major issue. Once it is solved, and I confirm that CI is passing for the affected projects, I would like to publish official Scala Native releases for all the projects I've been working on (Cats Effect, FS2, http4s, Skunk, etc.).

@LeeTibbert
Copy link
Collaborator

Nothing like a little pressure motivation, eh?

@armanbilge armanbilge mentioned this issue Sep 12, 2022
@armanbilge armanbilge linked a pull request Sep 13, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants