IPv6 -- getaddrinfo() and bind() ordering with V6ONLY
Recently I ran into an issue that took me a while to sort out, and it is
regarding inconsistent behaviour on various OS's with regards to IPv6 sockets
(AF_INET6
1) and calling bind(2)
after getting the
results back from getaddrinfo(3)
.
A call to getaddrinfo()
with the hints set to AF_UNSPEC
in ai_family
and AI_PASSIVE
in ai_flags
will return to us 1 or more results that we
can bind()
to. Sample code for that looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | struct addrinfo hints, *addrlist; memset(&hints, 0, sizeof(hints)); // Ask for TCP hints.ai_socktype = SOCK_STREAM; // Any family works for us ... hints.ai_family = AF_UNSPEC; // Set some hints hints.ai_flags = AI_PASSIVE | // We want to use this with bind AI_ADDRCONFIG; // Only return IPv4 or IPv6 if they are configured int rv; if ((rv = getaddrinfo(0, "7020", &hints, &addrlist)) != 0) { fprintf(stderr, "getaddrinfo: %s", gai_strerror(rv)); return 1; } // Use the list in *addrlist for (addr = addrlist; addr != 0; addr = addr->ai_next) { // use *addr as appropriate } // Clean up the memory from getaddrinfo() freeaddrinfo(addrlist); |
On Linux there are two entries returned when the host it is run on has both
IPv4 and IPv6 enabled. An AF_INET
which was followed by an AF_INET6
. Now,
it is not said that you are required to use all of the results that are
returned, but if you want to listen on all address families it is off course
suggested.
Following the steps below for each of the returned results should result in having 1 or more different sockets that are bound to a single port.
- Create the socket()
- Set any socket options you want (
SO_REUSEADDR
for example) - Then bind() the socket
- After that call listen() (followed off course by accept() on the socket)
Only for some unknown reason (and errno
is no help) bind()
fails when you
get to the AF_INET6
, which was returned second. Searching online as to why
the bind would fail doesn't give you any good results and the thing that is
even worse is that if you run the same code on another platform such as
FreeBSD, OpenIndiana or Mac OS X no such failure exists. However
I started suspecting something was up when I started looking at the output from
netstat -lan | grep 7020
on Mac OS X. Where 7020 is the port I passed into
getaddrinfo()
.
tcp46 0 0 *.7020 *.* LISTEN tcp4 0 0 *.7020 *.* LISTEN
Wait a minute ... one of the sockets is on both IPv4 and on IPv6. Some more
time spent searching the internet I came across RFC 3493 section 5.3,
which is titled "IPV6_V6ONLY
option for AF_INET6
Sockets".
As stated in section <3.7 Compatibility with IPv4 Nodes>,
AF_INET6
sockets may be used for both IPv4 and IPv6 communications. Some applications may want to restrict their use of anAF_INET6
socket to IPv6 communications only.
This was going down the right route, so I changed my code so that in the steps
listed above in number 2 I added the following code if the socket type is
AF_INET6
:
1 2 3 4 5 | if (setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY, &yes, sizeof(int)) == -1) { close(sockfd); fprintf(stderr, "setsockopt: %s IPV6_V6ONLY\n", strerror(errno)); continue; } |
The RFC 3493 section 5.3 also states that this option should be turned off by default, which means that all IPv6 sockets can also communicate over IPv4. Thus technically setting the option manually in code the best way to fix the issue. FreeBSD has had this feature turned on (as in IPv6 sockets can only communicate with IPv6 and NOT IPv4) since 5.x.
The biggest issue is that the remaining operating systems (OS X and OpenIndiana) don't have the same behaviour as Linux which makes troubleshooting this issue more difficult than it should be. The issue is that the RFC doesn't specify what exactly the operating should do when it encounters a request to bind to the same port on IPv4 and IPv6. The only place where I have found this documented is in "IPv6 Network Programming" under "Tips in IPv6 Programming" chapter 4, section 4, appropriately titled "bind(2) Ordering and Conflicts".
If you get a bind()
error when attempting to bind to an AF_INET6
socket
please make sure that you set the socket option IPV6_V6ONLY
on the AF_INET6
socket. The default as required by RFC 3493 is to have that option be off. The
default is wrong, and the RFC should have been more specific regarding what the
right behaviour is when attempting to bind on an AF_INET6
socket when already
bound on an AF_INET
while IPV6_V6ONLY
is set to false.
The full code that I used for testing, along with a little bit more information is available as a gist on github.
-
The old BSD style
socket()
called for defines starting withPF_
such asPF_INET
andPF_INET6
with thePF
standing for protocol family. POSIX starts them withAF_
, and calls them an address family. On almost every operating systemPF_INET
is the same asAF_INET
. If the define doesn't exist you can always create it. ↩