[tor-relays] inet_csk_bind_conflict

David Fifield david at bamsoftware.com
Sat Dec 10 15:23:04 UTC 2022


On Sat, Dec 10, 2022 at 09:59:14AM +0100, Anders Trier Olesen wrote:
> IP_BIND_ADDRESS_NO_PORT did not fix your somewhat similar problem in your
> Haproxy setup, because all the connections are to the same dst tuple <ip, port>
> (i.e 127.0.0.1:ExtORPort).
> The connect() system call is looking for a unique 5-tuple <protocol, srcip,
> srcport, dstip, dstport>. In the Haproxy setup, the only free variable is
> srcport <tcp, 127.0.0.1, srcport, 127.0.0.1, ExtORPort>, so toggling
> IP_BIND_ADDRESS_NO_PORT makes no difference.

No—that is what I thought too, at first, but experimentally it is not
the case. Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and
*doing nothing else* is sufficient to resolve the problem. Haproxy ends
up binding to the same address it would have bound to with
IP_BIND_ADDRESS_NO_PORT, and there are the same number of 5-tuples to
the same endpoints, but the EADDRNOTAVAIL errors stop. It is
counterintuitive and unexpected, which why I took the trouble to write
it up.

As I wrote at #40201, there are divergent code paths for connect in the
kernel when the port is already bound versus when it is not bound. It's
not as simple as filling in blanks in a 5-tuple in otherwise identical
code paths.

Anyway, it is not true that all connections go to the same (IP, port).
(There would be no need to use a load balancer if that were the case.)
At the time, we were running 12 tor processes with 12 different
ExtORPorts (each ExtORPort on a different IP address, even: 127.0.3.1,
127.0.3.2, etc.). We started to have EADDRNOTAVAIL problems at around
3000 connections per ExtORPort, which is far too few to have exhausted
the 5-tuple space. Please check the discussion at #40201 again, because
I documented this detail there.

I urge you to run an experient yourself, if these observations are not
what you expect. I was surprised, as well.


More information about the tor-relays mailing list