tailscale.com Open in urlscan Pro
2600:9000:a51d:27c1:6748:d035:a989:fb3c  Public Scan

URL: https://tailscale.com/blog/how-nat-traversal-works
Submission: On October 18 via api from US — Scanned from DE

Form analysis 1 forms found in the DOM

POST https://forms.hsforms.com/submissions/v3/public/submit/formsnext/multipart/40004831/a3039c84-9a30-4b3d-9f68-5fff930107a7

<form id="hsForm_a3039c84-9a30-4b3d-9f68-5fff930107a7" method="POST" accept-charset="UTF-8" enctype="multipart/form-data" novalidate=""
  action="https://forms.hsforms.com/submissions/v3/public/submit/formsnext/multipart/40004831/a3039c84-9a30-4b3d-9f68-5fff930107a7"
  class="hs-form-private hsForm_a3039c84-9a30-4b3d-9f68-5fff930107a7 hs-form-a3039c84-9a30-4b3d-9f68-5fff930107a7 hs-form-a3039c84-9a30-4b3d-9f68-5fff930107a7_5806aea2-4c22-4e15-96bd-18b194df4464 theme-transparent hide-labels newsletter-new hs-form"
  target="target_iframe_a3039c84-9a30-4b3d-9f68-5fff930107a7" data-instance-id="5806aea2-4c22-4e15-96bd-18b194df4464" data-form-id="a3039c84-9a30-4b3d-9f68-5fff930107a7" data-portal-id="40004831"
  data-test-id="hsForm_a3039c84-9a30-4b3d-9f68-5fff930107a7">
  <div class="hs_email hs-email hs-fieldtype-text field hs-form-field"><label id="label-email-a3039c84-9a30-4b3d-9f68-5fff930107a7" class="" placeholder="Enter your Email" for="email-a3039c84-9a30-4b3d-9f68-5fff930107a7"><span>Email</span><span
        class="hs-form-required">*</span></label>
    <legend class="hs-field-desc" style="display: none;"></legend>
    <div class="input"><input id="email-a3039c84-9a30-4b3d-9f68-5fff930107a7" name="email" required="" placeholder="" type="email" class="hs-input" inputmode="email" autocomplete="email" value=""></div>
  </div>
  <div class="hs_submit hs-submit">
    <div class="hs-field-desc" style="display: none;"></div>
    <div class="actions"><input type="submit" class="hs-button primary large" value="Submit"></div>
  </div><input name="hs_context" type="hidden"
    value="{&quot;embedAtTimestamp&quot;:&quot;1729254072855&quot;,&quot;formDefinitionUpdatedAt&quot;:&quot;1720724745594&quot;,&quot;lang&quot;:&quot;en&quot;,&quot;embedType&quot;:&quot;REGULAR&quot;,&quot;disableCookieSubmission&quot;:&quot;true&quot;,&quot;renderRawHtml&quot;:&quot;true&quot;,&quot;userAgent&quot;:&quot;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36&quot;,&quot;pageTitle&quot;:&quot;How NAT traversal works&quot;,&quot;pageUrl&quot;:&quot;https://tailscale.com/blog/how-nat-traversal-works&quot;,&quot;isHubSpotCmsGeneratedPage&quot;:false,&quot;path&quot;:&quot;/blog/how-nat-traversal-works&quot;,&quot;hutk&quot;:&quot;7673fbf6174dd1419b2a2f875c8e0438&quot;,&quot;__hsfp&quot;:90950173,&quot;__hssc&quot;:&quot;76893962.1.1729254072015&quot;,&quot;__hstc&quot;:&quot;76893962.7673fbf6174dd1419b2a2f875c8e0438.1729254072014.1729254072014.1729254072014.1&quot;,&quot;formTarget&quot;:&quot;#hubspot-form&quot;,&quot;rumScriptExecuteTime&quot;:1592.1000003814697,&quot;rumTotalRequestTime&quot;:1802.3000001907349,&quot;rumTotalRenderTime&quot;:1841.3000001907349,&quot;rumServiceResponseTime&quot;:210.19999980926514,&quot;rumFormRenderTime&quot;:39,&quot;connectionType&quot;:&quot;4g&quot;,&quot;firstContentfulPaint&quot;:0,&quot;largestContentfulPaint&quot;:0,&quot;locale&quot;:&quot;en&quot;,&quot;timestamp&quot;:1729254072970,&quot;originalEmbedContext&quot;:{&quot;portalId&quot;:&quot;40004831&quot;,&quot;formId&quot;:&quot;a3039c84-9a30-4b3d-9f68-5fff930107a7&quot;,&quot;region&quot;:&quot;na1&quot;,&quot;target&quot;:&quot;#hubspot-form&quot;,&quot;isBuilder&quot;:false,&quot;isTestPage&quot;:false,&quot;isPreview&quot;:false,&quot;cssClass&quot;:&quot;theme-transparent hide-labels newsletter-new&quot;,&quot;isMobileResponsive&quot;:true},&quot;correlationId&quot;:&quot;5806aea2-4c22-4e15-96bd-18b194df4464&quot;,&quot;renderedFieldsIds&quot;:[&quot;email&quot;],&quot;captchaStatus&quot;:&quot;NOT_APPLICABLE&quot;,&quot;emailResubscribeStatus&quot;:&quot;NOT_APPLICABLE&quot;,&quot;isInsideCrossOriginFrame&quot;:false,&quot;source&quot;:&quot;forms-embed-1.6227&quot;,&quot;sourceName&quot;:&quot;forms-embed&quot;,&quot;sourceVersion&quot;:&quot;1.6227&quot;,&quot;sourceVersionMajor&quot;:&quot;1&quot;,&quot;sourceVersionMinor&quot;:&quot;6227&quot;,&quot;allPageIds&quot;:{},&quot;_debug_embedLogLines&quot;:[{&quot;clientTimestamp&quot;:1729254072926,&quot;level&quot;:&quot;INFO&quot;,&quot;message&quot;:&quot;Retrieved customer callbacks used on embed context: [\&quot;onFormSubmit\&quot;]&quot;},{&quot;clientTimestamp&quot;:1729254072926,&quot;level&quot;:&quot;INFO&quot;,&quot;message&quot;:&quot;Retrieved pageContext values which may be overriden by the embed context: {\&quot;pageTitle\&quot;:\&quot;How NAT traversal works\&quot;,\&quot;pageUrl\&quot;:\&quot;https://tailscale.com/blog/how-nat-traversal-works\&quot;,\&quot;userAgent\&quot;:\&quot;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36\&quot;,\&quot;isHubSpotCmsGeneratedPage\&quot;:false}&quot;},{&quot;clientTimestamp&quot;:1729254072928,&quot;level&quot;:&quot;INFO&quot;,&quot;message&quot;:&quot;Retrieved countryCode property from normalized embed definition response: \&quot;DE\&quot;&quot;},{&quot;clientTimestamp&quot;:1729254072932,&quot;level&quot;:&quot;INFO&quot;,&quot;message&quot;:&quot;Retrieved analytics values from API response which may be overriden by the embed context: {\&quot;hutk\&quot;:\&quot;7673fbf6174dd1419b2a2f875c8e0438\&quot;,\&quot;path\&quot;:\&quot;/blog/how-nat-traversal-works\&quot;}&quot;}]}"><iframe
    name="target_iframe_a3039c84-9a30-4b3d-9f68-5fff930107a7" style="display: none;"></iframe>
</form>

Text Content

Life on the edge: Networking challenges of AI deploymentsRead more
Product

Solutions

Enterprise

Customers

Docs

Blog

Pricing

DownloadLog in
Get started
Product
Meet Tailscale
 * How it works
 * Why Tailscale
 * WireGuard® for Enterprises
 * Bring Tailscale to Work

Explore
 * Integrations
 * Features
 * Compare Tailscale

Solutions
By use-case
 * Remote Access
 * Multi-Cloud Networking
 * Kubernetes Networking
 * Edge & IoT Deployments
 * Zero Trust Networking
 * AI Workloads
 * Secure SaaS
 * Business VPN
 * Homelab

By role
 * DevOps
 * IT
 * Security

Enterprise

Customers

Docs

Blog

Pricing

Download

Get started
Login
WireGuard is a registered trademark of Jason A. Donenfeld.
Terms of ServicePrivacy Policy

© 2024 Tailscale Inc. All rights reserved. Tailscale is a registered trademark
of Tailscale Inc.
Blog|August 21, 2020


HOW NAT TRAVERSAL WORKS

We covered a lot of ground in our post about How Tailscale Works. However, we
glossed over how we can get through NATs (Network Address Translators) and
connect your devices directly to each other, no matter what’s standing between
them. Let’s talk about that now!

Let’s start with a simple problem: establishing a peer-to-peer connection
between two machines. In Tailscale’s case, we want to set up a WireGuard®
tunnel, but that doesn’t really matter. The techniques we use are widely
applicable and the work of many people over decades. For example, WebRTC uses
this bag of tricks to send peer-to-peer audio, video and data between web
browsers. VoIP phones and some video games use similar techniques, though not
always successfully.

We’ll be discussing these techniques generically, using Tailscale and others for
examples where appropriate. Let’s say you’re making your own protocol and that
you want NAT traversal. You need two things.

First, the protocol should be based on UDP. You can do NAT traversal with TCP,
but it adds another layer of complexity to an already quite complex problem, and
may even require kernel customizations depending on how deep you want to go.
We’re going to focus on UDP for the rest of this article.

If you’re reaching for TCP because you want a stream-oriented connection when
the NAT traversal is done, consider using QUIC instead. It builds on top of UDP,
so we can focus on UDP for NAT traversal and still have a nice stream protocol
at the end.

Second, you need direct control over the network socket that’s sending and
receiving network packets. As a rule, you can’t take an existing network library
and make it traverse NATs, because you have to send and receive extra packets
that aren’t part of the “main” protocol you’re trying to speak. Some protocols
tightly integrate the NAT traversal with the rest (e.g. WebRTC). But if you’re
building your own, it’s helpful to think of NAT traversal as a separate entity
that shares a socket with your main protocol. Both run in parallel, one enabling
the other.

Direct socket access may be tough depending on your situation. One workaround is
to run a local proxy. Your protocol speaks to this proxy, and the proxy does
both NAT traversal and relaying of your packets to the peer. This layer of
indirection lets you benefit from NAT traversal without altering your original
program.

With prerequisites out of the way, let’s go through NAT traversal from first
principles. Our goal is to get UDP packets flowing bidirectionally between two
devices, so that our other protocol (WireGuard, QUIC, WebRTC, …) can do
something cool. There are two obstacles to having this Just Work: stateful
firewalls and NAT devices.


FIGURING OUT FIREWALLS

Stateful firewalls are the simpler of our two problems. In fact, most NAT
devices include a stateful firewall, so we need to solve this subset before we
can tackle NATs.

There are many incarnations to consider. Some you might recognize are the
Windows Defender firewall, Ubuntu’s ufw (using iptables/nftables), BSD’s pf
(also used by macOS) and AWS’s Security Groups. They’re all very configurable,
but the most common configuration allows all “outbound” connections and blocks
all “inbound” connections. There might be a few handpicked exceptions, such as
allowing inbound SSH.

But connections and “direction” are a figment of the protocol designer’s
imagination. On the wire, every connection ends up being bidirectional; it’s all
individual packets flying back and forth. How does the firewall know what’s
inbound and what’s outbound?

That’s where the stateful part comes in. Stateful firewalls remember what
packets they’ve seen in the past and can use that knowledge when deciding what
to do with new packets that show up.

For UDP, the rule is very simple: the firewall allows an inbound UDP packet if
it previously saw a matching outbound packet. For example, if our laptop
firewall sees a UDP packet leaving the laptop from 2.2.2.2:1234 to 7.7.7.7:5678,
it’ll make a note that incoming packets from 7.7.7.7:5678 to 2.2.2.2:1234 are
also fine. The trusted side of the world clearly intended to communicate with
7.7.7.7:5678, so we should let them talk back.

(As an aside, some very relaxed firewalls might allow traffic from anywhere back
to 2.2.2.2:1234 once 2.2.2.2:1234 has communicated with anyone. Such firewalls
make our traversal job easier, but are increasingly rare.)

FIREWALL FACE-OFF

This rule for UDP traffic is only a minor problem for us, as long as all the
firewalls on the path are “facing” the same way. That’s usually the case when
you’re communicating with a server on the internet. Our only constraint is that
the machine that’s behind the firewall must be the one initiating all
connections. Nothing can talk to it, unless it talks first.

This is fine, but not very interesting: we’ve reinvented client/server
communication, where the server makes itself easily reachable to clients. In the
VPN world, this leads to a hub-and-spoke topology: the hub has no firewalls
blocking access to it and the firewalled spokes connect to the hub.

The problems start when two of our “clients” want to talk directly. Now the
firewalls are facing each other. According to the rule we established above,
this means both sides must go first, but also that neither can go first, because
the other side has to go first!

How do we get around this? One way would be to require users to reconfigure one
or both of the firewalls to “open a port” and allow the other machine’s traffic.
This is not very user friendly. It also doesn’t scale to mesh networks like
Tailscale, in which we expect the peers to be moving around the internet with
some regularity. And, of course, in many cases you don’t have control over the
firewalls: you can’t reconfigure the router in your favorite coffee shop, or at
the airport. (At least, hopefully not!)

We need another option. One that doesn’t involve reconfiguring firewalls.

FINESSING FINICKY FIREWALLS

The trick is to carefully read the rule we established for our stateful
firewalls. For UDP, the rule is: packets must flow out before packets can flow
back in.

However, nothing says the packets must be related to each other beyond the IPs
and ports lining up correctly. As long as some packet flowed outwards with the
right source and destination, any packet that looks like a response will be
allowed back in, even if the other side never received your packet!

So, to traverse these multiple stateful firewalls, we need to share some
information to get underway: the peers have to know in advance the ip:port their
counterpart is using. One approach is to statically configure each peer by hand,
but this approach doesn’t scale very far. To move beyond that, we built a
coordination server to keep the ip:port information synchronized in a flexible,
secure manner.

Then, the peers start sending UDP packets to each other. They must expect some
of these packets to get lost, so they can’t carry any precious information
unless you’re prepared to retransmit them. This is generally true of UDP, but
especially true here. We’re going to lose some packets in this process.

Our laptop and workstation are now listening on fixed ports, so that they both
know exactly what ip:port to talk to. Let’s take a look at what happens.

The laptop’s first packet, from 2.2.2.2:1234 to 7.7.7.7:5678, goes through the
Windows Defender firewall and out to the internet. The corporate firewall on the
other end blocks the packet, since it has no record of 7.7.7.7:5678 ever talking
to 2.2.2.2:1234. However, Windows Defender now remembers that it should expect
and allow responses from 7.7.7.7:5678 to 2.2.2.2:1234.

Next, the workstation’s first packet from 7.7.7.7:5678 to 2.2.2.2:1234 goes
through the corporate firewall and across the internet. When it arrives at the
laptop, Windows Defender thinks “ah, a response to that outbound request I saw”,
and lets the packet through! Additionally, the corporate firewall now remembers
that it should expect responses from 2.2.2.2:1234 to 7.7.7.7:5678, and that
those packets are also okay.

Encouraged by the receipt of a packet from the workstation, the laptop sends
another packet back. It goes through the Windows Defender firewall, through the
corporate firewall (because it’s a “response” to a previously sent packet), and
arrives at the workstation.

Success! We’ve established two-way communication through a pair of firewalls
that, at first glance, would have prevented it.

CREATIVE CONNECTIVITY CAVEATS

It’s not always so easy. We’re relying on some indirect influence over
third-party systems, which requires careful handling. What do we need to keep in
mind when managing firewall-traversing connections?

Both endpoints must attempt communication at roughly the same time, so that all
the intermediate firewalls open up while both peers are still around. One
approach is to have the peers retry continuously, but this is wasteful. Wouldn’t
it be better if both peers knew to start establishing a connection at the same
time?

This may sound a little recursive: to communicate, first you need to be able to
communicate. However, this preexisting “side channel” doesn’t need to be very
fancy: it can have a few seconds of latency, and only needs to deliver a few
thousand bytes in total, so a tiny VM can easily be a matchmaker for thousands
of machines.

In the distant past, I used XMPP chat messages as the side channel, with great
results. As another example, WebRTC requires you to come up with your own
“signalling channel” (a name that reveals WebRTC’s IP telephony ancestry), and
plug it into the WebRTC APIs. In Tailscale, our coordination server and fleet of
DERP (Detour Encrypted Routing Protocol) servers act as our side channel.

Stateful firewalls have limited memory, meaning that we need periodic
communication to keep connections alive. If no packets are seen for a while (a
common value for UDP is 30 seconds), the firewall forgets about the session, and
we have to start over. To avoid this, we use a timer and must either send
packets regularly to reset the timers, or have some out-of-band way of
restarting the connection on demand.

On the plus side, one thing we don’t need to worry about is exactly how many
firewalls exist between our two peers. As long as they are stateful and allow
outbound connections, the simultaneous transmission technique will get through
any number of layers. That’s really nice, because it means we get to implement
the logic once, and it’ll work everywhere.

…Right?

Well, not quite. For this to work, our peers need to know in advance what
ip:port to use for their counterparts. This is where NATs come into play, and
ruin our fun.


THE NATURE OF NATS

We can think of NAT (Network Address Translator) devices as stateful firewalls
with one more really annoying feature: in addition to all the stateful
firewalling stuff, they also alter packets as they go through.

A NAT device is anything that does any kind of Network Address Translation, i.e.
altering the source or destination IP address or port. However, when talking
about connectivity problems and NAT traversal, all the problems come from Source
NAT, or SNAT for short. As you might expect, there is also DNAT (Destination
NAT), and it’s very useful but not relevant to NAT traversal.

The most common use of SNAT is to connect many devices to the internet, using
fewer IP addresses than the number of devices. In the case of consumer-grade
routers, we map all devices onto a single public-facing IP address. This is
desirable because it turns out that there are way more devices in the world that
want internet access, than IP addresses to give them (at least in IPv4 — we’ll
come to IPv6 in a little bit). NATs let us have many devices sharing a single IP
address, so despite the global shortage of IPv4 addresses, we can scale the
internet further with the addresses at hand.

NAVIGATING A NATTY NETWORK

Let’s look at what happens when your laptop is connected to your home Wi-Fi and
talks to a server on the internet.

Your laptop sends UDP packets from 192.168.0.20:1234 to 7.7.7.7:5678. This is
exactly the same as if the laptop had a public IP. But that won’t work on the
internet: 192.168.0.20 is a private IP address, which appears on many different
peoples’ private networks. The internet won’t know how to get responses back to
us.

Enter the home router. The laptop’s packets flow through the home router on
their way to the internet, and the router sees that this is a new session that
it’s never seen before.

It knows that 192.168.0.20 won’t fly on the internet, but it can work around
that: it picks some unused UDP port on its own public IP address — we’ll use
2.2.2.2:4242 — and creates a NAT mapping that establishes an equivalence:
192.168.0.20:1234 on the LAN side is the same as 2.2.2.2:4242 on the internet
side.

From now on, whenever it sees packets that match that mapping, it will rewrite
the IPs and ports in the packet appropriately.

Resuming our packet’s journey: the home router applies the NAT mapping it just
created, and sends the packet onwards to the internet. Only now, the packet is
from 2.2.2.2:4242, not 192.168.0.20:1234. It goes on to the server, which is
none the wiser. It’s communicating with 2.2.2.2:4242, like in our previous
examples sans NAT.

Responses from the server flow back the other way as you’d expect, with the home
router rewriting 2.2.2.2:4242 back to 192.168.0.20:1234. The laptop is also none
the wiser, from its perspective the internet magically figured out what to do
with its private IP address.

Our example here was with a home router, but the same principle applies on
corporate networks. The usual difference there is that the NAT layer consists of
multiple machines (for high availability or capacity reasons), and they can have
more than one public IP address, so that they have more public ip:port
combinations to choose from and can sustain more active clients at once.

Multiple NATs on a single layer allow for higher availability or capacity, but
function the same as a single NAT.

A STUDY IN STUN

We now have a problem that looks like our earlier scenario with the stateful
firewalls, but with NAT devices:

Our problem is that our two peers don’t know what the ip:port of their peer is.
Worse, strictly speaking there is no ip:port until the other peer sends packets,
since NAT mappings only get created when outbound traffic towards the internet
requires it. We’re back to our stateful firewall problem, only worse: both sides
have to speak first, but neither side knows to whom to speak, and can’t know
until the other side speaks first.

How do we break the deadlock? That’s where STUN comes in. STUN is both a set of
studies of the detailed behavior of NAT devices, and a protocol that aids in NAT
traversal. The main thing we care about for now is the network protocol.

STUN relies on a simple observation: when you talk to a server on the internet
from a NATed client, the server sees the public ip:port that your NAT device
created for you, not your LAN ip:port. So, the server can tell you what ip:port
it saw. That way, you know what traffic from your LAN ip:port looks like on the
internet, you can tell your peers about that mapping, and now they know where to
send packets! We’re back to our “simple” case of firewall traversal.

That’s fundamentally all that the STUN protocol is: your machine sends a “what’s
my endpoint from your point of view?” request to a STUN server, and the server
replies with “here’s the ip:port that I saw your UDP packet coming from.”

(The STUN protocol has a bunch more stuff in it — there’s a way of obfuscating
the ip:port in the response to stop really broken NATs from mangling the
packet’s payload, and a whole authentication mechanism that only really gets
used by TURN and ICE, sibling protocols to STUN that we’ll talk about in a bit.
We can ignore all of that stuff for address discovery.)

Incidentally, this is why we said in the introduction that, if you want to
implement this yourself, the NAT traversal logic and your main protocol have to
share a network socket. Each socket gets a different mapping on the NAT device,
so in order to discover your public ip:port, you have to send and receive STUN
packets from the socket that you intend to use for communication, otherwise
you’ll get a useless answer.

HOW THIS HELPS

Given STUN as a tool, it seems like we’re close to done. Each machine can do
STUN to discover the public-facing ip:port for its local socket, tell its peers
what that is, everyone does the firewall traversal stuff, and we’re all set…
Right?

Well, it’s a mixed bag. This’ll work in some cases, but not others. Generally
speaking, this’ll work with most home routers, and will fail with some corporate
NAT gateways. The probability of failure increases the more the NAT device’s
brochure mentions that it’s a security device. (NATs do not enhance security in
any meaningful way, but that’s a rant for another time.)

The problem is an assumption we made earlier: when the STUN server told us that
we’re 2.2.2.2:4242 from its perspective, we assumed that meant that we’re
2.2.2.2:4242 from the entire internet’s perspective, and that therefore anyone
can reach us by talking to 2.2.2.2:4242.

As it turns out, that’s not always true. Some NAT devices behave exactly in line
with our assumptions. Their stateful firewall component still wants to see
packets flowing in the right order, but we can reliably figure out the correct
ip:port to give to our peer and do our simultaneous transmission trick to get
through. Those NATs are great, and our combination of STUN and the simultaneous
packet sending will work fine with those.

(in theory, there are also NAT devices that are super relaxed, and don’t ship
with stateful firewall stuff at all. In those, you don’t even need simultaneous
transmission, the STUN request gives you an internet ip:port that anyone can
connect to with no further ceremony. If such devices do still exist, they’re
increasingly rare.)

Other NAT devices are more difficult, and create a completely different NAT
mapping for every different destination that you talk to. On such a device, if
we use the same socket to send to 5.5.5.5:1234 and 7.7.7.7:2345, we’ll end up
with two different ports on 2.2.2.2, one for each destination. If you use the
wrong port to talk back, you don’t get through.

NAMING OUR NATS

Now that we’ve discovered that not all NAT devices behave in the same way, we
should talk terminology. If you’ve done anything related to NAT traversal
before, you might have heard of “Full Cone”, “Restricted Cone”, “Port-Restricted
Cone” and “Symmetric” NATs. These are terms that come from early research into
NAT traversal.

That terminology is honestly quite confusing. I always look up what a Restricted
Cone NAT is supposed to be. Empirically, I’m not alone in this, because most of
the internet calls “easy” NATs Full Cone, when these days they’re much more
likely to be Port-Restricted Cone.

More recent research and RFCs have come up with a much better taxonomy. First of
all, they recognize that there are many more varying dimensions of behavior than
the single “cone” dimension of earlier research, so focusing on the cone-ness of
your NAT isn’t necessarily helpful. Second, they came up with words that more
plainly convey what the NAT is doing.

The “easy” and “hard” NATs above differ in a single dimension: whether or not
their NAT mappings depend on what the destination is. RFC 4787 calls the easy
variant “Endpoint-Independent Mapping” (EIM for short), and the hard variant
“Endpoint-Dependent Mapping” (EDM for short). There’s a subcategory of EDM that
specifies whether the mapping varies only on the destination IP, or on both the
destination IP and port. For NAT traversal, the distinction doesn’t matter. Both
kinds of EDM NATs are equally bad news for us.

In the grand tradition of naming things being hard, endpoint-independent NATs
still depend on an endpoint: each source ip:port gets a different mapping,
because otherwise your packets would get mixed up with someone else’s packets,
and that would be chaos. Strictly speaking, we should say “Destination Endpoint
Independent Mapping” (DEIM?), but that’s a mouthful, and since “Source Endpoint
Independent Mapping” would be another way to say “broken”, we don’t specify.
Endpoint always means “Destination Endpoint.”

You might be wondering how 2 kinds of endpoint dependence maps into 4 kinds of
cone-ness. The answer is that cone-ness encompasses two orthogonal dimensions of
NAT behavior. One is NAT mapping behavior, which we looked at above, and the
other is stateful firewall behavior. Like NAT mapping behavior, the firewalls
can be Endpoint-Independent or a couple of variants of Endpoint-Dependent. If
you throw all of these into a matrix, you can reconstruct the cone-ness of a NAT
from its more fundamental properties:

NAT CONE TYPES

Endpoint-Independent NAT mapping Endpoint-Dependent NAT mapping (all types)
Endpoint-Independent firewall Full Cone NAT N/A* Endpoint-Dependent firewall
(dest. IP only) Restricted Cone NAT N/A* Endpoint-Dependent firewall (dest.
IP+port) Port-Restricted Cone NAT Symmetric NAT

* can theoretically exist, but don't show up in the wild

Once broken down like this, we can see that cone-ness isn’t terribly useful to
us. The major distinction we care about is Symmetric versus anything else — in
other words, we care about whether a NAT device is EIM or EDM.

While it’s neat to know exactly how your firewall behaves, we don’t care from
the point of view of writing NAT traversal code. Our simultaneous transmission
trick will get through all three variants of firewalls. In the wild we’re
overwhelmingly dealing only with IP-and-port endpoint-dependent firewalls. So,
for practical code, we can simplify the table down to:

Endpoint-Independent NAT mapping Endpoint-Dependent NAT mapping (dest. IP only)
Firewall is yes Easy NAT Hard NAT

If you’d like to read more about the newer taxonomies of NATs, you can get the
full details in RFCs 4787 (NAT Behavioral Requirements for UDP), 5382 (for TCP)
and 5508 (for ICMP). And if you’re implementing a NAT device, these RFCs are
also your guide to what behaviors you should implement, to make them
well-behaved devices that play well with others and don’t generate complaints
about Halo multiplayer not working.

Back to our NAT traversal. We were doing well with STUN and firewall traversal,
but these hard NATs are a big problem. It only takes one of them in the whole
path to break our current traversal plans.

But wait, this post is titled “how NAT traversal works”, not “how NAT traversal
doesn’t work.” So presumably, I have a trick up my sleeve to get out of this,
right?

HAVE YOU CONSIDERED GIVING UP?

This is a good time to have the awkward part of our chat: what happens when we
empty our entire bag of tricks, and we still can’t get through? A lot of NAT
traversal code out there gives up and declares connectivity impossible. That’s
obviously not acceptable for us; Tailscale is nothing without the connectivity.

We could use a relay that both sides can talk to unimpeded, and have it shuffle
packets back and forth. But wait, isn’t that terrible?

Sort of. It’s certainly not as good as a direct connection, but if the relay is
“near enough” to the network path your direct connection would have taken, and
has enough bandwidth, the impact on your connection quality isn’t huge. There
will be a bit more latency, maybe less bandwidth. That’s still much better than
no connection at all, which is where we were heading.

And keep in mind that we only resort to this in cases where direct connections
fail. We can still establish direct connections through a lot of different
networks. Having relays to handle the long tail isn’t that bad.

Additionally, some networks can break our connectivity much more directly than
by having a difficult NAT. For example, we’ve observed that the UC Berkeley
guest Wi-Fi blocks all outbound UDP except for DNS traffic. No amount of clever
NAT tricks is going to get around the firewall eating your packets. So, we need
some kind of reliable fallback no matter what.

You could implement relays in a variety of ways. The classic way is a protocol
called TURN (Traversal Using Relays around NAT). We’ll skip the protocol
details, but the idea is that you authenticate yourself to a TURN server on the
internet, and it tells you “okay, I’ve allocated ip:port, and will relay packets
for you.” You tell your peer the TURN ip:port, and we’re back to a completely
trivial client/server communication scenario.

For Tailscale, we didn’t use TURN for our relays. It’s not a particularly
pleasant protocol to work with, and unlike STUN there’s no real interoperability
benefit since there are no open TURN servers on the internet.

Instead, we created DERP (Detoured Encrypted Routing Protocol), which is a
general purpose packet relaying protocol. It runs over HTTP, which is handy on
networks with strict outbound rules, and relays encrypted payloads based on the
destination’s public key.

As we briefly touched on earlier, we use this communication path both as a data
relay when NAT traversal fails (in the same role as TURN in other systems) and
as the side channel to help with NAT traversal. DERP is both our fallback of
last resort to get connectivity, and our helper to upgrade to a peer-to-peer
connection, when that’s possible.

Now that we have a relay, in addition to the traversal tricks we’ve discussed so
far, we’re in pretty good shape. We can’t get through everything but we can get
through quite a lot, and we have a backup for when we fail. If you stopped
reading now and implemented just the above, I’d estimate you could get a direct
connection over 90% of the time, and your relays guarantee some connectivity all
the time.


NAT NOTES FOR NERDS

But… If you’re not satisfied with “good enough”, there’s still a lot more we can
do! What follows is a somewhat miscellaneous set of tricks, which can help us
out in specific situations. None of them will solve NAT traversal by itself, but
by combining them judiciously, we can get incrementally closer to a 100% success
rate.

THE BENEFITS OF BIRTHDAYS

Let’s revisit our problem with hard NATs. The key issue is that the side with
the easy NAT doesn’t know what ip:port to send to on the hard side. But must
send to the right ip:port in order to open up its firewall to return traffic.
What can we do about that?

Well, we know some ip:port for the hard side, because we ran STUN. Let’s assume
that the IP address we got is correct. That’s not necessarily true, but let’s
run with the assumption for now. As it turns out, it’s mostly safe to assume
this. (If you’re curious why, see REQ-2 in RFC 4787.)

If the IP address is correct, our only unknown is the port. There’s 65,535
possibilities… Could we try all of them? At 100 packets/sec, that’s a worst case
of 10 minutes to find the right one. It’s better than nothing, but not great.
And it really looks like a port scan (because in fairness, it is), which may
anger network intrusion detection software.

We can do much better than that, with the help of the birthday paradox. Rather
than open 1 port on the hard side and have the easy side try 65,535
possibilities, let’s open, say, 256 ports on the hard side (by having 256
sockets sending to the easy side’s ip:port), and have the easy side probe target
ports at random.

I’ll spare you the detailed math, but you can check out the dinky python
calculator I made while working it out. The calculation is a very slight variant
on the “classic” birthday paradox, because it’s looking at collisions between
two sets containing distinct elements, rather than collisions within a single
set. Fortunately, the difference works out slightly in our favor! Here’s the
chances of a collision of open ports (i.e. successful communication), as the
number of random probes from the easy side increases, and assuming 256 ports on
the hard side:

Number of random probes Chance of success 174 50% 256 64% 1024 98% 2048 99.9%

If we stick with a fairly modest probing rate of 100 ports/sec, half the time
we’ll get through in under 2 seconds. And even if we get unlucky, 20 seconds in
we’re virtually guaranteed to have found a way in, after probing less than 4% of
the total search space.

That’s great! With this additional trick, one hard NAT in the path is an
annoying speedbump, but we can manage. What about two?

We can try to apply the same trick, but now the search is much harder: each
random destination port we probe through a hard NAT also results in a random
source port. That means we’re now looking for a collision on a {source port,
destination port} pair, rather than just the destination port.

Again I’ll spare you the calculations, but after 20 seconds in the same regime
as the previous setup (256 probes from one side, 2048 from the other), our
chance of success is… 0.01%.

This shouldn’t be surprising if you’ve studied the birthday paradox before. The
birthday paradox lets us convert N “effort” into something on the order of
sqrt(N). But we squared the size of the search space, so even the reduced amount
of effort is still a lot more effort. To hit a 99.9% chance of success, we need
each side to send 170,000 probes. At 100 packets/sec, that’s 28 minutes of
trying before we can communicate. 50% of the time we’ll succeed after “only”
54,000 packets, but that’s still 9 minutes of waiting around with no connection.
Still, that’s better than the 1.2 years it would take without the birthday
paradox.

In some applications, 28 minutes might still be worth it. Spend half an hour
brute-forcing your way through, then you can keep pinging to keep the open path
alive indefinitely — or at least until one of the NATs reboots and dumps all its
state, then you’re back to brute forcing. But it’s not looking good for any kind
of interactive connectivity.

Worse, if you look at common office routers, you’ll find that they have a
surprisingly low limit on active sessions. For example, a Juniper SRX 300 maxes
out at 64,000 active sessions. We’d consume its entire session table with our
one attempt to get through! And that’s assuming the router behaves gracefully
when overloaded. And this is all to get a single connection! What if we have 20
machines doing this behind the same router? Disaster.

Still, with this trick we can make it through a slightly harder network topology
than before. That’s a big deal, because home routers tend to be easy NATs, and
hard NATs tend to be office routers or cloud NAT gateways. That means this trick
buys us improved connectivity for the home-to-office and home-to-cloud
scenarios, as well as a few office-to-cloud and cloud-to-cloud scenarios.

PARTIALLY MANIPULATING PORT MAPS

Our hard NATs would be so much easier if we could ask the NATs to stop being
such jerks, and let more stuff through. Turns out, there’s a protocol for that!
Three of them, to be precise. Let’s talk about port mapping protocols.

The oldest is the UPnP IGD (Universal Plug’n’Play Internet Gateway Device)
protocol. It was born in the late 1990’s, and as such uses a lot of very 90’s
technology (XML, SOAP, multicast HTTP over UDP — yes, really) and is quite hard
to implement correctly and securely — but a lot of routers shipped with UPnP,
and a lot still do. If we strip away all the fluff, we find a very simple
request-response that all three of our port mapping protocols implement: “Hi,
please forward a WAN port to lan-ip:port,” and “okay, I’ve allocated wan-ip:port
for you.”

Speaking of stripping away the fluff: some years after UPnP IGD came out, Apple
launched a competing protocol called NAT-PMP (NAT Port Mapping Protocol). Unlike
UPnP, it only does port forwarding, and is extremely simple to implement, both
on clients and on NAT devices. A little bit after that, NAT-PMP v2 was reborn as
PCP (Port Control Protocol).

So, to help our connectivity further, we can look for UPnP IGD, NAT-PMP and PCP
on our local default gateway. If one of the protocols elicits a response, we
request a public port mapping. You can think of this as a sort of supercharged
STUN: in addition to discovering our public ip:port, we can instruct the NAT to
be friendlier to our peers, by not enforcing firewall rules for that port. Any
packet from anywhere that lands on our mapped port will make it back to us.

You can’t rely on these protocols being present. They might not be implemented
on your devices. They might be disabled by default and nobody knew to turn them
on. They might have been disabled by policy.

Disabling by policy is fairly common because UPnP suffered from a number of
high-profile vulnerabilities (since fixed, so newer devices can safely offer
UPnP, if implemented properly). Unfortunately, many devices come with a single
“UPnP” checkbox that actually toggles UPnP, NAT-PMP and PCP all at once, so
folks concerned about UPnP’s security end up disabling the perfectly safe
alternatives as well.

Still, when it’s available, it effectively makes one NAT vanish from the data
path, which usually makes connectivity trivial… But let’s look at the unusual
cases.

NEGOTIATING NUMEROUS NATS

So far, the topologies we’ve looked at have each client behind one NAT device,
with the two NATs facing each other. What happens if we build a “double NAT”, by
chaining two NATs in front of one of our machines?

What happens if we build a “double NAT”, by chaining two NATs in front of one of
our machines?

In this example, not much of interest happens. Packets from client A go through
two different layers of NAT on their way to the internet. But the outcome is the
same as it was with multiple layers of stateful firewalls: the extra layer is
invisible to everyone, and our other techniques will work fine regardless of how
many layers there are. All that matters is the behavior of the “last” layer
before the internet, because that’s the one that our peer has to find a way
through.

The big thing that breaks is our port mapping protocols. They act upon the layer
of NAT closest to the client, whereas the one we need to influence is the one
furthest away. You can still use the port mapping protocols, but you’ll get an
ip:port in the “middle” network, which your remote peer cannot reach.
Unfortunately, none of the protocols give you enough information to find the
“next NAT up” to repeat the process there, although you could try your luck with
a traceroute and some blind requests to the next few hops.

Breaking port mapping protocols is the reason why the internet is so full of
warnings about the evils of double-NAT, and how you should bend over backwards
to avoid them. But in fact, double-NAT is entirely invisible to most
internet-using applications, because most applications don’t try to do this kind
of explicit NAT traversal.

I’m definitely not saying that you should set up a double-NAT in your network.
Breaking the port mapping protocols will degrade multiplayer on many video
games, and will likely strip IPv6 from your network, which robs you of some very
good options for NAT-free connectivity. But, if circumstances beyond your
control force you into a double-NAT, and you can live with the downsides, most
things will still work fine.

Which is a good thing, because you know what circumstances beyond your control
force you to double-NAT? Let’s talk carrier-grade NAT.

CONCERNING CGNATS

Even with NATs to stretch the supply of IPv4 addresses, we’re still running out,
and ISPs can no longer afford to give one entire public IP address to every home
on their network. To work around this, ISPs apply SNAT recursively: your home
router SNATs your devices to an “intermediate” IP address, and further out in
the ISP’s network a second layer of NAT devices map those intermediate IPs onto
a smaller number of public IPs. This is “carrier-grade NAT”, or CGNAT for short.

How do we connect two peers who are behind the same CGNAT, but different home
NATs within?

Carrier-grade NAT is an important development for NAT traversal. Prior to CGNAT,
enterprising users could work around NAT traversal difficulties by manually
configuring port forwarding on their home routers. But you can’t reconfigure the
ISP’s CGNAT! Now even power users have to wrestle with the problems NATs pose.

The good news: this is a run of the mill double-NAT, and so as we covered above
it’s mostly okay. Some stuff won’t work as well as it could, but things work
well enough that ISPs can charge money for it. Aside from the port mapping
protocols, everything from our current bag of tricks works fine in a CGNAT
world.

We do have to overcome a new challenge, however: how do we connect two peers who
are behind the same CGNAT, but different home NATs within? That’s how we set up
peers A and B in the diagram above.

The problem here is that STUN doesn’t work the way we’d like. We’d like to find
out our ip:port on the “middle network”, because it’s effectively playing the
role of a miniature internet to our two peers. But STUN tells us what our
ip:port is from the STUN server’s point of view, and the STUN server is out on
the internet, beyond the CGNAT.

If you’re thinking that port mapping protocols can help us here, you’re right!
If either peer’s home NAT supports one of the port mapping protocols, we’re
happy, because we have an ip:port that behaves like an un-NATed server, and
connecting is trivial. Ironically, the fact that double NAT “breaks” the port
mapping protocols helps us! Of course, we still can’t count on these protocols
helping us out, doubly so because CGNAT ISPs tend to turn them off in the
equipment they put in homes in order to avoid software getting confused by the
“wrong” results they would get.

But what if we don’t get lucky, and can’t map ports on our NATs? Let’s go back
to our STUN-based technique and see what happens. Both peers are behind the same
CGNAT, so let’s say that STUN tells us that peer A is 2.2.2.2:1234, and peer B
is 2.2.2.2:5678.

The question is: what happens when peer A sends a packet to 2.2.2.2:5678? We
might hope that the following takes place in the CGNAT box:

 * Apply peer A’s NAT mapping, rewrite the packet to be from 2.2.2.2:1234 and to
   2.2.2.2:5678.

 * Notice that 2.2.2.2:5678 matches peer B’s incoming NAT mapping, rewrite the
   packet to be from 2.2.2.2:1234 and to peer B’s private IP.

 * Send the packet on to peer B, on the “internal” interface rather than off
   towards the internet.

This behavior of NATs is called hairpinning, and with all this dramatic buildup
you won’t be surprised to learn that hairpinning works on some NATs and not
others.

In fact, a great many otherwise well-behaved NAT devices don’t support
hairpinning, because they make assumptions like “a packet from my internal
network to a non-internal IP address will always flow outwards to the internet”,
and so end up dropping packets as they try to turn around within the router.
These assumptions might even be baked into routing silicon, where it’s
impossible to fix without new hardware.

Hairpinning, or lack thereof, is a trait of all NATs, not just CGNATs. In most
cases, it doesn’t matter, because you’d expect two LAN devices to talk directly
to each other rather than hairpin through their default gateway. And it’s a pity
that it usually doesn’t matter, because that’s probably why hairpinning is
commonly broken.

But once CGNAT is involved, hairpinning becomes vital to connectivity.
Hairpinning lets you apply the same tricks that you use for internet
connectivity, without worrying about whether you’re behind a CGNAT. If both
hairpinning and port mapping protocols fail, you’re stuck with relaying.

IDEALLY IPV6, NAT64 NOTWITHSTANDING

By this point I expect some of you are shouting at your screens that the
solution to all this nonsense is IPv6. All this is happening because we’re
running out of IPv4 addresses, and we keep piling on NATs to work around that. A
much simpler fix would be to not have an IP address shortage, and make every
device in the world reachable without NATs. Which is exactly what IPv6 gets us.

And you’re right! Sort of. It’s true that in an IPv6-only world, all of this
becomes much simpler. Not trivial, mind you, because we’re still stuck with
stateful firewalls. Your office workstation may have a globally reachable IPv6
address, but I’ll bet there’s still a corporate firewall enforcing “outbound
connections only” between you and the greater internet. And on-device firewalls
are still there, enforcing the same thing.

So, we still need the firewall traversal stuff from the start of the article,
and a side channel so that peers can know what ip:port to talk to. We’ll
probably also still want fallback relays that use a well-like protocol like
HTTP, to get out of networks that block outbound UDP. But we can get rid of
STUN, the birthday paradox trick, port mapping protocols, and all the
hairpinning bumf. That’s much nicer!

The big catch is that we currently don’t have an all-IPv6 world. We have a world
that’s mostly IPv4, and about 33% IPv6. Those 34% are very unevenly distributed,
so a particular set of peers could be 100% IPv6, 0% IPv6, or anywhere in
between.

What this means, unfortunately, is that IPv6 isn’t yet the solution to our
problems. For now, it’s just an extra tool in our connectivity toolbox. It’ll
work fantastically well with some pairs of peers, and not at all for others. If
we’re aiming for “connectivity no matter what”, we have to also do IPv4+NAT
stuff.

Meanwhile, the coexistence of IPv6 and IPv4 introduces yet another new scenario
we have to account for: NAT64 devices.

So far, the NATs we’ve looked at have been NAT44: they translate IPv4 addresses
on one side to different IPv4 addresses on the other side. NAT64, as you might
guess, translates between protocols. IPv6 on the internal side of the NAT
becomes IPv4 on the external side. Combined with DNS64 to translate IPv4 DNS
answers into IPv6, you can present an IPv6-only network to the end device, while
still giving access to the IPv4 internet.

(Incidentally, you can extend this naming scheme indefinitely. There have been
some experiments with NAT46; you could deploy NAT66 if you enjoy chaos; and some
RFCs use NAT444 for carrier-grade NAT.)

This works fine if you only deal in DNS names. If you connect to google.com,
turning that into an IP address involves the DNS64 apparatus, which lets the
NAT64 get involved without you being any the wiser.

But we care deeply about specific IPs and ports for our NAT and firewall
traversal. What about us? If we’re lucky, our device supports CLAT
(Customer-side translator — from Customer XLAT). CLAT makes the OS pretend that
it has direct IPv4 connectivity, using NAT64 behind the scenes to make it work
out. On CLAT devices, we don’t need to do anything special.

CLAT is very common on mobile devices, but very uncommon on desktops, laptops
and servers. On those, we have to explicitly do the work CLAT would have done:
detect the existence of a NAT64+DNS64 setup, and use it appropriately.

Detecting NAT64+DNS64 is easy: send a DNS request to ipv4only.arpa. That name
resolves to known, constant IPv4 addresses, and only IPv4 addresses. If you get
IPv6 addresses back, you know that a DNS64 did some translation to steer you to
a NAT64. That lets you figure out what the NAT64 prefix is.

From there, to talk to IPv4 addresses, send IPv6 packets to {NAT64 prefix + IPv4
address}. Similarly, if you receive traffic from {NAT64 prefix + IPv4 address},
that’s IPv4 traffic. Now speak STUN through the NAT64 to discover your public
ip:port on the NAT64, and you’re back to the classic NAT traversal problem —
albeit with a bit more work.

Fortunately for us, this is a fairly esoteric corner case. Most v6-only networks
today are mobile operators, and almost all phones support CLAT. ISPs running
v6-only networks deploy CLAT on the router they give you, and again you end up
none the wiser. But if you want to get those last few opportunities for
connectivity, you’ll have to explicitly support talking to v4-only peers from a
v6-only network as well.


INTEGRATING IT ALL WITH ICE

We’re in the home stretch. We’ve covered stateful firewalls, simple and advanced
NAT tricks, IPv4 and IPv6. So, implement all the above, and we’re done!

Except, how do you figure out which tricks to use for a particular peer? How do
you figure out if this is a simple stateful firewall problem, or if it’s time to
bust out the birthday paradox, or if you need to fiddle with NAT64 by hand? Or
maybe the two of you are on the same Wi-Fi network, with no firewalls and no
effort required.

Early research into NAT traversal had you precisely characterize the path
between you and your peer, and deploy a specific set of workarounds to defeat
that exact path. But as it turned out, network engineers and NAT box programmers
have many inventive ideas, and that stops scaling very quickly. We need
something that involves a bit less thinking on our part.

Enter the Interactive Connectivity Establishment (ICE) protocol. Like STUN and
TURN, ICE has its roots in the telephony world, and so the RFC is full of SIP
and SDP and signalling sessions and dialing and so forth. However, if you push
past that, it also specifies a stunningly elegant algorithm for figuring out the
best way to get a connection.

Ready? The algorithm is: try everything at once, and pick the best thing that
works. That’s it. Isn’t that amazing?

Let’s look at this algorithm in a bit more detail. We’re going to deviate from
the ICE spec here and there, so if you’re trying to implement an interoperable
ICE client, you should go read RFC 8445 and implement that. We’ll skip all the
telephony-oriented stuff to focus on the core logic, and suggest a few places
where you have more degrees of freedom than the ICE spec suggests.

To communicate with a peer, we start by gathering a list of candidate endpoints
for our local socket. A candidate is any ip:port that our peer might, perhaps,
be able to use in order to speak to us. We don’t need to be picky at this stage,
the list should include at least:

 * IPv6 ip:ports

 * IPv4 LAN ip:ports

 * IPv4 WAN ip:ports discovered by STUN (possibly via a NAT64 translator)

 * IPv4 WAN ip:port allocated by a port mapping protocol

 * Operator-provided endpoints (e.g. for statically configured port forwards)

Then, we swap candidate lists with our peer through the side channel, and start
sending probe packets at each others’ endpoints. Again, at this point you don’t
discriminate: if the peer provided you with 15 endpoints, you send “are you
there?” probes to all 15 of them.

These packets are pulling double duty. Their first function is to act as the
packets that open up the firewalls and pierce the NATs, like we’ve been doing
for this entire article. But the other is to act as a health check. We’re
exchanging (hopefully authenticated) “ping” and “pong” packets, to check if a
particular path works end to end.

Finally, after some time has passed, we pick the “best” (according to some
heuristic) candidate path that was observed to work, and we’re done.

The beauty of this algorithm is that if your heuristic is right, you’ll always
get an optimal answer. ICE has you score your candidates ahead of time (usually:
LAN > WAN > WAN+NAT), but it doesn’t have to be that way. Starting with
v0.100.0, Tailscale switched from a hardcoded preference order to round-trip
latency, which tends to result in the same LAN > WAN > WAN+NAT ordering. But
unlike static ordering, we discover which “category” a path falls into
organically, rather than having to guess ahead of time.

The ICE spec structures the protocol as a “probe phase” followed by an “okay
let’s communicate” phase, but there’s no reason you need to strictly order them.
In Tailscale, we upgrade connections on the fly as we discover better paths, and
all connections start out with DERP preselected. That means you can use the
connection immediately through the fallback path, while path discovery runs in
parallel. Usually, after a few seconds, we’ll have found a better path, and your
connection transparently upgrades to it.

One thing to be wary of is asymmetric paths. ICE goes to some effort to ensure
that both peers have picked the same network path, so that there’s definite
bidirectional packet flow to keep all the NATs and firewalls open. You don’t
have to go to the same effort, but you do have to ensure that there’s
bidirectional traffic along all paths you’re using. That can be as simple as
continuing to send ping/pong probes periodically.

To be really robust, you also need to detect that your currently selected path
has failed (say, because maintenance caused your NAT’s state to get dumped on
the floor), and downgrade to another path. You can do this by continuing to
probe all possible paths and keep a set of “warm” fallbacks ready to go, but
downgrades are rare enough that it’s probably more efficient to fall all the way
back to your relay of last resort, then restart path discovery.

Finally, we should mention security. Throughout this article, I’ve assumed that
the “upper layer” protocol you’ll be running over this connection brings its own
security (QUIC has TLS certs, WireGuard has its own public keys…). If that’s not
the case, you absolutely need to bring your own. Once you’re dynamically
switching paths at runtime, IP-based security becomes meaningless (not that it
was worth much in the first place), and you must have at least end-to-end
authentication.

If you have security for your upper layer, strictly speaking it’s okay if your
ping/pong probes are spoofable. The worst that can happen is that an attacker
can persuade you to relay your traffic through them. In the presence of e2e
security, that’s not a huge deal (although obviously it depends on your threat
model). But for good measure, you might as well authenticate and encrypt the
path discovery packets as well. Consult your local application security engineer
for how to do that safely.


CONCLUDING OUR CONNECTIVITY CHAT

At last, we’re done. If you implement all of the above, you’ll have state of the
art NAT traversal software that can get direct connections established in the
widest possible array of situations. And you’ll have your relay network to pick
up the slack when traversal fails, as it likely will for a long tail of cases.

This is all quite complicated! It’s one of those problems that’s fun to explore,
but quite fiddly to get right, especially if you start chasing the long tail of
opportunities for just that little bit more connectivity.

The good news is that, once you’ve done it, you have something of a superpower:
you get to explore the exciting and relatively under-explored world of
peer-to-peer applications. So many interesting ideas for decentralized software
fall at the first hurdle, when it turns out that talking to each other on the
internet is harder than expected. But now you know how to get past that, so go
build cool stuff!

Here’s a parting “TL;DR” recap: For robust NAT traversal, you need the following
ingredients:

 * A UDP-based protocol to augment

 * Direct access to a socket in your program

 * A communication side channel with your peers

 * A couple of STUN servers

 * A network of fallback relays (optional, but highly recommended)

Then, you need to:

 * Enumerate all the ip:ports for your socket on your directly connected
   interfaces

 * Query STUN servers to discover WAN ip:ports and the “difficulty” of your NAT,
   if any

 * Try using the port mapping protocols to find more WAN ip:ports

 * Check for NAT64 and discover a WAN ip:port through that as well, if
   applicable

 * Exchange all those ip:ports with your peer through your side channel, along
   with some cryptographic keys to secure everything.

 * Begin communicating with your peer through fallback relays (optional, for
   quick connection establishment)

 * Probe all of your peer’s ip:ports for connectivity and if necessary/desired,
   also execute birthday attacks to get through harder NATs

 * As you discover connectivity paths that are better than the one you’re
   currently using, transparently upgrade away from the previous paths.

 * If the active path stops working, downgrade as needed to maintain
   connectivity.

 * Make sure everything is encrypted and authenticated end-to-end.

Share





Author

David Anderson

Author

David Anderson
Share





MORE ARTICLES

productOct 16, 2024

GOOGLE WORKSPACE ACCESS PROVISIONING IS NOW GENERALLY AVAILABLE

With Google Workspace integration, Tailscale takes another big step towards
simplifying network administration by working seamlessly with one of the most
widely used productivity suites in the world.

Tinku Thomas
insightsOct 15, 2024

KUBERNETES, DIRECT CONNECTIONS, AND YOU

When Tailscale usage starts to ramp up within an organization, bandwidth and
latency issues from relayed connections can start to be noticeable to users.
Solutions engineer Lee Briggs outlines how to minimize latency with direct
connections in Kubernetes environments.

Lee Briggs
insightsSep 30, 2024

GETTING STARTED WITH TAILSCALE ACCESS CONTROL LISTS

When you want to fine-tune the rules of who has access to what inside your
Tailscale network, the solution is our Access Control Lists, or ACLs. ACLs are a
powerful way to manage access to the resources on your tailnet. This is a quick
getting started guide for those of you who are ACL curious.

Alex Kretzschmar


SUBSCRIBE TO TAILSCALE’S BLOG

We have a deep commitment to keeping your data safe.

Email*


Too much email?RSSTwitter


TRY TAILSCALE FOR FREE

Get started

Schedule a demo
Contact sales



Product

How it worksPricingIntegrationsFeaturesCompare Tailscale

Use Cases

Business VPNRemote AccessSite-to-Site NetworkingHomelabEnterprise

Resources

BlogEvents & Webinars

Company

CompanyCareersPress

Help & Support

SupportSalesSecurityLegalOpen SourceChangelog

Learn

SSH keysDocker SSHDevSecOpsMulticloudNAT TraversalMagicDNSPAMPoLPAll articles
Terms of ServicePrivacy Policy
WireGuard is a registered trademark of Jason A. Donenfeld.

© 2024 Tailscale Inc. All rights reserved. Tailscale is a registered trademark
of Tailscale Inc.