Tuesday, November 6, 2012

Multi-site private IPv6 networking using ULA and IPSEC

Wait, what?! Why?

So here is the situation: I have a home network behind a Netgear WNDR3800 router running OpenWRT, and I rent a remote server on which I run XEN with several VMs on a virtual backend network. Both sites have full IPv6 connectivity; all backend systems have a global IPv6 address, and although they are free to communicate with the entire (IPv6) world, I do have basic firewalling in place to allow new connections to some internal IPv6 hosts running OpenSSH only.

There also is the usual IPv4 NAT (NAT44) story on both backend networks, but this post is not about IPv4.

What I want is this: I want systems on both backend networks to be able to openly talk to each other over IPv6, yet in a secure way. In other words; to internal systems, I want a completely open and private IPv6 network.

Is that even possible?

Well, yes, it is! Here is how.

IPv6 Unique Local Addresses (ULA)

Even though IPv6 prefixes are usually stable, I do not want to depend on that when/if I switch providers. Fortunately, IPv6 was built from the ground up on the concept that an interface can have multiple IPv6 addresses.; two of them are the normal Link Local address (in fe80::/64) and the Global address (in 2000::/3), but it is possible to add more.

One of the possibilities that IPv6 offers is Unique Local Addresses; these are addresses in fd00::/8 (and fc00::/8, though that should not be used until there is a global registry) that one can use in the same way as one would have used the IPv4 private address space (10.0.0.0/8, 192.168.0.0/16, and friends). You can randomly generate a /48 in fd00::/8 by choosing 40 more bits, e.g. by running noise through sha256sum or something similar. Within this /48, you can create as many subdivisions as you want, though is it customary to create /64s, so that IPv6 autoconfiguration works on your clients.

The networks you create in fd00::/8 should not be routed outside your internal network, and nobody will willingly route these prefixes from the external world to you. However, internally, and between sites, you can route them seven ways from Sunday any way you want.

In the remainder of this post, I will describe how to set up ULA on both sites, how to connect both sites, and how to then make the inter-site connection secure.

Choosing a ULA and setting it up on both sites

The relevant RFC suggests that you generate 40 random bits using any sufficiently random method, e.g. by running some data from /dev/urandom through sha256sum, and copying the first 10 hex digits. Let us, for the sake of simplicity, assume the following ULA:

fd12:3456:789a::/48

This is of course very non-random, and you should not use it yourself, but it makes this post a bit easier to read.

Now that we have our ULA, we could pause for a second, and appreciate how unimaginably large that network that we just created is. This is a /48 prefix, which means that we have 128 - 48 = 80 bits of address space all to ourselves! It is customary to divide the space into 65,536 /64 networks, each of which will can hold ~2^64 unique addresses. Now that is a large number: 2^64 is about 2 * 10^19. That means that if we buy 20 million 1 TByte hard drives, we could assign a unique IPv6 address to each bit on each drive! And within our ULA, we could have 65,536 stacks of 20 million 1 Tbyte drives :-).

Anyway, in practice we will have fewer devices. Let's say that we choose obvious yet simple networks for both sites:

  • Site 1: Network fd12:3456:789a:1::/64 .
  • Site 2: Network fd12:3456:789a:2::/64 .
We will also need a network to connect both sites, but I will get to that later.

Setting up ULA on both sites consists of setting a (preferably simple) address for the router, and announcing the network prefix to other machines on the site network.

Setting up ULA on site 1

In my case, site 1 has a router running OpenWRT 10.03. Since the internal interface already has a static address for the Globally routable network (also a /64), and the LuCi web interface on the router does not allow me to add multiple IPv6 addresses on the lan interface, I define an alias in /etc/config/network:

config 'alias' 'lanula'
        option 'interface' 'lan'
        option 'proto' 'static'
        option 'ip6addr' 'fd12:3456:789a:1::1/64'

This will give the router the first available (::1) address within site 1's network. I then need to tell radvd to start announcing the prefix. That is done by editing /etc/config/radvd:

config 'prefix'
        option 'interface' 'lan'
        option 'AdvOnLink' '1'
        option 'AdvAutonomous' '1'
        list 'prefix' '2***:****:****::/64 fd12:3456:789a:1::/64'
        option 'ignore' '0'

Here, the starred-out 2***:****:****::/64 is my actual global prefix. Just add the ULA prefix on that same line. After rebooting the router, your clients will automatically obtain an address in both the global prefix and on the ULA prefix. In fact, if you use IPv6 privacy extentions (Linux does, usually), you will even get a temporary IPv6 address in both networks.

At this point, it is a good idea to ensure that you can ping6 fd12:3456:789a:1::1 from a client.

Setting up ULA on Site 2

In my case, the "router" on Site 2 is the dom0 domain of a Xen box that runs the other backend machines as domU domains. It too, already has full IPv6 connectivity; my server hoster routes a /64 to my dom0, which I then distribute to my domUs using radvd.

The dom0 in question is a vanilla Ubuntu Server release, so I can configure the interfaces in /etc/network/interfaces. However, since I can add only one IPv6 address (in addition to the link-local address) in there, I have to use the up/down logic to assign the ULA address.

iface ibr0 inet6 static
  address 2###:####:####:####::1
  netmask 64
  up /sbin/ifconfig ibr0 inet6 add fd12:3456:789a:2::1/64
  down /sbin/ifconfig ibr0 inet6 del fd12:3456:789a:2::1/64

Here, ibr0 is the internal backend bridge to which all my domUs are connected, and the hashed-out 2###:####:####:####::/64 is my Global address on the interface.

As on site 1, I have to configure radvd to advertise the prefix. To this end, I edit /etc/radvd.conf to include:

interface ibr0 { 
        AdvSendAdvert on;
        MinRtrAdvInterval 3; 
        MaxRtrAdvInterval 10;
        prefix 2###:####:####:####::/64 { 
                AdvOnLink on; 
                AdvAutonomous on; 
                AdvRouterAddr on; 
        };
        prefix fd12:3456:789a:2::/64 { 
                AdvOnLink on; 
                AdvAutonomous on; 
                AdvRouterAddr off; 
        };
};

At this point, both sites have a functioning network in the ULA range. Please do check that you can ping the router from client machines, as this is essential to getting the rest to work.

What does not yet work, is the connection between both sites; more on that later, but there is something else that needs to be taken care off: on both sites, firewall rules should be set up to neither send nor receive any ULA addresses on their external IPv6 interface; block the full fc00::/7 both coming in and going out. If we do not do this, a machine on site 1 trying to ping a machine on site 2 realizes that site 2 is outside its /64, and the router will try to route the message onto the public IPv6 net.

Connecting both sites

In order to connect both sites, really any mechanism that allows for sending IPv6 will do: one could set up an OpenVPN tunnel (with tap devices, as you need to be able to set IPv6 addresses on the interfaces), an ipv6-in-ipv4 tunnel, etc. In this case, though, I will try to not touch IPv4 at all, and I will use what is already there: I will use an ipv6-in-ipv6 tunnel between the sites' external IPv6 addresses, where the traffic inside the tunnel runs in the ULA space.

Fortunately, Linux supports such a setup out-of-the-box using its ip6ip6 mechanism on a tun device.

In the remainder of this example, I will use 2111:1111:1111:1111::1 as site 1's external address, and 2222:2222:2222:2222:2 as site 2's external address.

Inside the tunnel, we will use the new fd12:3456:789a:3::/64 network inside our ULA space.

Setting up the tunnel portal on site 1

Site 1 runs the OpenWRT router, which is a bit tricky in how you configure it. I did not find a good way to set up an ip6ip6 tunnel in the LuCi web interface, so I will include the command to do that in the additional startup script under System -> Startup. Before I do so, though, I will add a new interface called mytun, configure it as static, and set address fd12:3456:789a:3::1/64 on it. My (self-chosen) logic here is that the final part of the address is "1" since this is site 1's end of the tunnel.

Now go to Network -> Interfaces, and add a new "zone" called (e.g.) "tunnel", which includes the mytun interface. Set up firewall rules to allow all traffic inside our ULA in both directions, and also allow open routing between "tunnel" and your "lan" zone. Also go to Network -> Static Routes, and route both fd12:3456:789a:2::/64 and fd12:3456:789a:3::/64 onto the mytun device; we want to be able to reach both the other end of the tunnel and the network on the other side of the tunnel.

Finally, go to System -> Startup, and add the command that will set up the tunnel:

ip -6 tunnel add mytun mode ip6ip6 remote 2222:2222:2222:2222:2 local 2111:1111:1111:1111::1 dev eth1
ifconfig mytun mtu 1400

That final MTU setting requires some explanation: I do not really have native IPv6 on site 1; I have native (and dynamic) IPv4, and my IPv6 comes through an AICCU tunnel with SixXS. Now, by default, SixXS will set an MTU of 1280 bytes for you. This is a safe bet, but if is also the very minimum that IPv6 will accept (IPv4 had a 576-byte minimum). Now, if SixXS tunnel has a 1280-byte MTU, our ip6ip6 tunnel cannot have its minimum 1280-byte MTU, as some bytes are needed for the encapsulation message.

In my case (and after reading the SixXS documentation), it seems that their IPv6-in-IPv4 scheme has 20 bytes of encapsulation, so that I can use an MTU of 1480 bytes inside the Ethernet IP MTU of 1500 bytes. In the case of SixXS, I had to log into my account on their site, and I had to change the tunnel MTU from 1280 to 1480. AICCU then required a restart to pick up the new value. NOTE: The fact that I get IPv6 through a tunnel also means that I needed to substitute eth1 with sixxs.0 in the above command.

The ip6ip6 tunnel inside the SixXS IPv6-in-IPv4 tunnel must have a smaller MTU than the SixXS tunnel. I do not know exactly how much smaller, but 80 bytes of encapsulation is a safe bet; I thus went for 1400 bytes.


Setting up the tunnel portal on site 2

Site 2 is a vanilla Ubuntu Server running in dom0. IPv6 is offered native, on the external eth0 interface. As such, configuring the interface can be done in /etc/networks/interfaces:

# ULA tunnel to Site 1.
auto mytun
iface mytun inet6 static
  address fd12:3456:789a:3::2
  netmask 64
  mtu 1400
  pre-up ip -6 tunnel add mytun mode ip6ip6 remote 2111:1111:1111:1111::1 local 2222:2222:2222:2222::2 dev eth0
  post-up ip -6 route add fd12:3456:789a:1::1/64 dev mytun mtu 1300
  pre-down ip -6 route del fd12:3456:789a:1::1/64 dev mytun mtu 1300
  post-down ip -6 tunnel del mytun mode ip6ip6 remote 2111:1111:1111:1111::1 local 2222:2222:2222:2222::2 dev eth0

This configures the tunnel.

Testing the tunnel

At this point, one should ensure that the tunnel itself works, logging onto site 1's router, and issuing ping6 fd12:3456:789a:3::2, and logging onto site 2's router and issuing ping6 fd12:3456:789a:3::1.

If that works, try pinging across the tunnel: first ping a machine on site1's backend network from the router on site 2, and a machine on site 2's backend network from a the router on site 1. Finally, pinging from a machine on site 1's network directly to a machine on site 2's network should work, and vise versa!

Securing the tunnel

In this example, I will use IPSec to secure the tunnel. Whereas in IPv4, IPSec requires opening up some UDP ports on both routers, in IPv6 it is built right into the protocol itself. IPSec can operate in three modes:
  1. AH, for direct host-to-host communication. This is hardly used in practice.
  2. ESP, for network-to-network communication, where all hosts on both networks need to cooperate in the IPSec setup.
  3. ESP Tunnel, where network-to-network communication is encrypted on the tunnel only, without the machines on either network needing to know about it.
The easiest setup for my situation is ESP Tunnel: this way, I need to configure IPSec on the routers only, and the whole secured tunnel is transparent to all backend machines.

To this end, I install the ipsec-tools package on both routers; this package is available for both Ubuntu Server and OpenWRT. I will use a simple pre-shared key infrastructure to keep the whole setup as simple as possible.

For a unidirectional ruleset, IPSec needs two keys: an encryption key, and an authentication key. As communication in both directions is treated separately in IPSec, we also need two keys for the other direction. We thus need four keys. In this example, I will use the keys from this howto; do not use these, but generate your own random keys, just as I did!

On site1's router, create a file /etc/ipsec-tools.conf with the following content (replacing the keys with your own), and permissions 700.

#!/usr/sbin/setkey -f

## Flush the SAD and SPD
flush;
spdflush;

# Just simple static keys.
# ESP SAs using 192 bit long keys (168 + 24 parity)
add fd12:3456:789a:3::2 fd12:3456:789a:3::1 esp 0x201 -m tunnel -E aes-cbc 0x7aeaca3f87d060a12f4a4487d5a5c3355920fae69a96c831 -A hmac-md5 0xc0291ff014dccdd03874d9e8e4cdf3e6;
add fd12:3456:789a:3::1 fd12:3456:789a:3::2 esp 0x301 -m tunnel -E aes-cbc 0xf6ddb555acfd9d77b03ea3843f2653255afe8eb5573965df -A hmac-md5 0x96358c90783bbfa3d7b196ceabe0536b;

# Require encryption in between the networks over this tunnel.
spdadd fd12:3456:789a:2::/64 fd12:3456:789a:1::/64 any -P in ipsec
   esp/tunnel/fd12:3456:789a:3::2-fd12:3456:789a:3::1/require;
spdadd fd12:3456:789a:1::/64 fd12:3456:789a:2::/64 any -P out ipsec
  esp/tunnel/fd12:3456:789a:3::2-fd12:3456:789a:3::1/require;

On site 2's router, create the same file, with only one tiny difference: swap in and out on the last two lines, as denoted in red below:

#!/usr/sbin/setkey -f

## Flush the SAD and SPD
flush;
spdflush;

# Just simple static keys.
# ESP SAs using 192 bit long keys (168 + 24 parity)
add fd12:3456:789a:3::2 fd12:3456:789a:3::1 esp 0x201 -m tunnel -E aes-cbc 0x7aeaca3f87d060a12f4a4487d5a5c3355920fae69a96c831 -A hmac-md5 0xc0291ff014dccdd03874d9e8e4cdf3e6;
add fd12:3456:789a:3::1 fd12:3456:789a:3::2 esp 0x301 -m tunnel -E aes-cbc 0xf6ddb555acfd9d77b03ea3843f2653255afe8eb5573965df -A hmac-md5 0x96358c90783bbfa3d7b196ceabe0536b;

# Require encryption in between the networks over this tunnel.
spdadd fd12:3456:789a:2::/64 fd12:3456:789a:1::/64 any -P out ipsec
   esp/tunnel/fd12:3456:789a:3::2-fd12:3456:789a:3::1/require;
spdadd fd12:3456:789a:1::/64 fd12:3456:789a:2::/64 any -P in ipsec
  esp/tunnel/fd12:3456:789a:3::2-fd12:3456:789a:3::1/require;

The add statements set up the keys and the IPSec type (ESP Tunnel), whereas the spdadd statements require the use of these encryption methods in both directions. The only difference between the two sites is which direction is "in", and which direction is "out".

On Ubuntu server, the init/upstart scripts will automatically use the information from the above file on startup. If you want to enable it now, without rebooting, simply run /etc/ipsec-tools.conf as root.

On OpenWRT, we need to add this script on the System -> Startup page. Simply add the command:

/etc/ipsec-tools.conf

And that is it; run it as root if you want to activate it now without rebooting.

Testing the secured tunnel

Initial tests can be done using the same methodology as before: ping router<->router, router<->backend, backend<->router, and backend<->backend. If that all works, we should ensure that the communication is indeed encrypted. 

To this end, I log on to router 2, and start listening for what the external interface sees when I communicate:

tcpdump -n -i eth0 src 2111:1111:1111:1111::1 or dst 2111:1111:1111:1111::1

Then, from a machine on site 1's backend network, I ping a machine on site 2's backend network. Ensure that you use the ULA address, since otherwise the traffic goes over the public net rather than through the tunnel! If all is well, not only do the pings work, but the tcpdump command will show something like:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
20:06:30.469386 IP6 2111:1111:1111:1111::1 > 2222:2222:2222:2222::2: IP6 fd12:3456:789a:3::1 > fd12:3456:789a:3::2: ESP(spi=0x00000301,seq=0xa7), length 88
20:06:30.469852 IP6 2222:2222:2222:2222::2 > 2111:1111:1111:1111::1: IP6 fd12:3456:789a:3::2 > fd12:3456:789a:3::1: ESP(spi=0x00000301,seq=0x917a), length 88

The first message is the ping request, and the second is the reply. Let's take a closer look at what we see here:
  • We can see the external addresses (logically, since otherwise no communication would be possible) of the routers only.
  • We can see the ULA addresses of the routers only.
  • We can see that we are transporting an 88-byte ESP-encrypted payload.
Let's also mention what we do not see:
  • We do not see what addresses on both backend networks are communicating with each other.
  • We do not see what is being communicated.
And there you have it! Enjoy your secure networking.


No comments: