oss-security - Quick Blind TCP Connection Spoofing with SYN Cookies

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <520A3B4A.1050704@jakoblell.com>
Date: Tue, 13 Aug 2013 15:57:30 +0200
From: Jakob Lell <jakob@...oblell.com>
To: full-disclosure@...ts.grok.org.uk
CC: oss-security@...ts.openwall.com, netdev@...r.kernel.org
Subject: Quick Blind TCP Connection Spoofing with SYN Cookies

Advisory location:
http://www.jakoblell.com/blog/2013/08/13/quick-blind-tcp-connection-spoofing-with-syn-cookies/

Quick Blind TCP Connection Spoofing with SYN Cookies

Abstract:

TCP uses 32 bit Seq/Ack numbers in order to make sure that both sides of 
a connection can actually receive packets from each other. Additionally, 
these numbers make it relatively hard to spoof the source address 
because successful spoofing requires guessing the correct initial 
sequence number (ISN) which is generated by the server in a 
non-guessable way. It is commonly known that a 32 bit number can be 
brute forced in a couple of hours given a fast (gigabit) network 
connection. This article shows that the effort required for guessing a 
valid ISN can be reduced from hours to minutes if the server uses TCP 
SYN Cookies (a widely used defense mechanism against SYN-Flooding DOS 
Attacks), which are enabled by default for various Linux distributions 
including Ubuntu and Debian.

I. Repetition of TCP Basics

A TCP Connection is initiated with a three-way handshake:

SYN: The Client sends a SYN packet to the server in order to initiate a 
connection. The SYN packet contains an initial sequence number (ISN) 
generated by the client.
SYN-ACK: The server acknowledges the connection request by the client. 
The SYN-ACK Packet contains an ISN generated by the server. It also 
confirms the ISN from the client in the ack field of the TCP header so 
that the client can verify that the SYN-ACK packet actually comes from 
the server and isn't spoofed.
ACK: In the final ACK packet of the three-way handshake the client 
confirms that it has received the ISN generated by the server. That way 
the server knows that the client has actually received the SYN-ACK 
packet from the server and thus the connection request isn't spoofed.

After this three-way handshake, the TCP connection is established and 
both sides can send data to each other. The initial sequence numbers 
make sure that the other side can actually receive the packets and thus 
prevent IP spoofing given that the attacker can't receive packets sent 
to the spoofed IP address.

Since the initial sequence numbers are only 32-bit values, it is not 
impossible to blindly spoof a connection by brute-forcing the ISN. If we 
need to send 3 packets to the server (one SYN packet to initiate the 
connection, one ACK packet to finish the three-way handshake and one 
payload packet), we will have to send 3*2^32 packets per successfully 
spoofed connection at an average. Given a packet rate of 300,000 packets 
per second (which can easily be achieved with a gigabit connection), 
sending this packets requires some 12 hours.

One long-known weakness of the original TCP protocol design is that an 
attacker can spoof a high number of SYN packets to a server. The server 
then has to send (and maybe even retransmit) a SYN-ACK packet to each of 
the spoofed IP addresses and keep track the half-open connection so that 
it can handle an ACK packet. Remembering a high number of bogus 
half-open connections can lead to resource exhaustion and make the 
server unresponsive to legitimate clients. This attack is called SYN 
Flooding and it can lead to DOS even if the attacker only uses a 
fraction of the network bandwidth available to the server.

II. Description of the SYN Cookie approach

In order to protect servers against SYN-Flooding attacks, Daniel J. 
Bernstein suggested the technique of TCP Syn Cookies in 1996. The main 
idea of the approach is not to keep track of incoming SYN packets and 
instead encode the required information in the ISN generated by the 
server. Once the server receives an ACK packet, he can check whether the 
Ack number from the client actually matches the server-generated ISN, 
which can easily be recalculated when receiving the ACK-packet. This 
allows processing the ACK packet without remembering anything about the 
initial SYN request issued by the client.

Since the server doesn't keep track of half-open connections, it can't 
remember any detail of the SYN packet sent by the client. Since the 
initial SYN packet contains the maximum segment size (MSS) of the 
client, the server encodes the MSS using 3 bits (via a table with 8 
hard-coded MSS values). In order to make sure that half-open connections 
expire after a certain time, the server also encodes a 
slowly-incrementing (typically about once a minute) counter to the ISN. 
Other options of the initial SYN packet are typically ignored (although 
recent Linux kernels do support some options by encoding them via TCP 
Timestamps [1]). When receiving an ACK packet, the kernel extracts the 
counter value from the SYN Cookie and checks whether it is one of the 
last 4 valid values.

The original approach of Bernstein [2] only encodes the counter and the 
MSS value in the first 8 bits of the ISN thus leaving only 24 bits for 
the cryptographically generated (non-guessable) value which needs to be 
guessed for spoofing a connection. This can easily be brute forced 
within relatively short time given the speed of modern network hardware. 
In order to mitigate this attack, Bernstein suggests[3]:

# Add another number to the cookie: a 32-bit server-selected secret 
function of the client address and server address (but not the current 
time). This forces the attacker to guess 32 bits instead of 24.

This is implemented in recent Linux kernels and it does indeed make 
guessing the ISN more costly than a simple implementation without this 
additional secret function. However, as we will see in the next section, 
it does not require the attacker to guess the full 32 bit ISN.

The following function shows the generation of the SYN Cookies in the 
Linux Kernel 3.10.1 (file net/ipv4/syncookies.c):

#define COOKIEBITS 24    /* Upper bits store count */
#define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)

static __u32 secure_tcp_syn_cookie(__be32 saddr, __be32 daddr, __be16 sport,
                    __be16 dport, __u32 sseq, __u32 count,
                    __u32 data)
{
     /*
      * Compute the secure sequence number.
      * The output should be:
      *   HASH(sec1,saddr,sport,daddr,dport,sec1) + sseq + (count * 2^24)
      *      + (HASH(sec2,saddr,sport,daddr,dport,count,sec2) % 2^24).
      * Where sseq is their sequence number and count increases every
      * minute by 1.
      * As an extra hack, we add a small "data" value that encodes the
      * MSS into the second hash value.
      */

     return (cookie_hash(saddr, daddr, sport, dport, 0, 0) +
         sseq + (count << COOKIEBITS) +
         ((cookie_hash(saddr, daddr, sport, dport, count, 1) + data)
          & COOKIEMASK));
}

The value sseq is the sequence number generated by the client and is 
therefore directly known to the attacker. The data is an integer between 
0 and 7, which encodes one of 8 possible MSS values. The count value is 
just a timestamp which is increased once a minute and it is encoded in 
the upper 8 bits of the generated cookie. However, since the first hash 
value is not known to the attacker, the timestamp value must be guessed 
by the attacker as well.

The following two functions show how the SYN Cookies are verified when 
receiving an ACK packet:

#define COUNTER_TRIES 4


/*
  * This retrieves the small "data" value from the syncookie.
  * If the syncookie is bad, the data returned will be out of
  * range.  This must be checked by the caller.
  *
  * The count value used to generate the cookie must be within
  * "maxdiff" if the current (passed-in) "count".  The return value
  * is (__u32)-1 if this test fails.
  */
static __u32 check_tcp_syn_cookie(__u32 cookie, __be32 saddr, __be32 daddr,
                   __be16 sport, __be16 dport, __u32 sseq,
                   __u32 count, __u32 maxdiff)
{
     __u32 diff;

     /* Strip away the layers from the cookie */
     cookie -= cookie_hash(saddr, daddr, sport, dport, 0, 0) + sseq;

     /* Cookie is now reduced to (count * 2^24) ^ (hash % 2^24) */
     diff = (count - (cookie >> COOKIEBITS)) & ((__u32) - 1 >> COOKIEBITS);
     if (diff >= maxdiff)
         return (__u32)-1;

     return (cookie -
         cookie_hash(saddr, daddr, sport, dport, count - diff, 1))
         & COOKIEMASK;    /* Leaving the data behind */
}

/*
  * Check if a ack sequence number is a valid syncookie.
  * Return the decoded mss if it is, or 0 if not.
  */
static inline int cookie_check(struct sk_buff *skb, __u32 cookie)
{
     const struct iphdr *iph = ip_hdr(skb);
     const struct tcphdr *th = tcp_hdr(skb);
     __u32 seq = ntohl(th->seq) - 1;
     __u32 mssind = check_tcp_syn_cookie(cookie, iph->saddr, iph->daddr,
                         th->source, th->dest, seq,
                         jiffies / (HZ * 60),
                         COUNTER_TRIES);

     return mssind < ARRAY_SIZE(msstab) ? msstab[mssind] : 0;
}

First of all, the server removes the first hash value and the ISN chosen 
by the client. This is easily possible because the hash only depends on 
a server secret and the source/destination address/port and doesn't 
change over time. Then the upper 8 bits contain the count value and if 
this counter is one of the last four valid counter values, it is 
accepted. At that point the counter used for generating the SYN Cookie 
is known and the server can therefore calculate the second hash and 
subtract it from the cookie. The remaining value is the encoded MSS 
value. The Cookie is only accepted if this encoded MSS value is actually 
a number between 0 and 7.


III. Reduced cost of guessing due to multiple valid ISNs

Since the kernel encodes a counter and the MSS value in the ISN, there 
must be one valid ISN for every combination of a valid counter value and 
a valid MSS value. In current implementations there are 4 valid counter 
values and 8 possible MSS values. This gives a total of 32 valid 
combinations which will be accepted by the server at any given time. 
Each of this 32 combination results in one valid ISN and if the attacker 
guesses any one of them, the kernel will accept the ACK packet. This 
reduces the effort needed to successfully guess a valid ISN by the 
factor 32.

Since the server doesn't remember that he has received a SYN packet when 
using SYN Cookies, there is no need to actually send the initial SYN 
packet. If we start the connection by sending an ACK packet and guess 
one of the 32 valid ISNs, the kernel will process the ACK packet without 
noticing that he has never received a SYN packet from the client and 
responded with a SYN-Ack packet.

IV. Combination of ACK-Packet and Payload

Although the TCP standard assumes that the three-way handshake is 
completed before any data is sent, it is also possible to add data to 
the final ACK packet of the handshake [4]. This means that guessing an 
ISN and spoofing a full tcp connection with some payload data (such as 
an http request) can be reduced to sending out only one single packet. 
So the average number of packets required per successfully spoofed 
connection can be reduced to 2^32 / 32 (because the server accepts 32 
different ISNs at a time). At a packet rate of 300,000 pps (which can 
easily be achieved with gigabit ethernet) this amount of packets can be 
sent out in no more than 8 minutes (compared to the 12 hours calculated 
in section I).

V. Possible real-life applications of TCP Connection spoofing

Many application developers assume that TCP makes sure that the client 
IP address is actually correct and can't easily be spoofed. Being able 
to spoof the source address obviously creates significant problems when 
using the IP address for authentication e.g. for legacy protocols like 
RSH. Even if RSH has widely been replaced by more secure alternatives 
like SSH by now, there are still some applications where the IP address 
is used for authentication. For instance it is still common to have 
administrative interfaces which can only be accessed from certain IP 
addresses. Another widespread usage of IP addresses for authentication 
is that many web applications bind the session ID to a specific IP 
address. If the session ID can be stolen by other means, an attacker can 
use the method described here to bypass this IP address verification.

Aside from actually using IP addresses for authenticating requests, it 
is also quite common to log IP addresses, which may then be used to 
track down initiators of objectionable requests such as exploits, 
abusive blog comments or illegal file sharing traffic. Using the 
technique described here may allow planting false evidence in the logged 
IP addresses.

Being able to spoof IP addresses also allows bypassing SPF e.g. when 
sending spear phishing mails in order to give the phishing mails the 
additional credibility of a valid SPF sender address, which may help to 
bypass email filtering software.

An obvious limitation of the technique described here is that when 
spoofing the IP address, you can only send a request (which may result 
in persistent changes on the server) but not receive any responses sent 
by the server. For many protocols it is however possible to guess the 
size of the server responses, send matching ACK packets and transmit 
multiple payload packets in order to spoof a more complex protocol 
interaction with the server.


VI. POC Exploit and real-life performance measures

This section describes the steps needed to actually carry out the attack 
and contains full POC code. For my experimental setup the server used 
the IP address 192.168.1.11 and port 1234. The attacker system was 
located in the same local subnet and the spoofed IP address was 
192.168.1.217.

First of all, even if SYN Cookies are enabled in 
/proc/sys/net/ipv4/tcp_syncookies (which is the default for various 
linux distributions), the system will still use a traditional backlog 
queue for storing half-open connections and only fall back to using SYN 
Cookies if the backlog queue overflows. The main reason for this is that 
storing information about connection requests allows full support of TCP 
Options and arbitrary MSS values (which don't have to be reduced to one 
of 8 predefined values). The backlog queue size is 2048 by default and 
can be adjusted via /proc/sys/net/ipv4/tcp_max_syn_backlog. So in order 
to actually carry out the spoofing attack, we have to intentionally 
overflow the backlog queue by doing a Syn-Flooding attack. This can be 
done e.g. with the hping3 command:

hping3 -i u100 -p 1234 -S -a 192.168.1.216 -q 192.168.1.11

Experiments have shown that running hping3 in parallel to the actual ISN 
brute-forcing does significantly reduce the packet rate even if hping3 
is configured to use only a small fraction of the packet rate of the ISN 
brute-forcing tool. In order to achieve the maximum packet rate 
possible, it is therefore more efficient to run hping3 in regular short 
intervals. The following command sends out 3000 SYN packets in a short 
burst once a second:

while true;do time hping3 -i u1 -c 3000 -S -q -p 1234 -a 192.168.1.216 
192.168.1.11;sleep 1;done

The source IP address used for this SYN-Flooding attack should not 
respond with a RST packet or return an ICMP Destination Host Unreachable 
message so that the queue entries aren't freed before they time out. On 
linux you can easily add another IP address to a network interface and 
block all traffic coming to this IP address in order to prevent it from 
responding with RST packets:

ifconfig eth0:1 inet 192.168.1.216 netmask 255.255.255.0 up
iptables -I INPUT --dst 192.168.1.216 -j DROP

I've used the same commands to set up the IP address 192.168.1.217, 
which is the IP address I wanted to spoof. This makes sure that sending 
responses to the spoofed address won't lead to a RST packet or an ICMP 
Destination Host Unreachable packet, which may lead to a premature 
termination of the connection and the processing of the spoofed request 
in the server software.

ifconfig eth0:2 inet 192.168.1.217 netmask 255.255.255.0 up
iptables -I INPUT --dst 192.168.1.217 -j DROP

In a real world attack, the same goal can also be achieved by issuing a 
(D)DOS attack against the spoofed IP address.

Once the system is in SYN-Cookie mode, it is necessary to spoof a high 
number of ACK packets with a payload in order to guess one of the 32 
valid ISNs. I initially wanted to do this with scapy but this failed due 
to the utterly low performance of scapy (less then 10k packets per 
second). So I went on to create a pcap file in scapy, which can then be 
sent out with a patched version of tcpreplay in a loop. The patched 
tcpreplay just increases the ack field of the tcp header by 31337 for 
each repetition of the loop. Using an uneven number makes sure that it 
reaches all 2^32 possible values without repetitions. In theory you 
could just linearly try all possible ISNs. However, the counter value in 
the 8 upper bits of the ISN only changes once a minute and is linearly 
incremented for a given combination of source and destination 
address/port. Therefore a linear search will likely be in an incorrect 
range and not create any hit within a long time. So it is advisable to 
increment the guessed ISN by a larger number so that it traverses the 
full ISN space relatively quickly.

The attached script create_packet.py creates a single ACK packet with 
some payload data.

The next step is to patch and compile tcpreplay. Here are the commands 
needed on an Ubuntu 12.04 amd64 system:

apt-get install build-essential libpcap-dev
ln -s lib/x86_64-linux-gnu /usr/lib64 # Quick workaround for a bug in 
the build system of tcpreplay
wget -O tcpreplay-3.4.4.tar.gz 
http://prdownloads.sourceforge.net/tcpreplay/tcpreplay-3.4.4.tar.gz?download
tar xzvf tcpreplay-3.4.4.tar.gz
cd tcpreplay-3.4.4
cat ../tcpreplay_patch.txt | patch -p1
./configure
make
cp src/tcpreplay-edit ../


After compiling a patched version of tcpreplay, you can use the 
following commands to actually send out packets in an infinite loop:
python create_packet.py
while true;do time ./tcpreplay-edit -i eth0 -t -C -K -l 500000000 -q 
ack_with_payload.pcap;done


VII. Experimental results

I've tested this setup in a local network between a 3 year old notebook 
(HP 6440b, i5-430M CPU and Marvell 88E8072 gigabit NIC) as the client 
and a desktop computer as the server. With a small test payload, the 
achievable packet rate is some 280,000 packets per seconds, which leads 
to some 73% CPU usage of the tcpreplay process (18% user and 55% sys in 
the output of time). According to [5] it may be expected that the packet 
rate can at least be doubled given a fast system with a decent Intel 
gigabit network card. Obviously the actual packet rate also depends on 
the size of the payload data. During a 10.5 hour overnight run I 
successfully spoofed 64 connections, which is about one successful spoof 
every 10 minutes. This is a little bit less than the expected value of 
79 spoofed connections (once every 8 minutes). There are several 
possible explanations for this deviation:
* The tcpreplay process takes some time to print the statistics in the 
end. During that time no packets are sent. I've only used the statistics 
output of tcpreplay for measuring the packet rate and so the measured 
packet rate may be a little bit off.
* When going to the maximum packet rate achievable with your hardware, 
there may be packet loss (especially if you don't use any kind of 
congestion control).
* Last but not least the spoofing is a statistical process. The standard 
deviation is approximately the square root of the expected number of 
spoofed connections and it is not particularly unlikely to be off by one 
or two standard deviations from the expected value. For this experiment 
the standard deviation is sqrt(79) = 8.89 and the measured number of 
spoofed connections was off by 1.68 standard deviations, which is well 
within the expected statistical variation.

VIII. Possible mitigation options

The simplification of TCP Connection Spoofing described here is an 
inherent problem of TCP SYN Cookies and so there won't be a simple patch 
which just solves the issue and makes the Spoofing Attack as hard as it 
is without SYN Cookies. It is only possible to gradually increase the 
required effort for successfully spoofing a connection e.g. by only 
accepting the last two instead of four counter values (which will lead 
to a 60-120s timeout between the initial SYN and the final ACK packet of 
the three-way handshake during a SYN Flooding attack) or by disallowing 
the combination of the final ACK packet with payload data (which will 
double the number of packets the attacker has to send). However, even 
with this two mitigation options in place, the spoofing attack is still 
about an order of magnitude easier with SYN Cookies than it is without 
SYN Cookies and it would still be very inadvisable to assume that the 
source IP address of TCP connections can't be spoofed. It may also be 
possible to use the lower bits of the TCP timestamp option (which is 
currently used in order to support TCP Options with SYN Cookies) for 
encoding the MSS and counter values. However, this can only provide 
effective protection against a spoofing attack if the server refuses 
clients which don't support TCP timestamps during a SYN Flooding Attack, 
which will break compatibility with some standard-conform TCP 
implementations.

It is obviously possible to disable SYN Cookies (and increase the 
backlog queue size in /proc/sys/net/ipv4/tcp_max_syn_backlog) in order 
to make the spoofing attack as hard as possible and force an attacker to 
brute force the full 32 bit ISN space. However, disabling SYN Cookies 
may require a significant amount of CPU Time and Memory during a SYN 
Flooding Attack. Moreover, the spoofing is still not impossible even 
without SYN Cookies and it will likely succeed within a couple of hours 
with a gigabit ethernet connection.

Given the limitations of the other mitigation options my suggestion is 
to solve the  problem on a higher level and make sure that the security 
of applications doesn't rely on the impossible of spoofing the source 
address of TCP connections. This obviously means that you should never 
rely on source IP addresses for authentication. For web applications it 
is also possible to mitigate the issue by using secure CSRF tokens for 
all actions which cause persistent changes on the server and not 
processing the request unless it uses a valid CSRF token. In that case 
the IP address of the request using the CSRF token may be spoofed but 
the IP address to which the token has been sent to can't be spoofed 
since the attacker will need to receive the CSRF token so that he can 
use it. When logging IP addresses used for certain actions such as blog 
comments or account registrations, the IP address to which the CSRF 
token has been sent to should be logged additionally to (or instead of) 
the IP address using the token.

References:
[1]: http://lwn.net/Articles/277146/
[2]: http://cr.yp.to/syncookies.html Section "What are SYN cookies?"
[3]: http://cr.yp.to/syncookies.html Section "Blind connection forgery"
[4]: http://www.thice.nl/creating-ack-get-packets-with-scapy/
[5]: 
http://wiki.networksecuritytoolkit.org/nstwiki/index.php/LAN_Ethernet_Maximum_Rates,_Generation,_Capturing_%26_Monitoring#pktgen:_UDP_60_Byte_Packets


View attachment "create_packet.py" of type "text/x-python" (794 bytes)

View attachment "tcpreplay_patch.txt" of type "text/plain" (1579 bytes)
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.