Considerations in the design of stegtunnel - A method of passing hidden 
data in TCP/IP headers.

Overview

Stegtunnel is a tool written to hide data within TCP/IP header fields. 
It was designed to be undetectable, even by people familiar with the
tool. It can hide the data underneath real TCP connections, using
real, unmodified clients and servers to provide the TCP conversation. In
this way, detection of odd-looking sessions is avoided. 

Background

Steganography is the art of concealing messages within seemingly innocuous
data, sometimes called a cover. Steganography is often found in close
proximity to cryptography, and there is one maxim that, while intended for
cryptography, can affect steganography as well.

"The enemy knows the system being used" - Claude Shannon

"Security through Obscurity," that is, steganographic methods that rely
on Shannon's maxim proving false, have had a long and proud history.  Methods
such as positioning window shades in order to convey a bit of information,
or hiding information in "dead drop" locations are essentially dependant
on knowledge of the system being unknown to the adversary. These systems
have digital analogs, such as the use of normally unused fields or adding
extra fields that will be ignored by most software that reads the data.
A slightly less obvious approach could be something along the lines
of hosting a weblog, and varying the capitalization of URLs embedded
in the links.

Use of these methods would work as long as the system itself remains
unknown. They must be custom built to keep their effectiveness, and
thus are unsuitable for open source steganographic tools.

Strong Open-Source Steganography

A strong open-source steganographic tool is one in which it is infeasible
to determine whether the cover material contains additional embedded
information without knowledge of the key. Determining what the information
is exactly is not necessary to "break" the stegosystem, simply the
determination that additional information is there.

The reason why recovery of the information is not necessary to consider
the stegosystem broken is that information may (and perhaps should) be
encrypted before being injected into the steganographic cover. The
purpose of the steganography is to conceal the existence of the data
flow, and if it fails in that it is broken. Stegtunnel was designed with
the goal of being undetectable unless the key is known.

How Stegtunnel Works

Stegtunnel hides data in the sequence number and IPID fields of packets used
for TCP connections. While improvements to the system have been suggested,
and will likely be implemented in future releases of stegtunnel, what
follows is a description of version 1 of the protocol, which was
demonstrated initially at Rubi-Con 5.

The software released at Rubi-Con 5 consisted of two parts, stegclient
and stegserver. Both stegclient and stegserver make use of an artifice
called a "silent IP address." This is an IP address local to the subnet
of the IP address that is currently unoccupied. Stegtunnel will listen
for packets destined to this IP address, and reply to arp requests for
this IP address. 

This enables our userspace program to completely handle the IP conversation,
as no kernel state about packets sent to these IP addresses will be kept.
Each stegtunnel connection is actually composed of two separate connections,
one between the local host and the silent IP, and another between the
silent IP and the remote host. 

This prevents the kernel from injecting spurious RST packets into our
modified connections, and prevents ACK storms as well.

There are other ways of injecting changes into TCP connections. One 
way would be to make changes to the kernel directly, through a patch
or loadable kernel module. This would be the cleanest method, but
it would not be portable. Another would be to use a combination of
routing table updates and firewall rule changes. Libdnet provides this
functionality, and it is a likely future direction for stegtunnel.

Since stegtunnel is as much a protocol as it is a software package,  
any future implementations, such as a kernel module implementation, 
should be able to interoperate with the current silent IP implementation.

The first phase of the connection determines if both sides share
the same shared secret. This phase takes place during the SYN and
SYN-ACK packets of the TCP connection. The receiving host will
use the packet nonce and the passphrase hash to generate four
pseudorandom bytes. If these bytes match the sequence number, it
is assumed that the remote generating host knows the passphrase
as well, and the session is considered "keyed."

A host taking part in a "keyed" session will hash each outbound packet
for use as a nonce, along with the passphrase, and use that to generate
2 pseudorandom bytes. These bytes are then XORed with the cleartext
to be sent, and the resultant bytes are used as the IPID of the
outbound packet. Inbound packets from the session will have two pseudorandom
bytes generated in the same way, and the XOR with the IPID will extract
the plaintext.

Future Direction of Stegtunnel

Stegtunnel will likely add several modes in the future. One mode would 
allow it to  pass data only in the initial sequence number, at the
cost of severely limiting the bandwidth. This will expand the number of
operating systems it could be used on undetectably. 

In addition, several weaknesses exist in stegtunnel that prevent it
from fully meeting the design objectives. These weaknesses should
be addressed. Some weaknesses will require a second version of
the stegtunnel protocol to be implemented.

Weaknesses in the Protocol

Currently, the protocol does not handle dropped packets gracefully. In fact,
a dropped packet will likely result in the leakage of information from
the session, as a nonce collision is probable in the retransmit.

Out of order packets currently result in out of order decrypted plaintext.
Future implementations may be able to use the TCP sequence numbers
to order the packets properly. Since the data is in the IPID, however,
it may be difficult to deal with overlapping packets or retransmits.

Due to the use of strongly pseudorandom-appearing output in the IPID
and initial sequence number fields, the use of stegtunnel on systems
that do not have random ISNs and IPIDs will be quite noticeable. Currently,
OpenBSD and grsecurity Linux provide random IPIDs and ISNs. Syn Ack Labs
would welcome knowledge about other OSes providing this kind of cover.

Currently, if plaintext is known, or has predictable characteristics,
dictionary attacks may be mounted against packets from a stegtunnel session.
The nonce is effectively in the clear, so care should be taken to pick only
strong passphrases

Many (all current?) pseudorandom number generators have internal cycles,
and may be identified using jitter analysis. When analyzing operating
system randomness pools, jitter analysis works best against a large number 
of samples in a short time frame, so that entropy mixing does not 
overwhelm the cycles present. Future work should study whether it is
possible to confuse jitter analysis without introducing new detectable
weaknesses within the code. Without this, the steganography may depend upon
using short data dtreams, or data streams spread over enough time to
allow internal entropy gathering to confuse any cycles present in the
PRNG.

Weaknesses in the Implementation

Because all connections are currently routed through the silent IP, only
64,000 connections may be active at any given time, due to port
exhaustion. This may not be much of a problem for the client, but it
leaves the server vulnerable to resource exhaustion attacks. Future
implementations should either aggressively time-out inactive connections,
or (probably better) move to the firewall/route-table adjustment
style of tunneling.  
