Transcript ppt

Self Adaptive High Interaction
Honeypots Driven by Game Theory
By: Gerard Wagener et al.
Presented by:
Mohamed Sharaf
Agenda
• Introduction
– Honeypot.
– Honeypots types.
• Problem
• Play with the Enemy
• Questions
Honeypot… A pot that contains
honey!!! 
Coutesy Image: http://successfromthenest.com/content/discoveryour-idea-honeypot/
Historical Background
• 1998 – development began on “CyperCop Sting” , one of
the first commercial honeypots.
• 1998 – Backofficer is released, a free simple-to-use,
Windows based honeypot
• 1998 – Marty Roesh at GTE Internetworking began
development on a honeypot solution that eventually
becomes NetFacade. This work began the concept of his
open source IDS “Snort”
• 1999 – Formation of honeynet project and publication
taken the name KYE “Know Your Enemy”
• 2000/2001 – Using honeypot to capture and study worm
activity. It has been adopted for detection and research.
• 2002 – A honeypot is used to capture in the wild a new and
unknown attacks
What is a Honeypot?
• A honeypot is a computer resource whose
only purpose is to be exploited ‘letting it
compromised’.
• So, it is a trap, but only for computer criminal.
• It was in 1998 that the first commercial
honeypot appeared to life. It was called
Cybercop Sting.
• Since 2002 honepots was shared and some
honeypots are there for research community .
The basic conept……
• The argument is that if we have a machine not
dedicated to a user and no legitimate
communication or services offered for public.
Then if it happens that we are able to capture
incoming and outgoing traffic (barring that of
OS updates).
• This will direct us with a very high confidence
ratio to one and only conclusion that, we are
under attack and there is a Malicious Act
against our network.
Honeypot Types:
1. LIH: Low Interaction Honeypot.
2. HIH: High Interaction Honeypot.
3. Hybrid Honeypot ( Consultation between LIH
and HIH)
4. Adaptive HIH “High Interaction Honeypot”
(This is the model proposed by our paper of
study.)
1. Low Interaction Honeypot (LIH)
• Virtual Honeypot , all “offered” services of a
low interaction honeypot are emulated.
• Takes this name because of the limited
interaction ( activities) the attacker can
achieve.
• This process is used to collect malware, in
which case the end goal is simply to collect a
downloaded malware sample.
Examples of LIH
•
•
•
•
•
•
•
Google Hack Honeypot
HoneyBOT
honeytrap
KFSensor
Multipot
Nepenthes
PHP Honeypot Project
Disdvantage of LIH
• Can not use it to discover new types of attack.
• After discovering that this is a virtual machine
the attackers can mislead the administrators
with wrong pattern of attacks or even doing
nothing.
Disadvantage of LIH …Cont…
• The attackers can discover that they are dealing with honeypot
easily How?
– The suspicious machine has many open TCP ports and uncommon
combination of open network ports i.e.TCP port 17300, used by the
backdoor left by the Kuang2 virus.
– A clear sign that a given host is running nepenthes can be found if you
just connect to TCP port 21:
$ nc xxx.xxx.xxx.xxx 21 220 ---freeFTPd 1.0---warFTPd 1.65--Expecting a banner of an FTP server. But nepenthes replies with two
different banners: one for freeFTPd and the other for warFTPd. A
human can clearly identify this uncommon response and conclude
that this is indeed a honeypot.
HIH: High Interaction Honeypot
• Adversaries have precious attacks that may be
considered new “zero-day attacks”.
• Their goal is to keep their attacks undisclosed
to achieve maximum profit out of it.
• HIH Goals:
– To be able to discover zero-day attacks.
– Putting remedies/fix to those vulnerabilities that
caused these attacks.
HIH Tools
• High Interaction Honeypot Software
– HIHAT: High Interaction Honeypot Analysis Toolkit
(HIHAT) allows to transform arbitrary PHP
applications into web-based high-interaction
Honeypots.
– Sebek: a tool for collecting forensic data from
compromised high-interaction honeypots.
Disadvantages of HIH:
• HIH is a real machine compared to LIH “ Services are
emulated”. The attacker can do malicious actions.
– For example, he could try to attack other hosts on the
Internet starting from your honeypot, or he could send
spam from one of the compromised machines. However,
there are ways to safeguard the high-interaction
honeypots and mitigate this risk using Honeywall by the
Honeynet Project.
• This will incur liability on us for whatever actions the
attackers are going to do.
• We have to make sure that attackers will not be able to
compromise our production network.
How honeynet works?
• A highly
controlled
network
where every
packet
entering or
leaving is
monitored,
captured and
analyzed
Honeynet components
3 key components
• Data control
• Data capture
• Data analysis
Data control
• Mitigate risk of honeynet being used to harm
production system
– Count outbound connections
– IPS (Snort-Inline)
– Bandwidth throttling
Data capture
• Capture activities at various levels
– Application
– Network
– OS level
Data analysis
• Manage and analysis captured data from
honeypots
– Investigate malware
– Forensic purpose
Adaptive High Interaction Honeypot
(AHIH)
• So, to now HIH has its pros and cons.
• Trying to trimming some of its disadvantages as
liability on attackers actions by providing some
mechanism of adaptability.
• Adaptive HIH sometimes may provoke the
attackers to get more of his/her new attacks by
means of blocking his/her attacks or letting it
through.
• This enhances and gives great value to the
knowledge captured from the attackers.
Honeypot Hierachical Probabilistic
Automaton (HPA)
• Defines the states of the automaton as the
programs that can be executed on the
honeypot.
• The set Qa contains the programs installed on
the honeynet.
Honeypot Hierarchical Probablistic
Automaton
Honeypot hierarchical probabilistic automaton “ executed on
honeypot” example
Scenario of Attack
1. Attacker penetrates the honeypot through the
SSH server with probability 1.
2. Attacker remains in sshd state.
3. Attacker will execute the program bash or
uname with the same probaility which is 0.5.
4. Executing program bash and “moving to bash
state” . The programs wget, rm,ls, and uname
have the same likelihood 0.25.
that mean:
pr(wget/bash)=0.25
/// Conditional propability.
Attacker Process Tree
• In Linux OS each process has a process has a
process identifier (PID) and a parent ID (PPID).
• The attackers usually starts with a privilege
separated process of the SSH server,Po.
• The process Po then forks, resulting in two
clone processes P1, and P2
Process Tree Example
Modeling Attackers and Honeypot
actions
• The current AHIH can accept or block the execution of
a program which is implemented by allowing or
blocking the do_execve() system call in Linux Kerenl.
• Let the probablility of blocking the do_execve() is
Pr(Block) in such case “Blocking” the attackers my
consider the machine “honeypot” is not ready yet.
S/he will try to invest the time till the machine get
ready “ downloading their source code and recompile
it” by launching other types of attacks.
• then the probability of allowing it will be 1-Pr(Block)
execve() and do_exeve() – Concept:
• Linux programs are launched using the
execve() system call.
• The function prototype for C programmers
looks like this:
– int execve(const char *filename, char *const
argv[], char *const envp[]);
Detailed View of Program’s Stack Layout
• Here, filename is the name of the executable file to run and the
pointer arrays argv and envp contain the command-line
arguments and environment variable strings respectively for the
new program.
• The execve() function is responsible for determining the format of
the named file and for taking appropriate actions to load and
execute that file. In the case of shell scripts that have been marked
as executable, execve() must instantiate a new shell, which in turn
is used to execute the named script.
• In the case of compiled binaries, which are predominantly ELF these
days, execve() invokes the appropriate loader functions to move the
binary image from disk into memory, to perform the initial stack
setup, and ultimately to transfer control to the new program.
• The execve() function is implemented within the Linux kernel by the
do_execve() function, which can be found in a file named fs/exec.c.
Attackers and Honeypot Interactions
• In the Game between attackers and Honeypot
there are three possible scenarios upon blocking
her/his attack:
– Retry of Command:
• Attackers may think that the download repository s/he is
using for storing the malware is passing through a temporary
failure. This pushes him/her considering the backup
repository to use in this case another repository was
exposed.-- Pr( Retry)
– Select An Alternative solution :
• The attackers may consider to debug the program on the
honeypot. -- Pr(Alternative)
– Quit:
• If the attackers got a suspect about the machine that it may
be a honeypot, s/he may decide to quit. -- Pr(Quit)
Attacker / Honeypot possible
Interaction
• Any possible action of the attacker after being
blocked is governed by the relation:
Pr(Retry)+Pr(Alternative)+Pr(Quit)=1
Honeypot game Example
Computing Payoffs
• Proposing two honeypot games. The games
are different w.r.t. the payoff computation.
1. Number of Transitions:The attacker’s goal is to
minimize # of transitions in HPA.Meanwhile the honeypot
tries to maximize the # of transitions.
2. Path probability payoff:
P ( path)
R 
P ( path)
p
r
a
*
r
P ( path)
R  1
P ( path)
p
r
h
*
r
Experimental Evaluation
• Setting up honeypot that is capable of
detecting do-execve() and clone() system calls.
• The honeypot is operated with the Qemu a
x86 emulator. Modifying the kernel inside the
Qemu to log process ids
• Transmitting system logs to a syslog-ng server.
• The default running service is SSH.
• Configuring the SSH server that No password
asked
Continue … Exp. Evaluation
• Honeypot was operated on IPv4 and Ubuntu 7.1 as OS.
The Linux OS, itself, was operating in a virtual machine
operated by Qemu ver.0.9.1.
• The honeypot was operated for almost 3 months.
During which noticing 637 (successful) SSH login and
12140 failures.
• Attackers tested 1763 non existing accounts with
different password ( representing the high # of failure).
• For the successful logins, 183 unique IP addresses.
• The honeypot was periodically mounted and
checksums was computed….Why?
– Just to detect any malicious change to the OS
kernel. Also, reboot was set to be power off.
• 637 process trees were recovered. The
smallest tree has only one node and the
tallest has 1954 nodes(mostly related to bots
interactions) Why?
– Because botmaster has long session with bots
s/he controlled.
Simulation result
Questions