Data Hiding in Digital Imagery

Download Report

Transcript Data Hiding in Digital Imagery

Steganography
in digital media
Word origin
From Greek Steganos (covered) and graphia (writing)
Steganography is sometimes called
- Secret writing
- Concealed writing
- Covert communication
- Stealth communication
- Data hiding
- Electronic invisible ink
- The prisoners’ problem
Steganography
• ~470 B.C. First written evidence
by Greek historian Herodotus.
• Term coined by Johannes
Trithemius in 1499.
• Digital media provide ideal
hideout.
• Steganography in its modern
form is only ~15 years old.
Stego software
Stego software
Research
Number of IEEE publications containing the keywords
steganography or steganalysis.
Steganography
The prisoners’ problem, Simmons (1983)
Alice
00101…1
encryption key
stego key
Compression
Encryption
Bob
00101…1
Decryption
Decompression
Image source
Cover object
Communication is monitored by a
warden looking for suspicious
artifacts
Stego-object
Main requirement: Undetectability (no algorithm can decide about stego and
cover objects with success better than random guessing)
Warden:
passive, active, malicious
Example of a steganographic channel
Alice pretends that she
wants to sell her sofa at
the auction site eBay
Secret
message
Buyer Buyer
No. 1 No. 2
…
…
Secret
message
Buyer
No. 10,000
…
Difference between steganography and cryptography
Both are privacy tools involving keys that enable two or
more parties communicate privately
Crypto makes the message unintelligible to those not
possessing the correct keys, but the existence of secret
message is obvious (overt)
Stego conceals the very presence of message (covert),
the communicated object is just a decoy.
Three fundamental types of steganography
1. Steganography by cover selection
Sender selects a cover from a large set of available covers so
that the required message is communicated.
2. Steganography by cover synthesis
Sender creates the cover that communicates the desired
message.
3. Steganography by cover modification
Sender modifies an existing cover in order to convey the
required message.
1. Steganography by cover selection
Secret shared codebook
A picture in landscape format means “yes”, portrait format
means “no” If picture contains an animal, attack tomorrow.
Red T-shirt means 0, yellow 1,…
Steganography by hashing
Apply a pre-agreed message digest function (could depend on
a key) to the cover and search for a cover till the digest
matches the message.
The recipient hashes the image to extract the message.
Advantage:
The cover is “100% natural”
Disadvantage: Low payload, is it really secure?
2. Steganography by cover synthesis
Alice creates the cover
Mimic functions
SpamMimic (www.spammimic.com) encodes a message to resemble spam.
Acrostics (linguistic steganography)
“Apparently neutral's protest is thoroughly discounted and ignored.
Isman hard hit. Blockade issue affects pretext for embargo on byproducts,
ejecting suets and vegetable oils”.
“Amorosa Visione” by Giovanni Boccaccio (1313–1375) contains
three sonnets and poems, such that the initial letters of successive
tercets correspond exactly to the sonnets.
Cardan’s grille (1501–1576)
3. Steganography by cover modification
Assume there are three sets
C … set of all cover objects
K … set of all keys
M … set of all messages that can be communicated
A steganographic embedding scheme is a pair of embedding and extraction
functions Emb and Ext
Emb : C  K  M  C
Ext : C  K  M
Ext (Emb(c, k, m))  m
such that
for all c  C, k  K , m  M
m
c
Emb
k
s …
Ext
k
m
The problem of steganography
• We wish to embed as many bits in the cover
object without introducing any statistically
detectable artifacts.
• Statistical undetectability (no one should be
able to tell whether an image contains secret
message) -- can be formalized using Information
Theory
A tip of an iceberg?
Dhiren Barot, an Al Qaeda operative filmed reconnaissance video between
Broadway and South Street and concealed it by splicing it into a copy of the
Bruce Willis movie "Die Hard: With a Vengeance." Barot was sentenced to
40-to-life in Great Britain.
NY Times article available from
http://www.nytimes.com/2006/11/08/world/europe/08britain.html?th&emc=th
(Requires registration)
Steganography program S-Tools was used to distribute child porn. This case
occurred between 1998 and 2000. A person working at a government facility
was using S-Tools to hide child porn in images and then distributing them
through e-mail and postings from his work computer.
Steganography was detected by identifying color patterns in the GIF palette.
The suspect was confronted the embedded images were retrieved.
Source: Neil Johnson, 2006 and N.F. Johnson and S. Jajodia, “Steganalysis of Images
Created Using Current Steganography Software,” in D. Aucsmith (ed.): Information Hiding.
2nd International Workshop, LNCS vol.1525, Springer-Verlag Berlin Heidelberg, pp. 273289,
1998.
June 19, 2001
Schwarzenegger’s letter
Considerable interest from Government
and law enforcement
Major US agencies funding research in steganography
–
–
–
–
–
US Air Force and AFOSR
National Institute of Justice (NIJ)
Office of Naval Research (ONR)
National Science Foundation (NSF)
Defense Advanced Research Project Agency (DARPA)
Steganalysis is considered part of Computer Forensics
Important for protection against malware
Tools developed for steganalysis find applications in Digital
Forensics in general (e.g., for detection of digital forgeries
and integrity and origin verification)
Steganalysis in the wide sense
Traditional steganalysis: a steganography system is
considered broken, when the mere presence of a hidden
message is detected
Forensic analysis: detection of the message may not be
sufficient; often, other information would be useful
•
•
•
•
•
identification of the embedding algorithm (LSB, 1, …)
the stego software used (F5 , OutGuess, Steganos, …)
the stego key (StegoSuite © by Wetstones, Inc.)
the hidden bit-stream
the decrypted message
LSB embedding
and its analysis
LSB embedding
Embedding function Emb (Matlab syntax)
c = imread(‘my_decoy_image.bmp’);
k = 1;
for i = 1 : 512
for j = 1 : 512
LSB = mod(c[i, j], 2);
if LSB = b[k] | k  m
s[i, j] = c[i, j];
else
s[i, j] = c[i, j] + b[k] – LSB;
end
k = k + 1;
end
end
imwrite(s, ‘stego_image.bmp’, ‘bmp’);
% Grayscale cover image in BMP format
% ‘b’ is a vector of m bits (secret message)
% Counter
% Stego image ‘s’ saved to disk
LSB embedding
Extraction function Ext (Matlab syntax)
s = imread(‘stego_image.bmp’);
k = 1;
for i = 1 : 512
for j = 1 : 512
if k  m
b[k] = mod(s[i, j], 2);
k = k + 1;
end
end
end
% b is the extracted secret message
% Grayscale stego image in BMP format
Why is LSB embeddig so popular?
General (can be applied to any digital file consisting of numerical data)
Extremely simple
Fast
High capacity (1 bit per pixel, embedding efficiency 2)
Does not require any software present on the computer
One command line in UNIX Perl script (source: A. Ker, Oxford University):
perl -n0777e ’$_=unpack"b*",$_;split/(\s+)/,<STDIN>,5;
@_[8]=~s{.}{$&&v254|chop()&v1}ge;print@_’
<input.pgm >output.pgm secrettextfile
LSB plane of images resembles random noise  this method was
believed to be undetectable.
LSB plane of Lenna
• LSB bit plane of a never-compressed Lenna image.
Properties of LSB flipping
LSBflip(x) = x + 1 – 2(x mod 2)
FlipLSB(x) is idempotent, e.g., LSBflip(LSBflip(x)) = x for all x
LSB flipping induces a permutation on {0, …, 255}
0  1, 2  3, 4  5, …, 254  255
LSB flipping is “asymmetrical” (e.g., 3 may change to 2 but never to 4)
| LSB(x) – x | = 1
for all x (embedding distortion is 1 per pixel)
Effect of LSB embedding on histogram
LSB flipping pair 2i, 2i+1
hc [2i] = number of occurrences of the value 2i in the cover image
hc [2i+1] = number of occurrences of the value 2i + 1 in the cover image
hs [2i+1]
hc [2i+1]
hc [2i]
parts untouched
by embedding
hs [2i]
2i
2i+1
For a fully embedded image:
2i
2i+1
hs [2i] = (hc [2i] + hc [2i+1])/2
hs [2i+1] = (hc [2i] + hc [2i+1])/2
“Twin peaks” in the histogram
• The peaks can be tested for using a chi-square test
• By looking at the histogram of pixel pairs, an even more accurate attack
can be built (Sample Pairs Analaysis – SPA).
EECE 562 Fundamentals of Steganography
Cambridge University Press, November 2009,
460 pages, $68 on Amazon.