Data Hiding Watermarking and Steganography

Transcript Data Hiding Watermarking and Steganography

Data Hiding
Watermarking and
Steganography
Outline
• Introduction to Data Hiding
• Watermarking
– Definition and History
– Applications
– Basic Principles
– Requirements
– Attacks
– Evaluation and Benchmarking
– Examples
• Steganography
– Definition and History
– Applications
– Basic Principles
– Examples of Techniques
– Demos
Data Hiding
Key
Carrier
document
Secret
message
Embedding
algorithm
Secret
message
Transmission
via network
Detector
Key
• Information Hiding is a general term encompassing many subdisciplines
• Two important sub-disciplines are:
Steganography and Watermarking
– Steganography:
• Hiding: keeping the existence of the information secret
– Watermarking
• Hiding: making the information imperceptible
• Information hiding is different than cryptography (cryptography is
about protecting the content of messages)
The Need for Data Hiding
• Covert communication using images (secret message is
hidden in a carrier image)
• Ownership of digital images, authentication, copyright
• Data integrity, fraud detection, self-correcting images
• Traitor-tracing (fingerprinting video-tapes)
• Adding captions to images, additional information,
such as subtitles, to video, embedding subtitles or
audio tracks to video (video-in-video)
• Intelligent browsers, automatic copyright information,
viewing a movie in a given rated version
• Copy control (secondary protection for DVD)
Issues in Data Hiding
• Perceptibility: does embedding information “distort” cover
medium to a visually unacceptable level (subjective)
• Capacity: how much information can be hidden relative
to its perceptibility (information theory)
• Robustness to attacks: can embedded data survive
manipulation of the stego medium in an effort to destroy,
remove, or change the embedded data
• Trade-offs between the three:
– More robust => lower capacity
– Lower perceptibility => lower capacityetc.
Requirements
Application
Covert communication
Copyright protection of images (authentication)
Fingerprinting (traitor-tracing)
Adding captions to images, additional information,
such as subtitles, to videos
Image integrity protection (fraud detection)
Copy control in DVD
capacity
robustness
invisibility
security
embedding complexity
detection complexity
Intelligent browsers, automatic copyright
information, viewing movies in given rated version
Requirements
Low
High
The “Magic” Triangle
Capacity
Naïve steganography
Secure steganographic
techniques
Security
There is a trade-off
between capacity,
invisibility, and robustness
Digital watermarking
Robustness
Additional factors: • Complexity of embedding / extraction
• Undetectability
make data hiding possible
and
• Information-theoretic
• Removed by lossless
compression
• Perceptual
• Removed by lossy
compression
 2 gray
levels
+
=
 5 gray
levels
+
=
 31 gray
levels
Original
+
=
Watermarking
• Intent: data embedding conveys some
information about the cover medium such
as owner, copyright, or other information
• Watermark can be considered to be an
extended attribute of the data
• Robustness of watermark is a main issue
• Know watermark may be there
• Can be visible or invisible
Steganography
• Intent: transmit secret message hidden in
innocuous-looking cover medium so that
its existence is undetectable
• Robustness not typically an issue
• Capacity desired for message is large
• Always invisible
• Typically dependent on file format
Watermarking
Watermarking:Definition
•
•
•
Watermarking is the practice of imperceptibly altering a cover to embed a
message about that cover
Watermarking is closely related to steganography but, there are differences
between the two
– In watermarking the message is related to the cover
– Steganography typically relates to covert point-to-point communication
between two parties .Therefore, steganography requires only limited
robustness
– Watermarking is often used whenever the cover is available to parties
who know the existence of the hidden data and may have an interest in
removing it
– Therefore, watermarking has the additional notion resilience against
attempts to remove the hidden data
Watermarks are inseparable from the cover in which they are embedded.
Unlike cryptography, watermarks can protect content even after they are
decoded.
Watermarking:History
• More than 700 years ago, watermarks were used in Italy indicate
the paper brand and the mill that produced it
• By the 18th century watermarks began to be used as anticounterfeiting measures on money and other documents
• The term watermark was introduced near the end of the century.It
was probably given because the marks resemble the effects of
water on paper
• The first example of a technology similar to digital watermarking
is a patent filed in 1954 by Emil Hembrooke for identifying works
• In 1988, Komatsu and Tominaga appear to be the first to use the
term "digital watermarking"
• About 1995, interest in digital watermarking began to mushroom
Motivation
•
•
The rapid revolution in digital multimedia and the ease of
generating identical and unauthorized digital data.
USA Today, Jan. 2000:Estimated lost revenue from digital
audio piracy US $8,500,000,000.00
The need to establish reliable methods for
copyright protection and authentication.
•
The need to establish secure invisible channels
for covert communications.
•
Adding caption and other additional information.
Watermarking:Applications
• Copyright protection
– Most prominent application
– Embed information about the owner to prevent others from claiming
copyright
– Require very high level of robustness
• Copy protection
– Embed watermark to disallow unauthorized copying of the cover
– For example, a compliant DVD player will not playback or data that
carry a "copy never" watermark
• Content Authentication
– Embed a watermark to detect modifications to the cover
– The watermark in this case has low robustness, "fragile"
Watermarking:Basic principles
Watermarking:
Requirements
• Imperceptibility
– The modifications caused by watermark embedding should be below
the perceptible threshold
• Robustness
– - The ability of the watermark to resist distortion introduced by
standard or malicious data processing
• Security
– - A watermark is secure if knowing the algorithms for embedding and
extracting does not help unauthorized party to detect or remove the
watermark
Digital Watermarking - Examples
• Text – varying spaces after punctuation, spaces
in between lines of text, spaces at the end of
sentences, etc.
• Audio – low bit coding, random imperceptible
noise, fragile & robust, etc.
• Images – least-significant bit,
random noise, masking and filtering, etc.
Digital Watermarking – Qualities/Types

Effect on quality of original content – how does
watermarking technique impact level of degradation
and what is the level of acceptability with the
degradation

Visible vs. invisible – visible such as a company logo
stamped on an image or movie or invisible and
imperceptible

Fragile vs. robust – fragile watermarks break down
easily whereas robust survive manipulations of
content (in some watermarking of audio files, both are
used)
Digital Watermarking –Qualities/Types.

Public vs. private – private watermarking techniques
require that the original be used as a basis of
encryption whereas public does not

Public-key vs. secret-key – secret-key watermarking
uses the same watermarking key to read the content
as the key that was inserted into the image; public key
uses different keys for watermarking the image and
reading the image
Digital watermarks
categories
Robust watermark- Used for copyright protection.
Requirements: the watermark should be permanently intact to the host
signal, removing the watermark result in destroying the perceptual
quality of the signal.
Fragile watermark- Used for tamper detection or as a digital signature.
Requirements: Break very easily under any modification of the host
signal.
Semi Fragile watermark- used for data authentication.
Requirements: Robust to some benign modifications, but brake very
easily to other attacks.
Provide information about the location and nature of attack
Copyright protection of digital images (authentication)
+
Original
=
Watermark
Watermarked
image
• Robustness against all kinds of image distortion
• Robustness to intentional removal even when all details
about the watermarking scheme are known (Kerckhoff’s
principle)
• Watermark pattern must be perceptually transparent
• Watermark depends on a secret key
• Robustness to over-watermarking, collusion, and other attacks
Proving ownership using a digital watermark
• Ownership is proved by showing that an image in question
contains a watermark that depends on owner’s secret key
• If pirate embeds his own watermark, the ownership can be
resolved by producing the original image or the
watermarked image (neither contains pirate’s watermark)
Detectable watermark:
Pseudo-random sequence
is either present or not
present (1 bit embedded)
Readable watermark:
One can recover a short
message, e.g. info about
the owner (100 bits)
Fingerprinting or traitor tracing
Marking copies of one document with a customer signature.
original
+
…
W1
W2
WN
…
N customers
Robust, secure, invisible watermark, resistant with respect
to the collusion attack (averaging copies of documents with
different marks).
Adding captions to images, additional
information to videos
Typical application:
• Adding subtitles in multiple languages
• Additional audio tracks to video
• Tracking the use of the data (history file)
• Adding comments, captions to images
Watermark requirements:
• Moderately robust scheme
• Robustness with respect to lossy compression, noise adding,
and A/D D/A conversion
• Original images (frames) not available for message extraction
• Security requirement not so strong
• Fast detection, watermark embedding can be more time
consuming
Watermarking principles
In spatial domain
watermark embedded
by directly modifying
the pixel values
+
=
Watermarking for color images
• One or more selected color channels.
• Luminance
In transform domain
watermark embedded in the
transform space by modifying
coefficients
DCT
Inverse DCT
Modify
DCT
Oblivious vs. non-oblivious watermarking
non-oblivious = original image is needed for extraction
oblivious = original image is not necessary
NEC Scheme
Watermark embedding:
1000 highest energy DCT coefficients are modulated with
a Gaussian random sequence wk N(0,1). The watermark
is embedded by modifying the 1000 highest energy DCT
coefficients vk
vk’ = vk (1 + awk ),
where vk’ are the modified DCT coefficients, and a is the
watermark strength also directly influencing watermark
visibility.
NEC Scheme
Watermark detection:
• Subtract the original image from the watermarked (attacked)
image, and extract the watermark sequence ’ (may be
corrupted due to image distortion)
• Correlate  with ’
 = original watermark sequence
  '
sim( , ' ) 
 ' '
sim(, ’) is called similarity
sim(, ’) > Th => watermark is present
sim(, ’) < Th => watermark is not present
Watermark detection
Direct Spread Spectrum in Spatial Domain
Patchwork, (Bender, Gruhl, and Morimoto)
• Initialize a PRNG with a secret key
• Randomly select n pixel pairs with grayscales ai and bi
• Set ai  ai + 1 and bi  bi – 1
• Use S to verify watermark presence
S

n
i 1
(ai  bi )
Hypotheses testing is used to confirm the presence of watermark
on a certain confidence level.
S = 0 with  = 104.5  n if no watermark is present
S  2n if watermark present
Set threshold Th to adjust probability of false alarms and missed detections
Using patches of pixels rather than single pixels improves robustness
Frequency Based Spread Spectrum Watermarking
Watermark embedding:
• Transform image using DCT, DFT, Hadamard, wavelet,
key-dependent random transformations
• Select n coefficients to be modified
- the most perceptually important coefficients
- fixed band depending on image size
- key-dependent selection (frequency hopping)
• Generate pseudo-random watermark sequence w1, …, wn
• Modulate selected coefficients vk, k = 1, …, n
vk’ = vk + awk,
(Ruanaidh et al.)
vk’ = vk + avk wk,
(Cox et al.)
vk’ = vk + a|vk|wk
(Piva et al.)
• Use inverse transform to get the watermarked image
Watermark detection using correlation
Transform
coefficients
Original image
Watermarked image
Attacked watermarked
vk
v’k
v’’k
Non-oblivious schemes
Watermark approximation
vk’ = vk + awk,
vk’ = vk + avk wk,
vk’ = vk + a|vk|wk



uk = (v’’k– vk)/a
uk = (v’’k– vk)/avk
uk = (v’’k– vk)/a|vk|
• Correlate uk with wk
• Threshold the result
• Make a decision about watermark presence
Watermark detection using correlation
Oblivious schemes
• Correlate v’’k with wk
vk’ = vk + awk,
vk’ = vk + a|vk|wk
• If no distortion is present
corr =  v’’k wk =  (vk + awk)wk  an2
corr =  v’’k wk =  (vk + a |vk|wk)wk  an|v|2
• If incorrect noise sequence is used
corr = 0 with corr2  n
which enables us to set a decision threshold
Frequency masking
The presence of a signal of one frequency can raise the
perceptual threshold of signals with frequencies close to
the masking frequency.
Masking signal
Masking threshold
Frequency
Masked signal
Spatial masking
Image discontinuities also have the ability to mask small
image distortions.
Luminance
Masking threshold
Edge
Perceptual Watermarking (Tewfik et al)
(1) Image divided into 8x8 blocks
(2) Each block is DCT transformed
(3) Frequency masking*) determines JND for each freq. bin
(4) vk = vk + k JND(b, k)
(5) Block is inverse DCT transformed
(6) Spatial masking**) model verifies invisibility
- If the changes are visible, JND is rescaled, goto (4)
• Invisibility of the watermark guaranteed
• Increased watermark energy leads to higher robustness
*) Foley,
Legge frequency masking model
**) Girod’s spatial masking model
Data Embedding in Video (Tewfik et al)
• Very high capacity with medium robustness
• Useful for embedding video-in-video or audio-in-video
without increasing the bandwidth or requiring two separate
information streams.
Perceptual mask M
T = min M
8 x 8 block B
DCT
x
8 x 8 signature S
p
DCT
p’ = kT–T/4 ~ 0
p’ = kT+T/4 ~ 1
(k-1)T
kT
• Watermarked block B’ = B + (p’– p) DCT(S)
(k+1)T
Block Diagram of Video
Watermarking
Robustness to geometric transformations
Easy if the original image is available (non-oblivious schemes)
Very challenging for oblivious schemes especially for a
combination of cropping, scaling, rotation, and shift
Approaches:
• Watermarking by small blocks (good for cropping)
• Embedding patterns with known geometry
• Watermarking using Fourier-Mellin transform (scaling and
rotation converted to shift)
• Embedding watermarks into image features or salient points
Weak points:
• Computational complexity
• More powerful geometric attacks - StirMark
Forensic analysis
Analysis of lighting and shadows
Localized analysis of
- noise
- histogram
- colors
Looking for discontinuities
Fragile watermarks
Properties:
Break easily
Computationally cheap
Good localization properties
Too sensitive for redundant data
Examples:
Embedding check-sums in the LSBs
Adding m-sequences to image blocks
Steve Walton, “Information authentication for a slippery new age”, Dr.
Dobbs Journal, vol. 20, no. 4, pp. 18–26, April 1995.
Fragile Watermarks for Tamper Detection
• A set of key-dependent random walks covering the image
• Choose a large integer N
• For each walk, add the gray values determined by 7 most
significant bits; denote the sum by S
• Embed the reminder S mod N into the LSB of the walk
• Probability of making a compliant change is 1/N
• S could be made walk-dependent to prevent exchanging
groups of pixels with the same check-sum
7
1
2
5
6
3
4
p1: 1 0 1 0 0 0 1 1
p2: 1 1 0 0 0 1 0 0
…
p3: 1 1 0 0 1 0 0 1
S
Embedded check-sum
S mod N
1. Overlay the fragile watermark
Three key-dependent binary valued functions fR, fG, fB
fR,G,B : {0, 1, …, 255}  {0,1},
are used to encode a binary logo B. The gray scales are
perturbed in such a manner so that
B(i,j) = fR(R(i,j))  fG(G(i,j))  fB(B(i,j))
for all (i,j)
The image authenticity is verified by checking the
relationship
B(i,j) = fR(R(i,j))  fG(G(i,j))  fB(B(i,j))
for each pixel (i,j)
Original image
f( ) = 1
Perturb
Authenticated image
Corresponding pixels
Binary logo
Robust watermarks on small blocks
Properties:
Medium robustness
Insensitive to small changes
Not as good localization properties
Can distinguish malicious and
non-malicious modifications
Examples:
Spread spectrum watermarks on
medium size blocks
Wavelet domain watermarks
J. Fridrich, “Image Watermarking for Tamper Detection”,
Proc. ICIP ’98, Chicago, Oct 1998.
2. Insert robust watermark into every block
64 pixels
Robust
bit extractor
50 bits
B
Secret
key K
Block #
B
W(K, B)
Synthesizing
Gaussian
sequence
B
+
Watermarked
block B
=
Hybrid watermark
Properties:
Fragile, sensitive, and robust
Good localization properties
Can distinguish malicious and
non-malicious modifications
Examples:
Robust watermarks on medium blocks
combined with a fragile watermark
The watermarked image "Lena" with
outlined blocks and block numbers.
(After brightness adjustment and JPG compression)
Presence of the robust watermark (above); Fragile
watermark indicated tampered areas with black dots (below).
(Retouched eyes) Presence of the robust watermark
(above); Fragile watermark indicated tampered areas
with black dots (below).
(Replaced face and softened) Presence of
the robust watermark (above); Fragile
watermark indicated tampered areas with
black dots (below).
Self-embedding
Properties:
Fragile
Security problems
Good localization properties
Tampered areas can be fixed
Easy to remove
Examples:
Coding quantized DCT transformed
blocks in distant blocks
J. Fridrich and M. Goljan “Protection of Digital Images Using Self Embedding”,
Symposium on Content Security and Data Hiding in Digital Media, New Jersey
Institute of Technology, May 14, 1999.
Images with Self-correcting Capabilities
• Content of block B1 is compressed and encoded in the
LSBs of B2
• B1 and B2 are separated by a random vector p
Selfembedding algorithm #1
QUANTIZATION
Binary encoding 11 coefficients
CODE1 : 64 bits per block
Selfembedding algorithm #2
QUANTIZATION
Binary encoding 21 coefficients
+ up to 2 next nonzero coefficients
CODE2 : 128 bits per block
For Binary Encoding
L=[7
7
6
5
4
3
2
1
7
6
5
5
4
2
1
0
7
5
5
4
3
1
0
0
5
5
4
3
1
0
0
0
4
4
3
1
0
0
0
0
3
2
1
0
0
0
0
0
2
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0].
The bit lengths provided for encoding
the 64 coefficients (with sign)
11 coefficient will take 64 bits
Original image
Embedded image (1 LSB encoding)
Original image embedded in itself
Embedded image (2 LSB encoding)
Reconstruction of a license plate
Tampered image - The license plate
has been replaced with a different one
• 2 LSBs have been used for selfembedding
The original license plate after
reconstruction
Reconstruction after mosaic filtering
Manipulated image
Secret key
Reconstructed image
Attacks
• Attacks are carried out with an intension to destroy
watermark for the purposes of use without having to pay
royalties to the originator of the content.
• Must withstand various signal processing attacks:
– Compression
– Cropping, editing, composing.
– Printing.
– Adding small amounts of noise.
Attack: Example
•
•
•
•
Alice puts image on her web page.
Eve and Mallet copy image and claim it as their own.
All three appear before Judge.
Alice, using her image and Eve’s, extracts the
watermark.
• Alice, using her image and Mallet’s altered one, extracts
a noisy version of her watermark.
• Alice must convince Judge that the noisy watermark is
indeed hers and not a false alarm.
Watermark attacks
• Robustness attacks: Intended to remove the watermark.
JPEG compression, filtering, cropping, histogram
equalization additive noise etc.
• Presentation Attacks: Watermark detection failure.
Geometric transformation, rotation,scaling, translation,
change aspect ratio, line/frame dropping, affine
transformation etc.
•
Counterfeiting attacks: Render the original image useless,
generate fake original, dead lock problem.
•
Court of law attacks: take advantage of legal issues.
Typical Attacks and Distortions
used on Watermarks
• Enhancement: sharpening, contrast, color
correction
• Additive and multiplicative noise:
Gaussian, uniform, speckle
• Linear filtering: lowpass, highpass,
bandpass
• Nonlinear filtering: median filters, rank
filters, morphological filters
Typical Attacks and Distortions
used to design Watermarks
• Lossy compression: JPEG, MPEG2,
MPEG4, audio as well
• Geometric transformations: shifts,
rotations, scaling, shearing (affine)
• Data reduction: cropping, clipping,
histogram modification
• D/A and A/D conversion: print-scan,
analog TV transmission
The IBM Attack (ownership deadlock)
Distributed image
+
=
Alice’s
watermark W1
Original X
belongs to Alice
Watermarked
image Y
Bob generates a random watermark W2
Subtracts Y–W2 = X’ and creates a false original X’
X’ + W2 = Y = X + W1
X’ = X + W1 – W2  X’ contains W1
X = X’ + W2 – W1  X contains W2
identical
Distributed image
+
False original X’
belongs to Bob
=
Bob’s
watermark W2
Watermarked
image Y
The IBM Attack - solution
• Make the watermark dependent on the original image
in a non-invertible way
X + W1(X) = Y
For example, W1(X) is a watermark generated from a
PRNG seeded with a hash of X.
Creating a forgery amounts to solving the equation
Y – W1(Z) = Z
for the unknown Z.
• Another possibility is timestamping.
Secure public watermark detector
Detector is implemented as a tamper-proof black box
that takes integer matrices on its input and outputs one
bit (watermark present or not).
Application: Copy control in DVD players.
Assumptions: The attacker knows the watermarking
algorithm and the detection algorithm, has one
watermarked image available, but does not have the
secret built-in key.
Task: To obtain some knowledge about the secret key
or to remove the watermark
Secure public watermark detector
Many watermark detectors D correlate some quantities xk
derived from the watermarked image I with a secret
sequence wk:
N
D( I )  H k 1 xk wk  Th


Th … threshold
H … Heaviside step function, H(x)=1 for x > 0, H=0 otherwise
Attack: (Cox, Linnartz, Kalker, Dijk, ...)
(1) Find a critical image by progressively deteriorating the
image (for example, by replacing the pixel values
one-by-one by the average gray level)
(2) Feed the detector with special images to reconstruct wk or
to learn the sensitivity of the detection function to various
pixels.
Secure public watermark detector
Statistical attacks (Kalker)
The culprit: Linearity of the watermark detector, and
the ability to purposely modify the derived quantities
through pixel modifications.
Sensitivity attacks (Cox, Linnartz et al.)
Determine the set of pixels with the largest influence
on the watermark detector; attempt to remove the
watermark by subtracting  set_of_sensitive_pixels;
iterate.
The culprit: Sensitivity of the watermark detector at the
critical image is the similar or at least positively
correlated with that for the watermarked image.
Secure public watermark detector
Observation:
In order to design a watermarking method with a detector
that would not be vulnerable to those attacks, we need to
mask the quantities that are being correlated so that we
cannot purposely change them through pixel values and
we must introduce nonlinearity into the scheme to prevent
the sensitivity attack.
Key-dependent basis functions and a special nonlinear
detection function may solve the problem.
Limitations of digital watermarking
• Digital watermarking does not prevent
copying or distribution.
• Digital watermarking alone is not a
complete solution for access/copy control
or copyright protection.
• Digital watermarks cannot survive every
possible attack.
Challenges in Watermarking research
• Lack of protocols, standards and benchmarking.
• Lack of comprehensive mathematical theory.
• Watermark survival for all attacks.
• Relating robustness, capacity, perceptual quality
and security.
• Will it be used, and how the legal system adopt
it?
Trends in watermarking research
• Color image watermarking, and other multimedia
signals.
• 2nd generation watermarking.
• Watermarking of maps graphics and cartoons.
• Information theoretic issues.
• Applications beyond copyright protection.
• Protocols and standardization.
Steganography
Steganography
• Embed information in such a way, its very
existence is concealed.
• Goal
– Hide information in undetectable way both perceptually
and statistically.
– Security, prevent extraction of the hidden information.
• Different concept than cryptography, but use some
of its basic principles.
HISTORY
• 440 B.C.
– Histiaeus shaved the head of his most trusted slave
and tattooed it with a message which disappeared
after the hair had regrown. To instigate a revolt
against Persians.
• 1st and 2nd World Wars
– German spies used invisible ink to print very small
dots on letters.
– Microdots – Blocks of text or images scaled down to
the size of a regular dot.
Early steganography
• Pictographs: e.g., Sherlock Holmes’s
Dancing Men.
“Come Here At Once”
An Example: Null-Cipher
• Message sent by a German spy during World war-I:
PRESIDENT’S EMBARGO RULING
SHOULD HAVE IMMEDIATE NOTICE.
GRAVE SITUATION AFFECTING
INTERNATIONAL LAW. STATEMENT
FORESHADOWS RUIN OF MANY
NEUTRALS. YELLOW JOURNALS
UNIFYING NATIONAL EXCITEMENT
IMMENSELY.
Null Cipher-Solved!
• Message sent by a German spy during World war-I:
PRESIDENT’S EMBARGO RULING SHOULD HAVE
IMMEDIATE NOTICE. GRAVE SITUATION AFFECTING
INTERNATIONAL LAW. STATEMENT FORESHADOWS
RUIN OF MANY NEUTRALS. YELLOW JOURNALS
UNIFYING NATIONAL EXCITEMENT IMMENSELY.
Pershing sails from NY June I.
Other Old Ideas
•
•
•
•
•
Pinpricks in maps.
Tattoos on scalp.
Dotted I’s and crossed T’s.
Hidden Meanings: “Is father dead or deceased?”
Deliberate Mispellings or Errors, e.g., errors in
trivia books, logtables, etc.
• Unusual languages: e.g.,navajo, peculiar sounds
used esp., in Guerilla warfare (Chenghez Khan)
The prisoners problem
• Alice and Bob are in jail and wish to hatch an escape plan.
• Alice's and Bob's communication pass through Willy.
• Alice's and Bob's goal is to hide their ciphertext in innocuous
looking way so that Willy will not become suspicious.
• If Willy is a passive warden he will not do any thing to Alice's
and Bob's communication.
• If Willy is an active warden he will alter the data being sent
between Alice and Bob.
Problem Formulation
“Hello”
Hello
Hello
Wendy
Terminology
Alice
Wendy
Bob
Secret
Message
Cover
Message
Embedding
Algorithm
Stego
Message
Is Stego
Message?
No
Message
Retrieval
Algorithm
Yes
Secret
Key
Suppress
Message
Secret
Key
Secret
Message
Steganography Techniques
• Substitution methods
– Bit plane methods
– Palette-based methods
• Signal Processing methods
– Transform methods
– Spread spectrum techniques
• Coding methods
– Quantizing, dithering
– Error correcting codes
• Statistical methods – use hypothesis testing
• Cover generation methods - fractals
Stego-system Criteria
• Cover data should not be significantly modified
ie perceptible to human perception system
• The embedded data should be directly encoded
in the cover & not in wrapper or header
• Embedded data should be immune to
modifications to cover
• Distortion cannot be eliminated so errorcorrecting codes need to be included whenever
required
Places to Hide Information:
Steganography
•
•
•
•
Images
Audio files
Text
Video
We focus on Images as cover media.
Though most ideas apply to video and
audio as well.
Steganography in Text
• Soft Copy Text
– Encode data by varying the number of spaces
after punctuation
– Slight modifications of formatted text will be
immediately apparent to anyone reading the
text
Steganography in Text
• Soft Copy Text
– Use of White Space (tabs & spaces) is much
more effective and less noticeable
– This is most common method for hiding data
in text
Steganography in Text
• Soft Copy Text
– Encode data in additional spaces placed at
the end of a line
F
o
u
r
s
e
v
e
o
u
r
s
n
f
o
c
o
r
e
a
y
e
a
r
s
r
e
f
a
t
h
n
d
a
g
o
e
r
s
Steganography in Text
• Hard Copy Text
– Line Shift Coding
• Shifts every other line up or down slightly in order
to encode data
– Word Shift Coding
• Shifts some words slightly left or right in order to
encode data
Steganography in Text
• Some methods that can be used with
either hard or soft copy text
– Feature Coding
– Syntactic
– Semantic
Steganography in Audio
•
•
•
•
Low Bit Coding
Phase Coding
Spread Spectrum
Echo Data Hiding
Steganography in Audio
• Low Bit Coding
– Most digital audio is created by sampling the signal
and quantizing the sample with a 16-bit quantizer.
– The rightmost bit, or low order bit, of each sample can
be changed from 0 to 1 or 1 to 0
– This modification from one sample value to another is
not perceptible by most people and the audio signal
still sounds the same
Steganography in Audio
• Phase Coding
– Relies on the relative insensitivity of the human
auditory system to phase changes
– Substitutes the initial phase of an audio signal with a
reference phase that represents the data
– More complex than low bit encoding, but it is much
more robust and less likely to distort the signal that is
carrying the hidden data.
Steganography in Audio
• Direct Sequence Spread Spectrum
– Spreads the signal by multiplying it by a chip,
which is a maximal length pseudorandom
sequence
– DSSS introduces additive random noise to the
sound file
Steganography in Audio
• Echo Data Hiding
– Discrete copies of the original signal are
mixed in with the original signal creating
echoes of each sound.
– By using two different time values between an
echo and the original sound, a binary 1 or
binary 0 can be encoded.
Steganography in MP3
• Music company publishes albums in mp3 and publishes
over internet.
• Some people take these mp3 files and publish under
their own name.
• Case goes to court.
• The Music company needs to prove that the material
which is exhibit is indeed the one they published.
• They need a hidden copyright.
Steganography in MP3 (contd.)
• Principle : Audio signals contain a significant portion of information
that can be discarded without average listener noticing the change.
• MP3Stego – tool developed by Fabien A.P. Petitcolas
• Tool operates within MP3 encoding process
• The data to be hidden is first compressed, encrypted and hidden in
MP3 bit stream.
• Quantization of original audio signal takes place.
• At the same time, for some selected points, data is introduced in the
quantized output,
• Distortions introduced by these are constantly checked for to satisfy
the psychoacoustics model.
• A variable records the number of bits that are for data in the actual
audio, data for huffman coding and hidden data.
• Key is selected using pseudo random bit generator based on SHA-1
and dictates the values that would be modified to hold the hidden
data.
Steganography in Images
Way images are stored:
• Array of numbers representing RGB values for each
pixel
• Common images are in 8-bit/pixel and 24-bit/pixel
format.
• 24-bit images have lot of space for storage but are huge
and invite compression
• 8-bits are good options.
• Proper selection of cover image is important.
• Best candidates: gray scale images ..
• Cashing on limitations of perception in human vision
Steganography: Bit plane Methods
• Image: replace least significant bit (LSB) of
image intensity with message bit
• Replace lowest 3 or 4 LSB with message bits or
image data (assume 8 bit values)
• Data is hidden in “noise” of image
• Can hide surprisingly large amounts of data this
way
• Very fragile to any image manipulation
Bit plane Methods
• Variations include:
– Using a permutation of pixel locations at
which to hide the bits.
– Put bits at only certain locations in image
where there is “significant” variation and
change in gray-value would not be visually
perceptible
Least Significant Bit method
•
•
•
•
Consider a 24 bit picture
Data to be inserted: character ‘A’: (10000011)
Host pixels: 3 pixel will be used to store one character of 8-bits
The pixels which would be selected for holding the data are chosen
on the basis of the key which can be a random number.
• Ex:
00100111 11101001
11001000
00100111
11001000
Embedding ‘A’
00100111
00100110
11001001
11001000
00100111
11101000
11001000
00100111
11101001
11101001
11001000
11101000
11101001
• According to researchers on an average only 50% of the pixels
actually change from 0-1 or 1-0.
TOP SECRET
+
=
8-bit (256 grayscale)
images.
Example: Copyright Fabian A.P. Petitcolas,
Computer Laboratory, University of Cambridge
http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/
Sacrificing 2 bits of cover to carry 2 bits of
secret image
Original Image
Extracted Image
Sacrificing 5 bits of cover to carry 5 bits of
secret image
Original Image
Extracted Image
Palette-based Methods
• Palette manipulation means changing the way
the color or grayscale palette represents the
image colors
• Bit methods are used in palette manipulation
schemes
• Data hidden in “noise” of image
• Often radical color shifts occur - can tip off that
data is hidden
• Use grayscale to overcome color shift problem
Sample palettes
Drastic &
Subtle shade
variations
Red color
shade
variations
Gray Scale
shade
variations
Palette-based Methods
• Pseudo color 8-bit image: 256 different colors
that are indexed by the numbers 0,…,255
• To insert information, for example, S-Tools
reduces the number of colors from 256 to 32
and uses the lower LSB bit places to hide data
• In this case, 8 colors are the same before data
embedding; after data embedding 8 colors are
very close visually but differ in their bit
representation
Steganography for palette
images
LSB encoding cannot be directly applied to palette-based images
because new colors, that are not present in the palette,
would be created.
Two sources of palette images:
1. Color truncation + dithering of photographs
2. Computer generated images (fractals, cartoons, animations)
A secure steganographic method will produce modified carriers
compatible with the source
Possibilities
Hiding in the palette
Hiding in the image data
Non-adaptive techniques
Adaptive techniques
Artifacts
Palette artifacts
Image data artifacts
Possible approaches
Message hiding in the image data - greedy techniques
Decrease color depth and expand
1. Collapse 256 colors  128 colors
2. Expand 128 colors  256 colors by including a close color
(e.g., flip the LSB of the blue channel)
3. Embed a binary message into the LSB of the blue channel
of randomly selected pixels
1 bpp
4. Read the message from the LSB of the blue channel
Alternatively
1. Decrease color depth to 32 colors and include all colors obtained
from LSB shuffling of all 32 colors (one color produces 23 new
colors)
3 bpp
2. Encode messages into the LSB of pixel colors
Possible approaches
Message hiding in the image data
Parity embedding
1. Assign parity to palette colors
2. Embed message bits as the parity of colors
Message: 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 1 1 1 1
Randomly chosen pixel with color
C1
Find the color
C1
in the sorted palette
index = 30 = 00011110
00011110
00011111
C2
Replace the LSB of the index to
color C1 with the message bit
The new index now points to a
neighboring color C2
Replace the index of the pixel in
the original image to point to the
new color C2.
Sorted palette
Critical assumption: Colors close in the luminance-sorted palette
are also close in the color space.
White Noise Storm inserts data
by using spread spectrum
technology and frequency
hopping, but severely changes
the palette.
Airfield embedded using White Noise Storm
Original 24-bit Renoir
converted to 248 color GIF;
Airfield inserted using S-Tools;
Final stego image has 256
colors in GIF format.
Airfield embedded using S-Tools with 8-bit Renoir
Palette Methods
• Color ci 10110 0 1 0
• Color ci+1 10110 0 1 1
• When order palette by luminance there are
groups of pixel colors that look identical to the
eye; L = 0.299R + 0.587G + 0.114B
• Airfield is a 3 bit image put in last 3 bits of Renoir
image
• Very fragile – destroyed by image manipulation
Example: Insertion of a Paragraph of Text
Original Image
Text Hidden using
White Noise Storm
Text Hidden using StegoDos
Text Hidden using S-Tools
Transform Domain Techniques
•
•
•
•
•
Discrete Cosine Transform
Discrete Wavelet Transform
Discrete Fourier Transform
Mellin-Fourier Transform
Related:
– Singular Value Decomposition
– Minimax Eigenvalue Decomposition
Discrete Cosine Transform
The forward equation, for image A, is
N 1 N 1
2
 u(2 x  1)   v(2 y  1) 
b(u, v)  C (u)C (v) a( x, y) cos
 cos

N
 2N
  2N

x 0 y 0
The inverse equation, for image B, is
2 N 1 N 1
 u (2 x  1)   v(2 y  1) 
a( x, y)   C (u )C (v)b(u, v) cos
 cos

N u 0 v 0
 2N
  2N

Discrete Cosine Transform
• JPEG uses DCT to compress an image
• Many different approaches to use DCT to hide
information
• Message is embedded in signal, not noise
• Studies on visual distortions conducted by
source coding community can be used to predict
the visible impact of the hidden data in the cover
image
• Can be implemented in compressed domain,
saving time
Discrete Cosine Transform
Basic idea of JPEG:
1. Convert image to YIQ color space
2. Each color plane is partitioned into 8x8 blocks
3. Apply DCT to each block
4. Values are quantized by dividing with preset
quantization values (in a table)
5. Values are then rounded to nearest integer
Steganography: One Approach
using DCT
• The sender and receiver agree ahead of time
on location for two DCT coefficients in the 8 x 8
block
• Middle frequencies with same quantization
value: Location 1 is (4,1) & Location 2 is (3,2)
16
11
10
16
24
40
51
61
12
12
14
19
26
58
60
55
14
13
16
24
40
57
69
56
14
17
22
29
51
87
80
62
18
22
37
56
68
109
103
77
24
35
55
64
81
104
113
92
49
64
78
87
103
121
120
101
72
92
95
98
112
100
103
99
Steganography: DCT
• The DCT is applied to each 8 x 8 block in
the image producing a block Bi
• Each block will encode a single bit, 0 or 1
• If the message bit is a 1 then the larger of
the two values Bi(4,1) and Bi(3,2) is put in
location (4,1), otherwise if the message bit
is a 0, the smaller of the two values is put
in location (4,1)
DCT Steganography
• If the difference |Bi(4,1) - Bi(3,2)| < µ, then the values
Bi(4,1) and Bi(3,2) are adjusted so that |Bi(4,1) - Bi(3,2)|
>µ
• This assures that the relative difference will not be lost
when the compression is done
• This last step can introduce distortion into image
• The JPEG compression is performed (if desired) and
then the resulting image is inversed transformed
• Other modifications to this algorithm have been
researched that overcome some of these limitations
DCT Steganography
• To extract the data, the DCT is performed
on each block, and the coefficient values
at locations (4,1) and (3,2) are compared
• If Bi(4,1) > Bi(3,2) then the message bit is
a 1, otherwise it is a 0
Wavelet Steganography
• Many different schemes proposed
• Wang and Kao give a multithreshold wavelet
coding scheme where coefficients with high
values are used to store information
• These coefficients are assumed to keep relative
values the same even after multiple image
processing operations
• If the coefficients change value much, the visual
difference is noticeable in the image
• Can be used for textured and natural images
Example of Image and Its Wavelet
Transform (no hidden data)
i i
Discrete Fourier Transform
The formulae for the DFT and its inverse are
  j 2ux 
  j 2vy 
F (u, v)   a( x, y) exp
 exp

 N 
 N 
x 0 y 0
N 1 N 1
1
a( x, y)  2
N
 j 2ux 
 j 2vy 
F (u, v) exp
 exp


 N 
 N 
u 0 v 0
N 1 N 1
Discrete Fourier Transform
Steganography
• The DFT has success when phase modulation is
used to hide data
• Phase components have less visual impact than
magnitude components
• Phase components are also more robust against
noise distortion
• A DFT coefficient is used if its energy is high
enough
Quantization Based Steganography
• Message is embedded through choice of
quantizer.
• Consider a uniform quantizer of step size Δ ,Odd
reconstruction points represents message ‘1’ &
even represents message ‘0’.
• If the value of cover coefficient is ‘126’ ,
Δ=10,message bit = ‘1’
then after embedding the message
Stego coefficient = 130.
Selectively Embedding in Coefficients (SEC) Scheme
•
•
The image is divided into 8 X 8 nonoverlapping blocks,
and an 8 X 8 discrete cosine transform (DCT) of the
blocks is taken. Let us denote the intensity values of
the 8 X 8 blocks by aij and the corresponding DCT
coefficients by cij , where i , j Є {0,1,2,…,7} . Thus,
c=DCT2 (a)
where DCT2 denotes a two-dimensional DCT.
Let the quantization matrix entries for a particular QF
be, MQFij where i , j Є {0,1,2,…,7} The coefficients cij
used for information embedding are computed as
c~ij = cij / MQFij
i , j Є {0,1,2,…,7}
Ref : K. Solanki, N. Jacobsen, U. Madhow, B. S. Manjunath and S. Chandrasekaran, "Robust ImageAdaptive Data Hiding Based on Erasure and Error Correction" IEEE Transactions on Image
Processing, vol. 13, no. 12, pp. 1627-1639, Dec. 2004.
Selectively Embedding in Coefficients
(SEC) Scheme
• The coefficients are scanned in zig-zag fashion, as in
JPEG, to get one dimensional vector c~k where 0 <= k
<=63 and only a predefined low frequency band after
excluding the dc coefficient (k=0 term) is considered for
hiding (i.e., 1 <= k <=n).
• Quantize these coefficient values c~k to nearest integers
and take their magnitude to get rk
• The
i.e. coefficient after embedding is obtained as
where, bl is the message and Qbl is the quantizer Q0 or
Q1 depending upon the message.
where, bl is the message and Qbl is the quantizer Q0 or Q1 depending upon the
message.
Q0: Quantize to even number.
Q1:Quantize to odd number.
If after embedding
= t then the same message is embedded into the next
qualified coefficient to have synchronization with the decoder.
Results : SEC Scheme
•
.
The decoding is perfect for jpeg compression less than QF
Cover image ‘I13.jpg’
Stego image with 10000bits embedded
variations of capacity with QF
variations of capacity with threshold
Steganalysis
Definition
Searching for the existence of hidden messages
or Stego-content in a given medium.
• Stego-only: only stego-medium is available for
analysis
• Known cover: both original cover media and stegomedia are used
• Known message: hidden message is revealed to
facilitate review of media in preparation for future
attacks
Goals
• Passive steganalysis

Detect the presence or absence of a message
• Active Steganalysis
Estimate the message length and location
 Determine the algorithm/Stego tool
 Estimate the Secret Key in embedding
 Extract the message

Types of Steganalysis
 Embedding
 Universal
algorithm specific Steganalysis
Steganalysis
Universal Steganalysis Techniques
• Techniques which are independent of the embedding
technique
• One approach – identify certain image features that
reflect hidden message presence.
• Two steps
 Extract ‘good’ features
 Finding strong classification algorithms
Steganalysis in Practice
• Techniques designed for a specific
steganography algorithm
 Good
detection accuracy for the specific technique
• Universal Steganalysis techniques
 Less
accurate in detection
 Usable on new embedding techniques
Supervised learning based Steganalysis
• Supervised learning methods construct a classifier to
differentiate between stego and non-stego images using
training examples.
• Some features are first extracted and given as training inputs to
a learning machine. These examples include both stego as well
as non-stego examples.
• The learning classifier iteratively updates its classification rule
based on its prediction and the ground truth. Upon
convergence the final stego classifier is obtained.
Blind Identification based Steganalysis
This method can be clearly understood by the
following block diagram:
Hence, by estimating the transformation A & its
inverse the secret message can be obtained.
Statistical detection based Steganalysis
Here, 3 cases arise,
a) For completely known statistics case, the parametric
models for stego-image & cover image.
b) For partially known statistics case, the parametric
probability models are available but, not the exact
parameter models. These parameters are estimated
c) For completely unknown case, Bayesian prior models
are assumed and detectors are developed.
Universal Steganalysis Techniques
• Techniques which are independent of the
embedding technique
• Identify certain image features that reflect
hidden message presence.
• Two problems
– Calculate features which are sensitive to the
embedding process
– Finding strong classification algorithms which
are able to classify the images using the
calculated features
State – of – Art
Steganalysis Techniques
• Wavelet based methods
- Farid and Lyu
- Deepak Hinge
• Using Markov Random Fields (Sullivan)
• Using Binary Similarity Measures
(N. Memon)
• Using Image Quality Metrics (Avcibas)
Fusing one or more of the above techniques to
improve the detection accuracy.
Wavelet-based Universal Steganalysis
• Wavelet transform is used to obtain the features.
• The mean, variance, skewness and kurtosis of the sub
band coefficients at each
location, scale and color channel
forms features.
i.e. 12(n-1) features per color.
n: Number of scales.
usually 4 scales are used.
therefore 36 features per color channel.
Wavelet-based Universal Steganalysis
• In order to capture higher order statistical correlations
second set of 36 features per color are found based on
the errors in a linear predictor of coefficient magnitude.
• For green channel at scale i ,
• This can be written in the matrix form as,
is found by minimizing,
Wavelet-based Universal Steganalysis
• Therefore
is found by solving
Which yields,
The log error between the actual & predicted coefficients
is,
Then the mean, variance mean, variance, skewness
and kurtosis of this log error is used as another 36
features per color.
Steganalysis:
Binary Similarity Measures
Motivation:




Embedding leaves Statistical Artifacts.
Correlation between the low-bit planes for a cover image
differs from a stego image.
Set of Binary Similarity Measures used to detect the artifacts.
A feature vector is generated using the BSMs.
Bit planes
11010011 00011011
00011010
Each00000110
bit-plane00011000
is a
11010010
binary image in
11011111 11010100 00011000
itself.
Bitplane-1
Value : 1
1
0
1
0
0
1
1
Bit- no: 1
2
3
4
5
6
7
8
Bitplane-2
Bitplane-7
. . . .
Bitplane-8
Binary texture Statistics
 Let xi = { x i,k |, k = 1,2,…K } be the sequences of bits
representing ‘K’ neighborhood pixels {N,E,W,S}
i
 runs over all the image pixels
M X N  size of the image
Binary texture Statistics
1
0
= 3
1
1
= 4
We define an agreement variable for pixel Xi as:
, j = 1,…4., K = 4, i = 1....M x N.
,the Kronecker Delta
function
Now, we can calculate the one step cooccurrence values :-
Now, we define 3 types of binary similarity measures :

The first group consists of the computed similarity differences
dmi = mi7th – mi8th , i = 1…10 across the 7th and 8th bit-planes.
These use { a, b, c, d }.
The second group consists of histogram and entropic features.
We first normalize the histograms of the agreement scores for
the new bit-planes(7th and 8th).
Then, based on these values, we define the
similarity measures



The third set of measures are some what different as we use
the neighborhood – weighing mask in that.
For each binary image, we compute a 512 bin histogram based
on weighted neighborhood where,
the score
given by weighing the eight directional neighbors with
following mask.
We get 18 such measures for grayscale images and 54, for
color images
Commercial
Watermarking / Steganography Tools
• Digimarc ImageBridge
– Inserts imperceptible digital watermarks onto images
• Digimarc MarcSpider
– Tracks all images with Digimarc’s watermark on the
Internet
– Searches over 50 million images on the Internet
• Digimarc is providing secure identification solution to
over 200 government units for over 24 countries
including the state of New Jersey, Vermont, and
Michigan
• Philips Digital Network WaterCast for videos
• Companies which mark their products:
Corbis, workbookstock.com, The British Library
(Digimarc). BBC, Reuters, The Universal Studios
(Philips watercast)…
• Some Success stories:
Corbis
– identifies up to 50 cases of unauthorized commercial use of its
images per month
– Settled 28 cases in and out of court in 8 months
– Movie Market paid 1 million for the settlement
•
Playboy
– Webbworld paid $310,000 as well as reasonable attorney’s fee
for using 62 Playboy’s images
Conclusion
• Steganography has its place in the security. Field is very young. On
its own, it won’t serve much but when used as a layer of
cryptography, it would lead to a greater security.
• Far fetched applications in privacy protection and intellectual
property rights protection.
• Research is going on in both the directions
– One is how to incorporate hidden or visible copyright information
in various media, which would be published.
– At the same time, in opposite direction, researcher are working
on how to detect the trafficking of illicit material & covert
messages published by certain outlawed groups.
On-line Sources
• Stego-Tools:
<http://www.stegoarchive.com/>
Lots of freeware (and commercial) tools for hiding
information in text, audio, video, and image files
Famous Stego-tools for image –
Outguess+, F5+, S-Tools, etc,.
• Helpful Steganalysis programs
–
–
–
–
WinHex-www.winhex.com
Hiderman
Stegspy
Etc..