隱像術--浮水印--影像處理形態學

Download Report

Transcript 隱像術--浮水印--影像處理形態學

§ 位元平面

․ 三原色 RGB 可分解成 R 平面 , G 平面 , B 平面 , 如右 : A A A 高灰階像素也可分解成八個位元平面 , 假設 256 個灰階值表示成 (g 8 g 7 g 6 g 5 g 4 g 3 g 2 g 1 ) 2 , 每一像素提供第 i 個位元 , 即 g i 以組成第 i 個位元平面 ( 也就是第 i 張黑白影像 ), 如下 : B B B B B B B B Ex4: 給定一 8 7 6 5 4X4 子影像 : , 算出第三張位元平面 ?

10 11 12 13 0 1 2 3 0000 1000 0010 0000 0000 0111 0001 1111 0000 0110 0001 1110 0000 0101 0001 1101 ⇒ 0000 1010 0000 1011 0000 1100 0000 1101 00000000 00000001 00000010 00000011 0 0 0 0 1 1 0 0 1 1 1 0 1 1 1 0 ․ 利用 位元平面 植入影像的缺點 : 經過壓縮後 , 所植入的影像 容易遭到破壞 ; 解壓縮後所得影像 , 常已破損 ; 即為數學上的 One-way function 。

§

Steganography and Watermark

․ 實際重疊高階四個位元平面 ( 捨棄低階四個位元平面 ) 所得影像 , 肉眼幾乎分辨 不出差異 ; 故 捨棄低階四個位元並不影響影像特徵太大 ( 此乃因愈低階位元的 權重愈低 , 所以影響影像特徵的機率愈小 ) 。 例如 , 某影像中有兩像素 , 其灰階值為 193 10 =(11000001) 2 與 192=(11000000) 2 , 可把灰階值為 37=(00100101) 2 的第三個像素隱藏於前述影像中 ; 所得灰 階值為 194=(11000010) 2 與 197=(11000101) 2 , 人眼幾乎察覺無異其影 像特徵。 ․ 假設一個位元組可以隱藏一位元 , 且 影像術規則 如下 : (1) 從浮水印讀出之位元為 0, 則原影像對應位元組的最後兩位元由 01/10, 改為 00/11 。 (2) 從浮水印讀出之位元為 1, 則原影像對應位元組的最後兩位元由 00/11, 改為 01/10 。 (3) 其餘情況則保持原狀

例如 , 位元組 11000000 要隱藏位元 1, 則改為 11000001; 要隱藏 0, 則位元組 11000000 保持不變。 Ex5: 原影像為 24 42 7 10 8 34 10 21 66 12 , 想隱藏如右浮水印 1 1 0 0 0 1 0 1 0 , 求出加入浮水印後 的十進位影像 ?

A: 先改成二進制 ( 如下 ), 再根據規則得 7 10 8 10 000110 00 001010 10 001000 10 XX XX XX XX XX XX 25 42 35 20 66 12 基本原理 ․ 令 B’ 為 A 隱藏在 B 後的結果 , PSNR 常用來評估 B’ 和 B 的相似性 ;

PSNR

 10 log 255 2

MSE MSE

 1

N

2

N x

 1    0

N y

 1 0 

B

' (

x

,

y

) 

B

(

x

,

y

 2 PSNR 是不錯的失真表示法 , 但 無法充分反應紋理 (texure) 的失真情形 ; 所謂的 浮水印 , 可把 A 看成標誌 (logo)-- 而這標誌通常也是一種版權 ; 例如 , NCKU 之於成大。 Note: A 的大小必須小於 B ; 故必要時 , 可把 A 先壓縮。

․ 設 A 為灰階影像且可被壓縮 , 又為長條型矩陣 ; Rank(A)=m, 則 Singular Value Decomposition of A 可表示成  

diag

(  1 ,  2 ,..., 

n

) A=U ∑V t 其中  1 ,  2 ,..., , 其中  n V and U is orthogonal. 為奇異值且滿足  1   2  ...

 

m

 0

and

m

 1  

m

 2  ...

 

n

 0

where

i

 

i

i

為矩陣 A t A 的第 i 個 Eigenvalue Ex6: Prove λ i ≥ 0 

AX

2  (

AX

)

t AX

X t A t AX

X t

( 

X

)  

X t X

 

X

2   

AX

2  0

X

2 Ex7: Prove A=U ∑V t = (U 1 U 2 )     1 0 0 0      

V

1

t V

2

t

   

U

1 

V

1 1

t

先求正交矩陣

V

所構成 , 也就是

V

1   (V 1 ,

V

2 ),

V

1 為  1 , (

v

1 ,

v

2 ,...,

v m

);

V

2   2 ,..., 

m

所算出的

Eigenvecto rs

, (

V m

 1 ,

V m

 2 ,...,

V n

) 是 

m

 1  0  即

v

1 , 

m

 2  ...

v

2 ,...,  

n v m

所求出之

Eigenvecto rs

所組成

例如 , 設  

A

   2 2 2 2   則

A t A

   8 8  4 ,  

V

 2  0 ;

eigenvecto rs

: (

V

1 ,

V

2 )  1 2   1 1

V

1  ( 1 , 1 )

t

,

V

2  1 1   ;

u

sin

g

 ( 1 ,  1 )

t AV

U

AV j

 

j u j

meaning

8 8   且特徵值  1  16 ,  2  0 ;  1 

u

1 

AV

1  1  1 4   2 2 2 2       

we

1 1 2 2            sin

ce A t U

V

t we have A t u j

 0 ;

hence

 1

get

1 1 2 2     

u

2  1 2  1 2  

t Note

: 欲解

A SVD

U

V t of A

U

V t

      ( 1 ) 先解  ( 2 ) 次解

V

1 1 2 2 1  2 1 2        4 0 0 0        ( 3 ) 末解

U

1 1 2 2 1  2 1 2      ․ SVD 被用於隱像術的原因 : 乃因植入的影像 A 之奇異值 , 可變得很小 ; 再把轉換 後的影像 A’ 植入 B, 則合成影像 B’ 的 SVD 之奇異值 , 仍以 B 的奇異值為主。 Note: 前景取較大的奇異值 ; 即 A’ 的奇異值接在 B 的後面 , 如此 A’ 就不易被察覺

形態學 ․ 假設色調 H 為人臉特徵依據 , 以訓練集 (training images) 測得皮膚色之色調範圍 可能顯得零碎 ; 吾人可利用形態學的 opening 與 closing 算子 , 將太小且疏離的雜 訊刪除 , 但將很靠近的區塊連接在一起 ; 加上頭髮的考慮 , 進一步判定是否人臉 ․ closing 算子會先進行 dilation 運算 , 再作 erosion 運算 ; 效果是 : 先擴張後 , 區域旁 的小區域會被併在一起 , 但離區域遠的小雜訊仍然處於孤立狀態。後經侵蝕運 算 , 區域旁近距離的雜訊仍會存於新區域內 , 但遠距離的雜訊則被侵蝕掉 ․ opening 算子進行的順序恰相反 , 有消除小塊雜點的功能 ; 能打斷以細線連接的 近距離兩區塊。原因是 : 連接兩區域的細邊消失 , 即使擴張兩區域也無法併合 ․ 影像處理基本主題 , 例如 DCT 、 sampling theorem 、 aliasing 等 , 不在討論內

Digital Watermarking

․ A digital watermark is a signal permanently embedded into digital data (audio, images, and text) that can be detected or extracted later by means of computing operations in order to make assertions about the data. It has been developed to protect the copyright of media signals .

․ It is hidden in the host data in such a way that it is inseparable from the data and so that it is resistant to many operations not degrading the host document. Thus by means of watermarking, the work is still accessible but permanently marked . ․ It is derived from steganography , which means covered writing Steganography is the science of communicating information while hiding the existence of the communication.

․ The goal of steganography is to hide an information message inside harmless messages in such a way that it is not possible even to detect that there is a secret message present . Water marking is not like encryption in that the latter has the aim of making messages unintelligible to any unauthorized persons who might interpret them. Once encrypted data id decrypted, the media is no longer protected.

Morphology

․ Morphology means the form and structure of an object, or the arrangements and interrelationships between the parts of an object. Digital morphology is a way to describe or analyze the shape of a digital (most often raster) object. The math behind it is simply set theory . ․ We can assume the existence of three color components (red, green and blue) is an extension of a grey level , or each color can be thought of as a separate domain containing new information.

․ Closing the red and blue images should brighten the green images, and opening the green images should suppress green ones.

the

․ Images consist of a set of picture elements ( pixels ) that collect into groups having two-dimensional structure (shape). Mathematical operations on the set of pixels can be used to enhance specific aspects of the shapes so that they might be (for example) counted or recognized .

․ Erosion : Pixels matching a given pattern are deleted from the image.

․ Dilation : A small area about a pixel is set to a given pattern .

Binary Dilation : First marking all white pixels having at (simple) least one black neighbor, and then setting all of the marked pixels to black.

(Dilation of the original by 1 pixel) ․ In general the object is considered to be a mathematical set of black pixels , written as A={(3,3),(3,4),(4,3),(4,4)} if the upper left pixel has the index (0,0).

․ Translation   

x

,  For example, if x were at (1,2) then the first (upper left) pixel in A x would be (3,3)+(1,2)=(4,5); all of the pixels in A shift down by one row and right by two columns.

․ Reflection  

c c

This is really a rotation  

a

,

a

A

 

A c

 

c c

A

 of the object A by 180 degrees about the origin, namely the complement of the set A.

․ Intersection, union and difference (i.e.

the language of the set theory.

A

B c

) correspond to ․ Dilation

A

B

 

c c

a

b

,

a

A

,

b

B

 set B is called a structuring element , and its composition defines the nature of the specific dilation.

A

B

C

 (

A

 {( 0 , 0 )})  (

A

 {( 0 , 1 )}) (3,3)+(0,0)=(3,3), (3,3)+(0,1)=(3,4), … Some are duplicates.

B= (0,0) added to A Adding (0,1) to A After union A= A= A= Note : If the set B has a set pixel to the right of the origin , then a dilation grows a layer of pixels on the right of the object.

To grow in all directions , we can use B having one pixel on every side of the origin; i.e. a 3X3 square with the origin at the center.

Ex2: Suppose A 1 ={(1,1),(1,2),(2,2),(3,2),(3,3),(4,4)} and B 1 ={(0,-1),(0,1)}. The translation of A 1 by (0,-1) yields (A 1 ) (0,-1) ={(1,0),(1,1),(2,1),(3,1),(3,2),(4,3)} and (A 1 ) (0,1) ={(1,2),(1,3),(2,3),(3,3),(3,4),(4,5)} as following.

B1= (B 1 not including the origin) before after

Note : (1) The original object pixels belonging to A 1 are not necessarily set in the result , (4,4) for example, due to the effect of the origin not being a part of B 1 .

B

b

B

(

A

) 

a

A

( dilation is commutative . This gives a clue concerning a possible implementation for the dilation operator. When the origin of B aligns with a black pixel in the image , all of the image pixels that correspond to black pixels in B are marked , and will later be changed to black. After the entire image has been swept by B, the dilation is complete. Normally the dilation is not computed in place; that is, where the result is copied over the original image. A third image, initially all white, is used to store the dilation while it is being computed.

← Dilating → (Erosion) (1 st ) (2 nd ) (1 st translation)

(Erosion) ⇒ ⇒ (2nd) (3rd) (final)

Binary Erosion

• If dilation can be said to add pixels to an object, or to make it bigger, then erosion will make an image smaller . Erosion can be implemented by marking all black pixels having at least one white neighbor, and then setting to white all of the marked pixels. Only those that initially place the origin of B at one of the members of A need to be considered . It is defined as

A

B

 

c

(

B

)

c

A

 Ex3: B={(0,0),(1,0)}, A={ (3,3) , (3,4) , (4,3) , (4,4) } Four such translations: B (3,3) ={ (3,3) , (4,3) } B (3,4) ={ (3,4) , (4,4) } B (4,3) ={(4,3), (5,3) } B (4,4) ={(4,4), (5,4) } 0 

B

2 B (2,3) ={(3,3)} B (2,4) ={(3,4)} B (3,3) ={(4,3)} B (3,4) ={(4,4)} Note: {(2,3),(2,4),(3,3),(4,4)} is not a subset of A, meaning the eroded image is not always a subset of the original.

․ Erosion and dilation are not inverse operations . Yet, erosion and dilation are duals in the following sense: (

A

B

)

c

A c

B

․ An issue of a “don’t care” state in B, which was not a concern about dilation. When using a strictly binary structuring element to perform an erosion, the member black pixels must correspond to black pixels in the image in order to set the pixel in the result, but the same is not true for a white pixel in B. We don’t care what the corresponding pixel in the image might be when the structuring element pixel is white .

Opening and Closing

․ The application of an erosion immediately followed by a dilation using the same B is referred to as an opening operation, describing the operation tends to “open” small gaps or spaces between touching objects in an image. After an opening using simple the objects are better isolated, and might now be counted or classified .

․ Another using of opening: the removal of noise . A noisy grey level image thresholded results in isolated pixels in random locations. The erosion step in an opening will remove isolated pixels as well as boundaries of objects, and the dilation step will restore most of the boundary pixels without restoring the noise.

This process seems to be successful at removing spurious black pixels, but does not remove the white ones .

․ A closing is similar to an opening except that the dilation is performed first , followed by an erosion using the same B, and will fill the gaps or “close” them . It can remove much of the white pixel noise , giving a fairly clean image. (A more complete method for fixing the gaps may use 4 or 5 structuring elements, and 2 or 3 other techniques outside of morphology.) ․ Closing can also be used for smoothing the outline of objects in an image, i.e. to fill the jagged appearances due to digitization in order to determine how rough the outline is. However, more than one B may be needed since the simple structuring element is only useful for removing or smoothing single pixel irregulari ties. N dilation/erosion (named depth N ) applications should result in the smoothing of irregularities of N pixels in size .

․ A fast erosion method is based on the distance map of each object, where the numerical value of each pixel is replaced by new value representing the distance of that pixel from the nearest background pixel . Pixels on a boundary would have a value of 1, being that they are one pixel width from a back ground pixel; a value of 2 meaning two widths from the back ground, and so on. The result has the appearance of a contour map , where the contours represent the distance from the boundary .

․ The distance map contains enough information to perform an erosion by any number of pixels in just one pass through the image, and a simple thresholding operation will give any desired erosion.

․ There is another way to encode all possible openings as one grey-level image, and all possible closings can be computed at

the same time. First, all pixels in the distance map that do NOT have at least one neighbor nearer to the background and one neighbor more distant are located and marked as nodal pixels .

If the distance map is thought of as a three-dimensional surface where the distance from the background is represented as height, then every pixel can be thought of as being peak of a pyramid having a standardized slope. Those peaks that are not included in any other pyramid are the nodal pixels.

․ One way to locate nodal pixels is to scan the distance map, looking at all object pixels; find the MIN and MAX value of all neighbors of the target pixel, and compute MAX-MIN. If the value is less than the MAX possible, then the pixel is nodal.

The “Hit and Miss” Transform

․ It is a morphological operator designed to locate simple shapes within an image. Though the erosion of A by S also includes places where the background pixels in that region do not match those of S, these locations would not normally be thought of as a match. ․ Matching the foreground pixels in S against those in A is “ hit ,” and is accomplished with an erosion

A

S

. The background pixels in A are those found in A c , and while we could use S c as the background for S in a more flexible approach is to specify the background pixels explicitly in a new structuring element T.

A “ hit ” in the background is called a “ miss ,” and is found by

A c

T

.

․ What we need is an operation that matches both the foreground and the background pixels of S in A , which are the pixels:

A

 (

S

,

T

)  (

A

S

)  (

A c

T

) Ex5: To detect upper right corners. The figure (a) below shows an image interpreted as being two overlapping squares.

(b) Foreground structuring element (a) (c) Erosion of (a) by (b) (d) Complement of (a) - the ‘hit’ (e) Background S, (f) Erosion of (d) showing 3 pixels of by (e)- the ‘miss’ the corner (g) Intersection of (c) and (f)--the result

Identifying Region Boundaries

․ The pixels on the boundary of an object are those that have at least one neighbor that belongs to the background . It can’t be known in advance which neighbor to look for! A single structur ing element can’t be constructed to detect the boundary. This is in spite of the fact that an erosion removes exactly these pixels .

․ The boundary can be stripped away using an erosion and the eroded image can then be subtracted from the original, written Ex6: Figure (h) results from the previous figure (a) after an erosion, and (i) shows (a)-(h): the boundary. (a) of Ex5 (h) (i)

Conditional Dilation

․ There are occasions when it is desirable to dilate an object in such a way that certain pixels remain immune . The forbidden area of the image is specified as a second image in which the forbidden pixels are black . The notation computed in an iterative fashion :

A i

 (

A i

 1 

S

) 

A

 A’: the set of forbidden pixels; A i : the desired dilation ․ One place where this is useful is in segmenting an image .

I high : a very high threshold applying to an image-- a great many will be missed.

I low : a very low threshold applying to the original image-- some background will be marked.

R: a segmented version of the original-- a superior result than using any single threshold in some cases, and obtained by:

R

I

high

 ( ,

low

)

․ Another application of conditional dilation is that of filling a region with pixels, which is the inverse operation of boundary extraction . It is to dilate until the inside region is all black, and then combine with the boundary image to form the final result.

Fill

P

 (

S cross

,

A c

) where P is an image containing only the seed pixel, known to be inside the region to be filled, A is the boundary image and S cross is the cross-shaped structuring element, (j) for example.

(i) (j) (k) (l) (m) (n) (o) (p) (q) Ex7: (i) boundary, (j) structuring element, (k) seed pixel iterated 0 of the process, (l) iteration 1, (m) iteration 2, (n) iteration 3, (o) iteration 4, (p) iteration 5 and completed, (q) union of (i) with (p)-- the result

Counting Regions

․ It is possible to count the number of regions in an binary image using morphological operators, first discussed by Levialdi using 6 different structuring elements--4 for erosion naming L 1 ~L 4 and 2 for counting isolated “1” pixels (# operator). The initial count of regions is the number of isolated pixels in the input image , and the image of iteration 0 is A: count 0 = #A, A 0 =A, count n = #A n The image of the next iteration is the union of the current image:

A n

 1  (

A n

L

1 )  (

A n

L

2 ) of the four erosions  (

A n

L

3 )  (

A n

L

4 ) The iteration stops when An becomes empty (all 0 pixels), and the overall number of regions is the sum of all of the values count i .

Ex8: Counting 8-connected, (a) (b) (c) (d) (e), and (a)~(d): L 1 ~L 4 (e)

Grey-Level Morphology

․ A pixel can now have any integer value, so the nice picture of an image being a set disappears ! The figures shows how the dilated grey-level line (a) might appear to be (b), and was computed as follows , A being the grey-level image to be dilated.

(

A

S

)[

i

,

j

]  max{

A

[

i

r

,

j

c

] 

S

[

r

,

c

][

i

r

,

j

c

] 

A

, [

r

,

c

] 

S

} (a) Background is 0, and line (b) Grey line after a dilation pixels have the value 20.

․ Process of the above computation : (1) Position the origin of the structuring element over the first pixel of the image being dilated.

(2) Compute the sum of each corresponding pair of pixel values in the structuring element and the image. (3) Find the maximum value of all of these sums, and set the corresponding pixel in the output image to this value.

․ (4) Repeat this process for each pixel in the image being dilated.

Chromaticity Diagram

Chromaticity

is an objective specification of the quality of a color regardless of its luminance , that is, as determined by its colorfulness (or saturation, chroma, intensity, or excitation purity) and hue .

• In color science , the white point of an illuminant or of a display is a neutral reference characterized by a chromaticity; for example, the white point of an sRGB display is an x,y chroma ticity of [0.3127,0.3290]. All other chromaticities may be defined in relation to this reference using polar coordinates . The

hue

is the angular component, and the

purity

is the radial component, normalized by the maximum radius for that hue.

• Purity is roughly equivalent to the term " saturation " in the HSV color model . The property " hue " is as used in general color theory and in specific color models such as HSV or HSL , though it is more perceptually uniform in color models such as Munsell , CIELAB or CIECAM02 .

․ Some color spaces separate the three dimensions of color into one luminance dimension and a pair of chromaticity dimensions.

For example, the chromaticity coordinates are

a

and

b

in Lab color space ,

u

and

v

in Luv color space ,

x

and etc. These pairs define chromaticity vectors

y

in xyY space, in a rectangular 2 space, unlike the polar coordinates of hue angle and saturation that are used in HSV color space .

․ On the other hand, some color spaces such as RGB and XYZ do not separate out chromaticity; chromaticity coordinates such as

r

and

g

or

x

and

y

can be calculated by an operation that normalizes out intensity.

․ The xyY space is a cross between the CIE XYZ color space and its normalized chromaticity coordinates xyz, such that the luminance Y is preserved and augmented with just the required two chromaticity dimensions.

․ The CIE (1931) diagram is a projection of a 3D color space , called XYZ color space , to 2D . The light emitted by a device, or light reflected from a surface consists of photons with different wavelengths. The amount of photons with a certain wavelength, λ, in a given light composition is represented by the function C( λ). The CIE diagram comprises three funs μ x ( λ), μ y ( λ), μ z ( λ), and is used for comparing colors produced by color producing devices, e.g. PC monitors, printers, and cameras.

The science of quantifying color is called colorimetry . The X, Y,

y

Z coordinates are found as follows: 

X

Y Y X

Z

 

C

(  ) 

x d

 ․ The projection

z

X

Z Y

Z Y

 

C

(  ) 

y

to the CIE diagram

d

Z

 is obtained via 

C

(  ) 

x

z d

X X

Y

Z

where x+y+z=1 , it’so that only two of x, y, z are independent , making the projection a planar surface .

Texture

․ A major characteristic is the repetition of a pattern or patterns over a region . The pattern may be repeated exactly, or as a set of small variations on the theme, possibly a function of position.

The goal of studying texture is to segment regions rather than characterizing textures, determining which regions have texture A and which have texture B . The result could be an image in which texture has been replaced by a unique grey level or color.

․ texton : the size, shape, color, and orientation of the elements of the pattern. Sometimes the difference between two textures is contained in the degree of variation alone, or in the statistical distribution found relating the textons. A region cannot display texture if it is small compared with the size of a texton .

․ The same texture at two different scales will be perceived as two different textures , provided that the scales are different enough. This leaves us with a problem of scale. As the scales become closer together the textures are harder to distinguish , and at some point they become the same.

Ex1: regions characterized by their textures – both are artificial.

(a) (b) ․