Transcript Image Processing on the GPU
Edge Detection
(with implementation on a GPU) And
Text Recognition
(if time permits) Jared Barnes Chris Jackson
◦
Edge Detection
Wikipedia: Identifying points in a digital image at which the image has discontinuities.
http://4.bp.blogspot.com/-p_9w91wC_Rc/TbBgF7dQYhI/AAAAAAAAACM/DQTrM_a7Apg/s1600/edge-example-c.png
John Canny “A Computational Approach to Edge Detection” 1986 http://ieeexplore.ieee.org.libproxy.mst.edu/stamp/stamp.jsp?tp=&arnumber=4767851
1.
Noise Removal 2.
Image Gradient Computation 3.
Non-Maximum Suppression 4.
Hysteresis Thresholding
Gaussian Smoothing or Blurring A pixel is changed based on a weighted average of itself and its neighbors The number of neighbors (3x3, 5x5) and the relative weights can vary 3D Gaussian Distribution Normalized 2D Gaussian Approximation http://www.librow.com/content/common/im ages/articles/article-9/2d_distribution.gif
http://homepage.cs.uiowa.edu/~cwyman/classe s/spring08-22C251/homework/canny.pdf
Too much Spotty About right Smooth http://media.tumblr.com/ccd6945141b46e5e2f5c36168f6a 8037/tumblr_inline_mhcv1l0EZB1qz4rgp.png
http://www.eversparkinteractive.com/wp content/uploads/2013/03/gaussian-blur-thumbnail.jpg
Calculus!
First derivative in the X and Y directions (separately) Sobel Operator (2 kernels) G x Magnitude = 𝐷 𝑥 2 𝑥, 𝑦 + 𝐷 2 𝑦 𝑥, 𝑦 Angle = arctan 𝐷 𝑦 (𝑥,𝑦) 𝐷 𝑥 (𝑥,𝑦) Then round to: 0° =←→ 90°=↑↓ 45°=↗↙ 135°=↘↖ http://homepage.cs.uiowa.edu/~cwyman/classes/spring08-22C251/homework/canny.pdf
G y
(Vertical Edges) (Horizontal Edges) http://suraj.lums.edu.pk/~cs436a02/CannyImplementation.htm
Make edges exactly one pixel thick Look at the gradient magnitude of your 2 neighbors in the direction of your angle Example 1 Angle = 135° ↘↖ Keep it!
Example 2 Angle = 0° ←→ Kill it!
Thick Edges Thin Edges (Gradient Magnitude) http://suraj.lums.edu.pk/~cs436a02/CannyImplementation.htm
(Gradient Magnitude)
Two thresholds are better than one!
If a pixel’s value is above T high , it’s an edge.
If a pixel’s value is below T low , it’s not an edge.
If a pixel’s value is between T high and T low , it might be an edge (provided it is connected to an actual edge) T high = 45 T low = 35
1.
Smooth image to reduce noise 2.
Calculate X & Y derivatives to get edges 3.
Thin all edge widths to 1 pixel 4.
Remove weak, unconnected edges (ta da!)
How do we parallelize the Canny Edge Detector?
Convolution – Independent of order 5 Image 5 5 10 10 10 20 20 20 Element-wise Multiplication 10 10 10 20 40 20 40 40 40 Kernel Sum All Values Divide by Kernel Sum 230
Convolve a Gaussian Kernel with the image Each GPU core can convolve each pixel in the image individually with the Gaussian Kernel One thread per pixel, each performing 9 multiplies, 9 adds, and 1 division Embarrassingly Parallel with huge speedup
Convolve two Sobel Kernels with the image Wait, convolution again?
Same as previous step – we can even reuse the convolution function!
Comparing 3 pixel gradient magnitudes and clearing the middle pixel or leaving it alone Similar to convolution… but simpler!
Each GPU thread owns a pixel: 1.
Check gradient angle of pixel 2.
3.
Compare this pixel’s magnitude with two neighbors in the direction of its angle If I’m greater than those neighbors, leave me alone; otherwise, mark me as “not an edge” Less speedup than steps 1 and 2
Mark pixels > T high Mark pixels < T low as strong edges as not edges Mark remaining pixels as weak edges if they connect to a strong edge Typically implemented with recursion Each thread with a weak-edge pixel looks at nearest 2 neighbors to find a strong-edge pixel With identical algorithms on CPU and GPU, speedup is marginal (memory accesses, not much processing)
Wikipedia: The mechanical or electronic conversion of images of printed text into computer-readable text.
http://hackadaycom.files.wordpress.com/2010/09/helloworldconsole.png
http://www.flacom.com/content/uploads/2013/09/hello-world.jpg
Label Connected Components Look For Letters Adjust for disconnected letters HELLO WORLD E F ? ü j i H E L L O W O R L D
1.
2.
3.
Create a list of components in the image A component is simply a set of connected edges Label each edge pixel with a unique component ID Examine each pixel’s 8 touching neighbors and set that pixel’s ID to the smallest neighbor ID Repeat step 2 until no pixel IDs are changed
Uhh… what’s a letter?
How do we know it’s a letter?
How does the computer know it’s a letter?
Letters are represented by a vector of numbers indicating the ratio of black pixels to white pixels in each division of the letter-image.
0 0 0 5 0 40 0 A 0 15 15 0 15 15
Compute how closely each labelled component matches each letter in your alphabet The component is then marked with whichever letter it most closely matches
Letters like ‘i’ and ‘j’ have floating parts Sometimes edge detection may accidentally break up a letter A letter vector should then get an additional property indicating vertical discontinuity L E T T E R V E C T O R … … … 0/1