Transcript Slide 1
Object Position and Orientation Detection
For Mars Science Laboratory Helicopter Test Imagery
Michael Johnson
EE 368
Stanford University
June 5, 2012
Background
• The Mars Science Laboratory Terminal Descent Sensor, the rover’s landing
radar, underwent a field test campaign in the Mojave Desert.
• One purpose of the field test was to gauge how the rover’s position during
the sky crane descent phase might impact the return of the radar signal
during this crucial phase of the mission.
• To test this, the landing radar was affixed to a helicopter, and a mock up of
the rover, with targets attached to the top deck, was attached to a wench
underneath the helicopter. The radar was operated while the position of
the rover mock up was varied.
• A camera was positioned near the wench, facing down, to image the rover
as the test was underway. Time-tagged images were captured at a rate of
6 frames per second, which will be used to estimate the position of the
rover during the test and matched with specific times of radar test data.
Problem Statement
• As stated in the background, the problem is to derive estimations of an
object’s position and orientation in an image.
• In each image, the object (in this case the rover mock up) will be:
– Translated along the x and y dimensions.
– Scaled in size depending on the distance from the camera.
– Rotated due to helicopter maneuvers, wind, and other factors.
• Although the mockup may also experience roll and pitch effects, these are
assumed to be minimal due to the nature of the test, and will be ignored.
• Image quality can vary due to:
– Angle of the sun causing bright reflections on the rover surfaces.
– Shadows cast by the helicopter over the rover, obscuring the targets.
Proposed Solution
• It is proposed to detect the rover’s position and orientation in the images
using the following three steps:
– Perform Sobel edge detection to detect the edges of the circular targets.
– Calculate the Circular Hough Transform on the filtered images to find the circular targets
by detecting the peaks in the Circular Hough Transform. This will be performed over a
range of possible radiuses until ‘good’ peaks are seen. This will give us a set of possible
target locations.
– After possible targets have been identified, a Procrustes method will be performed for
matching each possible target with the known actual targets.
Sobel Edge Detection
• The Sobel Operator will be used for edge detection in the images, along
both the horizontal and vertical (essential for detecting circles).
• The Sobel Operator uses 3x3 kernels convolved with the original image.
These kernels are:
• The effect of these kernels is to calculate the approximations of the
derivatives at each point which can detect abrupt changes from light to
dark intensities.
Sobel Edge Detection Results
•
•
An example of Sobel Edge Detection on the imagery is shown below. The detection does a
good job at capturing the edges of the targets while rejecting some of the other edged.
It is noted that there is difficulty in detecting the targets obscured by shadows, which is
acceptable if enough of the other targets are detected
2
7
3
9
8
-1
12
10
11
13
16
1
6
14 15
5
Circular Hough Transform
• The Circular Hough Transform is a series of operations performed on black
and white images for circle detection.
• The general method of the transform is as follows:
– Create an accumulator matrix the size of the image.
– For every bright pixel in the image, draw a circle of radius R centered at this pixel in the
accumulator matrix. In other words, increment the value of the accumulator matrix by
one at every pixel where a circle is drawn.
– Peaks in the resulting accumulator matrix indicate circles of radius R being present at
that pixel in the original image.
– An illustration:
Circle in source image
Red circles are those drawn
around the source circle
The accumulator develops a
peak where the circles intersect,
which is the center of the circle
Circular Hough Transform Results
690
700
445
450
455
460
465
470
475
480
485
BW level image of Sobel Filtered Image
Results of Circular Hough Transform
The centers of the targets have
noticeable peaks in them
710
720
730
740
Circular Hough Transform Results
•
Due to irregularities in the shapes and sizes of the target circles, the Circular
Hough Transform may not produce a peak at the center of each target, but a small
cluster of peaks near the center. In these cases, close peaks (on the order of less
than a circle radius away) will be averaged together.
490
480
480
470
470
460
460
450
450
440
440
430
430
600
590
600
610
620
630
640
650
660
670
610
620
630
640
650
660
Circular Hough Transform Results
•
•
•
•
•
Original image with detected targets superimposed.
Generally, the method is capable of
picking up a large fraction of the targets.
Some are not detected due to
shadowing.
Also, clutter around the target (tape
round the top-most undetected target)
can cause clutter in the detected edges,
therefore a poor circle shape.
However, further processing will allow for
good object tracking even in when some
targets are not identified.
Method is very good at not erroneously
identifying clutter as a target.
Procrustes Method
• Once potential targets are identified, the next step is to match them with
the actual known targets. A Procrustes Method was employed to do this.
• The Procrustes Method employed consists of several stages:
– Pre-process the set of actual set of points by translating them in X and Y such that the
mean position of all the targets is zero, and then scale them such that the RMS distance
of all points to the origin is one.
– For the identified points:
• Translate the identified points in X an Y the same way as the actual points.
• Scale the identified points the same way as the actual points.
• Once the points are translated and scaled, finding orientation angle:
– Iterate over a set of angles and calculate the distance from each identified target to the
nearest actual target.
– For each identified target, choose as it’s pair the actual target that is closest.
– Calculate as a cost function the sum of the square of these distances.
– The rotation angle to use is that which has the smallest value of the cost function.
Procrustes Method
•
•
An orientation angle is now known. However, because the identified targets may
be missing some, the translation and scaling first performed may not have been
accurate.
Therefore, another round of translation and scaling is performed.
– Translate the identified targets again in X and Y by sweeping over a small range of values,
picking the one produces the minimum value of the cost function.
– Likewise, scale the points again over a small range of values, again picking the scale
value that minimizes the cost function.
•
•
•
Performing another round of rotation could be done, but generally by this point
the identified and actual points will be close enough to not require it.
At this point, each identified point can be matched with an actual point by finding
the actual point that is closest to each identified point.
Now, we have the pixel location in the image for each actual target, and similarly
the orientation angle, the scale, and translation.
Results
Target Locations After Detection
2
Actual Target Positions and Labels
Detected Targets Estimate
1.5
2
1
8
10
12
13
16
11
6
14 15
5
Normalized Y direction
3
9
1
15
15
55
0.5
11
11
14
14
17
8
66
0
18
-0.5
1
1
10
10
3
13
13
7
9
4
16
12
-1
2
-1.5
-2
-2
An example image with each detected
target assigned a reference number
corresponding to an actual target.
-1.5
-1
-0.5
0
0.5
Normalized X direction
1
1.5
A plot of the normalized actual target locations with
the locations of the detected targets after
translation, scaling, and rotation.
2
Results
•
A subset of 50 images was selected and run through the described method.
Fraction of Targets Detected
Mean Distance from Detected Targets to Actual Targets
Number of Detected Targets Rejected as No Good
1
0.9
8
0.2
0.8
Number of Rejected Detections
7
0.7
Mean Distance
0.15
Fraction
0.6
0.5
0.4
0.1
0.3
0.05
0.2
5
4
3
2
1
0.1
0
6
0
5
•
10
15
20
25
30
Image Number
35
40
45
50
0
0
5
10
15
20
25
30
Image Number
35
40
45
50
0
0
5
10
15
20
25
30
Image Number
35
An average of 74.4% of targets are identified, plenty to develop a good estimation of the
rover position.
40
45
50
Conclusion
•
Performance is generally good, although some images are more difficult to track
than others due to the reasons previously mentioned: shadowing, sun angle, etc.
•
The algorithm is efficient, taking a fraction of a second to complete per image with
little optimization.
•
Room for improvements include performance optimizations and more thorough
pose estimation based on roll/pitch/yaw characteristics.