Transcript Slide 1
High-Precision Globally-Referenced Position and Attitude via a Fusion of Visual SLAM, Carrier-PhaseBased GPS, and Inertial Measurements Daniel Shepard and Todd Humphreys 2014 IEEE/ION PLANS Conference, Monterey, CA | May 8, 2014 Overview Globally-Referenced Visual SLAM Motivating Application: Augmented Reality Estimation Architecture Bundle Adjustment (BA) Simulation Results for BA 2 of 21 Stand-Alone Visual SLAM Produces high-precision estimates of Camera motion (with ambiguous scale for monocular SLAM) A map of the environment Limited in application due to lack of a global reference [1] G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” in 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. IEEE, 2007, pp. 225–234. 3 of 21 Visual SLAM with Fiduciary Markers Globally-referenced solution if fiduciary markers are globally-referenced Requires substantial infrastructure and/or mapping effort Microsoft’s augmented reality maps (TED2010[2]) [2] B. A. y Arcas, “Blaise Aguera y Arcas demos augmented-reality maps,” TED, Feb. 2010, http://www.ted.com/talks/blaise aguera.html. 4 of 21 Can globally-referenced position and attitude (pose) be recovered from combining visual SLAM and GPS? 5 of 21 Observability of Visual SLAM + GPS No GPS positions Translation Rotation Scale 1 GPS position Translation Rotation Scale 2 GPS positions Translation ~ Rotation Scale 3 GPS positions Translation Rotation Scale 6 of 21 Combined Visual SLAM and CDGPS CDGPS anchors visual SLAM to a global reference frame Can add an IMU to improve dynamic performance (not required!) Can be made inexpensive Requires little infrastructure Very Accurate! 7 of 21 Motivating Application: Augmented Reality Augmenting a live view of the world with computer-generated sensory input to enhance one’s current perception of reality[3] Current applications are limited by lack of accurate global pose Potential uses in Construction Real-Estate Gaming Social Media [3] Graham, M., Zook, M., and Boulton, A. "Augmented reality in urban places: contested content and the duplicity of code." Transactions of the Institute of British Geographers. . 8 of 21 Estimation Architecture Motivation Sensors: Camera Two GPS antennas (reference and mobile) IMU How can the information from these sensors best be combined to estimate the camera pose and a map of the environment? Real-time operation Computational burden vs. precision 9 of 21 Sensor Fusion Approach Tighter coupling = higher precision, but increased computational burden IMU IMU IMU IMU Visual SLAM Visual SLAM Visual SLAM Visual SLAM CDGPS CDGPS CDGPS CDGPS 10 of 21 The Optimal Estimator 11 of 21 IMU only for Pose Propagation 12 of 21 Tightly-Coupled Architecture 13 of 21 Loosely-Coupled Architecture 14 of 21 Hybrid Batch/Sequential Estimator Only geographically diverse frames (keyframes) in batch estimator 15 of 21 Bundle Adjustment State and Measurements State Vector: 𝑿𝐵𝐴 𝒄 = 𝒑 ,𝒄 = … 𝐶𝑖 𝑇 𝒙𝐺 𝐶𝑖 𝑇 𝒒𝐺 𝑇 … ,𝒑 = … 𝑝𝑗 𝑇 𝒙𝐺 … 𝑇 Measurement Models: CDGPS Positions: 𝐴 𝐶 𝐶 𝐶 𝐶 𝒙𝐺 𝑖 = 𝒉𝑥 𝒙𝐺𝑖 , 𝒒𝐺𝑖 + 𝒘𝑥𝑖 = 𝒙𝐺𝑖 + 𝑅 𝒒𝐺𝑖 𝒙𝐶𝐴 + 𝒘𝑥𝑖 Image Feature Measurements: 𝑝 𝑝 𝑝 𝒔𝐼 𝑗 = 𝒉𝑠 𝒙𝐶𝑗 + 𝒘𝐼 𝑗 = 𝑖 𝑝 𝒙𝐶𝑗 𝑖 𝑖 = 𝑝 𝑥𝐶 𝑗 𝑖 𝑝 𝑦𝐶 𝑗 𝑖 𝑖 𝑝𝑗 𝑇 𝑧𝐶 𝑖 𝑝 𝑥𝐶 𝑗 𝑖 𝑝𝑗 𝑧𝐶 𝑖 = 𝑅 𝑝𝑗 𝑇 𝑦𝐶 𝑖 𝑝𝑗 𝑧𝐶 𝑖 𝐶 𝒒𝐺𝑖 𝑝 + 𝒘𝐼 𝑗 𝑇 𝑖 𝑝 𝐶 (𝒙𝐺𝑗 − 𝒙𝐺𝑖 ) 16 of 21 Bundle Adjustment Cost Minimization Weighted least-squares cost function Employs robust weight functions to handle outliers 𝑁 1 argmin 𝑿𝐵𝐴 2 𝐴 Δ𝒙𝐺 𝑖 𝐴𝑖 2 Δ𝒙𝐺 𝑀 + 𝑖=1 =𝑅 −1/2 𝐴 𝒙𝐺𝑖 𝑤𝑉 𝑗=1 𝐴 𝒙𝐺 𝑖 − 𝐴 𝒙𝐺 𝑖 𝑝𝑗 Δ𝒔𝐼 𝑖 𝑝 Δ𝒔𝐼 𝑗 𝑖 =𝑅 −1/2 𝑝𝑗 𝑖 𝒔𝐼 𝑝𝑗 2 Δ𝒔𝐼 𝑖 𝑝𝑗 𝒔𝐼 𝑖 − 𝑝𝑗 𝒔𝐼 𝑖 Sparse Levenberg-Marquart algorithm Computational complexity linear in number of point features, but cubic in number of keyframes 17 of 21 Bundle Adjustment Initialization Initialize BA based on stand-alone visual SLAM solution and CDGPS positions Determine similarity transform relating coordinate systems 1 argmin 𝒙𝑉 , 𝒒𝑉 , 𝑠 2 𝐺 𝐺 𝑁 𝐴 𝒙𝐺 𝑖 − 𝒙𝑉𝐺 −𝑅 𝒒𝑉𝐺 𝐶 𝑠𝒙𝑉𝑖 +𝑅 𝐶 𝒒𝑉𝑖 𝒙𝐶𝐴 2 𝑖=1 Generalized form of Horn’s transform[4] Rotation: Rotation that best aligns deviations from mean camera position Scale: A ratio of metrics describing spread of camera positions Translation: Difference in mean antenna position [4] B. K. Horn, “Closed-form solution of absolute orientation using unit quaternions,” JOSA A, vol. 4, no. 4, pp. 629–642, 1987. 18 of 21 Simulation Scenario for BA Simulations investigating estimability included in paper Hallway Simulation: A D Measurement errors: 2 cm std for CDGPS 1 pixel std for vision ←C Keyframes every 0.25 m 242 keyframes 1310 point features ←B Three scenarios: 1. GPS available 2. GPS lost when hallway entered 3. GPS reacquired when hallway exited 19 of 21 Simulation Results for BA 20 of 21 Summary Hybrid batch/sequential estimator for loosely-coupled visual SLAM and CDGPS with IMU for state propagation Compared to optimal estimator Outlined algorithm for BA (batch) Presented a novel technique for initialization of BA BA simulations Demonstrated positioning accuracy of ~1 cm and attitude accuracy of ~0.1∘ in areas of GPS availability Attained slow drift during GPS unavailability (0.4% drift over 50 m) 21 of 21 Navigation Filter State Vector: 𝑿𝐹 = 𝑇 𝒙𝐶𝐺 𝑇 𝒗𝐶𝐺 𝑓 𝑇 𝒃𝐵 𝑇 𝒒𝐶𝐺 𝑇 𝒃𝜔 𝐵 𝑇 Propagation Step: Standard EKF propagation step using accelerometer and gyro measurements Accelerometer and gyro biases modeled as a first-order Gauss- Markov processes More information in paper … 22 of 21 Navigation Filter (cont.) Measurement Update Step: Image feature measurements from all non-keyframes Temporarily augment the state with point feature positions Prior from map produced by BA Must ignore cross-covariances ⇒ filter inconsistency Similar block diagonal structure in the normal equations as BA 𝑈𝐹 𝑊𝐹𝑇 𝑈 − 𝑊𝐹 𝑉𝐹 ⇒ 𝐹 𝑊𝐹𝑇 −1 𝑊 𝑇 𝐹 𝝐𝐹 𝑊𝐹 𝛿𝑿𝐹 = 𝝐 𝑝 𝑉𝐹 𝛿𝒑 0 𝛿𝒄 = 𝐼 −𝑊𝐹 𝑉𝐹 𝑉𝐹 𝛿𝒑 0 𝐼 −1 𝝐𝐹 𝝐𝑝 23 of 21 Simulation Results for BA (cont.) 24 of 21