Machine Learning and Robotics - Geometric Algorithms for

Transcript Machine Learning and Robotics - Geometric Algorithms for

ACHINE

OBOTICS

EARNING AND

Lisa Lyons 10/22/08

O UTLINE

 Machine Learning Basics and Terminology    An Example: DARPA Grand/Urban Challenge Multi-Agent Systems Netflix Challenge (if time permits)

I NTRODUCTION

    Machine learning is commonly associated with robotics When some think of robots, they think of machines like WALL-E (right) – human-looking, has feelings, capable of complex tasks Goals for machine learning in robotics aren’t usually this advanced, but some think we’re getting there Next three slides outline some goals that motivate researchers to continue work in this area

H OUSEHOLD R OBOT TO H ANDICAPPED A SSIST

    Could come preprogrammed with general procedures and behaviors Needs to be able to learn to recognize objects and obstacles and maybe even its owner (face recognition?) Also needs to be able to manipulate objects without breaking them May not always have all information about its environment (poor lighting, obscured objects)

F LEXIBLE M ANUFACTURING R OBOT

  Configurable robot that could manufacture multiple items Must learn to manipulate new types of parts without damaging them

L EARNING R EPAIRS S POKEN D IALOG S YSTEM FOR

  Given some initial information about a system, a robot could converse with a human and help to repair it Speech understanding is a very hard problem in itself

M

ACHINE

L

EARNING

T

ERMINOLOGY

B

ASICS AND With applications and examples in robotics

L EARNING A SSOCIATIONS



Association Rule – probability that an event will happen given another event already has (P(Y|X))

C LASSIFICATION

   Classification – model where input is assigned to a class based on some data Prediction – assuming a future scenario is similar to a past one, using past data to decide what this scenario would look like Pattern Recognition – a method used to make predictions  Face Recognition  Speech Recognition   Knowledge Extraction – learning a rule from data Outlier Detection – finding exceptions to the rules

R EGRESSION

 Linear regression is an example  Both Classification and Regression are “Supervised Learning” strategies where the goal is to find a mapping from input to output  Example: Navigation of autonomous car  Training Data: actions of human drivers in various situations   Input: data from sensors (like GPS or video) Output: angle to turn steering wheel

U NSUPERVISED L EARNING

 Only have input   Want to find regularities in the input Density Estimation: finding patterns in the input space  Clustering: find groupings in the input

R EINFORCEMENT L EARNING

   Policy: generating correct actions to reach the goal Learn from past good policies Example: robot navigating unknown environment in search of a goal  Some data may be missing  May be multiple agents in the system

P OSSIBLE A PPLICATIONS

 Exploring a world   Learning object properties Learning to interact with the world and with objects   Optimizing actions Recognizing states in world model     Monitoring actions to ensure correctness Recognizing and repairing errors Planning Learning action rules  Deciding actions based on tasks

W HAT W E E XPECT R OBOTS TO D O

   Be able to react promptly and correctly to changes in environment or internal state Work in situations where information about the environment is imperfect or incomplete Learn through their experience and human guidance   Respond quickly to human interaction Unfortunately, these are very high expectations which don’t always correlate very well with machine learning techniques

D IFFERENCES M ACHINE L B ETWEEN EARNING AND O THER R T YPES OF OBOTICS

Other ML Applications

  Planning can frequently be done offline Actions usually deterministic  No major time constraints

Robotics

   Often require simultaneous planning and execution (online) Actions could be nondeterministic depending on data (or lack thereof) Real-time often required

A

E

XAMPLE

: DARPA G

RAND

/U

RBAN

C

HALLENGE

T HE C HALLENGE

    Defense Advanced Research Projects Agency (DARPA) Goal: to build a vehicle capable of traversing unrehearsed off-road terrain Started in 2003 142 mile course through Mojave   No one made it through more than 5% of the course in 2004 race In 2005, 195 teams registered, 23 teams raced, 5 teams finished

T HE R ULES

     Must traverse a desert course up to 175 miles long in under 10 h Course kept secret until 2h before the race Must follow speed limits for specific areas of the course to protect infrastructure and ecology If a faster vehicle needs to overtake a slower one, the slower one is paused so that vehicles don’t have to handle dynamic passing Teams given data on the course 2h before race so that no global path planning was required

A DARPA G RAND C RASHING C HALLENGE V EHICLE

A DARPA G RAND THAT D ID N OT C C HALLENGE RASH V EHICLE

 …namely Stanley, the winner of the 2005 challenge

T ERRAIN M APPING AND D ETECTION O BSTACLE

    Data from 5 laser scanners mounted on top of the car is used to generate a point cloud of what’s in front of the car Classification problem   Drivable Occupied  Unknown Area in front of vehicle as grid Stanley’s system finds the probability that ∆h > δ where ∆h is the observed height of the terrain in a certain cell  If this probability is higher than some threshold α, the system defines the cell as occupied

( CONT .)

   A discriminative learning algorithm is used to tune the parameters Data is taken as a human driver drives through a mapped terrain avoiding obstacles (supervised learning) Algorithm uses coordinate ascent to determine δ and α

C OMPUTER V ISION A SPECT

 Lasers only make it safe for car to drive < 25 mph      Needs to go faster to satisfy time constraint Color camera is used for long-range obstacle detection Still the same classification problem Now there are more factors to consider – lighting, material, dust on lens Stanley takes adaptive approach

V ISION A LGORITHM

Take out the sky Map a quadrilateral on camera video corresponding with laser sensor boundaries As long as this region is deemed drivable, use the pixels in the quad as a training set for the concept of drivable surface Maintain Gaussians that model the color of drivable terrain   Adapt by adjusting previous Gaussians and/or throwing them out and adding new ones Adjustment allows for slow adjustment to lighting conditions Replacement allows for rapid change in color of the road Label regions as drivable if their pixel values are near one or more of the Gaussians and they are connected to laser quadrilateral

R OAD B OUNDARIES

   Best way to avoid obstacles on a desert road is to find road boundaries and drive down the middle Uses low-pass one-dimensional Kalman Filters to determine road boundary on both sides of vehicle Small obstacles don’t really affect the boundary found  Large obstacles over time have a stronger effect

S LOPE AND R UGGEDNESS

   If terrain becomes too rugged or steep, vehicle must slow down to maintain control Slope is found from vehicle’s pitch estimate Ruggedness is determined by taking data from vehicle’s z accelerometer with gravity and vehicle vibration filtered out

P ATH P LANNING

 No global planning necessary   Coordinate system used is base trajectory + lateral offset Base trajectory is smoothed version of driving corridor on the map given to contestants before the race

P ATH S MOOTHING

 Base trajectory computed in 4 steps: 1.

Points are added to the map in proportion to local curvature Least-squares optimization is used to adjust trajectories for smoothing Cubic spline interpolation is used to find a path that can be resampled efficiently 4.

Calculate the speed limit

O NLINE P ATH P LANNING

   Determines the actual trajectory of vehicle during race Search algorithm that minimizes a linear combination of continuous cost functions Subject to dynamic and kinematic constraints   Max lateral acceleration Max steering angle   Max steering rate Max acceleration  Penalize hitting obstacles, leaving corridor, leaving center of road

M

ULTI

-A

GENT

S

YSTEMS

R ECURSIVE M ODELING M ETHOD (RMM)

 Agents model the belief states of other agents     Beyesian methods implemented Useful in homogeneous non-communicating Multi-Agent Systems (MAS) Has to be cut off at some point (don’t want a situations where agent A thinks that agent B thinks that agent A thinks that…) Agents can affect other agents by affecting the environment to produce a desired reaction

H ETEROGENEOUS MAS N ON -C OMMUNICATING

 Competitive and cooperative learning possible    Competitive learning more difficult because agents may end up in “arms race” Credit-assignment problem  Can’t tell if agent benefitted because it’s actions were good or if opponent’s actions were bad Experts and observers have proven useful  Different agents may be given different roles to reach the goal  Supervised learning to “teach” each agent how to do its part

C OMMUNICATION

   Allowing agents to communicate can lead to deeper levels of planning since agents know (or think they know) the beliefs of others Could allow one agent to “train” another to follow it’s actions using reinforcement learning Negotiations   Commitment Autonomous robots could understand their position in an environment by querying other robots for their believed positions and making a guess based on that (Markov localization, SLAM)

N

ETFLIX

C

HALLENGE (if time permits)

R EFERENCES

     Alpaydin, E. Introduction to Machine Learning. Cambridge, Mass. : MIT Press, 2004.

Kreuziger, J. “Application of Machine Learning to Robotics – An Analysis.” In Proceedings of the Second

International Conference on Automation, Robotics,

and Computer Vision (ICARCV '92). 1992.

Mitchell et. al. “Machine Learning.” Annu. Rev. Coput. Sci. 1990. 4:417-33.

Stone, P and Veloso, M. “Multiagent Systems: A Survey from a Machine Learning Perspective.” Autonomous Robots 8, 345-383, 2000.

Thrun et. al. “Stanley: The Robot that Won the DARPA Grand Challenge.” Journal of Field Robotics 23(9), 661-692, 2006.