Agent2

Transcript Agent2

WAIR: A Web Agent That Learns
Rewardi
Statei
WAIR
Learn
(modify profile)
Actioni
Document Filtering
User Profile
Rewardi+1
(Relevance Feedback)
User
...
Filtered Documents
Zhang, B.-T. and Seo, Y.-W., Applied Artificial Intelligence, 15(7):665-685, 2001
1
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Reinforcement Learning Agents



Generalized model learning for reinforcement
learning on a humanoid robot:
http://www.youtube.com/watch?v=mRpX9DFCdwI
Autonomous spider learns to walk forward by
reinforcement learning:
http://www.youtube.com/watch?v=RZf8fR1SmNY&f
eature=related
Reinforcement learning for a robotic soccer
goalkeeper:
http://www.youtube.com/watch?v=CIF2SBVYJ0&feature=related
2
(c) 2008 SNU Biointelligence Laboratory, http://bi.snu.ac.kr/
(참고: Thore Graepel, MS Research Cambridge)
 Whole-audience Control of a Racing Game
http://www.youtube.com/watch?v=NS_L3Yyv2RI
Drivatar – Racing Game Agents
MS XBOX 360의 레이싱 게임인 Forza 2의 플레이어 운전 패턴을 기계학습을 통해
모델링 한 후 확률적으로 운전을 모방함으로써 인간 수준의 플레이 실현
Microsoft Research in Cambridge, UK
 The Future of Racing Games
http://www.youtube.com/watch?v=TaUyzlK
Ku-E
게임 플레이 상에서의 운전 패턴
도로상
위치
주행차선
코스별
속력
브레이크
/엑셀
모든 경로 세그먼트화 
게이머가 선택하는 최적경로 학습
(Imitation Approach)
운전자 운전 패턴 확률기반 모델링
확률적 모델링으로 인해 동일 수준의 무한한 운전 형태를 생성
© 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
3
(참고: Willow Garage)
Personal Robots at Home and Office
 PR2 Robot Cleans Up
http://www.youtube.com/watch?v=g
Yqfa-YtvW4&feature=related
 PR2 Robot Plays Pool
http://www.youtube.com/watch?v=
mgHUNfqIhAc&feature=related
 PR2 Robot of Willow Garage
4
© 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Soccer Robots

RoboCup 2000:
Beyond Human: Robot Soccer

Humanoid Robot Soccer 2007:
RoboCup 2007 Final, Humanoid League

RoboCup 2008:
CMDragons RoboCup 2008 SSL Highlights

KondoCup Robot Soccer 2008:
12th KondoCup Robot Soccer: Cool Moves!
5
(c) 2008 SNU Biointelligence Laboratory, http://bi.snu.ac.kr/
DARPA Grand Challenge
Autonomous Driving Robots
[Sebastian Thrun, Stanley & Junior, Stanford Univ.]
Stanford 팀은 무인자동차의 자동운전 기술에 기계학습 기법을 활용하여 2005년도 Grand
Challenge에서 우승(상금 2백만 달러), 2007년도 Urban Challenge에서 준우승을 차지하였다.
2005년도 미션: 사막지역 175마일을
자동운전만으로 10시간 이내에 주파
2007년도 미션: 도시환경에서 96km를
자동운전만으로 6시간 이내에 주파
지형 파악 및 진행경로 계획
Video
DARPA Grand Challenge:
Final Part 1
레이저를 이용한 지형 파악
사람의 운전 패턴을 학습
사람
확률적모델링
자동차
© 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
6
(참고: Andrew Ng, Stanford Univ.)
Stanford Autonomous Helicopter - Airshow #2:
http://www.youtube.com/watch?v=VCdxqn0fcnE
Autonomous Helicopter Control
강화 학습을 이용하여 RC 헬기를 자동적으로 제어하는데 성공했고
다양한 고난이도 비행도 성공적으로 행했다.
가속도, 속도 센서가 달린 RC 헬리콥터
자동 제어를 통한 고난이도 비행
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/7
강화 학습 (RL) 을 통한 자동 제어
자동 제어를 통한
고난이도 비행

Agent2

Transcript Agent2

Directory