POSTAG - Intelligent Software Lab.

Download Report

Transcript POSTAG - Intelligent Software Lab.

Human Language
Technology
Gary Geunbae Lee
Intelligent Software Lab. POSTECH
March,
Feb,
2002
2001
contents
 What is HLT? - definition/history/application?
 HLT workshop case study – acl/sigir/hlt conferences
 Towards Technology synergy – 21c frontier project
Gary G. Lee, Postech
Goals of the HLT
Computers would be a lot more useful if they could handle our
email, do our library research, talk to us …
But they are fazed by natural human language.
How can we make computers have abilities to handle human
language? (Or help them learn it as kids do?)
Gary G. Lee, Postech
A few applications of HLT




Spelling correction, grammar checking …
Better search engines
Information extraction, gisting
Psychotherapy; Harlequin romances; etc.
 New interfaces:

Speech recognition (and text-to-speech)

Dialogue systems (USS Enterprise onboard computer)

Machine translation; speech translation (the Babel tower??)
 Trans-lingual summarization, detection, extraction …
Gary G. Lee, Postech
Levels of Language
 Phonetics/phonology/morphology: what words (or subwords)
are we dealing with?
 Syntax: What phrases are we dealing with? Which words
modify one another?
 Semantics: What’s the literal meaning?
 Pragmatics: What should you conclude from the fact that I
said something? How should you react?
Gary G. Lee, Postech
What’s hard – ambiguities, ambiguities, all
different levels of ambiguities
John stopped at the donut store on his way home from work. He
thought a coffee was good every few hours. But it turned out to be
too expensive there. [from J. Eisner]
- donut: To get a donut (doughnut; spare tire) for his car?
- Donut store: store where donuts shop? or is run by donuts? or looks
like a big donut? or made of donut?
- From work: Well, actually, he stopped there from hunger and
exhaustion, not just from work.
- Every few hours: That’s how often he thought it? Or that’s for coffee?
- it: the particular coffee that was good every few hours? the donut
store? the situation
- Too expensive: too expensive for what? what are we supposed to
conclude about what John did?
Gary G. Lee, Postech
Ubiquitous computing







Ubiquitous computing
Pervasive computing
Third paradigm computing
Calm technology
Computing everywhere
Invisible computing
Irobot style interface – human language + hologram??
Gary G. Lee, Postech
Intelligent service robot
reverberation
Robot noise
Envi noise
Remote speech
input
Gary G. Lee, Postech
Telematics – Eye busy and hand busy
GPS
Voice Portal for
Email, VAD and
Internet Contents
Telematics device
PDA
Car Controller
CDMA
Information
Center
PDA
Gary G. Lee, Postech
Smart Home
네, 녹화할까요?
“대장금”하는 채널을 보
여줘
 아줌마: 요즘 이영애 나오는 인기있는 드라마가
뭐지?
 DTV: MBC에서 방영중인 대장금입니다.
 아줌마: 대장군 재방송 어디서 해?
 DTV: 지금은 방송중이 아니고, 채널36에서 오후
2시에 방영예정입니다.
 아줌마: 그럼, 그거 녹화해 줘.
 DTV: 네, 알겠습니다.
Gary G. Lee, Postech
ASR community (speech/ signal processing)
[from McTear]
 Research projects

Communicator, TRIPS, COMIC, DIPPER, TRINDI, GALAXY, SMARTKOM, Verbmobil,
DUMAS, FASiL, EARS
 Platforms for development and research

CSLU, JASPIS, TRINDIKIT, SUEDE, WITAS, SpeechBuilder, …
 Conferences and workshops

ICSLP, Eurospeech, ICASSP, SigDial, …
 Journals

Computer speech and language, Speech Communication, IJHCS, IEEE Trans. SAP
(speech and audio processing)
Gary G. Lee, Postech
NLP community (AI-NLP, Ling – CL)
 Research projects

SAM/PAM, Pen treebank, TAG, GATE, MUC, TIPSTER, TDT, TIDES, etc
 Platforms for development and research

Alembic, Alvey, Gate, LingPipe, Collins parser, Jasen, postag/K, …(see NLP software
registry)
 Conferences and workshops

ACL, EACL, ANLP, COLING, IJCNLP, EMNLP…
 Journals

Computational Linguistics, Natural Language Engineering, ACM TALIP, IJCPOL,
Computers and Humanities…
Gary G. Lee, Postech
IR community (Library science)
 Research projects

SMART, Digital Libraries, TREC, NTCIR, etc
 Platforms for development and research

SMART, MG, Lemur, Z-PRIZE, etc
 Conferences and workshops

ACM SIGIR, AIRS, ACM CIKM, JCDL, ASIST,..
 Journals

IPM, JASIST, Information systems, …
Gary G. Lee, Postech
Long History of Funding
 long research history since 1960’s
 significant research results due to constant funding (e.g.
DARPA’s 20 years of funding) --- ready for practical solution
 five main desiderata for practical app?





Integration at proper level of analysis/understanding
Combination of appropriate modality
Based on real examples (corpora)
Towards multi-lingual applications
Thorough evaluation (usability)
Gary G. Lee, Postech
contents
 What is HLT? - definition/history
 HLT workshop case study – darpa hlt examples
 Towards Technology synergy – 21c frontier project
Gary G. Lee, Postech
History notes common technologies
 HMM, SVM, CRF, MEMM, DBN for POS tagging, ASR, parsing,
prosody modeling, statistical MT, etc
 Tf/idf, n-gram, discounting/smoothing for sentence weighting,
retrieval models, summarization, question answering, etc
 Bayesian network, causal network, graphical models for
information retrieval, topic detection, dialog, task planning,
etc
 Trie indexing, tree indexing, caching for ASR pronunciation
modeling, morphological lexicon, IR indexing, TTS G2P, etc
 Common themes  statistical language modeling and
machine learning and empirical evaluation (glass box/black
box)
Gary G. Lee, Postech
Written language vs. spoken
language?
 Commonalities

Human languages
 Differences

Punctuation vs. prosodic cues

Disfluencies vs. linguistic competency

Recognition errors
 I canned meat at eleven ten then ok
 I can’t / meet at eleven ten then? / ok
Gary G. Lee, Postech
IT839: new technology for economy
growth needs synergy?
 8 new services
- WiBro
- DMB
- Home Network
- Telematics
- RFID application
- W-CDMA
- ground
DTV
- internet telephony (VoIP)
 3 new infrastructures
- (BcN)
- u-sensor network
- IPv6
 9 new growth technology
- new mobile communication
- digital TV broadcasting
- Home Network
- IT SOC
- next generation PC
- embedded SW
- digital contents(DC)
- telematics - intelligent service
robots
Gary G. Lee, Postech
NLP/IR/speech merge: ACL-05
conferences
 The Association for Computational Linguistics invites the
submission of papers for its 43rd Annual Meeting hosted jointly with
the North American Chapter of the ACL. Papers are invited on
substantial, original, and unpublished research on all aspects of
computational linguistics, including, but not limited to: pragmatics,
discourse, semantics, syntax, grammars and the lexicon; phonetics,
phonology and morphology; lexical semantics and ontologies; word
segmentation, tagging and chunking; parsing, generation and
summarization; language modeling, spoken language recognition
and understanding; linguistic, psychological and mathematical
models of language; language-oriented information retrieval,
question answering, and information extraction; machine learning for
natural language; corpus-based modeling of language, discourse
and dialogue; multi-lingual processing, machine translation and
translation aids; multi-modal and natural language interfaces and
dialogue systems; applications, tools and resources; and evaluation
of systems.
Gary G. Lee, Postech
NLP/IR/speech merge: ACM SIGIR-05
conferences

SIGIR 2005 welcomes contributions related to any aspect of IR, but the major areas of interest are listed
below. For each general area, two or more area coordinators will guide the reviewing process.

Formal Models, Language Models, Fusion/Combination

Text Representation and Indexing, XML and Metadata

Performance, Compression, Scalability, Architectures, Mobile Applications

Web IR, Intranet/Enterprise Search, Citation and Link Analysis, Digital Libraries, Distributed IR

Cross-language Retrieval, Multilingual Retrieval, Machine Translation for IR
Video and Image Access, Audio and Speech Retrieval, Music Retrieval

Text Data Mining and Machine Learning for IR
Text Categorization, Clustering

Topic Detection and Tracking, Content-Based Filtering, Collaborative Filtering, Agents

Summarization, Question Answering, Natural Language Processing for IR, Information Extraction,
Lexical Acquisition

Interactive IR, User Interfaces, Visualization, User Studies, User Models

Specialized Applications of IR, including Genomic IR, IR in Software Engineering, and IR for Chemical
Structures

Evaluation, Building Test Collections, Experimental Design and Metrics
Gary G. Lee, Postech
Speech/NLP/IR merge- recent HLT
conference series




HLT/NAACL2003 – Edmonton, Canada
HLT/NAACL2004 – Boston, USA
HLT/EMNLP2005 – Vancouver, Canada
The joint conference provides a unified forum for researchers
across a spectrum of disciplines to present recent, highquality, cutting-edge work, to exchange ideas, and to explore
emerging new research directions. The conference especially
encourages submissions that discuss synergistic
combinations of language technologies (e.g., Speech with
Information Retrieval, Machine Translation with Speech,
Question Answering with Natural Language Processing, etc.).
Particular consideration will be given to papers addressing
novel learning tasks and evaluation metrics in speech, natural
language processing and information retrieval, including e.g.:
Gary G. Lee, Postech
HLT/EMNLP2005 CFP











learning tasks insufficiently addressed in the past, e.g. collaborative learning, learning in the presence
of background knowledge, or finding anomalies in data;
limits of standard evaluation methods on new tasks;
novel performance measures incorporating user preferences, competence, or relevance to a given
problem;
learning and optimization algorithms addressing the above, e.g. novel statistical methods or cognitively
inspired solutions.
We are interested in papers from academia, government, and industry on all areas of traditional interest
to the HLT and SIGDAT communities, as well as aligned fields, including but not limited to:
Speech processing, including:

Speech recognition

Speech generation

Speech summarization

Rich transcription: annotation of speech signals with metalinguistic information, such as speaker
identity, attitude, emotion, etc.

Speech-based human-computer interfaces
Text summarization
Question answering
Paraphrasing
Computational analysis of phonology, morphology, prosody, syntax, semantics, pragmatics, discourse,
style
Statistical techniques for language processing, including:

Corpus-based language modeling

Lexical and knowledge acquisition
Gary G. Lee, Postech
HLT/EMNLP2005 CFP



Language generation and text planning
Sentence parsing and discourse analysis
Multilingual processing, including:




Evaluation, including:




Lexicons and ontologies
Treebanks, proposition banks, and frame banks
Understanding of human communication, including:





Glass-box evaluation of HLT systems and system components
Back-box evaluation of HLT systems in application settings
Development of language resources, including:


Machine translation of speech and text
Cross-language information retrieval
Multi-lingual speech recognition and language identification
Natural language interfaces
Dialogue structure and dialogue systems
Message and narrative understanding systems
Information extraction from multiple media
Information retrieval, including:





Formal models, clustering and classification
Web mining for IR
Natural language processing for IR
Spoken IR
Metadata annotation and XML IR
Gary G. Lee, Postech
contents
 What is HLT? - definition/history
 HLT workshop case study – darpa hlt examples
 Towards Technology synergy – 21c frontier project
Gary G. Lee, Postech
Technology cross-over: some
examples
 Spoken language understanding needs information extraction
technology
 Language modeling (adaptation) for ASR needs information
retrieval for corpus expansion for a specific domain
 Statistical MT needs fast-viterbi decoding
 ASR, SMT, speech error correction use exactly same HMM
modeling process
 Language modeling for ASR needs parsing/structural analysis
 And more and more…
Gary G. Lee, Postech
Some scenarios from 21c frontier
project
 3단계:
노인 : 꾀돌아, 파리의 연인 재방송은 언제하지?
로봇 : 파리의 연인 재방송은 SBS 드라마 채널에서 월요일 아침 10시에
합니다.
노인 : 그거 예약 녹화 좀 해 놔라.
다음 주에도 같은 시간에 재방송이니?
로봇 : 네, 예약하겠습니다. 다음 주는 올림픽 중계로 재방송이 없습니다.
노인 : 참, 이번 일요일 강영순 집사가 온다고 했는데 몇 시에 오니?
(domain switching)
로봇 : 강영순 집사와 월요일 약속은 오후 1시 입니다.
노인 : 알았다. 그날 1시간 전에 다시 알려다오.
그리고 냉장고에서 마실 것 좀 가져와라. (domain switching)
로봇 : 네, 알겠습니다. 마실 것은 무엇으로 가져올까요? (mixed mode
convceration)
노인 : 시원한 냉수가 좋겠다.
로봇 : 네, 냉수 1잔을 가져다 드리겠습니다.
노인 : (거실 구석의 책상을 가리키며) 그리고 그 위에 있는 노란 책 좀
가져와라. (multi-modal gesture)
로봇 : 네, 책상 위에 있는 노란 책을 가져다 드리겠습니다.
Gary G. Lee, Postech
Conversational SDS-integrated
approach
노인
그 그래? 어 대장금 시작하믄 알려줘
대화 현상을 반영한 음성인식
 간투어: 어/
 반복/수정발화: 그/ 그래?
 발음변이: 시작하믄 (시작하면)
(CSR itself)
음성인식
결과
그래? 대장균 삭히면 알려줘
그래? 대장금 시작하면 알려줘
영역 지식(TV가이드) 및 구문/의미/
문맥 지식을 이용한 인식 오류 수정
(post error correction)
대화 모델과 영역 지식을
이용한 음성대화 진행
(dialog understanding)
로봇
예. 대장금 시작할 때 TV를 켜고
알려 드리겠습니다.
Gary G. Lee, Postech
HLT- speech/language synergy
음성 인식 시스템
HTK 기반의 제한-실용적
음성 인식
음성 처리
Recognition
&
Correction
음성 후처리
음성 오류 수정
음성 언어 이해
언어 이해
Understanding
음성 오류에 강인한
음성 언어 이해
대화 시스템
대화
및
질의 응답
미래의 통합
정보 DB
대화모델링 및 대화를 통한
음성 인식 수정
질의 응답 시스템
DataBase 질의 및 결과
응답 시스템
Gary G. Lee, Postech
From signal to dialog/knowledge
실버 메이트용 대화음성 인터페이스
응용 시나리오 예제
노인 : 짱구야, 장금이 언제 하지?
로봇 : 드라마 대장금은 월요일, 화요일
밤 9시 55분에 합니다.
노인 : 가만, 오늘이 무슨 요일이지?
로봇 : 오늘은 월요일입니다.
노인 : 그래? 대장금 시작하면 알려줘.
로봇 : 예, 드라마 시작할 때 TV를 켜고
알려 드리겠습니다.
음성 전처리 성능향상
 음원 분리 및 음원 분류
 원격 마이크 환경보상
 로봇 잡음 및 배경잡음 보상
대장금
언제 하지?
9시 55분
입니다.
대화음성
인터페이스
대화음성인식기 성능 향상
 노년층 대화현상을 반영한 음성인식
 문맥정보를 활용한 음성인식 오류 수정
 대화모델을 이용한 체감인식률 향상
Gary G. Lee, Postech