Document 7483136

Download Report

Transcript Document 7483136

Topic Learning in Text &
Conversational Speech
Constantinos Boulis
Introduction

Definition of Topic Learning



Applications of Topic Learning



Supervised : Learn a mapping of data to topics
Unsupervised : Discover new topics
Crucial step for information access
Google News, call-center automation
Challenges of Topic Learning

A learning problem of very high dimensionality
An Example
B.16: especially, you know, smaller areas,
A.17: Uh-huh.
B.18: smaller towns.
A.19: Uh-huh. Yeah. Probably the hardest thing in, in my family, uh, my
grandmother, she had to be put in a nursing home and, um, she had used the
walker for, for quite some time, probably about six to nine months. And, um,
she had a fall and, uh, finally, uh, she had Parkinson's disease,
B.20: Oh.
A.21: and it got so much that she could not take care of her house.
B.22: Right.
A.23: Then she lived in an apartment and, uh, that was even harder --
B.24: Uh-huh.
Impact



Interdisciplinary research on Natural Language
Processing, Data Mining and Speech Recognition
Core technology can leverage fields such as
Bioinformatics
All these technologies come together on the 311 line
(TIME, Feb. 7th 2005)
Dimensions of Topic Learning
Less
My work
Supervision
Past work
More
More
Structured Input
Less
Dissertation Contributions

General Topic Learning Contributions
(applicable to text, speech, gene expression etc)



Combining Multiple Clustering Partitions (*)
Feature Construction (*)
Topic Learning in Conversational Speech




Speech-to-text errors
Role of disfluencies
Separating content & style
Role of prominence (*)
Combining Multiple Clustering
Partitions


Classifier combination is studied extensively but not
much work on combining clustering systems
Fundamental problem: Missing correspondence
between clusters of different systems
{1,2,2,1,3,1,2,3,3,3,2,1}
{3,1,1,3,2,3,1,2,1,2,3,2}

1
2
3

3

1

2
Contribution : New algorithms that estimate the
correspondence of clusters then combine them using
linear programming techniques and singular value
decomposition
Feature Construction



A lot of work on supervised topic learning methods
but not much on constructing feature spaces
Bag-of-words representation too coarse but hard to
improve
Contribution : Add only those word pairs that
contribute sufficiently new information than their
constituting words, i.e. the whole is much more than
the sum of its parts


“second hand” >> “second” + “hand”
“big brother” >> “big” + “brother”
Role of Prominence

Speech is a richer medium than text; it is not only
what we say is also how we say it.

Prominence is the emphasis we put on words

Contribution : The first study to show that
prominence can be combined with lexical saliency
measures to yield improved feature subsets for topic
learning
Summary



Topic learning a key step for information
access (retrieval, extraction)
Key contribution : Advancing language
processing for spoken documents
Unique elements of this work: Combining
speech, language and data mining technology
Journal Publications Resulting from PhD



Deng, L., Wang, Y., Wang, K., Acero, A., Hon, H.-W., Droppo, J., Boulis, C.,
Mahajan, M., and Huang, X.D, February-March 2004, “Speech and Language
Processing for Multimodal Human-Computer Interaction”, Journal of VLSI
Signal Processing Systems, 36(2-3):161-187.
Boulis, C., Ostendorf, M., Riskin, E., Otterson, S. November 2002. “Graceful
Degradation of Speech Recognition Performance Over Packet-Erasure Networks”,
IEEE Transactions on Speech and Audio Processing, 10(8):580-590.
Deng, L., Wang, K., Acero, A., Hon, H.-W., Droppo, J., Boulis, C., Wang, Y.-Y.,
Jakoby, D., Mahajan, M., Chelba C., and Huang, X.D. November 2002.
“Distributed Speech Processing in MiPad's Multimodal User Interface”, IEEE
Transactions on Speech and Audio Processing, 10(8):605-619.
Conference Publications Resulting from PhD





Boulis, C., Kahn, J., Ostendorf, M., July 2005. “The Role of Disfluencies in Topic
Classification of Natural Human-Human Conversations”, Proc. of the Workshop on
Spoken Language Understanding, in press.
Boulis, C., Ostendorf, M., June 2005. “A Quantitative Analysis of Lexical Differences
Between Genders in Telephone Conversations”, Proc. of the 43rd Annual Meeting of the
Association for Computational Linguistics (ACL), in press.
Boulis, C., Ostendorf, M. April 2005. “Text Classification by Augmenting the Bag-of-Words
Representation with Redundancy-Compensated Bigrams”, Proc. of the International
Workshop on Feature Selection in Data Mining, pp 9-16.
Boulis, C., Ostendorf, M. September 2004. “Combining Multiple Clustering Systems”. Proc.
of the 8th European Conference on Principles and Practice of Knowledge
Discovery in Databases (PKDD), LNAI 3202, pp. 63-74.
Boulis, C. May 2004. “Speaker Recognition with Mixtures of Gaussians with Sparse
Regression Matrices”, Proc. of the Student Research Workshop of Human Language
Technology/North American Chapter of the Association for Computational
Linguistics (HLT/NAACL), companion volume, pp. 55-60.

Riskin, E., Boulis, C., Otterson, S., Ostendorf, M. September 2001. “Graceful Degradation of
Speech Recognition Performance Over Lossy Packet Networks”. Proc. of the 7th European
Conference on Speech Communication and Technology (Eurospeech 2001), pp.
2715-2719.
Future Publications & Awards Resulting
from PhD
Manuscripts under review


Boulis, C., Ostendorf, M., “Combining Multiple Clustering Partitions”, Journal of
Machine Learning.
Boulis, C., Ostendorf, M., “Using Symbolic Prominence to Help Design Feature
Subsets for Topic Classification and Clustering of Natural Human-Human
Conversations”, Interspeech-05.
Manuscripts under preparation

Boulis, C. Ostendorf, M., “Unsupervised Estimation of Word Confusability and its
Use in Topic Classification of Human-Human Conversations”
Awards

Best Student Paper Award, PKDD 2004. 581 total submissions, 17%
acceptance rate
Backup Slides
The following slides are not used
in the main presentation
Speech-to-Text Errors

Output of STT systems contain errors (~20%)

Some words have higher error rates than others

Contribution : Design algorithm that adaptively
clusters confusable words, modifying the vocabulary
provided for topic learning tasks

Provided gains in classification performance of 25% relative
Role of Disfluencies

Disfluencies are very common in conversational
speech
That’s all you need you only need one boxcar (repetition)
So it’ll take um so you want to do what (repair)

Contribution : Demonstrate that removing
disfluencies in topic classification performance does
not impact the bag-of-words model, but does impact
more complex representations
Separating Content & Style

When two people talk they bring into the discussion their
idiosyncracies. Are there idiosyncracies in the gender level?

Can this affect topic classification?

Contribution : The first quantitative study to show that there
are lexical differences between genders in telephone
conversations


Almost 100% accuracy in detecting the gender of a speaker based
on what he/she said
The gender of the speaker of one side can influence lexical
patterns in the other side