CHILDES - Brian MacWhinney

Download Report

Transcript CHILDES - Brian MacWhinney

Learning from Interaction
Brian MacWhinney
Psychology
Carnegie Mellon/HKIEd
5/22/04
Interaction
1
Today
1.
2.
3.
4.
5.
5/22/04
Overview of TalkBank and CHILDES
Collaborative Commentary
TalkBank and Child Language
TalkBank and Education
Proposed Research Directions
Interaction
2
Part 1: Basic Issue
•
•
•
•
Psycholinguistic experiments only control a
single time dimension.
Behavior emerges from the intersection of
7 major time dimensions.
Different social objects have different
patterns of mesh to time frames.
Video can examine pivotal interactions to
understand merging of time frames in the
Moment.
5/22/04
Interaction
3
Available Methods
•
•
•
Microanalysis (CA, linguistic, ethology)
Microgenetic analysis
Group and treatment comparisons
• Same consultant, different communities …
•
•
•
•
•
Error analysis
Diffusion analysis - Hall museum, Whiten chimps
Longitudinal studies
Large sample analysis
Dynamic modeling
5/22/04
Interaction
4
The TalkBank Approach
•
•
•
•
•
Construct a web-accessible multimedia database
for human communication -- TalkBank
Maximize direct comparisons between
forces/treatments -- Individual Projects
Create powerful analytic tools
Construct a community for collaborative
commentary -- Change academic values
Collaborative commentary and the web now
becomes the central focus of research
5/22/04
Interaction
5
Transcripts linked to media
5/22/04
Interaction
6
In technical terms …
•
•
•
•
•
•
•
•
Standard transcription format that merges 10
common formats (CA, ISL, SALT, DT, AG, MT,
Columns, SyncWriter, Phon, Praat)
XML Schema definition for format translation
XML verification and roundtrip
Suite of analytic tools, transcription tools
Linkage to media and tools for linkage
Codec standardization
Streaming media server, locally deployable
Metadata: OLAC, OAI, ISBN
5/22/04
Interaction
7
The technology is here
•
•
•
Huge disks, fast web streaming, powerful
programs
Emerging standards
Collaborative commentary, event linkage
are available.
• WebDiver - Stanford
• ProjectPad - Northwestern
• TalkBankViewer, CLAN WebData - CMU
5/22/04
Interaction
8
Data Sharing
•
•
•
•
•
•
The CHILDES model.
Data sharing not crucial for established researchers. It is
crucial for the field.
Google, iTube, WIKIpedia could not have happened
without open data.
Raw data sets are infinitely rich. No one can be
“scooped”.
Tenured faculty have a responsibility to share data, within
IRB guidelines.
Federal agencies have a responsibility to promote data
sharing.
5/22/04
Interaction
9
Data sharing and Big Science
Human Genome Project
•
3 billion base pairs
Sloan Digital Sky Survey
•
100 million stars
Alzheimer’sNeuroimaging
•
800 patients over 3 years
•
5/22/04
fMRI Data Center
Interaction
10
TalkBank Groups, Areas, Topics
•
•
•
•
•
•
•
•
•
Child Language (CHILDES, PhonBank)
Conversation Analysis (MOVIN, CA)
SLA (SLAWeb, FLLOC, LIDES)
Sociolinguistics (SLX)
Aphasia (AphasiaBank)
Classroom Discourse (DIVER, Chicago)
Linguistic Exploration (DOBES)
Gesture (FORM)
Legal (SCOTUS)
5/22/04
Interaction
11
5/22/04
Interaction
12
5/22/04
Interaction
13
The CHILDES Model
•
•
•
•
•
•
Impressive data-sharing
Over 2000 published articles based on
CHILDES data
New groups developing in individual
languages (Chinese, Japanese, Dutch …)
New tools: MOR, GRASP, PhonBank,
Browser
Careful treatment of Human Subjects
Model for NSF CyberInfrastructure
5/22/04
Interaction
14
TalkBank features
•
•
•
•
•
•
•
Consistent transcription form -- CHAT/CA
OLAC/IMDI metadata
Central role of TalkBank XML Schema
Core Utilities: CLAN, Praat, Phon
Related Utilities: Transcriber, ELAN,
TransAna
320 corpora available
Ethics, data-sharing principles
5/22/04
Interaction
15
Online Access
•
•
•
•
http://talkbank.org
http://childes.psy.cmu.edu
Browsing from CLAN’s WebData
Downloading of transcripts and media
5/22/04
Interaction
16
Interdisciplinarity
•
•
•
•
•
Human Communication is a unified fact
It is studied by 18 disciplines
TalkBank provides a common infrastructure
to unify all this work
Commentary will often be discipline-based
However, TalkBank makes crossing
disciplines directly possible
5/22/04
Interaction
17
Part 2: Collaborative Commentary
the involvement of a research community in
the interpretive annotation of electronic
records
5/22/04
Interaction
18
Commentary Circles
•
•
•
•
•
•
Open: Lawrence Lessig on copyright law, Brian
MacWhinney on social bases of word learning
Distributed Groups: MPI Gesture coders, or
AphasiaBank, Koschmann coding project
Classes: Roy Pea teaches grads in Stanford Ed
School to code and evaluate teacher practices
Protected: Emergency medicine procedures
Coding Reliability: Rollins-Snow Speech Acts
Educational Standards
5/22/04
Interaction
19
One model: Project Pad
5/22/04
Interaction
20
PPad Visual Commentary
5/22/04
Interaction
21
Student Commentary in Tutorials
5/22/04
Interaction
22
Comment Tagging
•
•
•
•
Automatic: author, date, media begin-end
Author self-characterized metadata (role,
faction, position, credentials)
Commentary typing (refutation,
defense,elaboration, analogy, statistics, case
law, gesture-speech match)
Pseudo author: GenghisMac
5/22/04
Interaction
23
Commentary Filtering
•
•
•
•
•
•
Only from Roy Pea, Jim Greeno
Only Student A, then Student B, etc. while
grading assignments
Only comments that support or refute me.
Only scenes where patient is at risk, then
comments about causes of error
Only researchers, only public, only students
Only today, not including toda
5/22/04
Interaction
24
Naked Video
•
•
•
•
•
Terabytes of video
No transcripts
Occasional sign posts
Sparse speech recognition
Automatic video analysis
5/22/04
Interaction
25
Evidential Database
•
•
Claims pointing to evidence through media
Supporting pointers to PDFs, Picts of
• Articles, Legal precedents, evidence
•
Published as special issues
• JLS, JOC, Discourse Processes, Cog&Inst
5/22/04
Interaction
26
Part 3: Child Language
•
•
•
•
•
•
•
Child Language Data Exchange System
Founded 1984 in Concord MA
Director: Brian MacWhinney [email protected]
Programmers: Leonid Spektor, Franklin Chen
4500 Members
130 corpora
1500 published articles
5/22/04
Interaction
27
Why study child language?
•
•
•
•
•
•
Special Gift -- Universals
Typology -- Variation
Emergentism -- Processes
Language Disorders -- Differences
Socialization, Literacy
Language Maintenance
5/22/04
Interaction
28
Universals
•
•
•
•
Are there basic patterns to babbling?
Are early word orders universal?
Does UG give children a universal set of
functional categories?
Is the vocabulary spurt universal?
We need LOTS of data
5/22/04
Interaction
29
Differences
•
Do children have individual styles?
• Gestalt vs. Analytic
• Enactive (1S) vs. Depictive (3S)
Do children respond differentially to
parental recasts?
• Do children vary in their match to cue
validity?
Again, we need LOTS of data.
•
5/22/04
Interaction
30
Comparisons
•
•
•
How should we match SLI children to
normal controls -- MLU? Morphology, TTR
How should we compare language
socialization processes across social
classes? Between cultures?
How should we compare the course of
development across languages? The case of
Romance.
5/22/04
Interaction
31
Programs, Format, Database
•
•
•
CLAN
CHAT
CHILDES
5/22/04
Interaction
32
Web Browsing
5/22/04
Interaction
33
Speech Act Coding
5/22/04
Interaction
34
Slider vs Transcript
5/22/04
Interaction
35
Part 4: Classroom Interactions
•
•
•
•
TIMMS - six countries
PBL - Koschmann, LeBaron
Gravity - TERC
Science Museum
• Atmospheric light diffusion - Rahm
• Electricity generation - Crowley
•
•
•
Dresden - SLA English, French, Czech
Grimshaw - Oral Defense
Greeno - Garden Plot, numerical series
5/22/04
Interaction
36
Classroom - continued
•
•
•
•
•
•
•
•
•
Numerical displays - Sfard, McClain, Cobb
Lehrer - Carmen Curtis and quilt patterns
Lectures -- MacWhinney gesture analysis
Roth -- map lecture
Stevens -- professional dialogs
WorkGroup -- MacWhinney, CMU groups
Moskovitch -- bilingual classroom, math
Horowitz -- reading exercises
Home/School -- Hall, Snow
5/22/04
Interaction
37
Tutorial Interactions
•
•
•
•
Circle - Physics
Frederiksen - statistics
Graesser - statistics
DISPEL - collaborative problem solving
5/22/04
Interaction
38
Sample Analyses
•
•
•
James Greeno, Brian MacWhinney, and
Carla van der Sande
Learning as the construction of mental
models that explain device representations.
Humans represent (explain) devices through
• Perspectival embodiment
• Spatial imagery
5/22/04
Interaction
39
Gabriel’s Model
5/22/04
Interaction
40
Dad’s Model
5/22/04
Interaction
41
Gravity and Pprims
5/22/04
Interaction
42
Garden Plots
Sally’s backyard is 40 feet wide by 72 feet long,
and it is structured so that a central rectangle of
grass is surrounded by an even border of
flowers. The area of the border is 5/12 of the
area of the whole garden. If Sally wants to be
sure to walk her dog 1/4 mile, how many laps
will she have to take around the grass?
5/22/04
Interaction
43
Perspective Shifting
5/22/04
Interaction
44
Developmental Levels
•
•
•
•
•
Home Interactions, rare vocabulary, HSLLD
Science Museum, projects
Symmetry -- Carmen Curtis
High School Math, Science
Second Language Classrooms -- early, late
5/22/04
Interaction
45
Implications
•
•
•
•
Learner models are fragmentary.
Working devices must have full linkage.
These linked systems can be annotated by
embedded causal/perspectival links.
Teachers can facilitate linkage formation by
retracing perspectival links with focus on:
• Proceduralized subcomponents
• Missing links between levels
5/22/04
Interaction
46
Commitment to full database
•
•
•
•
We must construct complete propositional
tree analysis.
Coding must be reliable.
Model must apply across all TalkBank
datasets.
Model will be applied to Physics and
Chemistry in NSF Pittsburgh Science of
Learning Center
5/22/04
Interaction
47
Part 5: Proposed Research Directions
•
•
•
•
•
•
Early English/Cantonese bilingualism - analyses
of overlapping spheres
Crosscultural classroom video analyses (TIMMS
for ECE)
Early science learning and device representations
Perspective shifting in teachers and learners
Practices and outcomes of early English education
-- current HKIEd initiatives
Role of software in language learning, both in
early years and later -- PSLC
5/22/04
Interaction
48