Transcript Slide 1

Visualization, analysis and mining of geospatial information in educational data sets
using web-based tools
Aniruddha Desai |Winter 2013 Presentation
Center for Web and Data Science
University of Washington, Tacoma
Outline
Motivation / Background
Data sets for OSPI projects
Goals: Visualization, Analysis,
Mining
Goals: Tools, Strategies
Motivation / Background
Paper: “Top-k popular routes based on
Foursquare check-ins”
Paper: “Top-k optimal store locations
based on Foursquare check-ins”
Class Project: Using GPS traces created
by a user to infer mode of transportation
OSPI Project: Educational data
Is the domain data-rich and tools poor?
Does data have geo-spatial dimensions?
OSPI – RTI Reports
The RTI (Response to
Intervention Project)
website collects data
using a standard rubric
with a “rating scale”.
Goal: Measuring
effectiveness of
interventions at individual
school sites across various
metrics.
OSPI – SNP Reports
SNP (State Needs
Projects) data dashboard
collects data using web
based surveys.
Goal: Measuring
effectiveness of
professional development
/ training efforts in special
needs education.
Collect, Analyze, Share Data
Surveys
Geographically
distributed
Data
Field
Evidence
Variety of
data types
Managing
participants
Multiple
databases
Sharing
data
Analysis
Audio /
Video
Text
Data Visualization Goals
Chloropleth maps visualize no. of
educators trained by region at
state & national level (for SNP)
Data Visualization Goals
Lincoln County
No. of Participants Trained: X
No. of Responses Rcvd: X’
Population Density: Y
Number of Trainings: Z
Go to Results
Interactive regions on
Chloropleth map that display
more information
Data Visualization Goals
RTI data is spread across several districts / counties across the state.
Visualize data points at individual school sites by zooming in
Data Visualization Goals
Tahoma
School
District
ESD
No.17409
Go to
Results
At a low zoom factor, mouse-overs display more
information about data points and link to RTI results
Data Analysis / Mining Goals
Can visualizations answer some of these questions?
– Can we predict which area needs more
professional development training next year?
– Is the response rate on surveys and participant
attendance rate co-related?
– High volume / variety of data (some of it geospatial): survey responses / qualitative
assessments / user zip codes / school locations /
district boundaries – how do we extract useful
information?
Data Analysis / Mining Goals
– Are demographic data (census), income levels,
crime statistics, employment rates related to:
the outcomes of intervention (for RTI)?
the quality of professional development (for SNP)?
– Data collection, reporting and visualization is the
first step – finding patterns potentially the next
step.
– How do the visualizations scale up from state to
national level?
Tools
– Drupal CMS (already in-place)
– Google Maps API – Gmap module to create an
interface to the Google Maps API within Drupal
http://drupal.org/project/gmap
– D3.JS (Data Driven Documents) visualizations such
as Heat Maps, Chloropleth Maps, Bar charts, Pie
charts
https://github.com/mbostock/d3/wiki/Gallery
– Open Street Maps API
http://www.openstreetmap.org/ (Drupal
integration?)
Strategy
– Implement geographic map-based visualizations
with appropriate amount of information at
different zoom factors.
– At high level of granularity link data points on map
to bar charts / reports for more detail.
– Analyze data visualizations for patterns.
Thank you!
Q&A