CPS 196.03: Information Management and Mining Third programming project

Download Report

Transcript CPS 196.03: Information Management and Mining Third programming project

CPS 196.03: Information Management
and Mining
Third programming project
Third Programming Project

Three options:

Clustering project

PageRank project

Your own topic

1-page project proposal due by Friday (April 10) 5.00 PM

Project and report due on Tuesday April 21

Single demo for all three projects: April 21 and 22



30 minutes per team (team from Project 3)
Should be prepared to run code on your laptop or by
logging in to CS department machine
Time slots will be determined through email
Clustering Project

Implement BFR algorithm



Notes
www.cs.cornell.edu/Courses/cs678/2002sp/pape
rs/bradley98scaling.ps
Evaluate on one or more datasets from UCI
repository

http://kdd.ics.uci.edu/
PageRank Project

Implement PageRank computation algorithm for
large Web graphs


Study running time, convergence properties, and
robustness of the algorithm to spam/fraud


How is the Web graph represented?
Generate different types of Web graphs
Paper from Google: The PageRank Citation
Ranking: Bringing Order to the Web

For discussion on Thursday (see readings page)