Course Detail
Units:
3.0
Course Components:
Lecture
Enrollment Information
Enrollment Requirement:
Prerequisites: "C-" or better in (CS 3505 AND CS 3130).
Course Attribute:
Flexible Schedule
Description
Meets with CS 6140. Data mining is the study of efficiently finding structures and patterns in data sets. The structure and patterns are based on statistical and probablistic principals, and they are found efficiently through the use of clever algorithms. This class focuses on the modeling of problems and also efficient algorithms to solve them, especially at very large scale. Many of these techniques use randomized algorithms - these are oftern extremely simple to use, but more difficult to analyze. We will focus more on how to use, and give explanations (but often not proofs) or correctness. Topics will include: similarity search, clustering, regression/dimensionality reduction, link analysis (PageRank), and small space summaries.