CS 462/562: Big Data Algorithms

Course Information

Instructor

Sunil Shende

Class Schedule

Th 6pm - 8:50pm in BSB 336 + synchronous online over Zoom

Office

308 Business & Science Building

Office Hours

MW 11:20am - 12:20pm (Free Period) over Zoom

Campus Telephone

Extension 6122

Email

sunil DOT shende AT rutgers DOT edu

Contents

This course provides a rigorous introduction to algorithms and techniques for processing very large data sets, including both offline and streaming data. The emphasis will be on developing a clear understanding of the mathematical foundations of algorithms for data mining (similarity detection, clustering, association rules for frequent itemsets, link analysis etc.) and machine learning (supervised and unsupervised learning models, regression, support vector machines etc.). Students will also work with Python-based libraries and software like Pandas, Scikit-learn and Pyspark to experiment with, and to develop big data applications.

By way of preparation for the course, students are expected to be comfortable with basic discrete mathematics, probability theory, data structures, elementary algorithms and programming in a high-level language like Python or Java. We will be programming exclusively in Python so please make sure that you review and practice code-writing in Python.