CS 462/562: Big Data Algorithms¶
Instructor |
Sunil Shende |
Class Schedule |
Th 6pm - 8:50pm in BSB 336 + synchronous online over Zoom |
Office |
308 Business & Science Building |
Office Hours |
MW 11:20am - 12:20pm (Free Period) over Zoom |
Campus Telephone |
Extension 6122 |
sunil DOT shende AT rutgers DOT edu |
Contents¶
This course provides a rigorous introduction to algorithms and techniques for processing very large data sets, including both offline and streaming data. The emphasis will be on developing a clear understanding of the mathematical foundations of algorithms for data mining (similarity detection, clustering, association rules for frequent itemsets, link analysis etc.) and machine learning (supervised and unsupervised learning models, regression, support vector machines etc.). Students will also work with Python-based libraries and software like Pandas, Scikit-learn and Pyspark to experiment with, and to develop big data applications.
By way of preparation for the course, students are expected to be comfortable with basic discrete mathematics, probability theory, data structures, elementary algorithms and programming in a high-level language like Python or Java. We will be programming exclusively in Python so please make sure that you review and practice code-writing in Python.