CS/MATH/STAT:4740 Large Data Analysis 3 s.h. A project based course
MAIN INSTRUCTOR: Prof. Suely Oliveira.
GUEST LECURES: Prof David Stewart, Prof Isabel Darcy, Prof Kate Cowles.

This course is required for the award of the Large Data Analysis certificate.
--------------------------------------------------------------------------------------------------------------------------------------
Any student with the prerequisites is welcome.

Pre-requisites: CS:1210 and Math 2700 and STAT2010 (or STAT2020).

1)  Students will need to put a request for an exception to use MATH 2550 instead of MATH 2700. Please fill out  an "In Progress Transfer Work” form on your ISIS account, and give MATH 2550 as the reason.

2) Students will need to put a request for an exception to use  ENGR 2730 instead of CS1210. Please fill out  an "In Progress Transfer Work” form on your ISIS account, and give ENGR 2730 as the reason.
---------------------------------------------------------------------------------------------------------------------------

SUMMARY:It includes a review of concepts taught in other classes necessary for projects in this class. It is usually taken in your senior year but can be taken before. I also works as beginning graduate class in informatics. Emphasis will be given to programming and the mathematics behind the methods.

CATALOG DESCRIPTION: Current areas that deal with problem of Big Data; techniques from computer science, mathematics, statistics; high performance and parallel computing, matrix techniques, cluster analysis, visualization; variety of applications including Google PageRank, seismology, Netflix-type problems, weather forecasting; fusion of data with simulation; projects.
TARGET AUDIENCE: This course reaches a broad audience at the University of Iowa, including those not in the computer or information sciences. In this class, concepts of statistics, matrix analysis, and parallel algorithms are reviewed or introduced. Students will work on projects for half of the semester.  This course is a requirement for students in the undergraduate certificate in Large Data Analysis. It can also be used as a beginning graduate class, including for students in other certificates involving Data or Network Sciences.

GOAL: While other classes on campus cover how to use existing methods in this areas. The main  goal of this course is to enable the student to be able to develop new computational and analytical tools for dealing with Data when tackling new problems.

TOPICS COVERED:
  1. Introduction and Overview
  2. Programming Languages: Python and R Review
  3. Visualization
  4. Statistical Issues
  5. Clustering and Classification
  6. Matrix Methods 
  7. Topological Data Analysis
  8. Computational Issues: MapReduce, parallel and memory issues.