MATH 4322 - Introduction to Data Science and Machine Learning - University of Houston

# MATH 4322 - Introduction to Data Science and Machine Learning

***This is a course guideline. Students should contact instructor for the updated information on current course syllabus, textbooks, and course content***
Prerequisites: MATH 3339

Course Description: Course will deal with theory and applications for such statistical learning techniques as linear and logistic regression, classification and regression trees, random forests, neural networks. Other topics might include: fit quality assessment, model validation, resampling methods. R Statistical programming will be used throughout the course.

Textbook: While lecture notes will serve as the main source of material for the course, the following book constitutes a great reference:
• ”An Introduction to Statistical Learning (with applications in R)” by James, Witten et al. ISBN: 978-1461471370
• ”Neural Networks with R” by G. Ciaburro. ISBN: 978-1788397872

Learning Objectives: By the end of the course a successful student should:

• Have a solid conceptual grasp on the described statistical learning methods.
• Be able to correctly identify the appropriate techniques to deal with particular data sets.
• Have a working knowledge of R programming software in order to apply those techniques and subse- quently assess the quality of fitted models.
• Demonstrate the ability to clearly communicate the results of applying selected statistical learning methods to the data.

Course Outline:
• Introduction: What is Statistical Learning? Supervised and unsupervised learning. Regression and classification.
• Linear and Logistic Regression. Continuous response: simple and multiple linear regression. Binary response: logistic regression. Assessing quality of fit.
• Model Validation. Validation set approach. Cross-validation.
• Tree-based Models. Decision and regression trees: splitting algorithm, tree pruning. Random forests: bootstrap, bagging, random splitting.
• Neural Networks. Single-layer perceptron: neuron model, learning weights. Multi-Layer Perceptron: backpropagation, multi-class discrimination