Statistical Learning

About this Course

This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; neural networks and deep learning; survival models; multiple testing. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical). This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data science. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter. The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R (second addition) by James, Witten, Hastie and Tibshirani (Springer, 2021). The pdf for this book is available for free on the book website.

Created by: Stanford University

Level: Introductory

Related Online Courses

Structured Query Language (SQL) is a standardized programming language used to manage relational databases and perform various operations on their data. This is the first course of a two-part... more
Este curso te permitirá desarrollar habilidades como un tomador de decisiones con base a las siguientes competencias: análisis de elementos estadístico de la información conceptos y fun... more
Do you want to build systems that learn from experience? Or exploit data to create simple predictive models of the world? In this course, part of the Data Science MicroMasters program, you will... more
En este curso se van a impartir los conocimientos necesarios para comenzar a trabajar con visualización de datos en el lenguaje de programación Python. En primer lugar, se explicará las ca... more
La ciencia de datos es un área que hoy ofrece herramientas analíticas muy poderosas a las organizaciones; aquellas que han incorporado estas prácticas rápidamente han podido obtener ventajas com... more