The
course is designed to provide a self contained
introduction to state of the art statistical learning techniques for
the analysis of complex high dimensional data. It is designed to be
accessible to graduate students in computational sciences.
The subject of the course is motivated by the
observation that in many modern applications, such as computational
vision or bioinformatics, data are represented by hundreds or thousands
of variables. In this high dimensional setting small samples
become a major concern and classical statistical tools fall short at
providing satisfactory solutions. Starting
from the classical notion of smoothness underlying many kernel methods,
we present several principles for high dimensional learning. In
particular we discuss algorithms motivated by geometry of the data
(manifold learning), sparsity and transfer learning.
The emphasis of the course will be on algorithmic
issues and matlab sessions will provide students with hands on
experience. Applications to computer vision and bioinformatics will be
discussed. Fundamental concepts from the theory, such as the notion of
generalization and regularization, will be described.
|
General
references are
- Bousquet,
O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning
Theory. Advanced Lectures on Machine Learning Lecture Notes in
Artificial Intelligence 3176, 169-207. (Eds.) Heidelberg, Germany (2004)
- F. Cucker and S. Smale. On The Mathematical Foundations of
Learning. Bulletin of the American Mathematical Society, 2002.
- T. Evgeniou and M. Pontil and T. Poggio. Regularization
Networks
and Support Vector Machines. Advances in Computational Mathematics,
2000.
- T. Poggio and S. Smale. The Mathematics of Learning: Dealing
with Data. Notices of the AMS, 2003
- L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory
of Pattern Recognition. Springer, 1997.
- V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
- T. Hastie, R. Tibshirani, J. H. Friedman. The Elements of
Statistical Learning, Springer 2001.
- I. Steinwart
and A. Christmann. Support vector machines. Springer, New York, 2008.
- Cucker,
Felipe; Zhou, Ding-Xuan Learning theory: an approximation theory
viewpoint.
With a foreword by Stephen Smale.
Cambridge Monographs on Applied and Computational Mathematics. Cambridge
University Press, Cambridge, 2007. xii+224 pp.
|
|
Specific references for each class (see also http://www.mit.edu/~9.520/
).
class
1 (C1) Welcome. Introduction to Learning
class
2 (C2) RKHS and Tikhonov Regularization - slides
- Aronszajn.
Theory of reproducing kernels. Transactions of the American
Mathematical Society, 686, 337-404, 1950.
- Cucker
and Smale. On the mathematical foundations of learning. Bulletin of the
American Mathematical Society, 2002.
- Evgeniou,
Pontil and Poggio. Regularization Networks and Support Vector Machines
Advances in Computational Mathematics, 2000.
- Girosi,
F. An Equivalence between Sparse Approximation and Support Vector
Machines. Neural Computation, Vol. 10, 1455-1480, 1998. (Appendix A)
- Wahba,
G. Spline Models for Observational Data Series in Applied Mathematics,
Vol. 59, SIAM, 1990. (Chapter 1)
class 3 (C3) Spectral
Methods for Supervised Learning - slides
- Lo Gerfo, L, Rosasco, F. Odone,
E. De Vito, and A. Verri. Spectral
Algorithms for Supervised Learning,
Neural
Computation, 20(7):1873-97. 2008. and references there in - Mosci,
S., Rosasco, L. and Verri A. " Dimensionality reduction and
generalization ", ACM International Conference Proceeding Series; Vol.
227 archive Proceedings of the 24th International Conference on Machine
Learning
- Yao
Y., Rosasco L. and Caponnetto, A. "On Early Stopping in Gradient
Descent Learning", to be published in Constructive Approximation.
class 4 (C4) Error Analysis
and Parameter Choice class 5 (C5) Lab 1 - Binary classification and model
selection class 6
(C6) Sparsity Based Learning - slides
- De Mol, C., De Vito E. and Rosasco L. "Elastic Net Regularization in Learning Theory", to appear in the Journal of Complexity
(also
CBCL paper #273/ CSAILTechnical Report #TR-2008-046, Massachusetts
Institute of Technology, Cambridge, MA, July 24, 2008 and
arXiv:0807.3423).and references therein
class 7
(C7) Regularization for Multi-Output Learning class 8 (C8)
Lab 2 - Sparsity based
methods class 9 (C9) Manifold Regularization
class
10 (C10) Regularization with multiple kernels class 11 (C11)
Lab 3 - Manifold regularization - Lab
- code
- M.
Belkin, P. Niyogi. Semi-supervised Learning on Riemannian Manifolds.
Machine Learning, 56, Special Issue on Clustering, 209-239, 2004.
- M. Belkin, P. Niyogi, V. Sindhwani. On Manifold Regularization. AISTATS 2005.
class 12 (C12)
Applications to high dimensional problems
- slides
- A.
Destrero, C. De Mol, F. Odone, A. Verri. "A Regularized Framework for
Feature Selection in Face Detection and Authentication". IJCV (2009).
- A.
Destrero, C. De Mol, F. Odone, A. Verri."A sparsity-enforcing method
for learning face features.". IEEE Transactions on Image Processing 18
(2009): 188-201.
- C.
Basso, M. Santoro, A. Verri and M. Esposito. "Segmentation of Inflamed
Synovia in Multi-Modal MRI." In Proc. of IEEE ISBI 2009, June 28 - July
1 2009.
- Fardin,
Paolo, Cornero, Andrea, Annalisa Barla, Sofia Mosci, Acquaviva,
Massimo, Lorenzo Rosasco, Gambini, Claudio, Alessandro Verri, Varesio,
Luigi, "Identification of multiple hypoxia signatures in neuroblastoma
cell lines by l1-l2 regularization and data reduction", Journal of
Biomedicine and Biotechnology, 2010
- A.
Barla, S. Mosci, L. Rosasco and A. Verri. "A method for robust variable
selection with significance assessment." Proc. of ESANN, European
Symposium on Artificial Neural Networks 2008.
- C.
De Mol, S. Mosci, M. Traskine and A. Verri; "A Regularized Method for
Selecting Nested Groups of Relevant Genes from Microarray Data" Journal
of Computational Biology 2008.
class 13 (C13) Lab
4 - Applications |