PhD Program in Computer Science

Regularization Methods for High Dimensional Learning
A course on Learning Theory

The course will be held in July 2010, 5 to 9.

Registration: registration is free and requires sending an e-mail to one of the instructors.

Instructors:
Francesca Odone, odone@disi.unige.it
Lorenzo Rosasco, lrosasco@mit.edu
Venue
DISI-Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, IT
classes: room 711 (7th floor) - 217 (2nd floor)
labs: room SW2 (3rd floor)
Synopsis

The course is designed to provide a self contained introduction to state of the art statistical learning techniques for the analysis of complex high dimensional data. It is designed to be accessible to graduate students in computational sciences.

The subject of the course is motivated by the observation that in many modern applications, such as computational vision or bioinformatics, data are represented by hundreds or thousands of variables.
In this high dimensional setting small samples become a major concern and classical statistical tools fall short at
providing satisfactory solutions.

Starting from the classical notion of smoothness underlying many kernel methods, we present several principles for high dimensional learning. In particular we discuss algorithms motivated by geometry of the data (manifold learning), sparsity and transfer learning.

The emphasis of the course will be on algorithmic issues and matlab sessions will provide students with hands on experience. Applications to computer vision and bioinformatics will be discussed. Fundamental concepts from the theory, such as the notion of generalization and regularization, will be described.

Syllabus

- each class is 90 min. no breaks -

class 1 (C1) Welcome. Introduction to Learning
class 2 (C2) RKHS and Tikhonov Regularization
class 3 (C3) Spectral Methods for Supervised Learning
class 4 (C4) Error Analysis and Parameter Choice
class 5 (C5) Lab 1 - Binary classification and model selection
class 6 (C6) Sparsity Based Learning and Variable Selection
class 7 (C7) Regularization for Multi-Output Learning
class 8 (C8) Lab 2 - Sparsity based methods
class 9 (C9) Manifold Regularization
class 10 (C10) Regularization with multiple kernels
class 11 (C11) Lab 3 - Manifold regularization
class 12 (C12) Applications to high dimensional problems
class 13 (C13) Lab 4 - Applications

Course schedule
MON 5 TUE 6 WED 7 THU 8 FRI 9
9:30-11:00
-
C3
C6
C9
C12
11:30-13:00
-
C4
C7
C10
C13 (lab)
14:30-16:00
C1
C5 (lab)
C8 (lab)
C11 (lab)
-
16:30-18:00
C2
-
-
-
-
Prerequisites
Multivariate Calculus, Basic Probability Theory, Matlab.
Short reading list

General references are

  • Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Heidelberg, Germany (2004)
  • F. Cucker and S. Smale. On The Mathematical Foundations of Learning. Bulletin of the American Mathematical Society, 2002.
  • T. Evgeniou and M. Pontil and T. Poggio. Regularization Networks and Support Vector Machines. Advances in Computational Mathematics, 2000.
  • T. Poggio and S. Smale. The Mathematics of Learning: Dealing with Data. Notices of the AMS, 2003
  • L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.
  • V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
  • T. Hastie, R. Tibshirani, J. H. Friedman. The Elements of Statistical Learning, Springer 2001.
  • I. Steinwart and A. Christmann. Support vector machines. Springer, New York, 2008.
  • Cucker, Felipe; Zhou, Ding-Xuan Learning theory: an approximation theory viewpoint. 
    With a foreword by Stephen Smale. Cambridge Monographs on Applied and Computational Mathematics.
    Cambridge University Press, Cambridge, 2007. xii+224 pp. 
Material and further readings

Specific references for each class (see also http://www.mit.edu/~9.520/ ).

class 1 (C1) Welcome. Introduction to Learning

class 2 (C2) RKHS and Tikhonov Regularization

  • slides
  • Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 686, 337-404, 1950.
  • Cucker and Smale. On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 2002.
  • Evgeniou, Pontil and Poggio. Regularization Networks and Support Vector Machines Advances in Computational Mathematics, 2000.
  • Girosi, F. An Equivalence between Sparse Approximation and Support Vector Machines. Neural Computation, Vol. 10, 1455-1480, 1998. (Appendix A)
  • Wahba, G. Spline Models for Observational Data Series in Applied Mathematics, Vol. 59, SIAM, 1990. (Chapter 1)

class 3 (C3) Spectral Methods for Supervised Learning

  • slides
  • Lo Gerfo, L, Rosasco, F. Odone, E. De Vito, and A. Verri. Spectral Algorithms for Supervised Learning,
    Neural Computation, 20(7):1873-97. 2008.
    and references there in
  • Mosci, S., Rosasco, L. and Verri A. " Dimensionality reduction and generalization ", ACM International Conference Proceeding Series; Vol. 227 archive Proceedings of the 24th International Conference on Machine Learning
  • Yao Y., Rosasco L. and Caponnetto, A. "On Early Stopping in Gradient Descent Learning", to be published in Constructive Approximation.

class 4 (C4) Error Analysis and Parameter Choice

class 5 (C5) Lab 1 - Binary classification and model selection

class 6 (C6) Sparsity Based Learning 

  • slides
  • De Mol, C., De Vito E. and Rosasco L. "Elastic Net Regularization in Learning Theory", to appear in the Journal of Complexity
    (also CBCL paper #273/ CSAILTechnical Report #TR-2008-046, Massachusetts Institute of Technology, Cambridge, MA, July 24, 2008 and arXiv:0807.3423).and references therein

class 7 (C7) Regularization for Multi-Output Learning

 class 8 (C8) Lab 2 - Sparsity based methods


class 9 (C9) Manifold Regularization

class 10 (C10) Regularization with multiple kernels

 class 11 (C11) Lab 3 - Manifold regularization

  • Lab
  • code
  • M. Belkin, P. Niyogi. Semi-supervised Learning on Riemannian Manifolds. Machine Learning, 56, Special Issue on Clustering, 209-239, 2004.
  • M. Belkin, P. Niyogi, V. Sindhwani. On Manifold Regularization. AISTATS 2005.


class 12 (C12) Applications to high dimensional problems

  • slides
  • A. Destrero, C. De Mol, F. Odone, A. Verri. "A Regularized Framework for Feature Selection in Face Detection and Authentication". IJCV (2009). 
  • A. Destrero, C. De Mol, F. Odone, A. Verri."A sparsity-enforcing method for learning face features.". IEEE Transactions on Image Processing 18 (2009): 188-201.
  • C. Basso, M. Santoro, A. Verri and M. Esposito. "Segmentation of Inflamed Synovia in Multi-Modal MRI." In Proc. of IEEE ISBI 2009, June 28 - July 1 2009.
  • Fardin, Paolo, Cornero, Andrea, Annalisa Barla, Sofia Mosci, Acquaviva, Massimo, Lorenzo Rosasco, Gambini, Claudio, Alessandro Verri, Varesio, Luigi, "Identification of multiple hypoxia signatures in neuroblastoma cell lines by l1-l2 regularization and data reduction", Journal of Biomedicine and Biotechnology, 2010
  • A. Barla, S. Mosci, L. Rosasco and A. Verri. "A method for robust variable selection with significance assessment." Proc. of ESANN, European Symposium on Artificial Neural Networks 2008.
  • C. De Mol, S. Mosci, M. Traskine and A. Verri; "A Regularized Method for Selecting Nested Groups of Relevant Genes from Microarray Data" Journal of Computational Biology 2008.

class 13 (C13) Lab 4 - Applications