Selecting the number of components in PCA using cross-validation approximations - Université de Rennes Accéder directement au contenu
Article Dans Une Revue Computational Statistics and Data Analysis Année : 2012

Selecting the number of components in PCA using cross-validation approximations

Résumé

Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out cross-validation. They are based on the relation between the prediction error and the residual sum of squares weighted by elements of a projection matrix (or a smoothing matrix). Such a relation is then established in PCA using an original presentation of PCA with a unique projection matrix. It enables the definition of two cross-validation approximation criteria: the smoothing approximation of the cross-validation criterion (SACV) and the GCV criterion. The method is assessed with simulations and gives promising results.

Dates et versions

hal-00729614 , version 1 (07-09-2012)

Identifiants

Citer

Julie Josse, François Husson. Selecting the number of components in PCA using cross-validation approximations. Computational Statistics and Data Analysis, 2012, 56 (6), pp.1869-1879. ⟨10.1016/j.csda.2011.11.012⟩. ⟨hal-00729614⟩
907 Consultations
1 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More