MONK – Outlier-Robust Mean Embedding Estimation by Median-of-Means

Abstract : Mean embeddings provide an extremely flexible and powerful tool in machine learning and statistics to represent probability distributions and define a semi-metric (MMD, maximum mean discrepancy ; also called N-distance or energy distance), with numerous successful applications. The representation is constructed as the expectation of the feature map defined by a kernel. As a mean, its classical empirical estimator, however, can be arbitrary severely affected even by a single outlier in case of unbounded features. To the best of our knowledge, unfortunately even the consistency of the existing few techniques trying to alleviate this serious sensitivity bottleneck is unknown. In this paper, we show how the recently emerged principle of median-of-means can be used to design minimax-optimal estimators for kernel mean embedding and MMD, with finite-sample strong outlier-robustness guarantees.
Type de document :
Rapport
[Research Report] Laboratoire de Mathématiques d'Orsay; Ecole Polytechnique (Palaiseau, France); ONERA. 2018
Liste complète des métadonnées

Littérature citée [72 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01705881
Contributeur : Zoltan Szabo <>
Soumis le : vendredi 9 février 2018 - 23:32:33
Dernière modification le : mercredi 17 octobre 2018 - 16:34:01

Fichier

MONK.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01705881, version 1

Citation

Matthieu Lerasle, Zoltán Szabó, Gaspar Massiot, Eric Moulines. MONK – Outlier-Robust Mean Embedding Estimation by Median-of-Means. [Research Report] Laboratoire de Mathématiques d'Orsay; Ecole Polytechnique (Palaiseau, France); ONERA. 2018. 〈hal-01705881v1〉

Partager

Métriques

Consultations de la notice

70

Téléchargements de fichiers

5