SMILE: A novel dissimilarity-based procedure for detecting sparse-specific profiles in sparse contingency tables
Abstract
A novel statistical procedure for clustering individuals characterized by sparse-specific profiles is introduced in the context of data summarized in sparse contingency tables. The proposed procedure relies on a single-linkage clustering based on a new dissimilarity measure designed to give equal influence to sparsity and specificity of profiles. Theoretical properties of the new dissimilarity are derived by characterizing single-linkage clustering using Minimum Spanning Trees. Such characterization allows the description of situations for which the proposed dissimilarity outperforms competing dissimilarities. Simulation examples are performed to demonstrate the strength of the new dissimilarity compared to 11 other methods. The analysis of a genomic data set dedicated to the study of molecular signatures of selection is used to illustrate the efficiency of the proposed method in a real situation
Fichier principal
SMILE A novel dissimilarity-based procedure.pdf (692.79 Ko)
Télécharger le fichier
Origin : Files produced by the author(s)
Loading...