Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts - Université de Rennes Accéder directement au contenu
Article Dans Une Revue Journal of Biomedical Semantics Année : 2015

Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts

Résumé

Background Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. Results We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. Conclusions Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/ webcite. The software is available at https://bingo2.greyc.fr/?q=node/22 webcite.

Dates et versions

hal-01192959 , version 1 (04-09-2015)

Identifiants

Citer

Peggy Cellier, Thierry Charnois, Marc Plantevit, Christophe Rigotti, Bruno Crémilleux, et al.. Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts. Journal of Biomedical Semantics, 2015, 6, pp.1-27. ⟨10.1186/s13326-015-0023-3⟩. ⟨hal-01192959⟩
1094 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More