Text mining to support gene ontology curation and vice versa

Ruch, Patrick

doi:10.1007/978-1-4939-3743-1_6

Text mining to support gene ontology curation and vice versa

Ruch, Patrick

2016

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Cite

Résumé

In this chapter, we explain how text mining can support the curation of molecular biology databases dealing with protein functions. We also show how curated data can play a disruptive role in the developments of text mining methods. We review a decade of efforts to improve the automatic assignment of Gene Ontology (GO) descriptors, the reference ontology for the characterization of genes and gene products. To illustrate the high potential of this approach, we compare the performances of an automatic text categorizer and show a large improvement of +225 % in both precision and recall on benchmarked data. We argue that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions. Because GO descriptors can be relatively long and specific, traditional QA systems cannot answer such questions. A new type of QA system, so-called Deep QA which uses machine learning methods trained with curated contents, is thus emerging. Finally, future advances of text mining instruments are directly dependent on the availability of high-quality annotated contents at every curation step. Databases workflows must start recording explicitly all the data they curate and ideally also some of the data they do not curate.

Détails

Titre

Text mining to support gene ontology curation and vice versa

Auteur(s)/ trice(s)

Ruch, Patrick (Haute école de gestion de Genève, HES-SO Haute Ecole Spécialisée de Suisse Occidentale)

Editeur(s) scientifique(s)

Dessimoz, Christophe ; University College London, London, UK ; Swiss Institute of Bioinformatics, Lausanne, Switzerland ; University of Lausanne, Lausanne, Switzerland
Škunca, Nives ; ETH Zurich, Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, Zurich, Switzerland ; University College London, London, UK

Date

2016-11

Publié dans

The gene ontology handbook

Publié par

New York, Springer

Pagination & équivalents

Pp. 69-84

ISBN

978-1-4939-3741-7

DOI

https://doi.org/10.1007/978-1-4939-3743-1_6

ISSN

1064-3745

Collection et n°

Methods in molecular biology, 1446

Mots-clés (libres)

automatic text categorization ; gene ontology ; data curation ; databases ; data stewardship ; information storage and retrieval

Domaine

Economie et Services

Ecole

HEG - Genève

Institut

CRAG - Centre de Recherche Appliquée en Gestion

Lien vers catalogue collection papier

Accès au catalogue des bibliothèques

Le document apparaît dans

Chapitres de livres
Global

Text mining to support gene ontology curation and vice versa

Résumé

Détails

Actions

PDF