Identificación de nombres de genes en la literatura biomédica
Main Authors: | Galvez, Carmen, De-Moya-Anegón, Félix |
---|---|
Other Authors: | Guerrero-Bote, Vicente |
Format: | Proceeding PeerReviewed application/pdf |
Bahasa: | es |
Terbitan: |
Open Institute of Knowledge
, 2006
|
Subjects: | |
Online Access: |
http://eprints.rclis.org/8817/1/Identificacin_de_Nombres_de_Genes_en_la_Literatura_Biomdica.pdf http://eprints.rclis.org/8817/ |
Daftar Isi:
- An enormous complexity arises in the identification of gene terms in biomedical literature. With the discovery of huge quantities of genes, and the Human Genome Project (HGP), the scientists have remained without easy and intuitive names. In the genomic information many forms of variation occur due to lack of standardization of gene names. Although nomenclature and ontological specifications are valuable for processing, efforts toward the systematic naming of genes have been made, but the difficulty still exists. The development of procedures that resolve these problems would benefit the progress of molecular pathways, the extraction of gene-gene and gene-disease interactions, the delimitation of the structure of the genomic research domain through gene-document relations and the knowledge discovery that is hidden in the biomedical literature. Our proposal relies on approximate pattern-matching techniques, adopted of natural language processing (NLP), to find and filter gene variants matches. To perform the gene-matching, we apply finite-state transducers (FSTs). To implement our prototype system, we were using publicly available gene and text databases, such as FlyBase (biological database of the Drosophila genome projects) and PubMed (US National Library of Medicine).