La recuperación de información en español y la normalización de términos

Main Authors: G.-Figuerola, Carlos, Zazo, Ángel F., Rodríguez-Vázquez-de-Aldana, Emilio, Alonso-Berrocal, José-Luis
Format: Journal PeerReviewed application/pdf
Bahasa: es
Terbitan: , 2004
Subjects:
Online Access: http://eprints.rclis.org/13961/1/figuerola2004recuperacion.pdf
http://eprints.rclis.org/13961/
Daftar Isi:
  • Most of the Information Retrieval Systems uses counts of frequencies of the words that occur in documents. Such counts entail the need of normalizing these terms. A simple normalization of characters (upper/ lowercase, accents and other diacritical ones) seems insucient, since many words, by morphologic inection or derivation, could be grouped under an only form, when having very near semantic mean. Several algorithms of normalization are analyzed and tested experimentally to evaluate their efectiveness.