Desarrollo de un sistema de clasificación automática de contenidos en medios de comunicación españoles y mexicanos = Development of an automatic classification of content system in Spaniard-Mexican mass media
Main Author: | Blázquez-Ochando, Manuel |
---|---|
Format: | Proceeding PeerReviewed Book |
Bahasa: | es |
Terbitan: |
Universidad Complutense de Madrid
, 2012
|
Subjects: | |
Online Access: |
http://eprints.rclis.org/19031/1/9o-seminario-hispanomexicano-manuel-blazquez-ochando.pdf http://eprints.rclis.org/19031/ |
Daftar Isi:
- The objective of this research is to develop an automatic classification system for the contents retrieved through the Resync platform specializing in the investigation of sources of information media. This investigation is justified due to the lack of automated methods to organize the information gathered and the need to scrutinize the thematic categories addressed by the media by country. To resolve these problems, we transform the Eurovoc multilingual thesaurus in a pseudo-ontology vocabulary that is used as a qualifier for the documentary corpus. The test collection used has 400,000 contents from Mexican and Spaniard media published during the months of June-July 2011. Additionally, are designed and tested 5 automatic classification algorithms, accurate consultation and generic classification using the vocabulary above, for their harmonization with the collection of evidence. You get all the quantitative results of the experiment, concluding a progressive escalation in the percentage of classified content, given by the precision of the algorithm and its conditioning. Finally, the basis for qualitative evaluation of the classification made by the system, in order to perfect the process described herein.