Análisis léxico sobre los tweets de Twitter
Main Authors: | Bográn, Astrid-Paola, Alonso-Berrocal, José-Luis, G. Figuerola, Carlos |
---|---|
Other Authors: | Cruz-Benito, Juan, García-Holgado, Alicia, García-Sánchez, Sergio, Hernández-Alfageme, Daniel, Navarro-Cáceres, María, Vega-Ruiz, Roberto |
Format: | Proceeding PeerReviewed Book |
Bahasa: | es |
Terbitan: |
Departamento de Informática y Automática. Facultad de Ciencias. Universidad de Salamanca
, 2013
|
Subjects: | |
Online Access: |
http://eprints.rclis.org/29304/1/20132_bogran2013analisis.pdf http://eprints.rclis.org/29304/ |
Daftar Isi:
- This paper provides an approach on Lexical analysis, focused on the tweets of Twitter. Shows the development of a web application that can connect to Twitter involving the handling of a classifier text on the web for discover the essential characteristics tweets selected, either individually or in mass, all running in real time or adding content to a database, that allow the user reprocess the tweets. The use of stemming and tokenization techniques help process the tweet cleaner and without noise. For the classification have been proposed the Naïve Bayes algorithm, and created several dictionaries in XML based on the areas of Science and Technology, as well as dictionaries that help identify empty words.