COVID Indonesia Project: Data Training Preparation
Main Author: | Nugroho, Robertus Setiawan Aji |
---|---|
Format: | Monograph NonPeerReviewed Book |
Bahasa: | eng |
Terbitan: |
UNIKA Soegijapranata
|
Subjects: | |
Online Access: |
http://repository.unika.ac.id/23533/1/penelitian_COVIDprepare.pdf http://repository.unika.ac.id/23533/ |
Daftar Isi:
- Helping government of Indonesia in combating the COVID-19. CSIRO Data61 and Informatics Engineering of Soegijapranata Catholic University are collaborating to develop an epidemic intelligence system on tracing and tracking COVID-19 using Social Media. The system leverages big data, artificial intelligence, and natural language processing to provide a combined intelligence service for authorities. This paper reports the subset of the project to prepare for the data training to process the real-world data. We conduct annotation on 4000 tweets in Indonesian language and classify them based on the personal health mention. The annotation process has been successful with the kappa value of more than 0.7. We also investigate the possibility to augment the Indonesian data with English language translated with Google and Bing translation service, using translation picking experiment. We find that by augmenting data, the training dataset could be extended dynamically.