Indonesian News Corpus
Main Author: | RAHUTOMO, FAISAL |
---|---|
Other Authors: | MIQDAD MUADZ MUZAD, AAD |
Format: | Dataset |
Terbitan: |
Mendeley
, 2018
|
Subjects: | |
Online Access: |
https:/data.mendeley.com/datasets/2zpbjs22k3 |
Daftar Isi:
- This corpus contains 150,466 news articles, which is derived from several freely accessible Indonesian news website. The corpus is designated for research purpose only. The news websites are: • kompas.com is a registered trademark of PT. Kompas Cyber Media. https://inside.kompas.com/about-us • tempo.co is a registered trademark of PT INFO MEDIA DIGITAL. https://www.tempo.co/about • merdeka.com is a registered trademark of PT KAPAN LAGI DOT COM NETWORKS. https://www.merdeka.com/company/tentang-kami.html • republika.co.id is a registered trademark of PT Republika Media Mandiri. https://www.republika.co.id/page/about • viva.co.id is a registered trademark of PT. Viva Media Baru. https://www.viva.co.id/tentang-kami • tribunnews.com is a registered trademark of PT Tribun Digital Online. http://www.tribunnews.com/about-us The corpus is a part of bachelor thesis work of Aad Miqdad Muadz Muzad under the supervision of Faisal Rahutomo. We crawled several categories of the websites for 6 months from July 2015 until December 2015.