Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter

Main Author: Rintyarna, Bagus Setya
Format: Article info application/pdf eJournal
Bahasa: ind
Terbitan: Fakultas Ilmu Komputer, Universitas Brawijaya , 2022
Online Access: http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185
http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185/pdf
ctrlnum article-6185
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><title lang="id">Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter</title><creator lang="id">Rintyarna, Bagus Setya</creator><description lang="id">Sentiment analysis adalah teknik komputasi text mining berbasis natural language processing (NLP) untuk mengekstraksi pendapat seseorang yang diungkapkan dalam platform online, termasuk dalam platform microblogging Twitter, salah satu platform microblogging yang paling popular digunakan di Indonesia. Ada dua pendekatan yang umum digunakan dalam teknik sentiment analysis yaitu pendekatan berbasis machine learning (ML) dan pendekatan berbasis sentiment lexicon (SL). Fokus penelitian ini adalah untuk pengembangan teknik sentiment analysis berbasis machine learning yang disebut juga teknik tersupervisi pada dataset Twitter. Sebagian besar sentiment analysis pada dataset Twitter berbahasa Indonesia mengandalkan single machine learning algorithm. Penelitian ini menggabungkan kinerja berbagai algoritma/experts seraya mengurangi tingkat kesalahan klasifikasi dengan meng-update bobot secara dinamis menggunakan weighted majority vote (WMV) berbasis joint distribution dari Bayesian Network. Pada tahap pertama, data di grabbing dari Twitter dengan 3 hashtag terkait Covid-19 sebagai data eksperimen. Selanjutnya kinerja weighted majority vote secara ekstensif dibandingkan dengan 4 metode baseline sebagai pembanding, yaitu: Na&#xEF;ve Bayes, Gaussian Na&#xEF;ve Bayes, Multinomial Na&#xEF;ve Bayes dan Majority Vote dari ketiga single classifier tersebut. Metrics kinerja yang digunakan adalah precision, recall, fmeasure, accuracy dan Mathews correlation coeficient (MCCC). Dalam eksperimen, terbukti bahwa WMV mampu meningkatkan kinerja sentiment analysis pada ketiga topik dataset dengan evaluator berbagai metrics kinerja sentiment analysis.&#xA0;AbstractSentiment analysis is a computational text mining technique based on natural language processing (NLP) to extract someone's opinion expressed in online platforms, including the Twitter microblogging platform, one of the most popular microblogging platforms used in Indonesia. There are two approaches that are commonly used in sentiment analysis techniques, namely the machine learning (ML) based approach and the sentiment lexicon (SL) based approach. The focus of this research is the development of machine learning-based sentiment analysis techniques which are also called supervised techniques on the Twitter dataset. Most of the sentiment analysis on the Indonesian language Twitter dataset relies on a single machine learning algorithm. This study combines the performance of various algorithms/experts while reducing the level of misclassification by updating the weights dynamically using a joint distribution-based weighted majority vote (WMV) from the Bayesian Network. In the first stage, data was grabbed from Twitter with 3 hashtags related to Covid-19 as experimental data. Furthermore, the performance of the weighted majority vote was extensively compared with 4 baseline methods for comparison, namely: Na&#xEF;ve Bayes, Gaussian Na&#xEF;ve Bayes, Multinomial Nave Bayes and Majority Vote from the three single classifiers. Performance metrics used are precision, recall, fmeasure, accuracy and Mathews correlation coeficient. In experiments, it is proven that WMV is able to improve sentiment analysis performance on the three dataset topics with various evaluators of sentiment analysis performance metrics.</description><publisher lang="en">Fakultas Ilmu Komputer, Universitas Brawijaya</publisher><date>2022-10-31</date><type>Journal:Article</type><type>Other:info:eu-repo/semantics/publishedVersion</type><type>File:application/pdf</type><identifier>http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185</identifier><identifier>10.25126/jtiik.2022956185</identifier><source lang="id">Jurnal Teknologi Informasi dan Ilmu Komputer; Vol 9 No 5: Oktober 2022; 1083-1090</source><source lang="en">Jurnal Teknologi Informasi dan Ilmu Komputer; Vol 9 No 5: Oktober 2022; 1083-1090</source><source>2528-6579</source><source>2355-7699</source><source>10.25126/jtiik.202295</source><language>ind</language><relation>http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185/pdf</relation><rights lang="en">Hak Cipta (c) 2022 Jurnal Teknologi Informasi dan Ilmu Komputer</rights><recordID>article-6185</recordID></dc>
language ind
format Journal:Article
Journal
Other:info:eu-repo/semantics/publishedVersion
Other
File:application/pdf
File
Journal:eJournal
author Rintyarna, Bagus Setya
title Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter
publisher Fakultas Ilmu Komputer, Universitas Brawijaya
publishDate 2022
isbn 9782022956183
url http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185
http://jtiik.ub.ac.id/index.php/jtiik/article/view/6185/pdf
contents Sentiment analysis adalah teknik komputasi text mining berbasis natural language processing (NLP) untuk mengekstraksi pendapat seseorang yang diungkapkan dalam platform online, termasuk dalam platform microblogging Twitter, salah satu platform microblogging yang paling popular digunakan di Indonesia. Ada dua pendekatan yang umum digunakan dalam teknik sentiment analysis yaitu pendekatan berbasis machine learning (ML) dan pendekatan berbasis sentiment lexicon (SL). Fokus penelitian ini adalah untuk pengembangan teknik sentiment analysis berbasis machine learning yang disebut juga teknik tersupervisi pada dataset Twitter. Sebagian besar sentiment analysis pada dataset Twitter berbahasa Indonesia mengandalkan single machine learning algorithm. Penelitian ini menggabungkan kinerja berbagai algoritma/experts seraya mengurangi tingkat kesalahan klasifikasi dengan meng-update bobot secara dinamis menggunakan weighted majority vote (WMV) berbasis joint distribution dari Bayesian Network. Pada tahap pertama, data di grabbing dari Twitter dengan 3 hashtag terkait Covid-19 sebagai data eksperimen. Selanjutnya kinerja weighted majority vote secara ekstensif dibandingkan dengan 4 metode baseline sebagai pembanding, yaitu: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Naïve Bayes dan Majority Vote dari ketiga single classifier tersebut. Metrics kinerja yang digunakan adalah precision, recall, fmeasure, accuracy dan Mathews correlation coeficient (MCCC). Dalam eksperimen, terbukti bahwa WMV mampu meningkatkan kinerja sentiment analysis pada ketiga topik dataset dengan evaluator berbagai metrics kinerja sentiment analysis. AbstractSentiment analysis is a computational text mining technique based on natural language processing (NLP) to extract someone's opinion expressed in online platforms, including the Twitter microblogging platform, one of the most popular microblogging platforms used in Indonesia. There are two approaches that are commonly used in sentiment analysis techniques, namely the machine learning (ML) based approach and the sentiment lexicon (SL) based approach. The focus of this research is the development of machine learning-based sentiment analysis techniques which are also called supervised techniques on the Twitter dataset. Most of the sentiment analysis on the Indonesian language Twitter dataset relies on a single machine learning algorithm. This study combines the performance of various algorithms/experts while reducing the level of misclassification by updating the weights dynamically using a joint distribution-based weighted majority vote (WMV) from the Bayesian Network. In the first stage, data was grabbed from Twitter with 3 hashtags related to Covid-19 as experimental data. Furthermore, the performance of the weighted majority vote was extensively compared with 4 baseline methods for comparison, namely: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Nave Bayes and Majority Vote from the three single classifiers. Performance metrics used are precision, recall, fmeasure, accuracy and Mathews correlation coeficient. In experiments, it is proven that WMV is able to improve sentiment analysis performance on the three dataset topics with various evaluators of sentiment analysis performance metrics.
id IOS577.article-6185
institution Universitas Brawijaya
institution_id 30
institution_type library:university
library
library Perpustakaan Universitas Brawijaya
library_id 480
collection Jurnal Teknologi Informasi dan Ilmu Komputer
repository_id 577
subject_area Program Komputer dan Teknologi Informasi
city MALANG
province JAWA TIMUR
repoId IOS577
first_indexed 2024-06-02T20:42:51Z
last_indexed 2024-06-02T20:42:51Z
recordtype dc
_version_ 1800783732797341696
score 17.538404