Analysing Kinship in Severe Acute Respiratory Syndrome Coronavirus 2 DNA Sequences Based on Hierarchical and K-Means Clustering Methods Using Multiple Encoding Vector

Main Authors: Banjarnahor, Evander; Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Indonesia, Depok, 16424, Indonesia, Bustamam, Alhadi; Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Indonesia, Depok, 16424, Indonesia, Siswantining, Titin; Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Indonesia, Depok, 16424, Indonesia, Tampubolon, Patuan; Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Indonesia, Depok, 16424, Indonesia
Other Authors: Enago
Format: Article info application/pdf eJournal
Bahasa: eng
Terbitan: International Journal on Advanced Science, Engineering and Information Technology , 2022
Subjects:
Online Access: http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/15582
http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/15582/pdf_2273
http://insightsociety.org/ojaseit/index.php/ijaseit/article/downloadSuppFile/15582/2762
http://insightsociety.org/ojaseit/index.php/ijaseit/article/downloadSuppFile/15582/2763
http://insightsociety.org/ojaseit/index.php/ijaseit/article/downloadSuppFile/15582/2764
http://insightsociety.org/ojaseit/index.php/ijaseit/article/downloadSuppFile/15582/2765
Daftar Isi:
  • Based on the World Health Organization data obtained in mid-April 2021, Coronavirus or Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has already infected more than 134.9 million people worldwide. The virus attacks human breathing, which can cause lung infections and even death. More than 2.9 million people worldwide have died due to coronavirus infection. Meanwhile in Indonesia, more than 1.5 million people has been infected and 42.5 thousand people died because of this coronavirus. Based on this data, it is important to carry out a kinship analysis of the coronavirus to reduce its spread. Identification of the kinship of the COVID-19 virus and its spread can be done by forming a phylogenetic tree and clustering. This study uses the Multiple Encoding Vector method in analysing the sequences and Euclidean distance to determine the distance matrix. This research will then use the Hierarchical clustering method to determine the number of initial centroids, which will be used later by the K-Means clustering method kinship in the SARS-CoV-2 DNA sequence. This study took samples of DNA sequences of SARS-CoV-2 from several infected countries. From the simulation results, the ancestors of SARS-CoV-2 came from China. The results of the analysis also show that the closest ancestors of COVID-19 to Indonesia came from India. The SARS-CoV-2 DNA sequence also consisted of nine clusters, and the sixth cluster has the most number of members.