ctrlnum 164054
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Varsos, Constantinos</creator><creator>Patkos, Theodore</creator><creator>Oulas, Anastasis</creator><creator>Pavloudi, Christina</creator><creator>Gougousis, Alexandros</creator><creator>Ijaz, Umer</creator><creator>Filiopoulou, Irene</creator><creator>Pattakos, Nikolaos</creator><creator>Vanden Berghe, Edward</creator><creator>Fern&#xE1;ndez-Guerra, Antonio</creator><creator>Faulwetter, Sarah</creator><creator>Chatzinikolaou, Eva</creator><creator>Pafilis, Evangelos</creator><creator>Bekiari, Chryssoula</creator><creator>Doerr, Martin</creator><creator>Arvanitidis, Christos</creator><date>2016-11-01</date><description>Parallel data manipulation using R has previously been addressed by members of the R community, however most of these studies produce ad hoc solutions that are not readily available to the average R user. Our targeted users, ranging from the expert ecologist/microbiologists to computational biologists, often experience difficulties in finding optimal ways to exploit the full capacity of their computational resources. In addition, improving performance of commonly used R scripts becomes increasingly difficult especially with large datasets. Furthermore, the implementations described here can be of significant interest to expert bioinformaticians or R developers. Therefore, our goals can be summarized as: (i) description of a complete methodology for the analysis of large datasets by combining capabilities of diverse R packages, (ii) presentation of their application through a virtual R laboratory (RvLab) that makes execution of complex functions and visualization of results easy and readily available to the end-user. In this paper, the novelty stems from implementations of parallel methodologies which rely on the processing of data on different levels of abstraction and the availability of these processes through an integrated portal. Parallel implementation R packages, such as the pbdMPI (Programming with Big Data &#x2013; Interface to MPI) package, are used to implement Single Program Multiple Data (SPMD) parallelization on primitive mathematical operations, allowing for interplay with functions of the vegan package. The dplyr and RPostgreSQL R packages are further integrated offering connections to dataframe like objects (databases) as secondary storage solutions whenever memory demands exceed available RAM resources. The RvLab is running on a PC cluster, using version 3.1.2 (2014-10-31) on a x86_64-pc-linux-gnu (64-bit) platform, and offers an intuitive virtual environmet interface enabling users to perform analysis of ecological and microbial communities based on optimized vegan functions. A beta version of the RvLab is available after registration at: https://portal.lifewatchgreece.eu/</description><identifier>https://zenodo.org/record/164054</identifier><identifier>10.3897/BDJ.4.e8357</identifier><identifier>oai:zenodo.org:164054</identifier><publisher>Pensoft Publishers</publisher><relation>url:https://zenodo.org/communities/biosyslit</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><source>Biodiversity Data Journal 4 e8357</source><subject>Parallel data manipulation</subject><subject>R</subject><subject>pbdMPI package</subject><subject>Single Program Multiple Data (SPMD) parallelization</subject><subject>virtual enviroment</subject><subject>vegan package</subject><subject>biodiversity analyses</subject><subject>ecological analyses</subject><title>Optimized R functions for analysis of ecological community data using the R virtual laboratory (RvLab)</title><type>Journal:Article</type><type>Journal:Article</type><recordID>164054</recordID></dc>
format Journal:Article
Journal
Journal:eJournal
author Varsos, Constantinos
Patkos, Theodore
Oulas, Anastasis
Pavloudi, Christina
Gougousis, Alexandros
Ijaz, Umer
Filiopoulou, Irene
Pattakos, Nikolaos
Vanden Berghe, Edward
Fernández-Guerra, Antonio
Faulwetter, Sarah
Chatzinikolaou, Eva
Pafilis, Evangelos
Bekiari, Chryssoula
Doerr, Martin
Arvanitidis, Christos
title Optimized R functions for analysis of ecological community data using the R virtual laboratory (RvLab)
publisher Pensoft Publishers
publishDate 2016
topic Parallel data manipulation
R
pbdMPI package
Single Program Multiple Data (SPMD) parallelization
virtual enviroment
vegan package
biodiversity analyses
ecological analyses
url https://zenodo.org/record/164054
contents Parallel data manipulation using R has previously been addressed by members of the R community, however most of these studies produce ad hoc solutions that are not readily available to the average R user. Our targeted users, ranging from the expert ecologist/microbiologists to computational biologists, often experience difficulties in finding optimal ways to exploit the full capacity of their computational resources. In addition, improving performance of commonly used R scripts becomes increasingly difficult especially with large datasets. Furthermore, the implementations described here can be of significant interest to expert bioinformaticians or R developers. Therefore, our goals can be summarized as: (i) description of a complete methodology for the analysis of large datasets by combining capabilities of diverse R packages, (ii) presentation of their application through a virtual R laboratory (RvLab) that makes execution of complex functions and visualization of results easy and readily available to the end-user. In this paper, the novelty stems from implementations of parallel methodologies which rely on the processing of data on different levels of abstraction and the availability of these processes through an integrated portal. Parallel implementation R packages, such as the pbdMPI (Programming with Big Data – Interface to MPI) package, are used to implement Single Program Multiple Data (SPMD) parallelization on primitive mathematical operations, allowing for interplay with functions of the vegan package. The dplyr and RPostgreSQL R packages are further integrated offering connections to dataframe like objects (databases) as secondary storage solutions whenever memory demands exceed available RAM resources. The RvLab is running on a PC cluster, using version 3.1.2 (2014-10-31) on a x86_64-pc-linux-gnu (64-bit) platform, and offers an intuitive virtual environmet interface enabling users to perform analysis of ecological and microbial communities based on optimized vegan functions. A beta version of the RvLab is available after registration at: https://portal.lifewatchgreece.eu/
id IOS17403.164054
institution Universitas PGRI Palembang
institution_id 189
institution_type library:university
library
library Perpustakaan Universitas PGRI Palembang
library_id 587
collection Marga Life in South Sumatra in the Past: Puyang Concept Sacrificed and Demythosized
repository_id 17403
city KOTA PALEMBANG
province SUMATERA SELATAN
repoId IOS17403
first_indexed 2022-07-26T03:59:27Z
last_indexed 2022-07-26T03:59:27Z
recordtype dc
merged_child_boolean 1
_version_ 1739482377393012736
score 17.538404