Tampilan Petugas: Scalable bioinformatics workflows for growing '-omics' datasets on National Collaborative Research Infrastructure Strategy Facilities

Scalable bioinformatics workflows for growing '-omics' datasets on National Collaborative Research Infrastructure Strategy Facilities

Main Authors:	Chew, Tracy, Samaha, Georgina, Gustafsson, Ove J. R., Beecroft, Sarah, De La Pierre, Marco, Ward, Nigel, Sadsad, Rosemarie
Format:	info Proceeding Journal
Bahasa:	eng
Terbitan:	, 2021
Subjects:	Bioinformatics Workflows National Collaborative Research Infrastructure High Performance Computing Computational Biology eResearch 2021
Online Access:	https://zenodo.org/record/5587827

ctrlnum	5587827
fullrecord	<?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Chew, Tracy</creator><creator>Samaha, Georgina</creator><creator>Gustafsson, Ove J. R.</creator><creator>Beecroft, Sarah</creator><creator>De La Pierre, Marco</creator><creator>Ward, Nigel</creator><creator>Sadsad, Rosemarie</creator><date>2021-10-13</date><description>This presentation took place during a session at the eResearch Australasia 2021 Conference 11-15 October 2021. A recording of this presentation is available on the Australian BioCommons YouTube Channel. Abstract: Introduction Australian researchers rely on High Performance Computing (HPC) to process increasingly large life science and ‘-omics’ datasets. Bioinformatics workflows are often complex and, unlike traditional HPC, orchestrate multiple data, memory, I/O, and/or time-intensive compute tasks that mismatch local infrastructure paradigms. As a part of the Australian BioCommons, we have re-engineered popular bioinformatics workflows to enable scalable and efficient use of National Collaborative Research Infrastructure Strategy (NCRIS) eResearch facilities. Methods Re-engineered workflows and end-user resources were developed to meet FAIR data principles. Workflows identified through national surveys were optimised in collaboration with multiple research groups and national compute facility specialists. This ensured user-friendliness, applicability across research interests, datasets, and optimal use of local HPC architecture. Workflows are registered on WorkflowHub for findability and pipeline adoption is supported through training, user guides and documentation that follow FAIR recommendations by the Australian BioCommons. Results Developed workflows are available through the Australian BioCommons WorkflowHub space (https://workflowhub.eu/programmes/8). Between 2019-21, optimised workflows were successfully adopted by independent groups across pathogen, human, domestic animal, agricultural, and wildlife research. They have contributed to successful Australian competitive grants, national/local HPC allocation applications and research publications. Conclusions Creating efficient and scalable bioinformatics workflows on Australian compute infrastructure requires custom development to meld specific infrastructure hardware, usage and access policies with domain best practices. We are refining support and ongoing maintenance models and aim to enhance existing pipelines by adding auto-deployment capabilities and portability to other compute infrastructure accessible to Australian researchers.</description><identifier>https://zenodo.org/record/5587827</identifier><identifier>10.5281/zenodo.5587827</identifier><identifier>oai:zenodo.org:5587827</identifier><language>eng</language><relation>doi:10.5281/zenodo.5587826</relation><relation>url:https://zenodo.org/communities/australianbiocommons</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><subject>Bioinformatics</subject><subject>Workflows</subject><subject>National Collaborative Research Infrastructure</subject><subject>High Performance Computing</subject><subject>Computational Biology</subject><subject>eResearch 2021</subject><title>Scalable bioinformatics workflows for growing '-omics' datasets on National Collaborative Research Infrastructure Strategy Facilities</title><type>Other:info:eu-repo/semantics/lecture</type><type>Journal:Proceeding</type><recordID>5587827</recordID></dc>
language	eng
format	Other:info:eu-repo/semantics/lecture Other Journal:Proceeding Journal Journal:Journal
author	Chew, Tracy Samaha, Georgina Gustafsson, Ove J. R. Beecroft, Sarah De La Pierre, Marco Ward, Nigel Sadsad, Rosemarie
title	Scalable bioinformatics workflows for growing '-omics' datasets on National Collaborative Research Infrastructure Strategy Facilities
publishDate	2021
topic	Bioinformatics Workflows National Collaborative Research Infrastructure High Performance Computing Computational Biology eResearch 2021
url	https://zenodo.org/record/5587827
contents	This presentation took place during a session at the eResearch Australasia 2021 Conference 11-15 October 2021. A recording of this presentation is available on the Australian BioCommons YouTube Channel. Abstract: Introduction Australian researchers rely on High Performance Computing (HPC) to process increasingly large life science and ‘-omics’ datasets. Bioinformatics workflows are often complex and, unlike traditional HPC, orchestrate multiple data, memory, I/O, and/or time-intensive compute tasks that mismatch local infrastructure paradigms. As a part of the Australian BioCommons, we have re-engineered popular bioinformatics workflows to enable scalable and efficient use of National Collaborative Research Infrastructure Strategy (NCRIS) eResearch facilities. Methods Re-engineered workflows and end-user resources were developed to meet FAIR data principles. Workflows identified through national surveys were optimised in collaboration with multiple research groups and national compute facility specialists. This ensured user-friendliness, applicability across research interests, datasets, and optimal use of local HPC architecture. Workflows are registered on WorkflowHub for findability and pipeline adoption is supported through training, user guides and documentation that follow FAIR recommendations by the Australian BioCommons. Results Developed workflows are available through the Australian BioCommons WorkflowHub space (https://workflowhub.eu/programmes/8). Between 2019-21, optimised workflows were successfully adopted by independent groups across pathogen, human, domestic animal, agricultural, and wildlife research. They have contributed to successful Australian competitive grants, national/local HPC allocation applications and research publications. Conclusions Creating efficient and scalable bioinformatics workflows on Australian compute infrastructure requires custom development to meld specific infrastructure hardware, usage and access policies with domain best practices. We are refining support and ongoing maintenance models and aim to enhance existing pipelines by adding auto-deployment capabilities and portability to other compute infrastructure accessible to Australian researchers.
id	IOS16997.5587827
institution	ZAIN Publications
institution_id	7213
institution_type	library:special library
library	Cognizance Journal of Multidisciplinary Studies
library_id	5267
collection	Cognizance Journal of Multidisciplinary Studies
repository_id	16997
subject_area	Multidisciplinary
city	Stockholm
province	INTERNASIONAL
shared_to_ipusnas_str	1
repoId	IOS16997
first_indexed	2022-06-06T05:19:57Z
last_indexed	2022-06-06T05:19:57Z
recordtype	dc
_version_	1734904963051552768
score	17.538404

Scalable bioinformatics workflows for growing '-omics' datasets on National Collaborative Research Infrastructure Strategy Facilities

Lihat Juga