Mid and Base sized Bio-ELECTRA pre-trained biomedical deep learning language representation models

Main Author: Burak Ozyurt
Format: info software Journal
Terbitan: , 2021
Subjects:
Online Access: https://zenodo.org/record/4699034
Daftar Isi:
  • Mid and Base sized Bio-ELECTRA pre-trained biomedical deep learning language representation models. The models are pre-trained on 2021 Base PubMed abstracts and PMC open access papers. For details see 'Detecting Anatomical and Functional Connectivity Relations in Biomedical Literature via Language Representation Models' to appear in SDP 2021 @ NAACL-HLT 2021. Model, Params, Architecture, Steps, Train Time/Hardware Bio-ELECTRA Mid, 50M ,hidden:512, layers:12, 1.2M, 6.5d on 8 TPUv3s Bio-ELECTRA Base, 110M hidden:768, layers:12, 1.2M, 12.5d on 8 TPUv3s Bio-ELECTRA Mid-tall, 88M, hidden:512, layers:24, batch:128, 1M 5.5d on 8 TPUv3s Bio-ELECTRA Mid Combined, 50M, hidden:512, layers:12 1.2M, 6.5d on 8 TPUv3s