Dataset for deep-learning in-situ classification of HIV-1 virion morphology

Main Authors: Rey, Juan S., Li, Wen, Bryer, Alexander J., Beatson, Hagan, Lantz, Christian, Engelman, Alan N., Perilla, Juan R.
Format: info dataset Journal
Bahasa: eng
Terbitan: , 2021
Subjects:
Online Access: https://zenodo.org/record/5149062
Daftar Isi:
  • This dataset contains TEM micrographs for HIV-1 virion samples intended for classification and detection as follows: HIV-1_virion_classification_backbone_dataset.zip : Contains 1806 .tif images of isolated HIV-1 virions extracted and augmented from TEM micrographs. The images are divided into training (1443 images) and validation (363 images) sets and each of these is divided into eccentric, mature, immature labeled folders: HIV-1_virion_classification_backbone_dataset/ train/ eccentric/ immature/ mature/ val/ eccentric/ immature/ mature/ HIV-1_rcnn_dataset_full.zip : Contains 59 .tif TEM micrographs of HIV-1 samples as well as a matching .csv file recording the attributes of each viral instance and coordinates of the rectangular region that contains it: region_data_<image_id>.csv: filename: Name of the image this csv refers to. (Ex: 0131001.png) file_size: Size (bytes) of the image this csv refers to. (Ex: 11755590) file_attributes: Specific attributes of the micrograph. (Ex: None) region_count: Number of viral instances detected in the micrograph. (Ex: 39) region_id: ID of a specific viral region. (Ex: 1) region_shape_attributes: Coordinates of the bounding box of <region_id> that contains a virion. (Ex: {"name":"rect","x":1022,"y":357,"width":225,"height":228}) region_attributes: Classification of the virion enclosed in this region (eccentric/mature/immature). (Ex: {"particle_class":"mature"}) The images are divided into training (46 images) and validation (13 images) sets and each of these contains folder for each micrograph: HIV-1_rcnn_dataset_full/ train/ 0131001/ 0131001.png region_data_0131001.csv 0131004/ 0131004.png region_data_0131004.csv ... val/ 0131002/ 0131002.png region_data_0131002.csv 0131003/ 0131003.png region_data_0131003.csv ... The first dataset (HIV-1_virion_classification_backbone_dataset.zip) is intended for viral classification algorithms while the second dataset (HIV-1_rcnn_dataset_full) is intended for detection and segmentation algorithms (for example RCNN). Applications of this dataset as well as source code can be found at https://github.com/Perilla-lab/TEMNet .