The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

Usage examples for all datasets listed in the Registry of Open Data on AWS tagged with life sciences.


The Human Sleep Project

Tools & Applications
Publications

The Cancer Genome Atlas

Tools & Applications
Publications

Foldingathome COVID-19 Datasets

Tutorials
Tools & Applications
Publications

Therapeutically Applicable Research to Generate Effective Treatments (TARGET)

Tools & Applications
Publications

Gabriella Miller Kids First Pediatric Research Program (Kids First)

Tools & Applications
Publications

Allen Cell Imaging Collections

Tutorials
Tools & Applications
Publications

Genome Aggregation Database (gnomAD)

Tools & Applications
Publications

The Singapore Nanopore Expression Data Set

Tutorials
Tools & Applications
Publications

Fly Brain Anatomy: FlyLight Gen1 and Split-GAL4 Imagery

Tutorials
Tools & Applications
Publications

International Neuroimaging Data-Sharing Initiative (INDI)

Tutorials
Tools & Applications
Publications

Garvan Institute Long Read Sequencing Benchmark Data

Tutorials
Tools & Applications
Publications

Open NeuroData

Tutorials
Tools & Applications
Publications

PubSeq - Public Sequence Resource

Tutorials
Tools & Applications
Publications

Cancer Cell Line Encyclopedia (CCLE)

Tools & Applications
Publications

IBL Neuropixels Brainwide Map on AWS

Tutorials
Tools & Applications

Toxicant Exposures and Responses by Genomic and Epigenomic Regulators of Transcription (TaRGET)

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 2 (CPTAC-2)

Tools & Applications
Publications

ICGC on AWS

Tutorials
Publications

1000 Genomes Phase 3 Reanalysis with DRAGEN 3.5 and 3.7

Tutorials
Tools & Applications
Publications

BossDB Open Neuroimagery Datasets

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 3 (CPTAC-3)

Tools & Applications
Publications

IBL Neuropixels Reproducible Ephys Data on AWS

Tutorials
Tools & Applications
Publications

NYU Langone & FAIR FastMRI Dataset

Tutorials
Publications

Open Bioinformatics Reference Data for Galaxy

Tutorials
Tools & Applications
Publications

Serratus: Ultra-deep Search for Novel Viruses - Versioned Data Release

Tools & Applications
Publications

3000 Rice Genomes Project

Tools & Applications
Publications

CAncer MEtastases in LYmph nOdes challeNge (CAMELYON) Dataset

Tools & Applications
Publications

IBL Behavioral Data on AWS

Tutorials
Tools & Applications
Publications

NIH NCBI Sequence Read Archive (SRA) on AWS

Tutorials
Tools & Applications
Publications

The Human Connectome Project

Tutorials
Tools & Applications
Publications

Basic Local Alignment Sequences Tool (BLAST) Databases

Tools & Applications
Publications

Encyclopedia of DNA Elements (ENCODE)

Tutorials
Publications

Genome in a Bottle on AWS

Tools & Applications
Publications

Molecular Profiling to Predict Response to Treatment (phs001965)

Tools & Applications
Publications

Mouse Brain Anatomy: MouseLight Imagery

Tools & Applications
Publications

OpenCell on AWS

Tools & Applications
Publications

Refgenie reference genome assets

Tutorials
Tools & Applications
Publications

Synthea synthetic patient generator data in OMOP Common Data Model

Tutorials
Tools & Applications

UK Biobank Linkage Disequilibrium Matrices

Tutorials
Tools & Applications
Publications

UK Biobank Pan-Ancestry Summary Statistics

Tutorials
Tools & Applications
Publications

Allen Ivy Glioblastoma Atlas

Tutorials
Tools & Applications
Publications

Allen Mouse Brain Atlas

Tutorials
Tools & Applications
Publications

Beat Acute Myeloid Leukemia (AML) 1.0

Tools & Applications
Publications

Broad Genome References

Tutorials
Tools & Applications
Publications

COBRA

Tools & Applications

COVID-19 Harmonized Data

Tutorials
Tools & Applications

Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS

Publications

Clinical Trial Sequencing Project - Diffuse Large B-Cell Lymphoma

Tools & Applications
Publications

Distributed Archives for Neurophysiology Data Integration (DANDI)

Tools & Applications

Exceptional Responders Initiative

Tools & Applications
Publications

I-CARE:International Cardiac Arrest REsearch consortium Electroencephalography Database

Tools & Applications
Publications

MIMIC-III (‘Medical Information Mart for Intensive Care’)

Tutorials
Tools & Applications

Medical Segmentation Decathlon

Tutorials
Tools & Applications
Publications

NASA Space Biology Open Science Data Repository (OSDR)

Publications

OpenProteinSet

Tutorials
Publications

SPaRCNet data:Seizures, Rhythmic and Periodic Patterns in ICU Electroencephalography

Tools & Applications
Publications

STOIC2021 Training

Tools & Applications
Publications

The Human Microbiome Project

Publications

Variant Effect Predictor (VEP) and the Loss-Of-Function Transcript Effect Estimator (LOFTEE) Plugin

Tools & Applications

VirtualFlow Ligand Libraries

Tutorials
Tools & Applications
Publications

4D Nucleome (4DN)

Tutorials

Africa Soil Information Service (AfSIS) Soil Chemistry

Tutorials
Publications

Allen Institute for Brain Science - Synaptic Physiology Public Data Set

Tools & Applications
Publications

Allen Institute for Neural Dynamics - Extracellular Electrophysiology Compression Benchmark

Tutorials
Publications

Binding DB - Data Lakehouse Ready

Tutorials
Publications

COVID-19 Data Lake

Tutorials
Tools & Applications

Cancer Genome Characterization Initiatives - Burkitt Lymphoma, HIV+ Cervical Cancer

Tools & Applications
Publications

Cell Painting Image Collection

Tools & Applications
Publications

DNAStack COVID19 SRA Data

Tutorials
Tools & Applications

GATK Structural Variation (SV) Data

Tutorials
Tools & Applications

Genomic Characterization of Metastatic Castration Resistant Prostate Cancer

Tools & Applications
Publications

Harvard Electroencephalography Database

Tools & Applications
Publications

Harvard-Emory ECG Database

Tools & Applications
Publications

Hecatomb Databases

Tutorials
Publications

Indexes for Kaiju

Tutorials
Publications

Integrative Analysis of Lung Adenocarcinoma in Environment and Genetics Lung cancer Etiology (Phase 2)

Tools & Applications

OpenCRAVAT

Tutorials
Tools & Applications

Oregon Health & Science University Chronic Neutrophilic Leukemia Dataset

Tools & Applications
Publications

Protein Data Bank 3D Structural Biology Data

Publications

REDASA COVID-19 Open Data

Tools & Applications
Publications

Sounds of Central African landscapes

Publications

TIGER Training

Tools & Applications

UniProt

Tutorials

1000 Genomes

Publications

Allen Brain Observatory - Visual Coding AWS Public Data Set

Tutorials

Allen Institute for Neural Dynamics - Mouse Neuroanatomy and Physiology Data

Tutorials

CIViC (Clinical Interpretation of Variants in Cancer)

Publications

CMS 2008-2010 Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF) in OMOP Common Data Model

Tutorials
Tools & Applications

COVID-19 Genome Sequence Dataset

Tools & Applications

COVID-19 Open Research Dataset (CORD-19)

Tools & Applications

Conformational Space of Short Peptides

Tutorials

GATK Test Data

Tools & Applications

Global Biodiversity Information Facility (GBIF) Species Occurrences

Tutorials

Human Cancer Models Initiative (HCMI) Cancer Model Development Center

Tools & Applications

Human PanGenomics Project

Publications

NIH NCBI PubMed Central (PMC) Article Datasets - Full-Text Biomedical and Life Sciences Journal Articles on AWS

Tutorials

NYUMets Brain Dataset

Publications

Ohio State Cardiac MRI Raw Data (OCMR)

Tutorials

Oxford Nanopore Technologies Benchmark Datasets

Tutorials

Synthea Coherent Data Set

Publications

Tabula Muris

Publications

Tabula Muris Senis

Tutorials

Tabula Sapiens

Publications

VitalDB

Publications

ZINC Database

Publications

iHART Whole Genome Sequencing Data Set

Publications

recount3

Tutorials

ChEMBL - Data Lakehouse Ready

Tutorials
Publications

ClinVar - Data Lakehouse Ready

Tutorials
Publications

Open Targets - Data Lakehouse Ready

Tutorials
Publications

1000 Genomes Phase 3 Reanalysis with DRAGEN 3.5 - Data Lakehouse Ready

Tutorials

AWS iGenomes

Tools & Applications

Genome Aggregation Database (gnomAD) - Data Lakehouse Ready

Tutorials

Google Brain Genomics Sequencing Dataset for Benchmarking and Development

Publications

If you want to add a dataset or usage example to this registry, please follow the instructions on the Registry of Open Data on AWS GitHub repository or tell us about your project.

Home