Usage examples for all datasets listed in the Registry of Open Data on AWS.


The Cancer Genome Atlas

Tools & Applications
Publications

Therapeutically Applicable Research to Generate Effective Treatments (TARGET)

Tools & Applications
Publications

Common Crawl

Tutorials
Tools & Applications
Publications

Sentinel-2

Tutorials
Tools & Applications
Publications

Gabriella Miller Kids First Pediatric Research Program (Kids First)

Tools & Applications
Publications

Landsat 8

Tutorials
Tools & Applications
Publications

Sudachi Language Resources

Tutorials
Tools & Applications
Publications

Foldingathome COVID-19 Datasets

Tutorials
Tools & Applications
Publications

Genome Aggregation Database (gnomAD)

Tools & Applications
Publications

NEXRAD on AWS

Tutorials
Tools & Applications
Publications

Allen Cell Imaging Collections

Tutorials
Tools & Applications
Publications

CBERS on AWS

Tutorials
Tools & Applications
Publications

International Neuroimaging Data-Sharing Initiative (INDI)

Tutorials
Tools & Applications
Publications

Terrain Tiles

Tools & Applications
Publications

IRS 990 Filings

Tutorials
Tools & Applications

Multi-Scale Ultra High Resolution (MUR) Sea Surface Temperature (SST)

Tutorials
Tools & Applications
Publications

NOAA Geostationary Operational Environmental Satellites (GOES) 16 & 17

Tutorials
Tools & Applications
Publications

Cancer Cell Line Encyclopedia (CCLE)

Tools & Applications
Publications

NOAA Water-Column Sonar Data Archive

Tutorials
Tools & Applications
Publications

Sentinel-2 Cloud-Optimized GeoTIFFs

Tutorials
Tools & Applications
Publications

SpaceNet

Tutorials
Tools & Applications
Publications

eBird Status and Trends Model Results

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 2 (CPTAC-2)

Tools & Applications
Publications

Fly Brain Anatomy: FlyLight Gen1 and Split-GAL4 Imagery

Tutorials
Tools & Applications
Publications

ICGC on AWS

Tutorials
Publications

NREL Wind Integration National Dataset

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 3 (CPTAC-3)

Tools & Applications
Publications

Global Database of Events, Language and Tone (GDELT)

Tutorials
Tools & Applications

Low Altitude Disaster Imagery (LADI) Dataset

Tutorials
Tools & Applications
Publications

NYU Langone & FAIR FastMRI Dataset

Tutorials
Publications

Open NeuroData

Tutorials
Tools & Applications
Publications

Radiant MLHub

Tutorials
Tools & Applications
Publications

COVID-19 Data Lake

Tutorials
Tools & Applications

CoMMpass from the Multiple Myeloma Research Foundation

Tools & Applications
Publications

National Renewable Energy Laboratory's (NREL) PV Rooftop Database

Tools & Applications
Publications

Ozone Monitoring Instrument (OMI) / Aura NO2 Tropospheric Column Density

Tutorials
Tools & Applications

Prefeitura Municipal de São Paulo (PMSP) LiDAR Point Cloud

Tools & Applications
Publications

Southern California Earthquake Data

Tutorials

USGS 3DEP LiDAR Point Clouds

Tutorials
Tools & Applications
Publications

1000 Genomes Phase 3 Reanalysis with DRAGEN 3.5

Tutorials
Tools & Applications
Publications

3000 Rice Genomes Project

Tools & Applications
Publications

Community Earth System Model Large Ensemble (CESM LENS)

Tutorials
Tools & Applications
Publications

DOE's Water Power Technology Office's (WPTO) US Wave dataset

Tools & Applications
Publications

Department of Energy's Open Energy Data Initiative (OEDI)

Tools & Applications
Publications

Encyclopedia of DNA Elements (ENCODE)

Tutorials
Publications

First Street Foundation (FSF) Flood Risk Summary Statistics

Tools & Applications
Publications

GEOS-Chem Input Data

Tutorials
Publications

Genome in a Bottle on AWS

Tools & Applications
Publications

NIH NCBI Sequence Read Archive (SRA) on AWS

Tutorials
Tools & Applications
Publications

NOAA High-Resolution Rapid Refresh (HRRR) Model

Tutorials

New York City Taxi and Limousine Commission (TLC) Trip Record Data

Tutorials

OpenAQ

Tutorials
Tools & Applications
Publications

OpenStreetMap on AWS

Tutorials
Tools & Applications

Refgenie reference genome assets

Tutorials
Tools & Applications
Publications

Sentinel-1

Tools & Applications

Sentinel-3

Tutorials
Tools & Applications
Publications

Sentinel-5P Level 2

Tutorials
Tools & Applications
Publications

UK Biobank Pan-Ancestry Summary Statistics

Tutorials
Tools & Applications
Publications

Allen Mouse Brain Atlas

Tutorials
Tools & Applications
Publications

Basic Local Alignment Sequences Tool (BLAST) Databases

Tools & Applications
Publications

Beat Acute Myeloid Leukemia (AML) 1.0

Tools & Applications
Publications

COVID-19 Harmonized Data

Tutorials
Tools & Applications

Clinical Trial Sequencing Project - Diffuse Large B-Cell Lymphoma

Tools & Applications
Publications

Deutsche Börse Public Dataset

Tutorials
Tools & Applications

Distributed Archives for Neurophysiology Data Integration (DANDI)

Tools & Applications

Finnish Meteorological Institute Weather Radar Data

Tutorials

Foundation Medicine Adult Cancer Clinical Dataset (FM-AD)

Tools & Applications
Publications

JMA Himawari-8

Publications

Japanese Tokenizer Dictionaries

Tutorials
Tools & Applications
Publications

Medical Segmentation Decathlon

Tutorials
Tools & Applications
Publications

NASA NEX

Tutorials
Tools & Applications
Publications

NOAA Global Ensemble Forecast System (GEFS) Re-forecast

Tutorials
Publications

NOAA Global Historical Climatology Network Daily (GHCN-D)

Tutorials

NREL National Solar Radiation Database

Tools & Applications
Publications

National Herbarium of NSW

Tutorials
Publications

OpenEEW

Tutorials
Tools & Applications

SILO climate data on AWS

Tools & Applications

Storm EVent ImageRy (SEVIR)

Tutorials
Tools & Applications

The Human Microbiome Project

Publications

Variant Effect Predictor (VEP) and the Loss-Of-Function Transcript Effect Estimator (LOFTEE) Plugin

Tools & Applications

Africa Soil Information Service (AfSIS) Soil Chemistry

Tutorials
Publications

Amazon Bin Image Dataset

Publications

Atmospheric Models from Météo-France

Tools & Applications

Broad Genome References

Tutorials
Tools & Applications

Cancer Genome Characterization Initiatives - Burkitt Lymphoma, HIV+ Cervical Cancer

Tools & Applications
Publications

Copernicus Digital Elevation Model (DEM)

Tools & Applications

ECMWF ERA5 Reanalysis

Tutorials

Hubble Space Telescope Public Data

Tutorials
Publications

MIMIC-III (‘Medical Information Mart for Intensive Care’)

Tutorials
Tools & Applications

NAIP on AWS

Tools & Applications

NOAA National Water Model Reanalysis

Tutorials
Publications

NOAA World Ocean Database (WOD)

Publications

National Cancer Institute Center for Cancer Research - Diffuse Large B Cell Lymphoma (DLBCL) Genomics and Expression

Tools & Applications
Publications

Open City Model (OCM)

Tutorials

Oregon Health & Science University Chronic Neutrophilic Leukemia Dataset

Tools & Applications
Publications

Pancreatic Cancer Organoid Profiling

Tools & Applications
Publications

RAPID NRT Flood Maps

Publications

Rapid7 FDNS ANY Dataset

Tutorials

RarePlanes

Tools & Applications
Publications

Sentinel-1 SLC dataset for South and Southeast Asia, Taiwan, Korea and Japan

Tutorials
Publications

Terra Fusion Data Sampler

Tutorials
Tools & Applications

UK Met Office Atmospheric Deterministic and Probabilistic Forecasts

Tutorials

1000 Genomes

Publications

AWS iGenomes

Tools & Applications

Allen Brain Observatory - Visual Coding AWS Public Data Set

Tutorials

Answer Reformulation

Publications

Automatic Speech Recognition (ASR) Error Robustness

Publications

CMIP6 GCMs downscaled using WRF

Tutorials

COVID-19 Genome Sequence Dataset

Tools & Applications

Cell Painting Image Collection

Publications

Coupled Model Intercomparison Project 6

Publications

CoversBR

Tutorials

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

Publications

Enriched Topical-Chat Dataset for Knowledge-Grounded Dialogue Systems

Publications

Ford Multi-AV Seasonal Dataset

Tutorials

GATK Test Data

Tools & Applications

Geosnap Data, Center for Geospatial Sciences

Tools & Applications

Human Cancer Models Initiative (HCMI) Cancer Model Development Center

Tools & Applications

Human PanGenomics Project

Publications

IDEAM - Colombian Radar Network

Tutorials

LOFAR ELAIS-N1 cycle 2 observations on AWS

Publications

NOAA Global Forecast System (GFS)

Publications

NOAA National Digital Forecast Database (NDFD)

Publications

NOAA Operational Forecast System (OFS)

Tools & Applications

New Jersey Statewide Digital Aerial Imagery Catalog

Tutorials

New Jersey Statewide LiDAR

Tutorials

Ohio State Cardiac MRI Raw Data (OCMR)

Tutorials

Oxford Nanopore Technologies Benchmark Datasets

Tutorials

QIIME 2 User Tutorial Datasets

Tutorials

SILAM Air Quality

Tutorials

Safecast

Tools & Applications

Tabula Muris

Publications

The Multilingual Amazon Reviews Corpus

Publications

Transiting Exoplanet Survey Satellite (TESS)

Publications

U.S. Census ACS PUMS

Tutorials

Voices Obscured in Complex Environmental Settings (VOiCES)

Tutorials

Xiph.Org Test Media

Tutorials

Yale-CMU-Berkeley (YCB) Object and Model Set

Publications

ZINC Database

Publications

iHART Whole Genome Sequencing Data Set

Publications

PoroTomo

Tutorials
Publications

Sophos/ReversingLabs 20 Million malware detection dataset

Tutorials
Tools & Applications
Publications

Binding DB

Tutorials
Publications

Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS

Publications

ChEMBL

Tutorials
Publications

Covid Job Impacts - US Hiring Data Since March 1 2020

Tutorials
Tools & Applications

Open Targets

Tutorials
Publications

AgricultureVision

Publications

COVID-19 Open Research Dataset (CORD-19)

Tools & Applications

Cloud Indexes for Genomic Analyses

Tutorials

Humor Detection from Product Question Answering Systems

Publications

MODIS MYD13A1, MOD13A1, MYD11A1, MOD11A1, MCD43A4

Tools & Applications

Swiss Public Transport Stops

Tools & Applications

If you want to add a dataset or usage example to this registry, please follow the instructions on the Registry of Open Data on AWS GitHub repository.

Home