Usage examples for all datasets listed in the Registry of Open Data on AWS.


The Cancer Genome Atlas

Tools & Applications
Publications

Therapeutically Applicable Research to Generate Effective Treatments (TARGET)

Tools & Applications
Publications

Sentinel-2

Tutorials
Tools & Applications
Publications

Gabriella Miller Kids First Pediatric Research Program (Kids First)

Tools & Applications
Publications

Landsat 8

Tutorials
Tools & Applications
Publications

NEXRAD on AWS

Tutorials
Tools & Applications
Publications

Genome Aggregation Database (gnomAD)

Tools & Applications
Publications

CBERS on AWS

Tutorials
Tools & Applications
Publications

Common Crawl

Tutorials
Tools & Applications
Publications

International Neuroimaging Data-Sharing Initiative (INDI)

Tutorials
Tools & Applications
Publications

Allen Cell Imaging Collections

Tutorials
Tools & Applications
Publications

IRS 990 Filings

Tutorials
Tools & Applications

Multi-Scale Ultra High Resolution (MUR) Sea Surface Temperature (SST)

Tutorials
Tools & Applications
Publications

Terrain Tiles

Tools & Applications

Cancer Cell Line Encyclopedia (CCLE)

Tools & Applications
Publications

NOAA Water-Column Sonar Data Archive

Tutorials
Tools & Applications
Publications

eBird Status and Trends Model Results

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 2 (CPTAC-2)

Tools & Applications
Publications

Fly Brain Anatomy: FlyLight Gen1 and Split-GAL4 Imagery

Tutorials
Tools & Applications
Publications

ICGC on AWS

Tutorials
Publications

NOAA Geostationary Operational Environmental Satellites (GOES) 16 & 17

Tutorials
Publications

NREL Wind Integration National Dataset

Tutorials
Tools & Applications
Publications

SpaceNet

Tutorials
Tools & Applications
Publications

Clinical Proteomic Tumor Analysis Consortium 3 (CPTAC-3)

Tools & Applications
Publications

Global Database of Events, Language and Tone (GDELT)

Tutorials
Tools & Applications

Open NeuroData

Tutorials
Tools & Applications
Publications

Radiant MLHub

Tutorials
Tools & Applications
Publications

CoMMpass from the Multiple Myeloma Research Foundation

Tools & Applications
Publications

NYU Langone & FAIR FastMRI Dataset

Tutorials
Publications

National Renewable Energy Laboratory's (NREL) PV Rooftop Database

Tools & Applications
Publications

Ozone Monitoring Instrument (OMI) / Aura NO2 Tropospheric Column Density

Tutorials
Tools & Applications

Prefeitura Municipal de São Paulo (PMSP) LiDAR Point Cloud

Tools & Applications
Publications

3000 Rice Genomes Project

Tools & Applications
Publications

COVID-19 Data Lake

Tutorials
Tools & Applications

Community Earth System Model Large Ensemble (CESM LENS)

Tutorials
Tools & Applications
Publications

DOE's Water Power Technology Office's (WPTO) US Wave dataset

Tools & Applications
Publications

Encyclopedia of DNA Elements (ENCODE)

Tutorials
Publications

GEOS-Chem Input Data

Tutorials
Publications

NIH NCBI Sequence Research Archive (SRA) on AWS

Tutorials
Tools & Applications
Publications

New York City Taxi and Limousine Commission (TLC) Trip Record Data

Tutorials

OpenAQ

Tutorials
Tools & Applications
Publications

OpenStreetMap on AWS

Tutorials
Tools & Applications

Sentinel-1

Tools & Applications

Sentinel-3

Tutorials
Tools & Applications
Publications

Sentinel-5P Level 2

Tutorials
Tools & Applications
Publications

Allen Mouse Brain Atlas

Tutorials
Tools & Applications
Publications

Basic Local Alignment Sequences Tool (BLAST) Databases

Tools & Applications
Publications

Beat Acute Myeloid Leukemia (AML) 1.0

Tools & Applications
Publications

COVID-19 Harmonized Data

Tutorials
Tools & Applications

Clinical Trial Sequencing Project - Diffuse Large B-Cell Lymphoma

Tools & Applications
Publications

Deutsche Börse Public Dataset

Tutorials
Tools & Applications

Distributed Archives for Neurophysiology Data Integration (DANDI)

Tools & Applications

Foundation Medicine Adult Cancer Clinical Dataset (FM-AD)

Tools & Applications
Publications

JMA Himawari-8

Publications

Lawrence Berkeley National Laboratory (LBNL) Tracking the Sun Dataset

Tools & Applications
Publications

Medical Segmentation Decathlon

Tutorials
Tools & Applications
Publications

NASA NEX

Tutorials
Tools & Applications
Publications

NOAA Global Ensemble Forecast System (GEFS) Re-forecast

Tutorials
Publications

NREL National Solar Radiation Database

Tools & Applications
Publications

OpenEEW

Tutorials
Tools & Applications

SILO climate data on AWS

Tools & Applications

Southern California Earthquake Data

Tutorials

Storm EVent ImageRy (SEVIR)

Tutorials
Tools & Applications

The Human Microbiome Project

Publications

USGS 3DEP LiDAR Point Clouds

Tutorials
Publications

Variant Effect Predictor (VEP) and the Loss-Of-Function Transcript Effect Estimator (LOFTEE) Plugin

Tools & Applications

Africa Soil Information Service (AfSIS) Soil Chemistry

Tutorials
Publications

Amazon Bin Image Dataset

Publications

Atmospheric Models from Météo-France

Tools & Applications

Broad Genome References

Tutorials
Tools & Applications

Cancer Genome Characterization Initiatives - Burkitt Lymphoma, HIV+ Cervical Cancer

Tools & Applications
Publications

ECMWF ERA5 Reanalysis

Tutorials

Hubble Space Telescope Public Data

Tutorials
Publications

MIMIC-III (‘Medical Information Mart for Intensive Care’)

Tutorials
Tools & Applications

NAIP on AWS

Tools & Applications

NOAA Global Historical Climatology Network Daily (GHCN-D)

Tutorials

NOAA World Ocean Database (WOD)

Publications

National Cancer Institute Center for Cancer Research - Diffuse Large B Cell Lymphoma (DLBCL) Genomics and Expression

Tools & Applications
Publications

Open City Model (OCM)

Tutorials

Oregon Health & Science University Chronic Neutrophilic Leukemia Dataset

Tools & Applications
Publications

Pancreatic Cancer Organoid Profiling

Tools & Applications
Publications

Rapid7 FDNS ANY Dataset

Tutorials

RarePlanes

Tools & Applications
Publications

Sentinel-1 SLC dataset for South and Southeast Asia, Taiwan, Korea and Japan

Tutorials
Publications

Terra Fusion Data Sampler

Tutorials
Tools & Applications

UK Met Office Atmospheric Deterministic and Probabilistic Forecasts

Tutorials

1000 Genomes

Publications

AWS iGenomes

Tools & Applications

Allen Brain Observatory - Visual Coding AWS Public Data Set

Tutorials

COVID-19 Genome Sequence Dataset

Tools & Applications

Cell Painting Image Collection

Publications

Department of Energy's Open Energy Data Initiative (OEDI)

Publications

First Street Foundation (FSF) Flood Risk Summary Statistics

Tools & Applications
Publications

Ford Multi-AV Seasonal Dataset

Tutorials

GATK Test Data

Tools & Applications

Human Cancer Models Initiative (HCMI) Cancer Model Development Center

Tools & Applications

LOFAR ELAIS-N1 cycle 2 observations on AWS

Publications

NOAA National Digital Forecast Database (NDFD)

Publications

NOAA National Water Model Reanalysis

Publications

NOAA Operational Forecast System (OFS)

Tools & Applications

New Jersey Statewide Digital Aerial Imagery Catalog

Tutorials

New Jersey Statewide LiDAR

Tutorials

Ohio State Cardiac MRI Raw Data (OCMR)

Tutorials

QIIME 2 User Tutorial Datasets

Tutorials

Refgenie reference genome assets

Tutorials
Tools & Applications
Publications

SILAM Air Quality

Tutorials

Safecast

Tools & Applications

Tabula Muris

Publications

Transiting Exoplanet Survey Satellite (TESS)

Publications

U.S. Census ACS PUMS

Tutorials

UK Biobank Pan-Ancestry Summary Statistics

Tutorials
Tools & Applications
Publications

Voices Obscured in Complex Environmental Settings (VOiCES)

Tutorials

Xiph.Org Test Media

Tutorials

Yale-CMU-Berkeley (YCB) Object and Model Set

Publications

ZINC Database

Publications

iHART Whole Genome Sequencing Data Set

Publications

National Herbarium of NSW

Tutorials
Publications

PoroTomo

Tutorials
Publications

Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS

Publications

ChEMBL

Tutorials
Publications

Covid Job Impacts - US Hiring Data Since March 1 2020

Tutorials
Tools & Applications

Open Targets

Tutorials
Publications

RAPID NRT Flood Maps

Publications

Answer Reformulation

Publications

Automatic Speech Recognition (ASR) Error Robustness

Publications

COVID-19 Open Research Dataset (CORD-19)

Tools & Applications

Cloud Indexes for Genomic Analyses

Tutorials

Coupled Model Intercomparison Project 6

Publications

Enriched Topical-Chat Dataset for Knowledge-Grounded Dialogue Systems

Publications

Geosnap Data, Center for Geospatial Sciences

Tools & Applications

Humor Detection from Product Question Answering Systems

Publications

MODIS MYD13A1, MOD13A1, MYD11A1, MOD11A1, MCD43A4

Tools & Applications

Swiss Public Transport Stops

Tools & Applications

If you want to add a dataset or usage example to this registry, please follow the instructions on the Registry of Open Data on AWS GitHub repository.

Home