Registry of Open Data on AWS

About

This registry exists to help people discover and share datasets that are available via AWS resources. See recent additions and learn more about sharing data on AWS.

See all usage examples for datasets listed in this registry tagged with computer vision.

Search datasets (currently 13 matching datasets)

You are currently viewing a subset of data tagged with computer vision.

Add to this registry

If you want to add a dataset or example of how to use a dataset to this registry, please follow the instructions on the Registry of Open Data on AWS GitHub repository.

Unless specifically stated in the applicable dataset documentation, datasets available through the Registry of Open Data on AWS are not provided and maintained by AWS. Datasets are provided and maintained by a variety of third parties under a variety of licenses. Please check dataset licenses and related documentation to determine if a dataset may be used for your application.

Tell us about your project

If you have a project using a listed dataset, please tell us about it. We may work with you to feature your project in a blog post.

Cell Painting Gallery

bioinformaticsbiologycancercell biologycell imagingcell paintingchemical biologycomputer visioncsvdeep learningfluorescence imaginggenetichigh-throughput imagingimage processingimage-based profilingimaginglife sciencesmachine learningmedicinemicroscopyorganelle

The Cell Painting Gallery is a collection of image datasets created using the Cell Painting assay. The images of cells are captured by microscopy imaging, and reveal the response of various labeled cell components to whatever treatments are tested, which can include genetic perturbations, chemicals or drugs, or different cell types. The datasets can be used for diverse applications in basic biology and pharmaceutical research, such as identifying disease-associated phenotypes, understanding disease mechanisms, and predicting a drug’s activity, toxicity, or mechanism of action (Chandrasekaran et al 2020). This collection is maintained by the Carpenter–Singh lab and the Cimini lab at the Broad Institute. A human-friendly listing of datasets, instructions for accessing them, and other documentation is at the corresponding GitHub page abou...

Usage examples

Systematic morphological profiling of human gene and allele function via Cell Painting by Rohban MH, Singh S, Wu X, Berthet JB, Bray M-A, Shrestha Y, Varelas X, Boehm JS, & Carpenter AE
Accelerating Drug Discovery with high-throughput Cell Painting on AWS by Chris Kaspar
Image-based profiling introductory exercise - data and an exercise on exploring image-based profiles, including understanding the various data levels by Beth Cimini
Image-based Profiling Recipe by Multiple Authors
Cell Painting wiki by Multiple Authors

See 17 usage examples →

SpaceNet

computer visiondisaster responseearth observationgeospatialmachine learningsatellite imagery

SpaceNet, launched in August 2016 as an open innovation project offering a repository of freely available imagery with co-registered map features. Before SpaceNet, computer vision researchers had minimal options to obtain free, precision-labeled, and high-resolution satellite imagery. Today, SpaceNet hosts datasets developed by its own team, along with data sets from projects like IARPA’s Functional Map of the World (fMoW).

Usage examples

Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery by Ritwik Gupta, Colorado Reed, Anja Rohrbach, and Trevor Darrell
The SpaceNet 7 Multi-Temporal Urban Development Challenge: Dataset Release by Adam Van Etten
SpaceNet 6: Dataset Release by Jake Shermeyer
The SpaceNet 8 Flood Detection Challenge: Dataset and Algorithmic Baseline Release by SpaceNet
Deploying the SpaceNet 6 Baseline on AWS by Adam Van Etten and Nick Weir

See 15 usage examples →

Low Altitude Disaster Imagery (LADI) Dataset

aerial imagerycoastalcomputer visiondisaster responseearth observationearthquakesgeospatialimage processingimaginginfrastructurelandmachine learningmappingnatural resourceseismologytransportationurbanwater

The Low Altitude Disaster Imagery (LADI) Dataset consists of human and machine annotated airborne images collected by the Civil Air Patrol in support of various disaster responses from 2015-2023. Two key distinctions are the low altitude, oblique perspective of the imagery and disaster-related features, which are rarely featured in computer vision benchmarks and datasets.

Usage examples

LADI v1 Tutorials by Andrew Weinert, Jianyu Mao, Kiana Harris, Nae-Rong Chang, Caleb Pennell, Yiming Ren, Ryan Earley, Nadia Dimitrova
LADI v2 Overview by Jeffrey Liu, Sam Scheele, Katherine Picchione
Evaluating Multiple Video Understanding and Retrieval Tasks at TRECVID 2021 by George Awad, Asad Butt, Keith Curtis, Jonathan G. Fiscus, Afzal A. Godil, Yooyoung Lee, Andrew Delgado, Eliot Godard, Baptiste Chocot, Lukas Diduch, Jeffrey Liu, Yvette Graham, Gareth Jones, Georges Quenot
Large Scale Organization and Inference of an Imagery Dataset for Public Safety by Jeffrey Liu, David Strohschein, Siddharth Samsi, Andrew Weinert
Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI by Vamshi Krishna Enabothala, Morgan Dutton, and Sandeep Verma

See 11 usage examples →

nuScenes

autonomous vehiclescomputer visionlidarroboticstransportationurban

Public large-scale dataset for autonomous driving. It enables researchers to study challenging urban driving situations using the full sensor suite of a real self-driving car.

Usage examples

nuScenes prediction tutorial by Motional
nuImages devkit tutorial by Motional
nuScenes CAN bus tutorial by Motional
nuScenes: A multimodal dataset for autonomous driving by Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom
Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking by Whye Kit Fong, Rohit Mohan, Juana Valeria Hurtado, Lubing Zhou, Holger Caesar, Oscar Beijbom, Abhinav Valada

See 9 usage examples →

Boreas Autonomous Driving Dataset

autonomous vehiclescomputer visionlidarrobotics

This autonomous driving dataset includes data from a 128-beam Velodyne Alpha-Prime lidar, a 5MP Blackfly camera, a 360-degree Navtech radar, and post-processed Applanix POS LV GNSS data. This dataset was collect in various weather conditions (sun, rain, snow) over the course of a year. The intended purpose of this dataset is to enable benchmarking of long-term all-weather odometry and metric localization across various sensor types. In the future, we hope to also support an object detection benchmark.

Usage examples

Project Lidar onto Camera Frames (Jupyter notebook) by Keenan Burnett
Picking up speed: Continuous-time Lidar-only odometry using doppler velocity measurements by Y Wu, D J Yoon, K Burnett, S Kammel, Y Chen, H Vhavle, T D Barfoot
Boreas: A multi-season autonomous driving dataset by K Burnett, D J Yoon, Y Wu, A Z Li, H Zhang, S Lu, J Qian, W Tseng, A Lambert, K YK Leung, A P Schoellig, Timothy D Barfoot
Are We Ready for Radar to Replace Lidar in All-Weather Mapping and Localization? by K Burnett, Y Wu, D J Yoon, A P Schoellig, T D Barfoot
Introduction to Visualizing Sensor Types (Jupyter notebook) by Keenan Burnett

See 8 usage examples →

Argoverse

autonomous vehiclescomputer visiongeospatiallidarrobotics

Home of the Argoverse datasets.Public datasets supported by detailed maps to test, experiment, and teach self-driving vehicles how to understand the world around them.This bucket includes the following datasets:

Argoverse 1 (AV1)

Motion Forecasting
Tracking

Argoverse 2 (AV2)

Motion Forecasting
Lidar
Sensor

Trust, but Verify (TbV)

Map Change Detection

Usage examples

Trust, but Verify: Cross-Modality Fusion for HD Map Change Detection by John Lambert, James Hays
Argoverse 2 API by Argoverse Authors
conda-forge package for `av2` by Argoverse Authors
Argoverse: 3D Tracking and Forecasting With Rich Maps by Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, James Hays
PyPi package for `av2` by Argoverse Authors

See 6 usage examples →

Capella Space Synthetic Aperture Radar (SAR) Open Dataset

cogcomputer visionearth observationgeospatialimage processingsatellite imagerystacsynthetic aperture radar

Open Synthetic Aperture Radar (SAR) data from Capella Space. Capella Space is an information services company that provides on-demand, industry-leading, high-resolution synthetic aperture radar (SAR) Earth observation imagery. Through a constellation of small satellites, Capella provides easy access to frequent, timely, and flexible information affecting dozens of industries worldwide. Capella's high-resolution SAR satellites are matched with unparalleled infrastructure to deliver reliable global insights that sharpen our understanding of the changing world – improving decisions ...

Usage examples

Radar Generalized Image Quality Equation Applied to Capella Open Dataset by Wade Schwartzkopf, Jason Brown, Gordon Farquharson, Craig Stringham, Michael Duersch, Jordan Heemskerk
Analyzing LiDAR and SAR data with Capella Space and TileDB by Stavros Papadopoulos
Single Look Complex data reader for Capella SLC images - python module to convert Capella SLC data into an amplitude image. by Capella Space
Open SAR data and scalable analytics by Norman Barker
Python SDK for api.capellaspace.com by Capella Space

See 6 usage examples →

RarePlanes

computer visiondeep learningearth observationgeospatiallabeledmachine learningsatellite imagery

RarePlanes is a unique open-source machine learning dataset from CosmiQ Works and AI.Reverie that incorporates both real and synthetically generated satellite imagery. The RarePlanes dataset specifically focuses on the value of AI.Reverie synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery. Although other synthetic/real combination datasets exist, RarePlanes is the largest openly-available very high resolution dataset built to test the value of synthetic data from an overhead perspective. The real portion ...

Usage examples

Announcing YOLTv4: Improved Satellite Imagery Object Detection by Adam Van Etten
RarePlanes: Synthetic Data Takes Flight by Jacob Shermeyer, Thomas Hossler, Adam Van Etten, Daniel Hogan, Ryan Lewis, Daeil Kim
RarePlanes Codebase by Thomas Hossler and Jacob Shermeyer
Getting Started with YOLTv4 for Object Detection in Imagery: Getting Training Data by Sophia Parafina
Automatically compress and archive satellite imagery for Amazon S3 by Newel Hirst, Joseph Fahimi, and Justin Downes

See 6 usage examples →

Automated Segmentation of Intracellular Substructures in Electron Microscopy (ASEM) on AWS

biologycell biologycomputer visionelectron microscopyimaginglife sciencesmicroscopysegmentation

The Automated Segmentation of intracellular substructures in Electron Microscopy (ASEM) project provides deep learning models trained to segment structures in 3D images of cells acquired by Focused Ion Beam Scanning Electron Microscopy (FIB-SEM). Each model is trained to detect a single type of structure (mitochondria, endoplasmic reticulum, golgi apparatus, nuclear pores, clathrin-coated pits) in cells prepared via chemically-fixation (CF) or high-pressure freezing and freeze substitution (HPFS). You can use our open source pipeline to load a model and predict a class of sub-cellular structur...

Usage examples

Deep neural network automated segmentation of cellular structures in volume electron microscopy by Benjamin Gallusser, Giorgio Maltese, Giuseppe Di Caprio, Tegy John Vadakkan, Anwesha Sanyal, Elliott Somerville, Mihir Sahasrabudhe, Justin O’Connor, Martin Weigert, Tom Kirchhausen
Data layout and how to view by kirchhausenlab
How to use models by kirchhausenlab
TK Lab Data Explorer by Patrick Stock
ASEM Colab Notebook (Interactive Demo) by Patrick Stock

See 5 usage examples →

CAncer MEtastases in LYmph nOdes challeNge (CAMELYON) Dataset

cancercomputational pathologycomputer visiondeep learninggrand-challenge.orghistopathologylife sciences

"This dataset contains the all data for the CAncer MEtastases in LYmph nOdes challeNge or CAMELYON. CAMELYON was the first challenge using whole-slide images in computational pathology and aimed to help pathologists identify breast cancer metastases in sentinel lymph nodes. Lymph node metastases are extremely important to find, as they indicate that the cancer is no longer localized and systemic treatment might be warranted. Searching for these metastases in H&E-stained tissue is difficult and time-consuming and AI algorithms can play a role in helping make this faster and more accura...

Usage examples

See 5 usage examples →

MONKEY

cancerclassificationcomputational pathologycomputer visiondeep learningdigital pathologygrand-challenge.orghistopathologyimaginglife sciencesmachine learningmedical image computingmedical imaging

This dataset contains the training data for the Machine learning for Optimal detection of iNflammatory cells in the KidnEY or MONKEY challenge. The MONKEY challenge focuses on the automated detection and classification of inflammatory cells, specifically monocytes and lymphocytes, in kidney transplant biopsies using Periodic acid-Schiff (PAS) stained whole-slide images (WSI). It contains 80 WSI, collected from 4 different pathology institutes, with annotated regions of interest. For each WSI up to 3 different PAS scans and one IHC slide scan are available. This dataset and challenge support th...

Usage examples

See 5 usage examples →

RACECAR Dataset

autonomous racingautonomous vehiclescomputer visionGNSSimage processinglidarlocalizationobject detectionobject trackingperceptionradarrobotics

The RACECAR dataset is the first open dataset for full-scale and high-speed autonomous racing. Multi-modal sensor data has been collected from fully autonomous Indy race cars operating at speeds of up to 170 mph (273 kph). Six teams who raced in the Indy Autonomous Challenge during 2021-22 have contributed to this dataset. The dataset spans 11 interesting racing scenarios across two race tracks which include solo laps, multi-agent laps, overtaking situations, high-accelerations, banked tracks, obstacle avoidance, pit entry and exit at different speeds. The data is organized and released in bot...

Usage examples

RACECAR Tutorials - ROS2 Visualization by Amar Kulkarni, Utkarsh Chirimar
RACECAR Tutorials - ROS2 Localization by Amar Kulkarni
rosbag2nuscenes conversion library by John Chrosniak, Emory Ducote, John Link, Madhur Behl
RACECAR Tutorials - nuScenes by John Chrosniak
RACECAR--The Dataset for High-Speed Autonomous Racing by Amar Kulkarni, John Chrosniak, Emory Ducote, Florian Sauerbeck, Andrew Saba, Utkarsh Chirimar, John Link, Marcello Cellina, and Madhur Behl

See 5 usage examples →

Sanborn Maps Data Package

archivescitiescomputer visionconservationcultural preservationculturedemographicsdigital assetsgeospatialhistoryhousingland usemappingurban

The dataset contains metadata records for 50,600 maps from the Sanborn Fire Insurance Maps collection and their corresponding 440,048 JPEG images. The Sanborn collection at Library of Congress includes over fifty thousand editions of fire insurance maps comprising almost seven hundred thousand individual sheets. The Library of Congress holdings represent the largest extant collection of maps produced by the Sanborn Map Company.

Usage examples

Fire Insurance Maps at the Library of Congress: A Resource Guide by Julie Stoner, Reference Librarian, Geography and Map Division, Library of Congress
Sanborn Atlas Volume Finder by Julie Stoner and Meagan Snow, Geography and Map Division, Library of Congress
Sanborn Map Data Python Tutorial (Jupyter notebook) by Library of Congress
README data cover sheet by Library of Congress
Introduction to the Collection by Walter W. Ristow

See 5 usage examples →

OpenCell on AWS

biologycell biologycell imagingcomputer visionfluorescence imagingimaginglife sciencesmachine learningmicroscopy

The OpenCell project is a proteome-scale effort to measure the localization and interactions of human proteins using high-throughput genome engineering to endogenously tag thousands of proteins in the human proteome. This dataset consists of the raw confocal fluorescence microscopy images for all tagged cell lines in the OpenCell library. These images can be interpreted both individually, to determine the localization of particular proteins of interest, and in aggregate, by training machine learning models to classify or quantify subcellular localization patterns.

Usage examples

See 4 usage examples →

Allen Ivy Glioblastoma Atlas

biologycancercomputer visiongene expressiongeneticglioblastomaHomo sapiensimage processingimaginglife sciencesmachine learningneurobiology

This dataset consists of images of glioblastoma human brain tumor tissue sections that have been probed for expression of particular genes believed to play a role in development of the cancer. Each tissue section is adjacent to another section that was stained with a reagent useful for identifying histological features of the tumor. Each of these types of images has been completely annotated for tumor features by a machine learning process trained by expert medical doctors.

Usage examples

See 3 usage examples →

COBRA

cancercomputational pathologycomputer visiondeep learninghistopathologylife sciences

This page describes the COBRA (Classification Of Basal cell carcinoma, Risky skin cancers and Abnormalities) skin pathology dataset, which comprises over 7000 histopathology whole-slide-images related to the diagnosis of basal cell carcinoma skin cancer, the most commonly diagnosed cancer. The dataset includes biopsies and excisions and is divided into four groups. The first group contains about 2,500 BCC biopsies with subtype labels, while the second group includes 2,500 non-BCC biopsies with different types of skin dysplasia. The third group has 1,000 labelled risky cancer biopsies, includin...

Usage examples

See 3 usage examples →

Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS

cell biologycomputer visionelectron microscopyimaginglife sciencesorganelle

High resolution images of subcellular structures.

Usage examples

Correlative three-dimensional super-resolution and block-face electron microscopy of whole vitreously frozen cells. by David P. Hoffman, Gleb Shtengel, C. Shan Xu, Kirby R. Campbell, Melanie Freeman, Lei Wang, Daniel E. Milkie, H. Amalia Pasolli, Nirmala Iyer, John A. Bogovic, Daniel R. Stabley, Abbas Shirinifard, Song Pang, David Peale, Kathy Schaefer, Wim Pomp, Chi-Lun Chang, Jennifer Lippincott-Schwartz, Tom Kirchhausen1, David J. Solecki, Eric Betzig, Harald F. Hess
Whole-cell organelle segmentation in volume electron microscopy by Lisa Heinrich, Davis Bennett, David Ackerman, Woohyun Park, Jon Bogovic, Nils Eckstein, et al.
Enhanced FIB-SEM systems for large-volume 3D imaging by C. Shan Xu, Kenneth J. Hayworth, Zhiyuan Lu, Patricia Grob, Ahmed M. Hassan, José G. García-Cerdán, Krishna K. Niyogi, Eva Nogales, Richard J. Weinberg, Harald F. Hess.

See 3 usage examples →

CitrusFarm Dataset

agriculturecomputer visionIMUlidarlocalizationmappingrobotics

CitrusFarm is a multimodal agricultural robotics dataset that provides both multispectral images and navigational sensor data for localization, mapping and crop monitoring tasks.

It was collected by a wheeled mobile robot in the Agricultural Experimental Station at the University of California Riverside in the summer of 2023.
It offers a total of nine sensing modalities, including stereo RGB, depth, monochrome, near-infrared and thermal images, as well as wheel odometry, LiDAR, IMU and GPS-RTK data.
It comprises seven sequences collected from three citrus tree fields, featuring various tree species at different growth stages, distinctive planting patterns, as well as varying daylight conditions.
It spans a total operation time of 1.7 hours, covers a total distance of 7.5 km, and consti...

Details →

Usage examples
- Python scripts used in the data collection and post-processing by Hanzhe Teng et al.
- Python script to download this dataset by Hanzhe Teng et al.
- Multimodal Dataset for Localization, Mapping and Crop Monitoring in Citrus Tree Farms by Hanzhe Teng, Yipeng Wang, Xiaoao Song and Konstantinos Karydis
See 3 usage examples →

Multiview Extended Video with Activities (MEVA)

computer visionurbanusvideo

The Multiview Extended Video with Activities (MEVA) dataset consists video data of human activity, both scripted and unscripted, collected with roughly 100 actors over several weeks. The data was collected with 29 cameras with overlapping and non-overlapping fields of view. The current release consists of about 328 hours (516GB, 4259 clips) of video data, as well as 4.6 hours (26GB) of UAV data. Other data includes GPS tracks of actors, camera models, and a site map. We have also released annotations for roughly 184 hours of data. Further updates are planned.

Usage examples

ActEV: Activities in Extended Video by National Institute of Standards and Technology (NIST)
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection by Kellie Corona, Katie Osterdahl, Roderic Collins, Anthony Hoogs
TinyAction Challenge: Recognizing Real-world Low-resolution Activities in Videos by Praveen Tirupattur, Aayush J Rana, Tushar Sangam, Shruti Vyas, Yogesh S Rawat, Mubarak Shah

See 3 usage examples →

Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters

autonomous vehiclescomputer visionlidarmarine navigationrobotics

This dataset presents a multi-modal maritime dataset acquired in restricted waters in Pohang, South Korea. The sensor suite is composed of three LiDARs (one 64-channel LiDAR and two 32-channel LiDARs), a marine radar, two visual cameras used as a stereo camera, an infrared camera, an omnidirectional camera with 6 directions, an AHRS, and a GPS with RTK. The dataset includes the sensor calibration parameters and SLAM-based baseline trajectory. It was acquired while navigating a 7.5 km route that includes a narrow canal area, inner and outer port areas, and a near-coastal area. The aim of this d...

Usage examples

ROS package for LiDAR to image of Pohang Canal Dataset by Dongha Chung
Pohang Canal Dataset: A Multimodal Maritime Dataset for Autonomous Navigation in Restricted Waters by Dongha Chung, Jonghwi Kim, Changyu Lee, Jinwhan Kim
Ros Message Player for Pohang Dataset by Dongha Chung

See 3 usage examples →

STOIC2021 Training

computed tomographycomputer visioncoronavirusCOVID-19grand-challenge.orgimaginglife sciencesSARS-CoV-2

The STOIC project collected Computed Tomography (CT) images of 10,735 individuals suspected of being infected with SARS-COV-2 during the first wave of the pandemic in France, from March to April 2020. For each patient in the training set, the dataset contains binary labels for COVID-19 presence, based on RT-PCR test results, and COVID-19 severity, defined as intubation or death within one month from the acquisition of the CT scan. This S3 bucket contains the training sample of the STOIC dataset as used in the STOIC2021 challenge on grand-challenge.org.

Usage examples

How Well Do Self-Supervised Models Transfer to Medical Imaging? by Anton J, Castelli L, Chan MF, Outthers M, Tang WH, Cheung V, et al.
Study of Thoracic CT in COVID-19: The STOIC Project by Revel MP, Boussouar S, de Margerie-Mellon C, Saab I, Lapotre T, Mompoint D, et al.
STOIC2021 Challenge by Diagnostic Image Analysis Group, Radboudumc, Nijmegen

See 3 usage examples →

AgricultureVision

aerial imageryagriculturecomputer visiondeep learningmachine learning

Agriculture-Vision aims to be a publicly available large-scale aerial agricultural image dataset that is high-resolution, multi-band, and with multiple types of patterns annotated by agronomy experts. The original dataset affiliated with the 2020 CVPR paper includes 94,986 512x512images sampled from 3,432 farmlands with nine types of annotations: double plant, drydown, endrow, nutrient deficiency, planter skip, storm damage, water, waterway and weed cluster. All of these patterns have substantial impacts on field conditions and the final yield. These farmland images were captured between 201...

Usage examples

The 2nd International Workshop and Prize Challenge on Agriculture-Vision, Challenges & Opportunities for Computer Vision in Agricutlure by Humphrey Shi, Naira Hovakimyan, Jennifer Hobbs, Ed Delp, Melba Crawford, Zhen Li, David Clifford, Jim Yuan, Mang Tik Chiu, Xingqian Xu
Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis by Mang Tik Chiu, Xingqian Xu, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Hrant Khachatrian, Hovnatan Karapetyan, Ivan Dozier, Greg Rose, David Wilson, Adrian Tudor, Naira Hovakimyan, Thomas S. Huang, Honghui Shi

See 2 usage examples →

Aurora Multi-Sensor Dataset

autonomous vehiclescomputer visiondeep learningimage processinglidarmachine learningmappingroboticstraffictransportationurbanweather

The Aurora Multi-Sensor Dataset is an open, large-scale multi-sensor dataset with highly accurate localization ground truth, captured between January 2017 and February 2018 in the metropolitan area of Pittsburgh, PA, USA by Aurora (via Uber ATG) in collaboration with the University of Toronto. The de-identified dataset contains rich metadata, such as weather and semantic segmentation, and spans all four seasons, rain, snow, overcast and sunny days, different times of day, and a variety of traffic conditions.
The Aurora Multi-Sensor Dataset contains data from a 64-beam Velodyne HDL-64E LiDAR sensor and seven 1920x1200-pixel resolution cameras including a forward-facing stereo pair and five wide-angle lenses covering a 360-degree view around the vehicle.
This data can be used to develop and evaluate large-scale long-term approaches to autonomous vehicle localization. Its size and diversity make it suitable for a wide range of research areas such as 3D reconstruction, virtual tourism, HD map construction, and map compression, among others.
The data was first presented at the International Conference on Intelligent Robots an
...

Usage examples

"Pit30M: A benchmark for global localization in the age of self-driving cars", in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4477-4484) by Martinez, J., Doubov, S., Fan, J., Bârsan, I. A., Wang, S., Máttyus, G., Urtasun, R.
Introduction to Visualizing Sensor Types (Jupyter notebook) by Andrei Bârsan (note: Aurora makes no representations as to the accuracy or functionality of the tutorial)

See 2 usage examples →

Brain Encoding Response Generator (BERG)

brain modelscomputer visiondeep learninglife sciencesmachine learningneuroimagingneuroscience

Brain Encoding Response Generator (BERG) is a resource consisting of multiple pre-trained encoding models of the brain and an accompanying Python package to generate accurate in silico neural responses to arbitrary stimuli with just a few lines of code.

Usage examples

Quickstart Tutorial by Domenic Bersch
Brain Encoding Response Generator (BERG) by Alessandro Gifford
In-Silico EEG Data Tutorial by Alessandro Gifford
In-Silico fMRI Data Tutorial by Alessandro Gifford
The Brain Encoding Response Generator by Alessandro Gifford

See 5 usage examples →

Consented Activities of People

activity detectionactivity recognitioncomputer visionlabeledmachine learningprivacyvideo

The Consented Activities of People (CAP) dataset is a fine grained activity dataset for visual AI research curated using the Visym Collector platform.

Usage examples

Visym Collector by Visym Labs & Systems & Technology Research
OpenFAD - Open Fine Grained Activity Detection Challenge by Visym Labs & NIST

See 2 usage examples →

Emory Knee Radiograph (MRKR) dataset

bioinformaticsbiologycomputer visioncsvhealthimaginglabeledlife sciencesmachine learningmedical image computingmedical imagingradiologyx-ray

The Emory Knee Radiograph (MRKR) dataset is a large, demographically diverse collection of 503,261 knee radiographs from 83,011 patients, 40% of which are African American. This dataset provides imaging data in DICOM format along with detailed clinical information, including patient- reported pain scores, diagnostic codes, and procedural codes, which are not commonly available in similar datasets. The MRKR dataset also features imaging metadata such as image laterality, view type, and presence of hardware, enhancing its value for research and model development. MRKR addresses significant gaps ...

Usage examples

Example Notebook by Emory-HITI
Emory Knee Radiograph Dataset by Brandon Price, Jason Adleberg, Kaesha Thomas, Zach Zaiman, Aawez Mansuri, Beatrice Brown-Mulry, Chima Okecheukwu, Judy Gichoya, Hari Trivedi.

See 2 usage examples →

Ford Multi-AV Seasonal Dataset

autonomous vehiclescomputer visionlidarmappingroboticstransportationurbanweather

This research presents a challenging multi-agent seasonal dataset collected by a fleet of Ford autonomous vehicles at different days and times during 2017-18. The vehicles The vehicles were manually driven on an average route of 66 km in Michigan that included a mix of driving scenarios like the Detroit Airport, freeways, city-centres, university campus and suburban neighbourhood, etc. Each vehicle used in this data collection is a Ford Fusion outfitted with an Applanix POS-LV inertial measurement unit (IMU), four HDL-32E Velodyne 3D-lidar scanners, 6 Point Grey 1.3 MP Cameras arranged on the...

Usage examples

Ford AV Dataset Tutorial by Ford Motor Company
Autonomous Driving Data Service (ADDS) by Ajay Vohra, Amazon

See 2 usage examples →

LEarning biOchemical Prostate cAncer Recurrence from histopathology sliDes challenge (LEOPARD) Dataset

cancercomputational pathologycomputer visiondeep learninggrand-challenge.orghistopathologylife sciences

"This dataset contains the all data for the LEarning biOchemical Prostate cAncer Recurrence from histopathology sliDes challenge or LEOPARD.Prostate cancer, impacting 1.4 million men annually, is a prevalent malignancy (H. Sung et al., 2021). A substantial number of these individuals undergo prostatectomy as the primary curative treatment. The efficacy of this surgery is assessed, in part, by monitoring the concentration of prostate-specific antigen (PSA) in the bloodstream. While the role of PSA in prostate cancer screening is debatable (W. F. Clark et al., 2018; E. A. M. Heijnsdijk et al., 2018), it serves as a valuable biomarker for postprostatectomy follow-up in patients. Following successful surgery, PSA concentration is typically undetectable (<0.1 ng/mL) within 4-6 weeks (S. S. Goonewardene et al., 2014). However, approximately 30% of patients experience biochemical recurrence, signifying the resurgence of prostate cancer cells. This recurrence serves as a prognostic indicator for progression to clinical metastases and eventual prostate cancer-related mortality (C. L. Amling, 2014; S. J. Freedland et al., 2005; M. Han et al., 2001; T. Van den Broeck et al., 2001. Current clinical practices gauge the risk of biochemical recurrence by considering the International Society of Urological Pathology (ISUP) grade, PSA value at diagnosis, and TNM staging criteria (J. I. Epstein et al., 2016). A recent European consensus guideline suggests categorizing patients into low-risk, intermediate-risk, and high-risk groups based on these factors (N. Mottet et al., 2021). Notably, a high ISUP grade independently assigns a patient to the intermediate (grade 2/3) or high-risk group (grade 4/5). The Gleason growth patterns, representing morphological patterns of prostate cancer, are used to categorize cancerous tissue into ISUP grade groups (J. I. Epstein, 2010; P. M. Pierorazio et al., 2013; G. J. L. H. van Leenders et al., 2020; J. I. Epstein et al., 2016). However, the ISUP grade has limitations, such as grading disagreement among pathologists (J. I. Epstein et al., 2016) and coarse descriptors of tissue morphology. Recently, deep learning was shown (H. Pinckaers et al., 2022; O. Eminaga et. al., 2024)...

Usage examples

See 2 usage examples →

Satellogic EarthView dataset

cogcomputer visionearth observationgeospatialimage processingsatellite imagerystac

Satellogic EarthView dataset includes high-resolution satellite images captured over all continents. The dataset is organized in Hive partition format and hosted by AWS. The dataset can be accessed via STAC browser or aws cli. Each item of the dataset corresponds to a specific region and date, with some of the regions revisited for additional data. The dataset provides Top-of-Atmosphere (TOA) reflectance values across four spectral bands (Red, Green, Blue, Near-Infrared) at a Ground Sample Distance (GSD) of 1 meter, accompanied by comprehensive metadata such as off-nadir angles, sun elevation,...

Usage examples

Explore Satellogic EarthView in SageMaker Studio Lab (SMSL) by Javier Marin
EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision by Velázquez, Diego and Rodríguez, Pau and Alonso, Sergio and Gonfaus, Josep M. and González, Jordi and, Richarte, Gerardo and Marín, Javier and Bengio, Yoshua and Lacoste, Alexandre

See 2 usage examples →

TIGER Training

cancercomputational pathologycomputer visiondeep learninggrand-challenge.orghistopathologylife sciences

"This dataset contains the training data for the Tumor InfiltratinG lymphocytes in breast cancER or TIGER challenge. TIGER is the first challenge on fully automated assessment of tumor-infiltrating lymphocytes (TILs) in breast cancer histopathology slides. TILs are proving to be an important biomarker in cancer patients as they can play a part in killing tumor cells, particularly in some types of breast cancer. Identifying and measuring TILs can help to better target treatments, particularly immunotherapy, and may result in lower levels of other more aggressive treatments, including chemo...

Usage examples

See 2 usage examples →

3DCoMPaT: Composition of Materials on Parts of 3D Things

computer visionmachine learning

3D CoMPaT is a richly annotated large-scale dataset of rendered compositions of Materials on Parts of thousands of unique 3D Models. This dataset primarily focuses on stylizing 3D shapes at part-level with compatible materials. Each object with the applied part-material compositions is rendered from four equally spaced views as well as four randomized views. We introduce a new task, called Grounded CoMPaT Recognition (GCR), to collectively recognize and ground compositions of materials on parts of 3D objects. We present two variations of this task and adapt state-of-art 2D/3D deep learning met...

Usage examples

3DCoMPaT: Composition of Materials on Parts of 3D Things by Yuchen Li, Ujjwal Upadhyay, Habib Slim, Ahmed Abdelreheem, Arpit Prajapati, Suhail Pothigara, Peter Wonka & Mohamed Elhoseiny

See 1 usage example →

A2D2: Audi Autonomous Driving Dataset

autonomous vehiclescomputer visiondeep learninglidarmachine learningmappingrobotics

An open multi-sensor dataset for autonomous driving research. This dataset comprises semantically segmented images, semantic point clouds, and 3D bounding boxes. In addition, it contains unlabelled 360 degree camera images, lidar, and bus data for three sequences. We hope this dataset will further facilitate active research and development in AI, computer vision, and robotics for autonomous driving.

Usage examples

Autonomous Driving Data Service (ADDS) by Ajay Vohra, Amazon

See 1 usage example →

CHIMERA

cancercomputational pathologycomputer visiondeep learningdigital pathologygrand-challenge.orghistopathologylife sciencesmachine learningmedical image computingmedical imaging

This dataset contains the training data for the CHIMERA - Combining HIstology, Medical imaging (radiology) and molEcular data for medical pRognosis and diAgnosis challenge. The CHIMERA Challenge aims to advance precision medicine in cancer care by addressing the critical need for multimodal data integration. Despite significant progress in AI, integrating transcriptomics, pathology, and radiology across clinical departments remains a complex challenge. Clinicians are faced with large, heterogeneous datasets that are difficult to analyze effectively. AI has the potential to unify multimodal dat...

Usage examples

CHIMERA Challenge by Computational Pathology Group Radboudumc, Nijmegen

See 1 usage example →

Corn Kernel Counting Dataset

agriculturecomputer visionmachine learning

Dataset associated with the March 2021 Frontiers in Robotics and AI paper "Broad Dataset and Methods for Counting and Localization of On-Ear Corn Kernels", DOI: 10.3389/frobt.2021.627009

Usage examples

Broad Dataset and Methods for Counting and Localization of On-Ear Corn Kernels by Jennifer Hobbs, Vachik Khachatryan, Barathwaj Anandan, Harutyun Hovhannisyan, David Wilson

See 1 usage example →

Image classification - fast.ai datasets

computer visiondeep learningmachine learning

Some of the most important datasets for image classification research, including CIFAR 10 and 100, Caltech 101, MNIST, Food-101, Oxford-102-Flowers, Oxford-IIIT-Pets, and Stanford-Cars. This is part of the fast.ai datasets collection hosted by AWS for convenience of fast.ai students. See documentation link for citation and license details for each dataset.

Usage examples

Oxford-IIIT Pet Image Classification on Amazon SageMaker by AWS

See 1 usage example →

Longitudinal Nutrient Deficiency

aerial imageryagriculturecomputer visiondeep learningmachine learning

Dataset associated with the 2021 AAAI Paper- Detection and Prediction of Nutrient Deficiency Stress using Longitudinal Aerial Imagery. The dataset contains 3 image sequences of aerial imagery from 386 farm parcels which have been annotated for nutrient deficiency stress.

Usage examples

Detection and Prediction of Nutrient Deficiency Stress using Longitudinal Aerial Imagery by Saba Dadsetan, Gisele Rose, Naira Hovakimyan, Jennifer Hobbs

See 1 usage example →

MAN TruckScenes

autonomous vehiclescomputer visiondeep learningGPSIMUlidarlogisticsmachine learningobject detectionobject trackingperceptionradarroboticstransportation

A large scale multimodal dataset for Autonomous Trucking. Sensor data was recorded with a heavy truck from MAN equipped with 6 lidars, 6 radars, 4 cameras and a high-precision GNSS. MAN TruckScenes allows the research community to come into contact with truck-specific challenges, such as trailer occlusions, novel sensor perspectives, and terminal environments for the first time. It comprises more than 740 scenes of 20s each within a multitude of different environmental conditions. Bounding boxes are available for 27 object classes, 15 attributes, and a range of more than 230m. The scenes are t...

Usage examples

TruckScenes devkit tutorial by Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin
TruckScenes devkit by Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin
MANTruckScenes: A multimodal dataset for autonomous trucking in diverse conditions by Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin, et al
PyPi package by Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin

See 4 usage examples →

Multi-robot, Multi-Sensor, Multi-Environment Event Dataset (M3ED)

autonomous vehiclescomputer visiondeep learningevent cameraglobal shutter cameraGNSSGPSh5hdf5IMUlidarmachine learningperceptionroboticsRTK

M3ED is the first multi-sensor event camera (EC) dataset focused on high-speed dynamic motions in robotics applications. M3ED provides high-quality synchronized data from multiple platforms (car, legged robot, UAV), operating in challenging conditions such as off-road trails, dense forests, and performing aggressive flight maneuvers. M3ED also covers demanding operational scenarios for EC, such as high egomotion and multiple independently moving objects. M3ED includes high-resolution stereo EC (1280×720), grayscale and RGB cameras, a high-quality IMU, a 64-beam LiDAR, and RTK localization.

Usage examples

M3ED: Multi-Robot, Multi-Sensor, Multi-Environment Event Dataset by Chaney K, Cladera F, et al.

See 1 usage example →

NYUMets Brain Dataset

biologycancercomputer visionhealthimage processingimaginglife sciencesmachine learningmagnetic resonance imagingmedical imagingmedicineneurobiologyneuroimagingsegmentation

This dataset contains 8,000+ brain MRIs of 2,000+ patients with brain metastases.

Usage examples

Longitudinal deep neural networks for assessing metastatic brain cancer on a massive open benchmark. by Link et al (2023)

See 1 usage example →

OpenSurfaces

computer vision

A large database of annotated surfaces created from real-world consumer photographs.

Usage examples

OpenSurfaces: A Richly Annotated Catalog of Surface Appearance by Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala

See 1 usage example →

RSNA Abdominal Trauma Detection (RSNA-ABT)

computed tomographycomputer visioncsvlabeledlife sciencesmachine learningmedical image computingmedical imagingradiologyx-ray tomography

Blunt force abdominal trauma is among the most common types of traumatic injury, with the most frequent cause being motor vehicle accidents. Abdominal trauma may result in damage and internal bleeding of the internal organs, including the liver, spleen, kidneys, and bowel. Detection and classification of injuries are key to effective treatment and favorable outcomes. A large proportion of patients with abdominal trauma require urgent surgery. Abdominal trauma often cannot be diagnosed clinically by physical exam, patient symptoms, or laboratory tests. Prompt diagnosis of abdominal trauma using...

Usage examples

The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset by Rudie, Jeffrey D.

See 1 usage example →

RSNA Cervical Spine Fracture Detection (RSNA-CSF) Dataset

computed tomographycomputer visioncsvlabeledlife sciencesmachine learningmedical image computingmedical imagingradiologyx-ray tomography

Over 1.5 million spine fractures occur annually in the United States alone resulting in over 17,730 spinal cord injuries annually. The most common site of spine fracture is the cervical spine. There has been a rise in the incidence of spinal fractures in the elderly and in this population, fractures can be more difficult to detect on imaging due to degenerative disease and osteoporosis. Imaging diagnosis of adult spine fractures is now almost exclusively performed with computed tomography (CT). Quickly detecting and determining the location of any vertebral fractures is essential to prevent ne...

Usage examples

The RSNA Cervical Spine Fracture CT Dataset by Ming, Hui Lin

See 1 usage example →

RSNA Intracranial Hemorrhage Detection

computed tomographycomputer visioncsvlabeledlife sciencesmachine learningmedical image computingmedical imagingradiologyx-ray tomography

RSNA assembled this dataset in 2019 for the RSNA Intracranial Hemorrhage Detection AI Challenge (https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/). De-identified head CT studies were provided by four research institutions. A group of over 60 volunteer expert radiologists recruited by RSNA and the American Society of Neuroradiology labeled over 25,000 exams for the presence and subtype classification of acute intracranial hemorrhage.

Usage examples

Construction of a Machine Learning Dataset through Collaboration: The RSNA 2019 Brain CT Hemorrhage Challenge by Rudie, Jeffrey D.

See 1 usage example →

RSNA Pulmonary Embolism Detection

computed tomographycomputer visioncsvlabeledlife sciencesmachine learningmedical image computingmedical imagingradiologyx-ray tomography

RSNA assembled this dataset in 2020 for the RSNA STR Pulmonary Embolism Detection AI Challenge (https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/). With more than 12,000 CT pulmonary angiography (CTPA) studies contributed by five international research centers, it is the largest publicly available annotated PE dataset. RSNA collaborated with the Society of Thoracic Radiology to recruit more than 80 expert thoracic radiologists who labeled the dataset with detailed clinical annotations.

Usage examples

The RSNA Pulmonary Embolism CT Dataset by Colak, Errol

See 1 usage example →

Xiph.Org Test Media

computer visionimage processingimagingmediamoviesmultimediavideo

Uncompressed video used for video compression and video processing research.

Usage examples

Encoding video with AV1 on EC2 by Thomas Daede

See 1 usage example →

COCO - Common Objects in Context - fast.ai datasets

computer visiondeep learningmachine learning

COCO is a large-scale object detection, segmentation, and captioning dataset. This is part of the fast.ai datasets collection hosted by AWS for convenience of fast.ai students. If you use this dataset in your research please cite arXiv:1405.0312 [cs.CV].

Cloud to Street - Microsoft Flood and Clouds Dataset

cogcomputer visiondeep learningearth observationfloodsgeospatialmachine learningsatellite imagerysynthetic aperture radar

This dataset consists of chips of Sentinel-1 and Sentinel-2 satellite data. Each Sentinel-1 chip contains a corresponding label for water and each Sentinel-2 chip contains a corresponding label for water and clouds. Data is stored in folders by a unique event identifier as the folder name. Within each event folder there are subfolders for Sentinel-1 (s1) and Sentinel-2 (s2) data. Each chip is contained in its own sub-folder with the folder name being the source image id, followed by a unique chip identifier consisting of a hyphenated set of 5 numbers. All bands of the satellite data, as well a...

DARPA Invisible Headlights Dataset

autonomous vehiclesbroadbandcomputer visionlidarmachine learningsegmentationus

"The DARPA Invisible Headlights Dataset is a large-scale multi-sensor dataset annotated for autonomous, off-road navigation in challenging off-road environments. It features simultaneously collected off-road imagery from multispectral, hyperspectral, polarimetric, and broadband sensors spanning wave-lengths from the visible spectrum to long-wave infrared and provides aligned LIDAR data for ground-truth shape. Camera calibrations, LiDAR registrations, and traversability annotations for a subset of the data are available."

DHARANI Developing Human-Brain Atlas

brain imagescomputer visionlife sciencesmicroscopyneurobiologysegmentation

We introduce DHARANI, the first online platform with three-dimensional (3D) histological reconstructions of the developing human brain from 14 to 24 gestational weeks (GW) across the five fetal brains. DHARANI features 5132 Nissl, hematoxylin and eosin stained, 20 µm coronal and sagittal sections, postmortem MRI, and a neuroanatomical atlas with 466 annotated sections covering ∼500 brain structures. It is accessible online at https://brainportal.humanbrain.in/publicview/index.html. The 3D reconstruction enables a volumetric view of the fetal brain, allowing visualization in all three planes ak...

Usage examples

See 3 usage examples →

Image localization - fast.ai datasets

computer visiondeep learningmachine learning

Some of the most important datasets for image localization research, including Camvid and PASCAL VOC (2007 and 2012). This is part of the fast.ai datasets collection hosted by AWS for convenience of fast.ai students. See documentation link for citation and license details for each dataset.

KITTI Vision Benchmark Suite

autonomous vehiclescomputer visiondeep learningmachine learningrobotics

Dataset and benchmarks for computer vision research in the context of autonomous driving. The dataset has been recorded in and around the city of Karlsruhe, Germany using the mobile platform AnnieWay (VW station wagon) which has been equipped with several RGB and monochrome cameras, a Velodyne HDL 64 laser scanner as well as an accurate RTK corrected GPS/IMU localization unit. The dataset has been created for computer vision and machine learning research on stereo, optical flow, visual odometry, semantic segmentation, semantic instance segmentation, road segmentation, single image depth predic...

MegaScenes

benchmarkcomputer visiondeep learninginternet

The MegaScenes Dataset is an extensive collection of around 430k scenes, featuring over 100k structure-from-motion reconstructions and over 2 million registered images. MegaScenes includes a diverse array of scenes, such as minarets, building interiors, statues, bridges, towers, religious buildings, and natural landscapes. The images of these scenes are captured under varying conditions, including different times of day, various weather and illumination, and from different devices with distinct camera intrinsics.

Usage examples

MegaScenes: Scene-Level View Synthesis at Scale by Tung J., Chou G., Cai R., Yang, G., Zhang K., Wetzstein G., et al.
MegaScenes: Scene-Level View Synthesis at Scale by Tung J., Chou G., Cai R., Yang, G., Zhang K., Wetzstein G., et al.
MegaScenes: Scene-Level View Synthesis at Scale by Tung J., Chou G., Cai R., Yang, G., Zhang K., Wetzstein G., et al.

See 3 usage examples →

Multimedia Commons

computer visionmachine learningmultimediavideo

The Multimedia Commons is a collection of audio and visual features computed for the nearly 100 million Creative Commons-licensed Flickr images and videos in the YFCC100M dataset from Yahoo! Labs, along with ground-truth annotations for selected subsets. The International Computer Science Institute (ICSI) and Lawrence Livermore National Laboratory are producing and distributing a core set of derived feature sets and annotations as part of an effort to enable large-scale video search capabilities. They have released this feature corpus into the public domain, under Creative Commons License 0, s...

Natural Scenes Dataset

computer visionimage processingimaginglife sciencesmachine learningmagnetic resonance imagingneuroimagingneurosciencenifti

Here, we collected and pre-processed a massive, high-quality 7T fMRI dataset that can be used to advance our understanding of how the brain works. A unique feature of this dataset is the massive amount of data available per individual subject. The data were acquired using ultra-high-field fMRI (7T, whole-brain, 1.8-mm resolution, 1.6-s TR). We measured fMRI responses while each of 8 participants viewed 9,000–10,000 distinct, color natural scenes (22,500–30,000 trials) in 30–40 weekly scan sessions over the course of a year. Additional measures were collected including resting-state data, retin...

RSNA Screening Mammography Breast Cancer Detection (RSNA-SMBC) Dataset

breast cancercancercomputer visioncsvlabeledlife sciencesmachine learningmammographymedical image computingmedical imagingradiology

According to the WHO, breast cancer is the most commonly occurring cancer worldwide. In 2020 alone, there were 2.3 million new breast cancer diagnoses and 685,000 deaths. Yet breast cancer mortality in high-income countries has dropped by 40% since the 1980s when health authorities implemented regular mammography screening in age groups considered at risk. Early detection and treatment are critical to reducing cancer fatalities, and your machine learning skills could help streamline the process radiologists use to evaluate screening mammograms. Currently, early detection of breast cancer requi...

The Massively Multilingual Image Dataset (MMID)

computer visionmachine learningmachine translationnatural language processing

MMID is a large-scale, massively multilingual dataset of images paired with the words they represent collected at the University of Pennsylvania. The dataset is doubly parallel: for each language, words are stored parallel to images that represent the word, and parallel to the word's translation into English (and corresponding images.)

Amazon Bin Image Dataset

amazon.sciencecomputer visionmachine learning

The Amazon Bin Image Dataset contains over 500,000 images and metadata from bins of a pod in an operating Amazon Fulfillment Center. The bin images in this dataset are captured as robot units carry pods as part of normal Amazon Fulfillment Center operations.

Usage examples

See 2 usage examples →

YouTube 8 Million - Data Lakehouse Ready

amazon.sciencecomputer visionlabeledmachine learningparquetvideo

This both the original .tfrecords and a Parquet representation of the YouTube 8 Million dataset. YouTube-8M is a large-scale labeled video dataset that consists of millions of YouTube video IDs, with high-quality machine-generated annotations from a diverse vocabulary of 3,800+ visual entities. It comes with precomputed audio-visual features from billions of frames and audio segments, designed to fit on a single hard disk. This dataset also includes the YouTube-8M Segments data from June 2019. This dataset is 'Lakehouse Ready'. Meaning, you can query this data in-place straight out of...

Usage examples

YouTube 8 Million by Google Research
Data Lake as Code Deployment Guide by AWS Industry Blueprints Team

See 2 usage examples →

BodyM Dataset

amazon.sciencecomputer visiondeep learning

The first large public body measurement dataset including 8978 frontal and lateral silhouettes for 2505 real subjects, paired with height, weight and 14 body measurements. The following artifacts are made available for each subject.

Subject Height
Subject Weight
Subject Gender
Two black-and-white silhouette images of subject standing in frontal and side pose respectively with full body in view.
14 body measurements in cm - {ankle girth, arm-length, bicep girth, calf girth, chest girth, forearm girth, height, hip girth, leg-length, shoulder-breadth, shoulder-to-crotch length, thigh girth, waist girth, wrist girth}

The data is split into 3 sets - Training, Test Set A, Test Set B. For the training and Test-A sets, subjects are photographed and 3D-scanned by in a lab by technicians. For the Test-B set, subjects ...

Usage examples

Human Body Measurement Estimation with Adversarial Augmentation by Nataniel Ruiz, Miriam Bellver, Timo Bolkart, Ambuj Arora, Ming C. Lin, Javier Romero and Raja Bala

See 1 usage example →

PersonPath22

amazon.sciencecomputer vision

PersonPath22 is a large-scale multi-person tracking dataset containing 236 videos captured mostly from static-mounted cameras, collected from sources where we were given the rights to redistribute the content and participants have given explicit consent. Each video has ground-truth annotations including both bounding boxes and tracklet-ids for all the persons in each frame.

Usage examples

Large scale Real-world Multi-Person Tracking by Bing Shuai, Alessandro Bergamo, Uta Buechler, Andrew Berneshawi, Alyssa Boden, Joseph Tighe

See 1 usage example →

Airborne Object Tracking Dataset

amazon.sciencecomputer visiondeep learningmachine learning

Airborne Object Tracking (AOT) is a collection of 4,943 flight sequences of around 120 seconds each, collected at 10 Hz in diverse conditions. There are 5.9M+ images and 3.3M+ 2D annotations of airborne objects in the sequences. There are 3,306,350 frames without labels as they contain no airborne objects. For images with labels, there are on average 1.3 labels per image. All airborne objects in the dataset are labelled.

Amazon Berkeley Objects Dataset

amazon.sciencecomputer visiondeep learninginformation retrievalmachine learningmachine translation

Amazon Berkeley Objects (ABO) is a collection of 147,702 product listings with multilingual metadata and 398,212 unique catalog images. 8,222 listings come with turntable photography (also referred as "spin" or "360º-View" images), as sequences of 24 or 72 images, for a total of 586,584 images in 8,209 unique sequences. For 7,953 products, the collection also provides high-quality 3d models, as glTF 2.0 files.

Clay Model v0 Embeddings

aerial imagerycomputer visionearth observationimagingmachine learningsatellite imagery

Machine learning model embeddings dataset providing pre-computed feature representations for satellite and aerial imagery analysis.

FashionLocalTriplets

amazon.sciencecomputer visionmachine learning

Fine-grained localized visual similarity and search for fashion.