Voices Obscured in Complex Environmental Settings (VOiCES)

automatic speech recognition denoising machine learning speaker identification speech processing

Description

VOiCES is a speech corpus recorded in acoustically challenging settings, using distant microphone recording. Speech was recorded in real rooms with various acoustic features (reverb, echo, HVAC systems, outside noise, etc.). Adversarial noise, either television, music, or babble, was concurrently played with clean speech. Data was recorded using multiple microphones strategically placed throughout the room. The corpus includes audio recordings, orthographic transcriptions, and speaker labels.

Update Frequency

Data from two additional rooms will be added to the corpus Fall 2018.

License

Creative Commons BY 4.0 (see here for more details)

Documentation

https://voices18.github.io/

Managed By

In-Q-Tel

See all datasets managed by In-Q-Tel.

Contact

https://github.com/voices18/utilities/issues

How to Cite

Voices Obscured in Complex Environmental Settings (VOiCES) was accessed on DATE from https://registry.opendata.aws/lab41-sri-voices.

Usage Examples

Tutorials

Getting started with VOiCES data by M.A. Barrios

Resources on AWS

Description

wav audio files, orthographic transcriptions, and speaker ID

Resource type

S3 Bucket

Amazon Resource Name (ARN)

arn:aws:s3:::lab41openaudiocorpus

AWS Region

us-east-1

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://lab41openaudiocorpus/