Oxford Nanopore Technologies Benchmark Datasets

bioinformatics biology fast5 fastq genomic Homo sapiens life sciences whole genome sequencing

Description

The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. GM24385 as reference human). Raw data are provided with metadata and scripts to describe sample and data provenance.

Update Frequency

Additional datasets will be added periodically. Updates and amendents will be made to existing entries when algorithmic advancements are made (e.g. improvements to basecalling algorithms).

License

Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/ The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.

Documentation

https://nanoporetech.github.io/ont-open-datasets

Managed By

Oxford Nanopore Technologies

See all datasets managed by Oxford Nanopore Technologies.

Contact

support@nanoporetech.com

Usage Examples

Tutorials

Resources on AWS

  • Description
    Oxford Nanopore Open Datasets
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data
    AWS Region
    eu-west-1
    AWS CLI Access (No AWS account required)
    aws s3 ls s3://ont-open-data/ --no-sign-request
  • Description
    Nanopore sequencing data of the Genome in a Bottle sample GM24385. Multiple PromethION flowcells using both the R9.4.1 and R10.3 nanopores. The direct sequencer output is included, raw signal data stored in .fast5 files and basecalled data in .fastq file. Additional secondary analyses are included, notably alignments of sequence data to the reference genome are provided along with statistics derived from these. The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data/gm24385_2020.09
    AWS Region
    eu-west-1
    AWS CLI Access (No AWS account required)
    aws s3 ls s3://ont-open-data/gm24385_2020.09/ --no-sign-request

Edit this dataset entry on GitHub

Home