The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

Open Human Genome Library

bioinformatics biology genomic life sciences

Description

The Open Human Genome Library (OpenHGL) is a collection of high-quality de novo human assemblies that are publicly available in genomic databases (e.g. NCBI and CNCB) or from individual research papers. It provides consistent naming and uniform formats across datasets, supporting efficient subsequence retrieval and approximate string search.

Update Frequency

As new data or new analyses become available

License

Creative Commons Zero (CC0)

Documentation

https://lh3.github.io/OpenHGL/

Managed By

Heng Li lab at Dana-Farber Cancer Institute and Harvard Medical School

See all datasets managed by Heng Li lab at Dana-Farber Cancer Institute and Harvard Medical School.

Contact

https://github.com/lh3/OpenHGL/issues

How to Cite

Open Human Genome Library was accessed on DATE from https://registry.opendata.aws/openhgl.

Usage Examples

Tutorials
Publications

Resources on AWS

  • Description
    This bucket contains genomic sequences in the AGC format and the corresponding FM-index in the ropebwt3 format.
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::openhgl
    AWS Region
    us-east-1
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://openhgl/
  • Description
    Notifications for OpenHGL updates
    Resource type
    SNS Topic
    Amazon Resource Name (ARN)
    arn:aws:sns:us-east-1:104240442756:openhgl-object_created
    AWS Region
    us-east-1

Edit this dataset entry on GitHub

Tell us about your project

Home