bioinformatics biology biotech blueprint genetic genomic life sciences parquet population genetics vcf whole genome sequencing
Amazon is no longer hosting this Data Lakehouse Ready dataset
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.
Not updated
https://github.com/aws-samples/data-lake-as-code/tree/roda#readme
See all datasets managed by Amazon Web Services.
https://github.com/aws-samples/data-lake-as-code/issues
Genome Aggregation Database (gnomAD) - Data Lakehouse Ready was accessed on DATE
from https://registry.opendata.aws/gnomad-data-lakehouse-ready.
arn:aws:s3:::aws-roda-hcls-datalake/gnomad
us-east-1
aws s3 ls --no-sign-request s3://aws-roda-hcls-datalake/gnomad/