The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

ClinVar - Data Lakehouse Ready

biotech blueprint chemistry genetic genomic life sciences parquet

Description

ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed health status, and the history of that interpretation. ClinVar processes submissions reporting variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data. The alleles described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then presents the data for interactive users as well as those wishing to use ClinVar in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible. This representation of ClinVar is stored in Parquet format and most easily utilized through Amazon Athena. Follow the documentation link for install instructions (< 2 minute install).

Update Frequency

Every Sunday at 1AM UTC

License

https://github.com/aws-samples/data-lake-as-code/blob/roda/docs/roda_attributions.txt

Documentation

https://github.com/aws-samples/data-lake-as-code/blob/roda/docs/roda_install.md

Managed By

See all datasets managed by Amazon Web Services.

Contact

https://github.com/aws-samples/data-lake-as-code/issues

How to Cite

ClinVar - Data Lakehouse Ready was accessed on DATE from https://registry.opendata.aws/clinvar.

Usage Examples

Tutorials
Publications

Resources on AWS

  • Description
    ClinVar
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::aws-roda-hcls-datalake/clinvar_summary_variants/
    AWS Region
    us-east-1
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://aws-roda-hcls-datalake/clinvar_summary_variants/

Edit this dataset entry on GitHub

Tell us about your project

Home