The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

Synthea synthetic patient generator data in OMOP Common Data Model

bioinformatics health life sciences natural language processing us


The Synthea generated data is provided here as a 1,000 person (1k), 100,000 person (100k), and 2,800,000 persom (2.8m) data sets in the OMOP Common Data Model format. SyntheaTM is a synthetic patient generator that models the medical history of synthetic patients. Our mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions. It can be used without restriction for a variety of secondary uses in academia, research, industry, and government (although a citation would be appreciated). You can read our first academic paper here:

Update Frequency

Not updated



Managed By

Amazon Web Sevices

See all datasets managed by Amazon Web Sevices.


Post any questions to re:Post and use the AWS Open Data tag.

How to Cite

Synthea synthetic patient generator data in OMOP Common Data Model was accessed on DATE from

Usage Examples

Tools & Applications

Resources on AWS

  • Description
    Project data files
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    AWS Region
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://synthea-omop/

Edit this dataset entry on GitHub

Tell us about your project