bioinformatics biology deep learning life sciences machine learning protein
ProteinGym is a benchmark suite for assessing the performance of protein fitness prediction and design models. It comprises a large curated collection of 200+ high-throughput experimental assays (~3M mutated sequences), as well as clinical annotations from experts about the pathogenicity of mutants in over 3k human genes.
Quarterly
MIT License
https://github.com/OATML-Markslab/ProteinGym/blob/main/README.md
Harvard Medical School; University of Oxford
See all datasets managed by Harvard Medical School; University of Oxford.
ProteinGym was accessed on DATE
from https://registry.opendata.aws/proteingym.
arn:aws:s3:::proteingym
us-east-2
aws s3 ls --no-sign-request s3://proteingym/