The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

CZ CELLxGENE Discover Census

bioinformatics cell biology single-cell transcriptomics transcriptomics

Description

CZ CELLxGENE Discover (cellxgene.cziscience.com) is a free-to-use platform for the exploration, analysis, and retrieval of single-cell data. CZ CELLxGENE Discover hosts the largest aggregation of standardized single-cell data from the major human and mouse tissues, with modalities that include gene expression, chromatin accessibility, DNA methylation, and spatial transcriptomics. This year, CZ CELLxGENE Discover has made available all of its human and mouse RNA single-cell data through Census (https://chanzuckerberg.github.io/cellxgene-census/) – a free-to-use service with an API and data that allows for querying its single-cell data corpus directly from Python or R. The API uses a new technology, TileDB-SOMA, that allows for efficient and low-latency querying. The data are fully standardized and hosted publicly for free access, and they are composed by a count matrix of tens of millions of cells (observations) by >60 k genes (features) accompanied by standard cell metadata variables (e.g. cell type, tissue, sequencing technology, donor id, etc) and gene metadata that includes GENCODE-based IDs and gene names. While these data are built from hundreds of datasets, the APIs enable convenient cell- and gene-based filtering to obtain any slice of interest in a matter of seconds. All data can be quickly transformed to NumPy, Pandas, Anndata, Seurat, or R base objects. We created data loaders for the data to be directly used by PyTorch for modeling at scale. In addition, all the source dataset files in H5AD format are also available for retrieval.

Update Frequency

New releases are published weekly. Long-term supported (LTS) releases are published every 6 months.

License

CC BY license

Documentation

https://chanzuckerberg.github.io/cellxgene-census/

Managed By

Chan Zuckerberg Initiative Foundation

See all datasets managed by Chan Zuckerberg Initiative Foundation.

Contact

cellxgene@chanzuckerberg.com

How to Cite

CZ CELLxGENE Discover Census was accessed on DATE from https://registry.opendata.aws/czi-cellxgene-census.

Usage Examples

Tutorials
Tools & Applications
Publications

Resources on AWS

  • Description
    CZ CELLxGENE Discover Census Data
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::cellxgene-census-public-us-west-2/cell-census
    AWS Region
    us-west-2
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://cellxgene-census-public-us-west-2/cell-census/

Edit this dataset entry on GitHub

Tell us about your project

Home