Description
The Cancer Genome Atlas (TCGA), a collaboration between the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI), aims to generate comprehensive, multi-dimensional maps of the key genomic changes in major types and subtypes of cancer. TCGA has analyzed matched tumor and normal tissues from 11,000 patients, allowing for the comprehensive characterization of 33 cancer types and subtypes, including 10 rare cancers.
The dataset contains open Clinical Supplement, Biospecimen Supplement, RNA-Seq Gene Expression Quantification, miRNA-Seq Isoform Expression Quantification, miRNA Expression Quantification, Genotyping Array Copy Number Segment, Genotyping Array Masked Copy Number Segment, Genotyping Array Gene Level Copy Number Scores, and WXS Masked Somatic Mutation data from Genomic Data Commons (GDC).
This dataset also contains controlled Whole Exome Sequencing (WXS), RNA-Seq, miRNA-Seq, ATAC-Seq Aligned Reads, WXS Annotated Somatic Mutation, WXS Raw Somatic Mutation, and WXS Aggregated Somatic Mutation data from GDC.
TCGA is made available on AWS via the NIH STRIDES Initiative.
Update Frequency
Genomic Data Commons (GDC) is source of truth for this dataset; GDC offers monthly data releases,
although this dataset may not be updated at every release.
License
NIH Genomic Data Sharing Policy: https://gdc.cancer.gov/access-data/data-access-policies
Documentation
https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga
Managed By
See all datasets managed by Center for Translational Data Science at The University of Chicago.
Contact
dcf-support@datacommons.io
How to Cite
The Cancer Genome Atlas was accessed on DATE
from https://registry.opendata.aws/tcga.
Usage Examples
Tools & Applications
Publications
-
"Before and After: A Comparison of Legacy and Harmonized TCGA Data at the Genomic Data
Commons"
by Galen F. Gao, Joel S. Parker, et al.
-
A Pan-Cancer Analysis of Enhancer Expression in Nearly 9000 Patient Samples by Han Chen, Chunyan Li, et al.
-
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome
Analytics
by Jianfang Liu, Tara Lichtenberg, et al.
-
Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types
of Cancer
by Katherine A. Hoadley, Christina Yau, et al.
-
Comparative Molecular Analysis of Gastrointestinal Adenocarcinomas by Yang Liu, Nilay S. Sethi, et al.
-
Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients by André Kahles, Kjong-Van Lehmann, et al.
-
Comprehensive Characterization of Cancer Driver Genes and Mutations by Matthew H. Bailey, Collin Tokheim, et al.
-
Genomic and Functional Approaches to Understanding Cancer Aneuploidy by Alison M. Taylor, Juliann Shih, et al.
-
Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome
Atlas
by Theo A. Knijnenburg, Linghua Wang, et al.
-
Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas by Joshua D. Campbell, Christina Yau, et al.
-
Integrated Genomic Analysis of the Ubiquitin Pathway across Cancer Types by Zhongqi Ge, Jake S. Leighton, et al.
-
Machine Learning Detects Pan-Cancer Ras Pathway Activation in The Cancer Genome Atlas by Gregory P. Way, Francisco Sanchez-Vega, et al.
-
Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation
by Tathiane M. Malta, Artem Sokolov, et al.
-
Molecular Characterization and Clinical Relevance of Metabolic Expression Subtypes in Human
Cancers
by Xinxin Peng, Zhongyuan Chen, et al.
-
Oncogenic Signaling Pathways in The Cancer Genome Atlas by Francisco Sanchez-Vega, Marco Mina, et al.
-
Pan-Cancer Alterations of the MYC Oncogene and its Proximal Network Across The Cancer Genome
Atlas
by Franz X. Schaub, Varsha Dhankani, et al.
-
Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each
Tumor Context
by Hua-Sheng Chiu, Sonal Somvanshi, et al.
-
Pathogenic Germline Variants in 10,389 Adult Cancers by Kuan-lin Huang, R. Jay Mashl, et al.
-
Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic
Pipelines
by Kyle Ellrott, Matthew H. Bailey, et al.
-
Spatial Organization And Molecular Correlation Of Tumor-Infiltrating Lymphocytes Using Deep
Learning On Pathology Images
by Joel Saltz, Rajarsi Gupta, et al.
-
The chromatin accessibility landscape of primary human cancers by M. Ryan Corces, Jeffrey M. Granja, et al.
-
The Immune Landscape of Cancer by VĂ©steinn Thorsson, David L. Gibbs, et al.