CNAs (gene-associated CNAs identified using the GISTIC2 tool75 from TCGA data) were downloaded via firehose from 'https://gdac.broadinstitute.org'. Both amplified and deleted genes were collected, ultramutated samples from syn1729383 were removed, and the copy number rate for a gene was defined as the number of times it was amplified or deleted in a specific cohort. All types of CNAs were aggregated for each cancer type, without distinguishing between amplifications and deletions.

Chromosomal location information for each gene was obtained from the UCSC Genome Browser ('https://genome.ucsc.edu/'). This information was used to group genes located on the same chromosome and calculate the total number of CNAs affecting each chromosome in each cancer type.

To compare CNAs across different cancer types, the copy number rate of each gene was normalized by the total number of CNAs affecting the corresponding chromosome in the same cancer type. This normalization accounts for the fact that some chromosomes are more commonly affected by CNAs than others.

Principal component analysis (PCA) was used to visualize similarities and differences in CNA profiles between cancer types. PCA reduces data dimensionality by identifying the directions (principal components) capturing the most variance. The first two principal components were used to generate a scatter plot of cancer types, where each point represents a cancer type and the distance between points reflects the similarity of their CNAs.

Cancer Type Comparison Using Copy Number Alterations (CNAs) from TCGA Data

原文地址: https://www.cveoy.top/t/topic/n4h9 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录