BioModels Database logo

BioModels Database


Mathias Uhlen et al., (2017). A pathology atlas of the human cancer transcriptome.

August 2017, model of the month by Varun Kothamachu and Rahuman Sheriff
Original models can be found in our new section on Patient Derived Genome Scale Metabolic Models PDGSMM
For this month, we are making an exception from our usual model of the month article which describes a model from the curated branch.
We will instead discuss models from a recent submission to BioModels which could have a significant impact to the area of personalised medicine and cancer research.


Cancer is a highly complex, multifactorial disease with genetic and environmental factors working together in ways that we do not yet fully understand. Mathematical modelling approaches are increasingly used by researchers to model cancer initiation, progression and metastasis, to elucidate mechanisms and derive quantitative predictions [1]. Previous efforts to model cancer employed modelling abstract cancer growth or considered a small signalling network. Recently M. Uhlen et al. [2] developed genome scale metabolic model factoring in genomic variability amongst major types of cancer and individual patients. They also included a clinical context to the models which has significant impact on the quality of models in predicting therapeutic outcomes.


To understand inter-cancer and inter-patient variability and the underlying molecular mechanisms, this study carried out a system-wide analysis on the transcriptomic data from tumour samples available in “The Cancer Genome Atlas” (TCGA) [3, 4]. By comparing differential gene expression across samples, and correlating this information with clinical outcome of the patient, a list of candidate prognostic genes was derived. Combining RNA-seq data from individual patients, data from Human Protein Atlas (HPA) [5] and a generic human metabolic model (HMR2 [6, 7]), personalised Genome Scale Metabolic Models (GSMM) representing tumour growth were reconstructed. With data from nearly 8000 patients across 17 major human cancer types, these efforts resulted in 6753 patient specific models. These models are currently hosted in the new “Patient Derived Genome Scale Metabolic Models (PDGSMM)” section [8] of BioModels, and have the accession numbers MODEL1707110000 - MODEL1707116752.

Model Construction

To build these patient specific GSMMs, the authors used a task-driven model reconstruction algorithm called tINIT [9] that combines generic human GSMM HMR2 with RNA-seq data from each patient sample. The presence or absence of specific proteins determined from the expression data was used to customise the metabolic pathways described in HMR2 for each patient model. Considering cell growth as the main metabolic task for model reconstruction, the distribution of metabolic fluxes in the model were determined. This approach resulted in 6753 personalised models, each representing an individual patient. These models contain anywhere between 2070 and 4058 metabolites, 2093 to 5261 reactions and 978 to 2102 associated genes. Of these, 1419 metabolites, 1020 reactions and 334 genes were commonly found across all the patient derived models.

Figure 1

Figure 1 Shows the workflow involved in building a personalised Genome Scale Metabolic Model (GSMM). Figure derived from (2).


By comparing gene expression data across all patient samples collected by TCGA, this study identified a list of candidate prognostic genes associated with different clinical outcomes. The impact of prognostic genes on the clinical outcome (for example, patient survival) can be positive (favourable) or negative (unfavourable). Prognostic genes with favourable and unfavourable influences are defined as genes whose over expression can either increase or decrease chances of patient survival respectively. Examining the functional significance of prognostic genes from the protein coding regions showed that genes with a favourable influence are usually involved in a positive regulation of immune cell activation and cell to cell adhesion (see Figure 3B, from [2]) while unfavourable genes are associated with cell proliferation, cell cycle regulation and nucleic acid metabolism. Examining the clinical outcomes associated with different types of cancer, in nearly 8000 tissue samples revealed that prostate and testis cancer had the highest (98% and 97% respectively) 3-year survival rate, while high grade glioma and pancreatic cancer had the lowest (8% and 35% respectively) 3-year survival rate.

None of the genes from the hallmarks of cancer [9] (see Figure 5, from [2]) were shared across different types of cancer. A functional analysis of the enriched prognostic genes in different prognostic clusters shows that these enriched genes have functional similarities with the hallmark genes involved in DNA repair, angiogenesis, cell to cell signalling and cell proliferation. Such functional comparisons could enable identification of previously unknown genes involved in the onset and progression of cancer.

Furthermore, authors used the personalised human GSMMs (HMR2) to analyse the changes in the metabolism across patients and different cancer types. This study identified that the genes important for tumour growth showed different expression patterns across patients. For example, lung cancer patients exhibit significant differences in the expression of essential genes participating in the TCA cycle. Expression of genes like FH (Fumarate Hydratase) are conserved across all liver cancer patients, while SDHA (Succinate dehydrogenase complex, subunit A) is found in around 60% of the liver cancer patients and ACLY across less than 5% of the patients.

From a total of 2553 essential genes identified as important for tumour growth, 55 (2%) of these genes were commonly found in patients across all types of cancers. Within different patient groups with the same type of cancer, 80% of the patients shared anywhere between 10 - 25% of the essential genes. When examining these as potential drug targets, the study found that a large number of these shared essential genes play an important role in central metabolism and hence cannot be targeted for drug development. Toxicity tests conducted using these personalised GSMM models showed that targeting 75-81% of these essential genes caused severe side effects due to the important roles that they play in healthy tissues. Using these models, the authors of the study were able to identify 32 other genes involved in nucleic acid metabolism as candidate drug targets as they were non - toxic in healthy cells but essential in 80% of the tumours irrespective of the type of cancer.

Significance of these Models

Using these personalised GSMM models of individual patients has allowed an in depth examination of metabolic variability amongst cancer patients and across different cancer types. These models can therefore be useful to perform in-silico toxicity studies and predict side effects in individual patients during clinical trials. In summary, genome scale models could help in developing better therapies and support more personalised treatments for cancer.

Scientific Value and Impact

The scale of this study involving data from ~8000 patients representing 17 different cancer types is unprecedented. By examining differences in the gene expression between individual patients and cancer types, this study derived a system-wide understanding of molecular mechanisms controlling cancer incidence and progression. The open access Human Pathology Atlas (part of the Human Protein Atlas [10]) created by the authors contains >900,000 survival plots (see Figure 2C from [2]) across all major types of human cancer, and shows protein-coding prognostic genes and their associated clinical outcomes. Patient specific models built in this study are a good starting point for efforts towards developing personalised medicine for treating cancer. By sharing these models on BioModels, the authors have made them available for use by the wider scientific community. The approach used in this study to develop metabolic models for cancer can be extended to different disease to predict prognostic markers and identify new targets for drug development.

With the development of efficient data analysis pipelines and increasing efforts towards collecting clinical data, research based on systems-wide “big data” approaches like the one described here will have a significant impact on identifying novel prognostic markers and developing effective therapies.


  1. Altrock, P.M., Liu, L.L., and Michor, F. (2015). The mathematics of cancer: integrating quantitative models. Nat Rev Cancer 15, 730–745.

  2. Uhlen, M., Zhang, C., Lee, S., Sjöstedt, E., Fagerberg, L., Bidkhori, G., Benfeitas, R., Arif, M., Liu, Z., Edfors, F., et al. (2017). A pathology atlas of the human cancer transcriptome. Science 357, eaan2507.

  3. Peng, L., Bian, X.W., Li, D.K., Xu, C., Wang, G.M., Xia, Q.Y., and Xiong, Q. (2015). Large-scale RNA-Seq Transcriptome Analysis of 4043 Cancers and 548 Normal Tissue Controls across 12 TCGA Cancer Types. Scientific Reports 5, srep13413.

  4. Genomic Data Common (

  5. Uhlén, M., Fagerberg, L., Hallström, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, Å., Kampf, C., Sjöstedt, E., Asplund, A., et al. (2015). Tissue-based map of the human proteome. Science 347, 1260419.

  6. Mardinoglu, A., Agren, R., Kampf, C., Asplund, A., Uhlen, M., and Nielsen, J. (2014). Genome-scale metabolic modelling of hepatocytes reveals serine deficiency in patients with non-alcoholic fatty liver disease. Nature Communications 5, ncomms4083.

  7. Agren, R., Mardinoglu, A., Asplund, A., Kampf, C., Uhlen, M., and Nielsen, J. (2014). Identification of anticancer drugs for hepatocellular carcinoma through personalized genome-scale metabolic modeling. Mol. Syst. Biol. 10, 721.

  8. Patient Derived Genome Scale Metabolic Models section on BioModels (

  9. Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of Cancer: The Next Generation. Cell 144, 646–674.

  10. Human Protein Atlas (