Comprehensive molecular characterization of mitochondrial genomes in human cancers
Abstract
Mitochondria are essential cellular organelles that play critical roles in cancer. Here, as part of the International Cancer Genome Consortium/The Cancer Genome Atlas Pan-Cancer Analysis of Whole Genomes Consortium, which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we performed a multidimensional, integrated characterization of mitochondrial genomes and related RNA sequencing data. Our analysis presents the most definitive mutational landscape of mitochondrial genomes and identifies several hypermutated cases. Truncating mutations are markedly enriched in kidney, colorectal and thyroid cancers, suggesting oncogenic effects with the activation of signaling pathways. We find frequent somatic nuclear transfers of mitochondrial DNA, some of which disrupt therapeutic target genes. Mitochondrial copy number varies greatly within and across cancers and correlates with clinical variables. Co-expression analysis highlights the function of mitochondrial genes in oxidative phosphorylation, DNA repair and the cell cycle, and shows their connections with clinically actionable genes. Our study lays a foundation for translating mitochondrial biology into clinical applications.
Article type: Research Article
Affiliations: grid.240145.60000 0001 2291 4776Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.10306.340000 0004 0606 5382Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK; grid.37172.300000 0001 2292 0500Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea; grid.264381.a0000 0001 2181 989XDepartment of Health Science and Technology, Samsung Advanced Institute for Health Science and Technology, Sungkyunkwan University School of Medicine, Seoul, Korea; grid.414964.a0000 0001 0640 5613Samsung Genome Institute, Samsung Medical Center, Seoul, Korea; grid.39382.330000 0001 2160 926XQuantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX USA; grid.267308.80000 0000 9206 2401Division of Biostatistics, The University of Texas Health Science Center at Houston School of Public Health, Houston, TX USA; grid.39382.330000 0001 2160 926XDepartment of Medicine and Dan L. Duncan Cancer Center Division of Biostatistics, Baylor College of Medicine, Houston, TX USA; grid.240145.60000 0001 2291 4776Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.21107.350000 0001 2171 9311Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD USA; grid.267308.80000 0000 9206 2401Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston McGovern Medical School, Houston, TX USA; grid.255649.90000 0001 2171 7754Department of Biochemistry, Ewha Womans University School of Medicine, Seoul, Korea; grid.509459.40000 0004 0472 0267Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan; grid.264381.a0000 0001 2181 989XDivision of Hematology/Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea; grid.264381.a0000 0001 2181 989XSamsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University School of Medicine, Seoul, South Korea; grid.5335.00000000121885934Department of Haematology, University of Cambridge, Cambridge, UK; grid.7737.40000 0004 0410 2071Applied Tumor Genomics Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland; grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK; grid.51462.340000 0001 2171 9952Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.26999.3d0000 0001 2151 536XGenome Science Division, Research Center for Advanced Science and Technology, University of Tokyo, Tokyo, Japan; grid.170205.10000 0004 1936 7822Department of Surgery, University of Chicago, Chicago, IL USA; grid.414067.00000 0004 0647 8419Department of Surgery, Division of Hepatobiliary and Pancreatic Surgery, School of Medicine, Keimyung University Dongsan Medical Center, Daegu, South Korea; grid.256155.00000 0004 0647 2973Department of Oncology, Gil Medical Center, Gachon University, Incheon, South Korea; grid.257022.00000 0000 8711 3200Hiroshima University, Hiroshima, Japan; grid.240145.60000 0001 2291 4776University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.415310.20000 0001 2191 4301King Faisal Specialist Hospital and Research Centre, Al Maather, Riyadh, Saudi Arabia; grid.7719.80000 0000 8700 1153Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain; grid.13648.380000 0001 2180 3484Bioinformatics Core Facility, University Medical Center Hamburg, Hamburg, Germany; grid.418481.00000 0001 0665 103XHeinrich Pette Institute, Leibniz Institute for Experimental Virology, Hamburg, Germany; grid.419890.d0000 0004 0626 690XOntario Tumour Bank, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.240145.60000 0001 2291 4776Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.48336.3a0000 0004 1936 8075Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD USA; grid.266100.30000 0001 2107 4242Department of Cellular and Molecular Medicine and Department of Bioengineering, University of California San Diego, La Jolla, CA USA; grid.516081.b0000 0000 9217 9714UC San Diego Moores Cancer Center, San Diego, CA USA; grid.434706.20000 0004 0410 5424Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC Canada; grid.1008.90000 0001 2179 088XSir Peter MacCallum Department of Oncology, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia; grid.11794.3a0000000109410645Centre for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain; grid.11794.3a0000000109410645Department of Zoology, Genetics and Physical Anthropology, (CiMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain; grid.6312.60000 0001 2097 6738The Biomedical Research Centre (CINBIO), Universidade de Vigo, Vigo, Spain; grid.416177.20000 0004 0417 7890Royal National Orthopaedic Hospital – Bolsover, London, UK; grid.240145.60000 0001 2291 4776Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.249880.f0000 0004 0374 0039The Jackson Laboratory for Genomic Medicine, Farmington, CT USA; grid.419890.d0000 0004 0626 690XGenome Informatics Program, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.9764.c0000 0001 2153 9986Institute of Human Genetics, Christian-Albrechts-University, Kiel, Germany; grid.410712.10000 0004 0473 882XInstitute of Human Genetics, Ulm University and Ulm University Medical Center, Ulm, Germany; grid.1003.20000 0000 9320 7537Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St. Lucia, Brisbane, QLD Australia; grid.412346.60000 0001 0237 2025Salford Royal NHS Foundation Trust, Salford, UK; grid.411475.20000 0004 1756 948XDepartment of Surgery, Pancreas Institute, University and Hospital Trust of Verona, Verona, Italy; grid.5288.70000 0000 9758 5690Molecular and Medical Genetics, OHSU Knight Cancer Institute, Oregon Health and Science University, Portland, OR USA; grid.248762.d0000 0001 0702 3000Department of Molecular Oncology, BC Cancer Research Centre, Vancouver, BC Canada; grid.4367.60000 0001 2355 7002The McDonnell Genome Institute at Washington University, St. Louis, MO USA; grid.83440.3b0000000121901201University College London, London, UK; grid.272242.30000 0001 2168 5385Division of Cancer Genomics, National Cancer Center Research Institute, National Cancer Center, Tokyo, Japan; DLR Project Management Agency, Bonn, Germany; grid.410818.40000 0001 0720 6587Tokyo Women’s Medical University, Tokyo, Japan; grid.51462.340000 0001 2171 9952Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.148313.c0000 0004 0428 3079Los Alamos National Laboratory, Los Alamos, NM USA; grid.417184.f0000 0001 0661 1177Department of Pathology, University Health Network, Toronto General Hospital, Toronto, ON Canada; grid.240404.60000 0001 0440 1889Nottingham University Hospitals NHS Trust, Nottingham, UK; grid.7497.d0000 0004 0492 0584Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.419890.d0000 0004 0626 690XComputational Biology Program, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.17063.330000 0001 2157 2938Department of Molecular Genetics, University of Toronto, Toronto, ON Canada; grid.494618.6Vector Institute, Toronto, ON Canada; grid.9764.c0000 0001 2153 9986Hematopathology Section, Institute of Pathology, Christian-Albrechts-University, Kiel, Germany; grid.10698.360000000122483208Department of Pathology and Laboratory Medicine, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.55325.340000 0004 0389 8485Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway; grid.5841.80000 0004 1937 0247Pathology, Hospital Clinic, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Barcelona, Spain; grid.5335.00000000121885934Department of Veterinary Medicine, Transmissible Cancer Group, University of Cambridge, Cambridge, UK; grid.4367.60000 0001 2355 7002Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO USA; grid.8756.c0000 0001 2193 314XWolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Glasgow, UK; grid.10698.360000000122483208Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA; grid.511177.4Dana-Farber/Boston Children’s Cancer and Blood Disorders Center, Boston, MA USA; grid.38142.3c000000041936754XDepartment of Pediatrics, Harvard Medical School, Boston, MA USA; grid.443984.60000 0000 8813 7132Leeds Institute of Medical Research @ St. James’s, University of Leeds, St. James’s University Hospital, Leeds, UK; grid.411475.20000 0004 1756 948XDepartment of Pathology and Diagnostics, University and Hospital Trust of Verona, Verona, Italy; grid.412744.00000 0004 0380 2017Department of Surgery, Princess Alexandra Hospital, Brisbane, QLD Australia; grid.1003.20000 0000 9320 7537Surgical Oncology Group, Diamantina Institute, University of Queensland, Brisbane, QLD Australia; grid.67105.350000 0001 2164 3847Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH USA; grid.443867.a0000 0000 9149 4843Research Health Analytics and Informatics, University Hospitals Cleveland Medical Center, Cleveland, OH USA; grid.413144.70000 0001 0489 6543Gloucester Royal Hospital, Gloucester, UK; grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK; grid.419890.d0000 0004 0626 690XDiagnostic Development, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.10097.3f0000 0004 0387 1602Barcelona Supercomputing Center (BSC), Barcelona, Spain; grid.22072.350000 0004 1936 7697Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB Canada; grid.22072.350000 0004 1936 7697Departments of Surgery and Oncology, University of Calgary, Calgary, AB Canada; grid.55325.340000 0004 0389 8485Department of Pathology, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway; grid.419890.d0000 0004 0626 690XPanCuRx Translational Research Initiative, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.21107.350000 0001 2171 9311Department of Oncology, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.430506.40000 0004 0465 4079University Hospital Southampton NHS Foundation Trust, Southampton, UK; grid.439344.d0000 0004 0641 6760Royal Stoke University Hospital, Stoke-on-Trent, UK; grid.419890.d0000 0004 0626 690XGenome Sequence Informatics, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.459583.60000 0004 4652 6825Human Longevity Inc, San Diego, CA USA; grid.1018.80000 0001 2342 0938Olivia Newton-John Cancer Research Institute, La Trobe University, Heidelberg, VIC Australia; grid.9227.e0000000119573309Computer Network Information Center, Chinese Academy of Sciences, Beijing, China; grid.440163.40000 0001 0352 8618Genome Canada, Ottawa, ON Canada; grid.473715.30000 0004 6475 7299CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain; grid.5612.00000 0001 2172 2676Universitat Pompeu Fabra (UPF), Barcelona, Spain; grid.272799.00000 0000 8687 5377Buck Institute for Research on Aging, Novato, CA USA; grid.189509.c0000000100241216Duke University Medical Center, Durham, NC USA; grid.10423.340000 0000 9529 9877Department of Human Genetics, Hannover Medical School, Hannover, Germany; grid.50956.3f0000 0001 2152 9905Center for Bioinformatics and Functional Genomics, Cedars-Sinai Medical Center, Los Angeles, CA USA; grid.50956.3f0000 0001 2152 9905Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA USA; grid.9619.70000 0004 1937 0538The Hebrew University Faculty of Medicine, Jerusalem, Israel; grid.4868.20000 0001 2171 1133Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK; grid.9647.c0000 0004 7669 9786Department of Computer Science, Bioinformatics Group, University of Leipzig, Leipzig, Germany; grid.9647.c0000 0004 7669 9786Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany; grid.9647.c0000 0004 7669 9786Transcriptome Bioinformatics, LIFE Research Center for Civilization Diseases, University of Leipzig, Leipzig, Germany; grid.65499.370000 0001 2106 9910Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA USA; grid.65499.370000 0001 2106 9910Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA USA; grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA; grid.42505.360000 0001 2156 6853USC Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA USA; grid.411475.20000 0004 1756 948XDepartment of Diagnostics and Public Health, University and Hospital Trust of Verona, Verona, Italy; grid.7048.b0000 0001 1956 2722Department of Mathematics, Aarhus University, Aarhus, Denmark; grid.154185.c0000 0004 0512 597XDepartment of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus N, Denmark; Instituto Carlos Slim de la Salud, Mexico City, Mexico; grid.17063.330000 0001 2157 2938Department of Medical Biophysics, University of Toronto, Toronto, ON Canada; grid.1005.40000 0004 4902 0432Cancer Division, Garvan Institute of Medical Research, Kinghorn Cancer Centre, University of New South Wales (UNSW Sydney), Sydney, NSW Australia; grid.1005.40000 0004 4902 0432South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales (UNSW Sydney), Liverpool, NSW Australia; grid.411714.60000 0000 9825 7840West of Scotland Pancreatic Unit, Glasgow Royal Infirmary, Glasgow, UK; grid.484013.a0000 0004 6879 971XCenter for Digital Health, Berlin Institute of Health and Charitè – Universitätsmedizin Berlin, Berlin, Germany; grid.7497.d0000 0004 0492 0584Heidelberg Center for Personalized Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.189509.c0000000100241216The Preston Robert Tisch Brain Tumor Center, Duke University Medical Center, Durham, NC USA; grid.32224.350000 0004 0386 9924Massachusetts General Hospital, Boston, MA USA; grid.410872.80000 0004 1774 5690National Institute of Biomedical Genomics, Kalyani, West Bengal India; grid.5510.10000 0004 1936 8921Institute of Clinical Medicine and Institute of Oral Biology, University of Oslo, Oslo, Norway; grid.10698.360000000122483208University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.411475.20000 0004 1756 948XARC-Net Centre for Applied Research on Cancer, University and Hospital Trust of Verona, Verona, Italy; grid.18886.3fThe Institute of Cancer Research, London, UK; grid.428397.30000 0004 0385 0924Centre for Computational Biology, Duke-NUS Medical School, Singapore, Singapore; grid.428397.30000 0004 0385 0924Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, Singapore; grid.4514.40000 0001 0930 2361Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden; grid.411327.20000 0001 2176 9917Department of Pediatric Oncology, Hematology and Clinical Immunology, Heinrich-Heine-University, Düsseldorf, Germany; grid.509459.40000 0004 0472 0267Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan; grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, Japan; Department of Internal Medicine/Hematology, Friedrich-Ebert-Hospital, Neumünster, Germany; grid.47100.320000000419368710Departments of Dermatology and Pathology, Yale University, New Haven, CT USA; grid.473715.30000 0004 6475 7299Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; grid.4991.50000 0004 1936 8948Radcliffe Department of Medicine, University of Oxford, Oxford, UK; grid.14709.3b0000 0004 1936 8649Canadian Center for Computational Genomics, McGill University, Montreal, QC Canada; grid.14709.3b0000 0004 1936 8649Department of Human Genetics, McGill University, Montreal, QC Canada; grid.19006.3e0000 0000 9632 6718Department of Human Genetics, University of California Los Angeles, Los Angeles, CA USA; grid.17063.330000 0001 2157 2938Department of Pharmacology, University of Toronto, Toronto, ON Canada; grid.412330.70000 0004 0628 2985Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Center, Tampere University Hospital, Tampere, Finland; grid.415967.80000 0000 9965 1030Haematology, Leeds Teaching Hospitals NHS Trust, Leeds, UK; grid.418116.b0000 0001 0200 3174Translational Research and Innovation, Centre Léon Bérard, Lyon, France; grid.249335.a0000 0001 2218 7820Fox Chase Cancer Center, Philadelphia, PA USA; grid.17703.320000000405980095International Agency for Research on Cancer, World Health Organization, Lyon, France; grid.421605.40000 0004 0447 4123Earlham Institute, Norwich, UK; grid.8273.e0000 0001 1092 7967Norwich Medical School, University of East Anglia, Norwich, UK; grid.5590.90000000122931605Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, HB The Netherlands; CRUK Manchester Institute and Centre, Manchester, UK; grid.17063.330000 0001 2157 2938Department of Radiation Oncology, University of Toronto, Toronto, ON Canada; grid.5379.80000000121662407Division of Cancer Sciences, Manchester Cancer Research Centre, University of Manchester, Manchester, UK; grid.415224.40000 0001 2150 066XRadiation Medicine Program, Princess Margaret Cancer Centre, Toronto, ON Canada; grid.38142.3c000000041936754XDepartment of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA USA; grid.21107.350000 0001 2171 9311Department of Surgery, Division of Thoracic Surgery, The Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.430814.a0000 0001 0674 1393Division of Molecular Pathology, The Netherlands Cancer Institute, Oncode Institute, Amsterdam, CX The Netherlands; grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA USA; grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA USA; grid.7497.d0000 0004 0492 0584Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.7497.d0000 0004 0492 0584German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.461742.20000 0000 8855 0365National Center for Tumor Diseases (NCT) Heidelberg, Heidelberg, Germany; grid.5170.30000 0001 2181 8870Center for Biological Sequence Analysis, Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark; grid.5254.60000 0001 0674 042XNovo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark; grid.1003.20000 0000 9320 7537Institute for Molecular Bioscience, University of Queensland, St. Lucia, Brisbane, QLD Australia; grid.5288.70000 0000 9758 5690Biomedical Engineering, Oregon Health and Science University, Portland, OR USA; grid.7497.d0000 0004 0492 0584Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.7700.00000 0001 2190 4373Institute of Pharmacy and Molecular Biotechnology and BioQuant, Heidelberg University, Heidelberg, Germany; grid.5586.e0000 0004 0639 2885Federal Ministry of Education and Research, Berlin, Germany; grid.1013.30000 0004 1936 834XMelanoma Institute Australia, University of Sydney, Sydney, NSW Australia; grid.16149.3b0000 0004 0551 4246Pediatric Hematology and Oncology, University Hospital Muenster, Muenster, Germany; grid.21107.350000 0001 2171 9311Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.21107.350000 0001 2171 9311McKusick-Nathans Institute of Genetic Medicine, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.418158.10000 0004 0534 4718Foundation Medicine, Inc, Cambridge, MA USA; grid.168010.e0000000419368956Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA USA; grid.168010.e0000000419368956Department of Genetics, Stanford University School of Medicine, Stanford, CA USA; grid.266102.10000 0001 2297 6811Bakar Computational Health Sciences Institute and Department of Pediatrics, University of California, San Francisco, CA USA; grid.5510.10000 0004 1936 8921Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway; grid.94365.3d0000 0001 2297 5165National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.5072.00000 0001 0304 893XRoyal Marsden NHS Foundation Trust, London and Sutton, UK; grid.4709.a0000 0004 0495 846XGenome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany; grid.5335.00000000121885934Department of Oncology, University of Cambridge, Cambridge, UK; grid.5335.00000000121885934Li Ka Shing Centre, Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK; grid.14925.3b0000 0001 2284 9388Institut Gustave Roussy, Villejuif, France; grid.24029.3d0000 0004 0383 8386Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; grid.5841.80000 0004 1937 0247Anatomia Patológica, Hospital Clinic, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Barcelona, Spain; grid.451322.30000 0004 1770 9462Spanish Ministry of Science and Innovation, Madrid, Spain; grid.412590.b0000 0000 9081 2336University of Michigan Comprehensive Cancer Center, Ann Arbor, MI USA; grid.5734.50000 0001 0726 5157Department for BioMedical Research, University of Bern, Bern, Switzerland; grid.5734.50000 0001 0726 5157Department of Medical Oncology, Inselspital, University Hospital and University of Bern, Bern, Switzerland; grid.5734.50000 0001 0726 5157Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland; grid.8982.b0000 0004 1762 5736University of Pavia, Pavia, Italy; grid.265892.20000000106344187University of Alabama at Birmingham, Birmingham, AL USA; grid.417184.f0000 0001 0661 1177UHN Program in BioSpecimen Sciences, Toronto General Hospital, Toronto, ON Canada; grid.59734.3c0000 0001 0670 2351Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY USA; grid.1009.80000 0004 1936 826XCentre for Law and Genetics, University of Tasmania, Sandy Bay Campus, Hobart, TAS Australia; grid.7700.00000 0001 2190 4373Faculty of Biosciences, Heidelberg University, Heidelberg, Germany; grid.28046.380000 0001 2182 2255Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON Canada; grid.66875.3a0000 0004 0459 167XDivision of Anatomic Pathology, Mayo Clinic, Rochester, MN USA; grid.94365.3d0000 0001 2297 5165Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.417154.20000 0000 9781 7439Illawarra Shoalhaven Local Health District L3 Illawarra Cancer Care Centre, Wollongong Hospital, Wollongong, NSW Australia; BioForA, French National Institute for Agriculture, Food, and Environment (INRAE), ONF, Orléans, France; grid.21107.350000 0001 2171 9311Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD USA; grid.266100.30000 0001 2107 4242University of California San Diego, San Diego, CA USA; grid.66875.3a0000 0004 0459 167XDivision of Experimental Pathology, Mayo Clinic, Rochester, MN USA; grid.1013.30000 0004 1936 834XCentre for Cancer Research, The Westmead Institute for Medical Research, University of Sydney, Sydney, NSW Australia; grid.413252.30000 0001 0180 6477Department of Gynaecological Oncology, Westmead Hospital, Sydney, NSW Australia; PDXen Biosystems Inc, Seoul, South Korea; grid.37172.300000 0001 2292 0500Korea Advanced Institute of Science and Technology, Daejeon, South Korea; grid.36303.350000 0000 9148 4899Electronics and Telecommunications Research Institute, Daejeon, South Korea; grid.455095.80000 0001 2189 059XInstitut National du Cancer (INCA), Boulogne-Billancourt, France; grid.265892.20000000106344187Department of Genetics, Informatics Institute, University of Alabama at Birmingham, Birmingham, AL USA; grid.410724.40000 0004 0620 9745Division of Medical Oncology, National Cancer Centre, Singapore, Singapore; grid.411475.20000 0004 1756 948XMedical Oncology, University and Hospital Trust of Verona, Verona, Italy; grid.412468.d0000 0004 0646 2097Department of Pediatrics, University Hospital Schleswig-Holstein, Kiel, Germany; grid.231844.80000 0004 0474 0428Hepatobiliary/Pancreatic Surgical Oncology Program, University Health Network, Toronto, ON Canada; grid.9654.e0000 0004 0372 3343School of Biological Sciences, University of Auckland, Auckland, New Zealand; grid.1008.90000 0001 2179 088XDepartment of Surgery, University of Melbourne, Parkville, VIC Australia; grid.416107.50000 0004 0614 0346The Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, VIC Australia; grid.1042.70000 0004 0432 4889Walter and Eliza Hall Institute, Parkville, VIC Australia; grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, Canada; grid.416166.20000 0004 0473 9881Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON Canada; grid.8273.e0000 0001 1092 7967University of East Anglia, Norwich, UK; grid.240367.40000 0004 0445 7876Norfolk and Norwich University Hospital NHS Trust, Norwich, UK; grid.433802.e0000 0004 0465 4247Victorian Institute of Forensic Medicine, Southbank, VIC Australia; grid.38142.3c000000041936754XDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA USA; grid.5335.00000000121885934Department of Chemistry, Centre for Molecular Science Informatics, University of Cambridge, Cambridge, UK; grid.38142.3c000000041936754XLudwig Center at Harvard Medical School, Boston, MA USA; grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Baylor College of Medicine, Houston, TX USA; grid.1008.90000 0001 2179 088XPeter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia; grid.32224.350000 0004 0386 9924Physics Division, Optimization and Systems Biology Lab, Massachusetts General Hospital, Boston, MA USA; grid.39382.330000 0001 2160 926XDepartment of Medicine, Baylor College of Medicine, Houston, TX USA; grid.6190.e0000 0000 8580 3777University of Cologne, Cologne, Germany; grid.450294.e0000 0004 0641 0756International Genomics Consortium, Phoenix, AZ USA; grid.419890.d0000 0004 0626 690XGenomics Research Program, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.439436.f0000 0004 0459 7289Barking Havering and Redbridge University Hospitals NHS Trust, Romford, UK; grid.1013.30000 0004 1936 834XChildren’s Hospital at Westmead, University of Sydney, Sydney, NSW Australia; grid.411475.20000 0004 1756 948XDepartment of Medicine, Section of Endocrinology, University and Hospital Trust of Verona, Verona, Italy; grid.51462.340000 0001 2171 9952Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.5801.c0000 0001 2156 2780Department of Biology, ETH Zurich, Zürich, Switzerland; grid.5801.c0000 0001 2156 2780Department of Computer Science, ETH Zurich, Zurich, Switzerland; grid.419765.80000 0001 2223 3006SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland; grid.5386.8000000041936877XWeill Cornell Medical College, New York, NY USA; grid.5335.00000000121885934Academic Department of Medical Genetics, University of Cambridge, Addenbrooke’s Hospital, Cambridge, UK; grid.415041.5MRC Cancer Unit, University of Cambridge, Cambridge, UK; grid.10698.360000000122483208Departments of Pediatrics and Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.492568.4Seven Bridges Genomics, Charlestown, MA USA; Annai Systems, Inc, Carlsbad, CA USA; grid.5608.b0000 0004 1757 3470Department of Pathology, General Hospital of Treviso, Department of Medicine, University of Padua, Treviso, Italy; grid.9851.50000 0001 2165 4204Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; grid.8591.50000 0001 2322 4988Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, CH Switzerland; grid.8591.50000 0001 2322 4988Swiss Institute of Bioinformatics, University of Geneva, Geneva, CH Switzerland; grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK; grid.5596.f0000 0001 0668 7884University of Leuven, Leuven, Belgium; grid.10392.390000 0001 2190 1447Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany; grid.418377.e0000 0004 0620 715XComputational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore; grid.4280.e0000 0001 2180 6431School of Computing, National University of Singapore, Singapore, Singapore; grid.4991.50000 0004 1936 8948Big Data Institute, Li Ka Shing Centre, University of Oxford, Oxford, UK; grid.451388.30000 0004 1795 1830Biomedical Data Science Laboratory, Francis Crick Institute, London, UK; grid.83440.3b0000000121901201Bioinformatics Group, Department of Computer Science, University College London, London, UK; grid.17063.330000 0001 2157 2938The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON Canada; grid.418119.40000 0001 0684 291XBreast Cancer Translational Research Laboratory JC Heuson, Institut Jules Bordet, Brussels, Belgium; grid.5596.f0000 0001 0668 7884Department of Oncology, Laboratory for Translational Breast Cancer Research, KU Leuven, Leuven, Belgium; grid.473715.30000 0004 6475 7299Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain; grid.5612.00000 0001 2172 2676Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain; grid.415224.40000 0001 2150 066XDivision of Medical Oncology, Princess Margaret Cancer Centre, Toronto, ON Canada; grid.5386.8000000041936877XDepartment of Physiology and Biophysics, Weill Cornell Medicine, New York, NY USA; grid.5386.8000000041936877XInstitute for Computational Biomedicine, Weill Cornell Medicine, New York, NY USA; grid.415596.a0000 0004 0440 3018Department of Pathology, UPMC Shadyside, Pittsburgh, PA USA; Independent Consultant, Wellesley, USA; grid.8993.b0000 0004 1936 9457Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden; grid.4367.60000 0001 2355 7002Department of Medicine and Department of Genetics, Washington University School of Medicine, St. Louis, St. Louis, MO USA; grid.256896.60000 0001 0395 8562Hefei University of Technology, Anhui, China; grid.5284.b0000 0001 0790 3681Translational Cancer Research Unit, GZA Hospitals St.-Augustinus, Center for Oncological Research, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; grid.61971.380000 0004 1936 7494Simon Fraser University, Burnaby, BC Canada; grid.25879.310000 0004 1936 8972University of Pennsylvania, Philadelphia, PA USA; grid.440820.aFaculty of Science and Technology, University of Vic—Central University of Catalonia (UVic-UCC), Vic, Spain; grid.52788.300000 0004 0427 7672The Wellcome Trust, London, UK; grid.42327.300000 0004 0473 9646The Hospital for Sick Children, Toronto, ON Canada; grid.511123.50000 0004 5988 7216Department of Pathology, Queen Elizabeth University Hospital, Glasgow, UK; grid.1049.c0000 0001 2294 1395Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD Australia; grid.5335.00000000121885934Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK; grid.5335.00000000121885934Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK; grid.453281.90000 0004 4652 6665Prostate Cancer Canada, Toronto, ON Canada; grid.5335.00000000121885934University of Cambridge, Cambridge, UK; grid.4514.40000 0001 0930 2361Department of Laboratory Medicine, Translational Cancer Research, Lund University Cancer Center at Medicon Village, Lund University, Lund, Sweden; grid.7700.00000 0001 2190 4373Heidelberg University, Heidelberg, Germany; grid.6363.00000 0001 2218 4662New BIH Digital Health Center, Berlin Institute of Health (BIH) and Charité – Universitätsmedizin Berlin, Berlin, Germany; grid.466571.70000 0004 1756 6246CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain; Research Group on Statistics, Econometrics and Health (GRECS), UdG, Barcelona, Spain; Quantitative Genomics Laboratories (qGenomics), Barcelona, Spain; grid.507118.a0000 0001 0329 4954Icelandic Cancer Registry, Icelandic Cancer Society, Reykjavik, Iceland; grid.233520.50000 0004 1761 4404State Key Laboratory of Cancer Biology, and Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Shaanxi, China; grid.5608.b0000 0004 1757 3470Department of Medicine (DIMED), Surgical Pathology Unit, University of Padua, Padua, Italy; grid.475435.4Rigshospitalet, Copenhagen, Denmark; grid.94365.3d0000 0001 2297 5165Center for Cancer Genomics, National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.14848.310000 0001 2292 3357Department of Biochemistry and Molecular Medicine, University of Montreal, Montreal, QC Canada; grid.1011.10000 0004 0474 1797Australian Institute of Tropical Health and Medicine, James Cook University, Douglas, QLD Australia; Department of Neuro-Oncology, Istituto Neurologico Besta, Milano, Italy; grid.484025.fBioplatforms Australia, North Ryde, NSW Australia; grid.83440.3b0000000121901201Department of Pathology (Research), University College London Cancer Institute, London, UK; grid.415224.40000 0001 2150 066XDepartment of Surgical Oncology, Princess Margaret Cancer Centre, Toronto, ON Canada; grid.5645.2000000040459992XDepartment of Medical Oncology, Josephine Nefkens Institute and Cancer Genomics Centre, Erasmus Medical Center, Rotterdam, CN The Netherlands; grid.415184.d0000 0004 0614 0266The University of Queensland Thoracic Research Centre, The Prince Charles Hospital, Brisbane, QLD Australia; grid.5808.50000 0001 1503 7226CIBIO/InBIO – Research Center in Biodiversity and Genetic Resources, Universidade do Porto, Vairão, Portugal; grid.420746.30000 0001 1887 2462HCA Laboratories, London, UK; grid.10025.360000 0004 1936 8470University of Liverpool, Liverpool, UK; grid.22098.310000 0004 1937 0503The Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel; grid.15276.370000 0004 1936 8091Department of Neurosurgery, University of Florida, Gainesville, FL USA; grid.26999.3d0000 0001 2151 536XDepartment of Pathology, Graduate School of Medicine, University of Tokyo, Tokyo, Japan; grid.7563.70000 0001 2174 1754University of Milano Bicocca, Monza, Italy; grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China; grid.55325.340000 0004 0389 8485Department of Pathology, Oslo University Hospital Ulleval, Oslo, Norway; grid.38142.3c000000041936754XCenter for Biomedical Informatics, Harvard Medical School, Boston, MA USA; grid.5841.80000 0004 1937 0247Department Biochemistry and Molecular Biomedicine, University of Barcelona, Barcelona, Spain; grid.94365.3d0000 0001 2297 5165Office of Cancer Genomics, National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.7497.d0000 0004 0492 0584Cancer Epigenomics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.240145.60000 0001 2291 4776Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.240145.60000 0001 2291 4776Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.47100.320000000419368710Department of Computer Science, Yale University, New Haven, CT USA; grid.47100.320000000419368710Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT USA; grid.47100.320000000419368710Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT USA; grid.32224.350000 0004 0386 9924Center for Cancer Research, Massachusetts General Hospital, Boston, MA USA; grid.32224.350000 0004 0386 9924Department of Pathology, Massachusetts General Hospital, Boston, MA USA; grid.51462.340000 0001 2171 9952Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.66875.3a0000 0004 0459 167XDivision of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN USA; grid.1013.30000 0004 1936 834XUniversity of Sydney, Sydney, NSW Australia; grid.4991.50000 0004 1936 8948University of Oxford, Oxford, UK; grid.5335.00000000121885934Department of Surgery, Academic Urology Group, University of Cambridge, Cambridge, UK; grid.8379.50000 0001 1958 8658Department of Medicine II, University of Würzburg, Wuerzburg, Germany; grid.26790.3a0000 0004 1936 8606Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL USA; grid.20522.370000 0004 1767 9005Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Spain; grid.280664.e0000 0001 2110 5790Genome Integrity and Structural Biology Laboratory, National Institute of Environmental Health Sciences (NIEHS), Durham, NC USA; grid.425213.3St. Thomas’s Hospital, London, UK; Osaka International Cancer Center, Osaka, Japan; grid.4514.40000 0001 0930 2361Department of Pathology, Skåne University Hospital, Lund University, Lund, Sweden; grid.422301.60000 0004 0606 0717Department of Medical Oncology, Beatson West of Scotland Cancer Centre, Glasgow, UK; grid.94365.3d0000 0001 2297 5165National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA; grid.1008.90000 0001 2179 088XCentre for Cancer Research, Victorian Comprehensive Cancer Centre, University of Melbourne, Melbourne, VIC Australia; grid.170205.10000 0004 1936 7822Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, IL USA; grid.452463.2German Center for Infection Research (DZIF), Partner Site Hamburg-Borstel-Lübeck-Riems, Hamburg, Germany; grid.7048.b0000 0001 1956 2722Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark; grid.410865.eDepartment of Biotechnology, Ministry of Science and Technology, Government of India, New Delhi, Delhi India; grid.410724.40000 0004 0620 9745National Cancer Centre Singapore, Singapore, Singapore; grid.253264.40000 0004 1936 9473Brandeis University, Waltham, MA USA; grid.17091.3e0000 0001 2288 9830Department of Urologic Sciences, University of British Columbia, Vancouver, BC Canada; grid.168010.e0000000419368956Department of Internal Medicine, Stanford University, Stanford, CA USA; grid.267308.80000 0000 9206 2401The University of Texas Health Science Center at Houston, Houston, TX USA; grid.7445.20000 0001 2113 8111Imperial College NHS Trust, Imperial College, London, INY UK; grid.7839.50000 0004 1936 9721Senckenberg Institute of Pathology, University of Frankfurt Medical School, Frankfurt, Germany; grid.266100.30000 0001 2107 4242Department of Medicine, Division of Biomedical Informatics, UC San Diego School of Medicine, San Diego, CA USA; grid.468222.8Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX USA; Oxford Nanopore Technologies, New York, NY USA; grid.26999.3d0000 0001 2151 536XInstitute of Medical Science, University of Tokyo, Tokyo, Japan; grid.205975.c0000 0001 0740 6917Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA USA; grid.412857.d0000 0004 1763 1087Wakayama Medical University, Wakayama, Japan; grid.10698.360000000122483208Department of Internal Medicine, Division of Medical Oncology, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.267301.10000 0004 0386 9246University of Tennessee Health Science Center for Cancer Research, Memphis, TN USA; grid.412346.60000 0001 0237 2025Department of Histopathology, Salford Royal NHS Foundation Trust, Salford, UK; grid.5379.80000000121662407Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK; grid.11135.370000 0001 2256 9319BIOPIC, ICG and College of Life Sciences, Peking University, Beijing, China; grid.11135.370000 0001 2256 9319Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China; grid.239552.a0000 0001 0680 8770Children’s Hospital of Philadelphia, Philadelphia, PA USA; grid.240145.60000 0001 2291 4776Department of Bioinformatics and Computational Biology and Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.4714.60000 0004 1937 0626Karolinska Institute, Stockholm, Sweden; grid.17063.330000 0001 2157 2938The Donnelly Centre, University of Toronto, Toronto, ON Canada; grid.256753.00000 0004 0470 5964Department of Medical Genetics, College of Medicine, Hallym University, Chuncheon, South Korea; grid.5612.00000 0001 2172 2676Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, Barcelona, Spain; grid.411941.80000 0000 9194 7179Health Data Science Unit, University Clinics, Heidelberg, Germany; grid.32224.350000 0004 0386 9924Massachusetts General Hospital Center for Cancer Research, Charlestown, MA USA; grid.39158.360000 0001 2173 7691Hokkaido University, Sapporo, Japan; grid.272242.30000 0001 2168 5385Department of Pathology and Clinical Laboratory, National Cancer Center Hospital, Tokyo, Japan; grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.418245.e0000 0000 9999 5706Computational Biology, Leibniz Institute on Aging – Fritz Lipmann Institute (FLI), Jena, Germany; grid.1008.90000 0001 2179 088XUniversity of Melbourne Centre for Cancer Research, Melbourne, VIC Australia; grid.266813.80000 0001 0666 4105University of Nebraska Medical Center, Omaha, NE USA; Syntekabio Inc, Daejeon, South Korea; grid.5650.60000000404654431Department of Pathology, Academic Medical Center, Amsterdam, AZ The Netherlands; grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China; grid.7497.d0000 0004 0492 0584Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.24515.370000 0004 1937 1450Division of Life Science and Applied Genomics Center, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China; grid.59734.3c0000 0001 0670 2351Icahn School of Medicine at Mount Sinai, New York, NY USA; Geneplus-Shenzhen, Shenzhen, China; grid.43169.390000 0001 0599 1243School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; grid.431072.30000 0004 0572 4227AbbVie, North Chicago, IL USA; grid.6363.00000 0001 2218 4662Institute of Pathology, Charité – University Medicine Berlin, Berlin, Germany; grid.248762.d0000 0001 0702 3000Centre for Translational and Applied Genomics, British Columbia Cancer Agency, Vancouver, BC Canada; grid.418716.d0000 0001 0709 1919Edinburgh Royal Infirmary, Edinburgh, UK; grid.419491.00000 0001 1014 0849Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany; grid.5253.10000 0001 0328 4908Department of Pediatric Immunology, Hematology and Oncology, University Hospital, Heidelberg, Germany; grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.482664.aHeidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM), Heidelberg, Germany; grid.5386.8000000041936877XInstitute for Computational Biomedicine, Weill Cornell Medical College, New York, NY USA; grid.429884.b0000 0004 1791 0895New York Genome Center, New York, NY USA; grid.21107.350000 0001 2171 9311Department of Urology, James Buchanan Brady Urological Institute, Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.26999.3d0000 0001 2151 536XDepartment of Preventive Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; grid.39382.330000 0001 2160 926XDepartment of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX USA; grid.39382.330000 0001 2160 926XDepartment of Pathology and Immunology, Baylor College of Medicine, Houston, TX USA; grid.413890.70000 0004 0420 5521Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX USA; grid.5170.30000 0001 2181 8870Technical University of Denmark, Lyngby, Denmark; grid.49606.3d0000 0001 1364 9317Department of Pathology, College of Medicine, Hanyang University, Seoul, South Korea; grid.411714.60000 0000 9825 7840Academic Unit of Surgery, School of Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow Royal Infirmary, Glasgow, UK; grid.267370.70000 0004 0533 4667Department of Pathology, Asan Medical Center, College of Medicine, Ulsan University, Songpa-gu, Seoul South Korea; Science Writer, Garrett Park, MD USA; grid.419890.d0000 0004 0626 690XInternational Cancer Genome Consortium (ICGC)/ICGC Accelerating Research in Genomic Oncology (ARGO) Secretariat, Ontario Institute for Cancer Research, Toronto, ON Canada; grid.8954.00000 0001 0721 6013University of Ljubljana, Ljubljana, Slovenia; grid.170205.10000 0004 1936 7822Department of Public Health Sciences, University of Chicago, Chicago, IL USA; grid.240372.00000 0004 0400 4439Research Institute, NorthShore University HealthSystem, Evanston, IL USA; grid.5734.50000 0001 0726 5157Department for Biomedical Research, University of Bern, Bern, Switzerland; grid.411640.6Centre of Genomics and Policy, McGill University and Génome Québec Innovation Centre, Montreal, QC Canada; grid.10698.360000000122483208Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.510964.fHopp Children’s Cancer Center (KiTZ), Heidelberg, Germany; grid.7497.d0000 0004 0492 0584Pediatric Glioma Research Group, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.11485.390000 0004 0422 0975Cancer Research UK, London, UK; Indivumed GmbH, Hamburg, Germany; Genome Integration Data Center, Syntekabio, Inc, Daejeon, South Korea; grid.412004.30000 0004 0478 9977University Hospital Zurich, Zurich, Switzerland; grid.419765.80000 0001 2223 3006Clinical Bioinformatics, Swiss Institute of Bioinformatics, Geneva, Switzerland; grid.412004.30000 0004 0478 9977Institute for Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland; grid.7400.30000 0004 1937 0650Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland; grid.4305.20000 0004 1936 7988MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Edinburgh, UK; grid.50956.3f0000 0001 2152 9905Women’s Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA USA; grid.4808.40000 0001 0657 4636Department of Biology, Bioinformatics Group, Division of Molecular Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia; grid.412468.d0000 0004 0646 2097Department for Internal Medicine II, University Hospital Schleswig-Holstein, Kiel, Germany; grid.414733.60000 0001 2294 430XGenetics and Molecular Pathology, SA Pathology, Adelaide, SA Australia; grid.272242.30000 0001 2168 5385Department of Gastric Surgery, National Cancer Center Hospital, Tokyo, Japan; grid.272242.30000 0001 2168 5385Department of Bioinformatics, Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan; grid.435025.50000 0004 0619 6198A.A. Kharkevich Institute of Information Transmission Problems, Moscow, Russia; grid.465331.6Oncology and Immunology, Dmitry Rogachev National Research Center of Pediatric Hematology, Moscow, Russia; grid.454320.40000 0004 0555 3608Skolkovo Institute of Science and Technology, Moscow, Russia; grid.253615.60000 0004 1936 9510Department of Surgery, The George Washington University, School of Medicine and Health Science, Washington, DC USA; grid.48336.3a0000 0004 1936 8075Endocrine Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.1004.50000 0001 2158 5405Melanoma Institute Australia, Macquarie University, Sydney, NSW Australia; grid.116068.80000 0001 2341 2786MIT Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA USA; grid.413249.90000 0004 0385 0051Tissue Pathology and Diagnostic Oncology, Royal Prince Alfred Hospital, Sydney, NSW Australia; grid.9786.00000 0004 0470 0856Cholangiocarcinoma Screening and Care Program and Liver Fluke and Cholangiocarcinoma Research Centre, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand; Controlled Department and Institution, New York, NY USA; grid.5386.8000000041936877XEnglander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY USA; grid.410914.90000 0004 0628 9810National Cancer Center, Gyeonggi, South Korea; grid.255649.90000 0001 2171 7754Department of Biochemistry, College of Medicine, Ewha Womans University, Seoul, South Korea; grid.266100.30000 0001 2107 4242Health Sciences Department of Biomedical Informatics, University of California San Diego, La Jolla, CA USA; grid.410914.90000 0004 0628 9810Research Core Center, National Cancer Centre Korea, Goyang-si, South Korea; grid.264381.a0000 0001 2181 989XDepartment of Health Sciences and Technology, Sungkyunkwan University School of Medicine, Seoul, South Korea; Samsung Genome Institute, Seoul, South Korea; grid.417747.60000 0004 0460 3896Breast Oncology Program, Dana-Farber/Brigham and Women’s Cancer Center, Boston, MA USA; grid.51462.340000 0001 2171 9952Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.62560.370000 0004 0378 8294Division of Breast Surgery, Brigham and Women’s Hospital, Boston, MA USA; grid.280664.e0000 0001 2110 5790Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences (NIEHS), Durham, NC USA; grid.7914.b0000 0004 1936 7443Department of Clinical Science, University of Bergen, Bergen, Norway; grid.412484.f0000 0001 0302 820XCenter For Medical Innovation, Seoul National University Hospital, Seoul, South Korea; grid.412484.f0000 0001 0302 820XDepartment of Internal Medicine, Seoul National University Hospital, Seoul, South Korea; grid.413454.30000 0001 1958 0162Institute of Computer Science, Polish Academy of Sciences, Warsawa, Poland; grid.7497.d0000 0004 0492 0584Functional and Structural Genomics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.94365.3d0000 0001 2297 5165Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, , National Institutes of Health, Bethesda, MD USA; grid.9647.c0000 0004 7669 9786Institute for Medical Informatics Statistics and Epidemiology, University of Leipzig, Leipzig, Germany; grid.240145.60000 0001 2291 4776Morgan Welch Inflammatory Breast Cancer Research Program and Clinic, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.7450.60000 0001 2364 4210Department of Hematology and Oncology, Georg-Augusts-University of Göttingen, Göttingen, Germany; grid.5718.b0000 0001 2187 5445Institute of Cell Biology (Cancer Research), University of Duisburg-Essen, Essen, Germany; grid.420545.20000 0004 0489 3985King’s College London and Guy’s and St. Thomas’ NHS Foundation Trust, London, UK; grid.251017.00000 0004 0406 2057Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI USA; grid.416100.20000 0001 0688 4634The University of Queensland Centre for Clinical Research, Royal Brisbane and Women’s Hospital, Herston, QLD Australia; grid.6190.e0000 0000 8580 3777Department of Pediatric Oncology and Hematology, University of Cologne, Cologne, Germany; grid.411327.20000 0001 2176 9917University of Düsseldorf, Düsseldorf, Germany; grid.418119.40000 0001 0684 291XDepartment of Pathology, Institut Jules Bordet, Brussels, Belgium; grid.8761.80000 0000 9919 9582Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden; grid.414235.50000 0004 0619 2154Children’s Medical Research Institute, Sydney, NSW Australia; ILSbio, LLC Biobank, Chestertown, MD USA; grid.2515.30000 0004 0378 8438Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA USA; grid.49606.3d0000 0001 1364 9317Institute for Bioengineering and Biopharmaceutical Research (IBBR), Hanyang University, Seoul, South Korea; grid.205975.c0000 0001 0740 6917Department of Statistics, University of California Santa Cruz, Santa Cruz, CA USA; grid.482251.80000 0004 0633 7958National Genotyping Center, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan; grid.419538.20000 0000 9071 0620Department of Vertebrate Genomics/Otto Warburg Laboratory Gene Regulation and Systems Biology of Cancer, Max Planck Institute for Molecular Genetics, Berlin, Germany; grid.411640.6McGill University and Genome Quebec Innovation Centre, Montreal, QC Canada; grid.431797.fbiobyte solutions GmbH, Heidelberg, Germany; grid.137628.90000 0004 1936 8753Gynecologic Oncology, NYU Laura and Isaac Perlmutter Cancer Center, New York University, New York, NY USA; grid.4367.60000 0001 2355 7002Division of Oncology, Stem Cell Biology Section, Washington University School of Medicine, St. Louis, MO USA; grid.38142.3c000000041936754XHarvard University, Cambridge, MA USA; grid.48336.3a0000 0004 1936 8075Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD USA; grid.5510.10000 0004 1936 8921University of Oslo, Oslo, Norway; grid.17063.330000 0001 2157 2938University of Toronto, Toronto, ON Canada; grid.11135.370000 0001 2256 9319Peking University, Beijing, China; grid.11135.370000 0001 2256 9319School of Life Sciences, Peking University, Beijing, China; grid.419407.f0000 0004 4665 8158Leidos Biomedical Research, Inc, McLean, VA USA; grid.5841.80000 0004 1937 0247Hematology, Hospital Clinic, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Barcelona, Spain; grid.73113.370000 0004 0369 1660Second Military Medical University, Shanghai, China; Chinese Cancer Genome Consortium, Shenzhen, China; grid.414350.70000 0004 0447 1045Department of Medical Oncology, Beijing Hospital, Beijing, China; grid.412474.00000 0001 0027 0586Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, China; grid.11914.3c0000 0001 0721 1626School of Medicine/School of Mathematics and Statistics, University of St. Andrews, St, Andrews, Fife UK; grid.64212.330000 0004 0463 2320Institute for Systems Biology, Seattle, WA USA; Department of Biochemistry and Molecular Biology, Faculty of Medicine, University Institute of Oncology-IUOPA, Oviedo, Spain; grid.476460.70000 0004 0639 0505Institut Bergonié, Bordeaux, France; grid.5335.00000000121885934Cancer Unit, MRC University of Cambridge, Cambridge, UK; grid.239546.f0000 0001 2153 6013Department of Pathology and Laboratory Medicine, Center for Personalized Medicine, Children’s Hospital Los Angeles, Los Angeles, CA USA; grid.1001.00000 0001 2180 7477John Curtin School of Medical Research, Canberra, ACT Australia; MVZ Department of Oncology, PraxisClinic am Johannisplatz, Leipzig, Germany; grid.5342.00000 0001 2069 7798Department of Information Technology, Ghent University, Ghent, Belgium; grid.5342.00000 0001 2069 7798Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium; grid.240344.50000 0004 0392 3476Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH USA; grid.5288.70000 0000 9758 5690Computational Biology Program, School of Medicine, Oregon Health and Science University, Portland, OR USA; grid.26009.3d0000 0004 1936 7961Department of Surgery, Duke University, Durham, NC USA; grid.425902.80000 0000 9601 989XInstitució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain; grid.7080.f0000 0001 2296 0625Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain; grid.8756.c0000 0001 2193 314XUniversity of Glasgow, Glasgow, UK; grid.10403.360000000091771775Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; grid.4367.60000 0001 2355 7002Division of Oncology, Washington University School of Medicine, St. Louis, MO USA; grid.7445.20000 0001 2113 8111Department of Surgery and Cancer, Imperial College, London, INY UK; grid.437060.60000 0004 0567 5138Applications Department, Oxford Nanopore Technologies, Oxford, UK; grid.266102.10000 0001 2297 6811Department of Obstetrics, Gynecology and Reproductive Services, University of California San Francisco, San Francisco, CA USA; grid.27860.3b0000 0004 1936 9684Department of Biochemistry and Molecular Medicine, University California at Davis, Sacramento, CA USA; grid.415224.40000 0001 2150 066XSTTARR Innovation Facility, Princess Margaret Cancer Centre, Toronto, ON Canada; grid.1029.a0000 0000 9939 5719Discipline of Surgery, Western Sydney University, Penrith, NSW Australia; grid.47100.320000000419368710Yale School of Medicine, Yale University, New Haven, CT USA; grid.10698.360000000122483208Department of Genetics, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.413103.40000 0001 2160 8953Departments of Neurology and Neurosurgery, Henry Ford Hospital, Detroit, MI USA; grid.5288.70000 0000 9758 5690Precision Oncology, OHSU Knight Cancer Institute, Oregon Health and Science University, Portland, OR USA; grid.13648.380000 0001 2180 3484Institute of Pathology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany; grid.177174.30000 0001 2242 4849Department of Health Sciences, Faculty of Medical Sciences, Kyushu University, Fukuoka, Japan; grid.461593.c0000 0001 1939 6592Heidelberg Academy of Sciences and Humanities, Heidelberg, Germany; grid.1008.90000 0001 2179 088XDepartment of Clinical Pathology, University of Melbourne, Melbourne, VIC, Australia; grid.240614.50000 0001 2181 8635Department of Pathology, Roswell Park Cancer Institute, Buffalo, NY USA; grid.7737.40000 0004 0410 2071Department of Computer Science, University of Helsinki, Helsinki, Finland; grid.7737.40000 0004 0410 2071Institute of Biotechnology, University of Helsinki, Helsinki, Finland; grid.7737.40000 0004 0410 2071Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland; grid.4367.60000 0001 2355 7002Department of Obstetrics and Gynecology, Division of Gynecologic Oncology, Washington University School of Medicine, St. Louis, MO USA; grid.430183.d0000 0004 6354 3547Penrose St. Francis Health Services, Colorado Springs, CO USA; grid.410712.10000 0004 0473 882XInstitute of Pathology, Ulm University and University Hospital of Ulm, Ulm, Germany; grid.272242.30000 0001 2168 5385National Cancer Center, Tokyo, Japan; grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, Singapore, Singapore; grid.47100.32000000041936871032Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT USA; grid.453370.60000 0001 2161 6363German Cancer Aid, Bonn, Germany; grid.428397.30000 0004 0385 0924Programme in Cancer and Stem Cell Biology, Centre for Computational Biology, Duke-NUS Medical School, Singapore, Singapore; grid.10784.3a0000 0004 1937 0482The Chinese University of Hong Kong, Shatin, NT, Hong Kong China; grid.233520.50000 0004 1761 4404Fourth Military Medical University, Shaanxi, China; grid.5335.00000000121885934The University of Cambridge School of Clinical Medicine, Cambridge, UK; grid.240871.80000 0001 0224 711XSt. Jude Children’s Research Hospital, Memphis, TN USA; grid.415224.40000 0001 2150 066XUniversity Health Network, Princess Margaret Cancer Centre, Toronto, ON Canada; grid.205975.c0000 0001 0740 6917Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA USA; grid.170205.10000 0004 1936 7822Department of Medicine, University of Chicago, Chicago, IL USA; grid.66875.3a0000 0004 0459 167XDepartment of Neurology, Mayo Clinic, Rochester, MN USA; grid.24029.3d0000 0004 0383 8386Cambridge Oesophagogastric Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; grid.253692.90000 0004 0445 5969Department of Computer Science, Carleton College, Northfield, MN USA; grid.8756.c0000 0001 2193 314XInstitute of Cancer Sciences, College of Medical Veterinary and Life Sciences, University of Glasgow, Glasgow, UK; grid.265892.20000000106344187Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL USA; grid.417691.c0000 0004 0408 3720HudsonAlpha Institute for Biotechnology, Huntsville, AL USA; grid.265892.20000000106344187O’Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL USA; grid.26091.3c0000 0004 1936 9959Department of Pathology, Keio University School of Medicine, Tokyo, Japan; grid.272242.30000 0001 2168 5385Department of Hepatobiliary and Pancreatic Oncology, National Cancer Center Hospital, Tokyo, Japan; grid.430406.50000 0004 6023 5303Sage Bionetworks, Seattle, WA USA; grid.410724.40000 0004 0620 9745Lymphoma Genomic Translational Research Laboratory, National Cancer Centre, Singapore, Singapore; grid.416008.b0000 0004 0603 4965Department of Clinical Pathology, Robert-Bosch-Hospital, Stuttgart, Germany; grid.17063.330000 0001 2157 2938Department of Cell and Systems Biology, University of Toronto, Toronto, ON Canada; grid.4714.60000 0004 1937 0626Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden; grid.410914.90000 0004 0628 9810Center for Liver Cancer, Research Institute and Hospital, National Cancer Center, Gyeonggi, South Korea; grid.264381.a0000 0001 2181 989XDivision of Hematology-Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea; grid.263136.30000 0004 0533 2389Cheonan Industry-Academic Collaboration Foundation, Sangmyung University, Cheonan, South Korea; grid.240324.30000 0001 2109 4251NYU Langone Medical Center, New York, NY USA; grid.239578.20000 0001 0675 4725Department of Hematology and Medical Oncology, Cleveland Clinic, Cleveland, OH USA; grid.266102.10000 0001 2297 6811Department of Radiation Oncology, University of California San Francisco, San Francisco, CA USA; grid.66875.3a0000 0004 0459 167XDepartment of Health Sciences Research, Mayo Clinic, Rochester, MN USA; grid.414316.50000 0004 0444 1241Helen F. Graham Cancer Center at Christiana Care Health Systems, Newark, DE USA; grid.5253.10000 0001 0328 4908Heidelberg University Hospital, Heidelberg, Germany; CSRA Incorporated, Fairfax, VA USA; grid.83440.3b0000000121901201Research Department of Pathology, University College London Cancer Institute, London, UK; grid.13097.3c0000 0001 2322 6764Department of Research Oncology, Guy’s Hospital, King’s Health Partners AHSC, King’s College London School of Medicine, London, UK; grid.1004.50000 0001 2158 5405Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW Australia; grid.411158.80000 0004 0638 9213University Hospital of Minjoz, INSERM UMR 1098, Besançon, France; grid.7719.80000 0000 8700 1153Spanish National Cancer Research Centre, Madrid, Spain; grid.415180.90000 0004 0540 9980Center of Digestive Diseases and Liver Transplantation, Fundeni Clinical Institute, Bucharest, Romania; Cureline, Inc, South San Francisco, CA USA; grid.412946.c0000 0001 0372 6120St. Luke’s Cancer Centre, Royal Surrey County Hospital NHS Foundation Trust, Guildford, UK; grid.24029.3d0000 0004 0383 8386Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospital NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge, UK; grid.416266.10000 0000 9009 9462East of Scotland Breast Service, Ninewells Hospital, Aberdeen, UK; grid.5841.80000 0004 1937 0247Department of Genetics, Microbiology and Statistics, University of Barcelona, IRSJD, IBUB, Barcelona, Spain; grid.30760.320000 0001 2111 8460Department of Obstetrics and Gynecology, Medical College of Wisconsin, Milwaukee, WI USA; grid.516089.30000 0004 9535 5639Hematology and Medical Oncology, Winship Cancer Institute of Emory University, Atlanta, GA USA; grid.16750.350000 0001 2097 5006Department of Computer Science, Princeton University, Princeton, NJ USA; grid.152326.10000 0001 2264 7217Vanderbilt Ingram Cancer Center, Vanderbilt University, Nashville, TN USA; grid.261331.40000 0001 2285 7943Ohio State University College of Medicine and Arthur G. James Comprehensive Cancer Center, Columbus, OH USA; grid.268441.d0000 0001 1033 6139Department of Surgery, Yokohama City University Graduate School of Medicine, Kanagawa, Japan; grid.7497.d0000 0004 0492 0584Division of Chromatin Networks, German Cancer Research Center (DKFZ) and BioQuant, Heidelberg, Germany; grid.10698.360000000122483208Research Computing Center, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.30064.310000 0001 2157 6568School of Molecular Biosciences and Center for Reproductive Biology, Washington State University, Pullman, WA USA; grid.5254.60000 0001 0674 042XFinsen Laboratory and Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark; grid.17063.330000 0001 2157 2938Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON Canada; grid.51462.340000 0001 2171 9952Department of Pathology, Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.411067.50000 0000 8584 9230University Hospital Giessen, Pediatric Hematology and Oncology, Giessen, Germany; grid.418189.d0000 0001 2175 1768Oncologie Sénologie, ICM Institut Régional du Cancer, Montpellier, France; grid.9764.c0000 0001 2153 9986Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany; grid.8379.50000 0001 1958 8658Institute of Pathology, University of Wuerzburg, Wuerzburg, Germany; grid.418484.50000 0004 0380 7221Department of Urology, North Bristol NHS Trust, Bristol, UK; grid.419385.20000 0004 0620 9905SingHealth, Duke-NUS Institute of Precision Medicine, National Heart Centre Singapore, Singapore, Singapore; grid.17063.330000 0001 2157 2938Department of Computer Science, University of Toronto, Toronto, ON Canada; grid.5734.50000 0001 0726 5157Bern Center for Precision Medicine, University Hospital of Bern, University of Bern, Bern, Switzerland; grid.5386.8000000041936877XEnglander Institute for Precision Medicine, Weill Cornell Medicine and New York Presbyterian Hospital, New York, NY USA; grid.5386.8000000041936877XMeyer Cancer Center, Weill Cornell Medicine, New York, NY USA; grid.5386.8000000041936877XPathology and Laboratory, Weill Cornell Medical College, New York, NY USA; grid.411083.f0000 0001 0675 8654Vall d’Hebron Institute of Oncology: VHIO, Barcelona, Spain; grid.411475.20000 0004 1756 948XGeneral and Hepatobiliary-Biliary Surgery, Pancreas Institute, University and Hospital Trust of Verona, Verona, Italy; grid.22401.350000 0004 0502 9283National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India; grid.411377.70000 0001 0790 959XIndiana University, Bloomington, IN USA; grid.428965.40000 0004 7536 2436Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium; grid.422639.80000 0004 0372 3861Analytical Biological Services, Inc, Wilmington, DE USA; grid.1013.30000 0004 1936 834XSydney Medical School, University of Sydney, Sydney, NSW Australia; grid.38142.3c000000041936754XcBio Center, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA; grid.38142.3c000000041936754XDepartment of Cell Biology, Harvard Medical School, Boston, MA USA; grid.410869.20000 0004 1766 7522Advanced Centre for Treatment Research and Education in Cancer, Tata Memorial Centre, Navi Mumbai, Maharashtra India; grid.266842.c0000 0000 8831 109XSchool of Environmental and Life Sciences, Faculty of Science, The University of Newcastle, Ourimbah, NSW Australia; grid.410718.b0000 0001 0262 7331Department of Dermatology, University Hospital of Essen, Essen, Germany; grid.7497.d0000 0004 0492 0584Bioinformatics and Omics Data Analytics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.6363.00000 0001 2218 4662Department of Urology, Charité Universitätsmedizin Berlin, Berlin, Germany; grid.13648.380000 0001 2180 3484Martini-Clinic, Prostate Cancer Center, University Medical Center Hamburg-Eppendorf, Hamburg, Germany; grid.9764.c0000 0001 2153 9986Department of General Internal Medicine, University of Kiel, Kiel, Germany; grid.7497.d0000 0004 0492 0584German Cancer Consortium (DKTK), Partner site Berlin, Berlin, Germany; grid.239395.70000 0000 9011 8547Cancer Research Institute, Beth Israel Deaconess Medical Center, Boston, MA USA; grid.21925.3d0000 0004 1936 9000University of Pittsburgh, Pittsburgh, PA USA; grid.38142.3c000000041936754XDepartment of Ophthalmology and Ocular Genomics Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA USA; grid.240372.00000 0004 0400 4439Center for Psychiatric Genetics, NorthShore University HealthSystem, Evanston, IL USA; grid.251017.00000 0004 0406 2057Van Andel Research Institute, Grand Rapids, MI USA; grid.26999.3d0000 0001 2151 536XLaboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan; grid.480536.c0000 0004 5373 4593Japan Agency for Medical Research and Development, Tokyo, Japan; grid.222754.40000 0001 0840 2678Korea University, Seoul, South Korea; grid.414467.40000 0001 0560 6544Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD USA; grid.9764.c0000 0001 2153 9986Human Genetics, University of Kiel, Kiel, Germany; grid.65499.370000 0001 2106 9910Department of Oncologic Pathology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA; grid.5288.70000 0000 9758 5690Oregon Health and Science University, Portland, OR USA; grid.240145.60000 0001 2291 4776Center for RNA Interference and Noncoding RNA, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.240145.60000 0001 2291 4776Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.240145.60000 0001 2291 4776Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.15628.380000 0004 0393 1193University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK; grid.10417.330000 0004 0444 9382Department of Radiation Oncology, Radboud University Nijmegen Medical Centre, Nijmegen, GA The Netherlands; grid.170205.10000 0004 1936 7822Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL USA; grid.459927.40000 0000 8785 9045Clinic for Hematology and Oncology, St.-Antonius-Hospital, Eschweiler, Germany; grid.51462.340000 0001 2171 9952Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.14013.370000 0004 0640 0021University of Iceland, Reykjavik, Iceland; grid.7497.d0000 0004 0492 0584Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.416266.10000 0000 9009 9462Dundee Cancer Centre, Ninewells Hospital, Dundee, UK; grid.410712.10000 0004 0473 882XDepartment for Internal Medicine III, University of Ulm and University Hospital of Ulm, Ulm, Germany; grid.418596.70000 0004 0639 6384Institut Curie, INSERM Unit 830, Paris, France; grid.268441.d0000 0001 1033 6139Department of Gastroenterology and Hepatology, Yokohama City University Graduate School of Medicine, Kanagawa, Japan; grid.10417.330000 0004 0444 9382Department of Laboratory Medicine, Radboud University Nijmegen Medical Centre, Nijmegen, GA The Netherlands; grid.7497.d0000 0004 0492 0584Division of Cancer Genome Research, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.163555.10000 0000 9486 5048Department of General Surgery, Singapore General Hospital, Singapore, Singapore; grid.4280.e0000 0001 2180 6431Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore; grid.7737.40000 0004 0410 2071Department of Medical and Clinical Genetics, Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland; grid.24029.3d0000 0004 0383 8386East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; grid.21729.3f0000000419368729Irving Institute for Cancer Dynamics, Columbia University, New York, NY USA; grid.418812.60000 0004 0620 9243Institute of Molecular and Cell Biology, Singapore, Singapore; grid.410724.40000 0004 0620 9745Laboratory of Cancer Epigenome, Division of Medical Science, National Cancer Centre Singapore, Singapore, Singapore; Universite Lyon, INCa-Synergie, Centre Léon Bérard, Lyon, France; grid.66875.3a0000 0004 0459 167XDepartment of Urology, Mayo Clinic, Rochester, MN USA; grid.416177.20000 0004 0417 7890Royal National Orthopaedic Hospital – Stanmore, Stanmore, Middlesex UK; grid.6312.60000 0001 2097 6738Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain; Giovanni Paolo II / I.R.C.C.S. Cancer Institute, Bari, BA Italy; grid.7497.d0000 0004 0492 0584Neuroblastoma Genomics, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.414603.4Fondazione Policlinico Universitario Gemelli IRCCS, Rome, Italy, Rome, Italy; grid.5611.30000 0004 1763 1124University of Verona, Verona, Italy; grid.418135.a0000 0004 0641 3404Centre National de Génotypage, CEA – Institute de Génomique, Evry, France; grid.5012.60000 0001 0481 6099CAPHRI Research School, Maastricht University, Maastricht, ER The Netherlands; grid.418116.b0000 0001 0200 3174Department of Biopathology, Centre Léon Bérard, Lyon, France; grid.7849.20000 0001 2150 7757Université Claude Bernard Lyon 1, Villeurbanne, France; grid.419082.60000 0004 1754 9200Core Research for Evolutional Science and Technology (CREST), JST, Tokyo, Japan; grid.26999.3d0000 0001 2151 536XDepartment of Biological Sciences, Laboratory for Medical Science Mathematics, Graduate School of Science, University of Tokyo, Yokohama, Japan; grid.265073.50000 0001 1014 9130Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, Japan; grid.10306.340000 0004 0606 5382Cancer Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, UK; grid.412563.70000 0004 0376 6589University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; grid.4777.30000 0004 0374 7521Centre for Cancer Research and Cell Biology, Queen’s University, Belfast, UK; grid.240145.60000 0001 2291 4776Breast Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.21107.350000 0001 2171 9311Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD USA; grid.4714.60000 0004 1937 0626Department of Oncology-Pathology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden; grid.5491.90000 0004 1936 9297School of Cancer Sciences, Faculty of Medicine, University of Southampton, Southampton, UK; grid.6988.f0000000110107715Department of Gene Technology, Tallinn University of Technology, Tallinn, Estonia; grid.42327.300000 0004 0473 9646Genetics and Genome Biology Program, SickKids Research Institute, The Hospital for Sick Children, Toronto, ON Canada; grid.189967.80000 0001 0941 6502Departments of Neurosurgery and Hematology and Medical Oncology, Winship Cancer Institute and School of Medicine, Emory University, Atlanta, GA USA; grid.5947.f0000 0001 1516 2393Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway; Argmix Consulting, North Vancouver, BC Canada; grid.5342.00000 0001 2069 7798Department of Information Technology, Ghent University, Interuniversitair Micro-Electronica Centrum (IMEC), Ghent, Belgium; grid.4991.50000 0004 1936 8948Nuffield Department of Surgical Sciences, John Radcliffe Hospital, University of Oxford, Oxford, UK; grid.9845.00000 0001 0775 3222Institute of Mathematics and Computer Science, University of Latvia, Riga, LV Latvia; grid.1013.30000 0004 1936 834XDiscipline of Pathology, Sydney Medical School, University of Sydney, Sydney, NSW Australia; grid.5335.00000000121885934Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, UK; grid.51462.340000 0001 2171 9952Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY USA; grid.21729.3f0000000419368729Department of Statistics, Columbia University, New York, NY USA; grid.8993.b0000 0004 1936 9457Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden; grid.43169.390000 0001 0599 1243School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China; grid.24029.3d0000 0004 0383 8386Department of Histopathology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; grid.4991.50000 0004 1936 8948Oxford NIHR Biomedical Research Centre, University of Oxford, Oxford, UK; grid.410427.40000 0001 2284 9329Georgia Regents University Cancer Center, Augusta, GA USA; grid.417286.e0000 0004 0422 2524Wythenshawe Hospital, Manchester, UK; grid.4367.60000 0001 2355 7002Department of Genetics, Washington University School of Medicine, St.Louis, MO USA; grid.423940.80000 0001 2188 0463Department of Biological Oceanography, Leibniz Institute of Baltic Sea Research, Rostock, Germany; grid.4991.50000 0004 1936 8948Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK; grid.39382.330000 0001 2160 926XDepartment of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA; grid.66875.3a0000 0004 0459 167XThoracic Oncology Laboratory, Mayo Clinic, Rochester, MN USA; grid.66875.3a0000 0004 0459 167XDepartment of Obstetrics and Gynecology, Division of Gynecologic Oncology, Mayo Clinic, Rochester, MN USA; grid.510975.f0000 0004 6004 7353International Institute for Molecular Oncology, Poznań, Poland; grid.22254.330000 0001 2205 0971Poznan University of Medical Sciences, Poznań, Poland; grid.7497.d0000 0004 0492 0584Genomics and Proteomics Core Facility High Throughput Sequencing Unit, German Cancer Research Center (DKFZ), Heidelberg, Germany; grid.410724.40000 0004 0620 9745NCCS-VARI Translational Research Laboratory, National Cancer Centre Singapore, Singapore, Singapore; grid.4367.60000 0001 2355 7002Edison Family Center for Genome Sciences and Systems Biology, Washington University, St. Louis, MO USA; grid.301713.70000 0004 0393 3981MRC-University of Glasgow Centre for Virus Research, Glasgow, UK; grid.5288.70000 0000 9758 5690Department of Medical Informatics and Clinical Epidemiology, Division of Bioinformatics and Computational Biology, OHSU Knight Cancer Institute, Oregon Health and Science University, Portland, OR USA; grid.33199.310000 0004 0368 7223School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China; grid.136593.b0000 0004 0373 3971Department of Cancer Genome Informatics, Graduate School of Medicine, Osaka University, Osaka, Japan; grid.7700.00000 0001 2190 4373Institute of Computer Science, Heidelberg University, Heidelberg, Germany; grid.1013.30000 0004 1936 834XSchool of Mathematics and Statistics, University of Sydney, Sydney, NSW Australia; grid.170205.10000 0004 1936 7822Ben May Department for Cancer Research, University of Chicago, Chicago, IL USA; grid.170205.10000 0004 1936 7822Department of Human Genetics, University of Chicago, Chicago, IL USA; grid.5386.8000000041936877XTri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY USA; grid.43169.390000 0001 0599 1243The First Affiliated Hospital, Xi’an Jiaotong University, Xi’an, China; grid.10784.3a0000 0004 1937 0482Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Shatin, NT, Hong Kong China; grid.240145.60000 0001 2291 4776Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX USA; grid.428397.30000 0004 0385 0924Duke-NUS Medical School, Singapore, Singapore; grid.16821.3c0000 0004 0368 8293Department of Surgery, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; grid.8756.c0000 0001 2193 314XSchool of Computing Science, University of Glasgow, Glasgow, UK; grid.55325.340000 0004 0389 8485Division of Orthopaedic Surgery, Oslo University Hospital, Oslo, Norway; grid.1002.30000 0004 1936 7857Eastern Clinical School, Monash University, Melbourne, VIC Australia; grid.414539.e0000 0001 0459 5396Epworth HealthCare, Richmond, VIC Australia; grid.65499.370000 0001 2106 9910Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA USA; grid.261331.40000 0001 2285 7943Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH USA; grid.413944.f0000 0001 0447 4797The Ohio State University Comprehensive Cancer Center (OSUCCC – James), Columbus, OH USA; grid.267308.80000 0000 9206 2401The University of Texas School of Biomedical Informatics (SBMI) at Houston, Houston, TX USA; grid.10698.360000000122483208Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA; grid.16753.360000 0001 2299 3507Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL USA; grid.1013.30000 0004 1936 834XFaculty of Medicine and Health, University of Sydney, Sydney, NSW Australia; grid.5645.2000000040459992XDepartment of Pathology, Erasmus Medical Center Rotterdam, Rotterdam, GD The Netherlands; grid.430814.a0000 0001 0674 1393Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Amsterdam, CX The Netherlands; grid.7400.30000 0004 1937 0650Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
License: © The Author(s) 2020 CC BY 4.0 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Article links: DOI: 10.1038/s41588-019-0557-x | PubMed: 32024997 | PMC: PMC7058535
Relevance: Moderate: mentioned 3+ times in text
Full text: PDF (3.4 MB)
Main
Mitochondria are crucial cellular organelles in eukaryotes, and there can be several hundred mitochondria in a single human cell1. Known as ‘the powerhouses of the cell’, mitochondria play essential roles in generating most of the cell’s energy through oxidative phosphorylation2. Despite its small size (16.6 kilobases (kb)), the circular mitochondrial genome encodes 13 proteins that form respiratory chain complexes with other proteins of nuclear origin3. The involvement of mitochondria in carcinogenesis has long been suspected4,5 because altered energy metabolism is a common feature of cancer6. Furthermore, mitochondria play important roles in other tasks, such as biosynthesis, signaling, cellular differentiation, apoptosis, maintaining control of the cell cycle and cell growth, all of which are intrinsically linked to tumorigenesis5,7.
In several recent studies, molecular characterization of mitochondria was performed in cancer by using next-generation sequencing data8–13, but these studies usually describe one specific dimension of the mitochondrial genome (for example, somatic mutations) based on relatively small sample cohorts. Furthermore, due to the whole-exome sequencing data employed, the relatively low depth of mitochondrial genomes limits the accuracy and scope of these studies. Thus, a comprehensive, multidimensional molecular portrait of mitochondria across a broad range of cancer types has not been achieved. Moreover, previous studies have focused on the patterns of mitochondrial alterations alone, without fully exploring the interplay between the mitochondrial genome and the nuclear genome, as well as the biomedical significance of mitochondrial alterations.
The Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium aggregated whole-genome sequencing (WGS) data from 2,658 cancers across 38 tumor types generated by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) projects. These sequencing data were re-analyzed with standardized, high-accuracy pipelines to align to the human genome (reference build hs37d5) and identify germline variants and somatically acquired mutations, as described14. Meanwhile, TCGA has generated RNA sequencing (RNA-seq) data from a large number of patient samples, which allow for assessment of the transcriptional activities of mitochondrial genes15. These large-scale datasets create a tremendous resource for characterizing cancer mitochondrial genomes at an unprecedented level (Fig. 1a). We first characterized mitochondrial somatic mutations, nuclear transfers and copy numbers, then investigated their interactions with nuclear somatic alterations16,17, and finally examined the expression profiles of mitochondrial genes and their connections with clinically relevant nuclear genes.

Results
Mutational landscape of cancer mitochondrial genomes
To characterize somatic mutations in mitochondrial genomes across cancer types, we extracted the mitochondrial DNA (mtDNA) mapped reads of 2,658 cancer and matched control sample pairs from the PCAWG Consortium. The samples we surveyed covered 21 cancer tissues and 38 specific cancer types (Supplementary Table 1). On average, the sequencing depth for the mitochondrial genome was 9,959×, which was much higher than that obtained from whole-exome sequencing data, allowing for confident detection of somatic mutations at a very low heteroplasmic level (variant allele fraction (VAF) > 1%; Supplementary Fig. 1). By applying a well-designed computational pipeline that carefully considered various potentially confounding factors (for example, sample cross-contamination, mismapping of reads from nuclear mtDNA-like sequence18, and artifactual mutations caused by oxidative DNA damage during library preparation19), we identified a total of 7,611 somatic substitutions and 930 small indels in 2,536 high-quality cancer samples (122 samples were excluded in the mutation analysis for the issues mentioned above; Supplementary Fig. 2 and Methods). The high reliability of the mutations was confirmed by long-range PCR-based validation (Supplementary Table 2) and by inspection of the mutational spectrum of the very low-VAF mutation candidates (Supplementary Fig. 3).
Of the 7,611 substitutions, >85% were clearly heteroplasmic, showing VAFs lower than 0.6 (average: 0.2; median: 0.045). Overall, mtDNA mutations located in the transcribed regions were also found in RNA-seq with similar VAFs, except for a fraction of transfer RNA (tRNA) mutations showing much higher VAFs in transcripts due to the accumulation of unprocessed tRNA precursors during the processing of polycistronic mitochondrial transcripts10 (Supplementary Fig. 4). Across all of the cancer samples, we observed several mutational hotspots in the regulatory D-loop region and the ND4 gene (Fig. 1b). Of the 13 protein-coding genes, ND5 was the most frequently mutated in most cancer types, while ND4 was most frequently mutated in prostate and lung cancers, and COX1 was most frequently mutated in breast, cervical and bladder cancers (Supplementary Fig. 5). We identified that cancer type and gene identity were associated with the mutation status of the 13 coding genes (log-linear model, Pcancer type < 2.2 × 10−16; Pgene < 2.2 × 10−16), but the effect of their interaction was not significant (Pcancer type × gene = 0.12).
In contrast with somatic mutations in nuclear genomes (where cancer type-specific mutational signatures are observed)20, mtDNA mutational signatures were very similar across tumor types, with C:G>T:A (58.3%) and T:A>C:G (34.2%) substitutions being the most and second most frequent mutation types, respectively (Fig. 1c and Supplementary Fig. 6). Indeed, the impact of well-known carcinogens (for example, tobacco smoking (C:G>A:T dominant; signature 4), ultraviolet light (C:G>T:A dominant at dipyrimidine contexts; signature 7) and reactive oxygen species (G:C>T:A dominant)) were minimal (Supplementary Fig. 7) even in lung and skin cancers (the latest mutational signatures of nuclear genomes are available from the Catalogue of Somatic Mutations in Cancer database: https://cancer.sanger.ac.uk/cosmic/signatures). Instead, the vast majority of mtDNA mutations were manifesting extreme replicational mtDNA strand bias9,21,22; that is, predominant G>A and T>C substitutions and deficient complementary C>T and A>G substitutions on the light (L) strand of the mtDNA genome sequence (+strand of the revised Cambridge Reference Sequence) despite the relative depletion of guanines and thymines on the L strand (Supplementary Fig. 6). These mutational signatures suggest that mitochondria-specific, replication-coupled mutational processes (such as mtDNA polymerase gamma error9,21,23 or other replication-coupled DNA damage mechanisms) are dominantly responsible for somatic mtDNA mutations in cancer.
In agreement with its endogenous origin, we observed clock-like properties in mtDNA mutations as nuclear genome mutations24. The number of mtDNA mutations in our study was largely proportional to the age of the patient at the time of tissue sampling (Supplementary Fig. 8). In addition, the maximum VAF of somatic mutations in a cancer tissue showed positive correlation with age, on average (Fig. 1d; P < 2.2 × 10−16). Collectively, these results suggest that the vast majority of mtDNA somatic mutations were: (1) acquired at an earlier age when the cell lineage was phenotypically normal; and (2) overall shifted towards homoplasmy throughout life in the cellular lineage of the neoplastic cells. The spread to homoplasmy can, in theory, be caused by either physiological advantage (selection) or a series of asymmetric segregations during cell divisions (drift)25, or both.
To further assess the potential impact of mtDNA mutations, we performed integrative analysis by examining alterations from mitochondrial and nuclear genomes simultaneously17. We observed significantly positive correlations between the mutation burdens of mitochondrial and nuclear genomes in several cancer types, with the highest correlations observed in kidney chromophobe and thyroid cancers (magenta bars in Fig. 1e). Some of these correlations may be explained by the age effect, as the mutation numbers in both mitochondrial and nuclear genomes were significantly correlated with patient age in the corresponding cancer types (bars marked with an asterisk in Fig. 1e). In addition, we examined the mtDNA mutation frequency in the context of nuclear drivers. Although nuclear driver alterations exist in the majority of patients in most cancer types, a notable proportion of patients (22.2% with kidney chromophobe cancer and 18.8% with thyroid cancer) bear non-silent mtDNA mutations but no known nuclear drivers, suggesting a potential functional contribution of mtDNA mutations in the absence of nuclear drivers in these cancer types (Fig. 1f).
Hypermutation process in mitochondrial genomes
Hypermutation processes have been well established for a small proportion of cancer nuclear genomes (for example, microsatellite instability)26,27, but have not been reported for mitochondrial genomes. Of the 2,536 cancer samples surveyed, seven cases showed extremely large numbers of mtDNA somatic substitutions (>13 mutations), which were larger than expected from the background distribution (Fig. 2a; around three somatic substitutions per sample on average, with a standard deviation of 2.6). The mutational spectra in these hypermutated samples were sometimes clearly distinguished from the background L-strand G>A and T>C substitution dominant signature (Fig. 2b), suggesting that the massive numbers of mutations are not the consequence of the gradual accumulation of ordinary mtDNA substitutions.

The most striking case was a breast cancer sample (sample ID: SP6730) harboring 33 mutations, 30 of which were localized in a 2-kb region (Fig. 2c), resulting in a local hypermutational rate (>75× higher than the background mutational rate). The mutations were neither of germline origin (~70% were novel) nor caused by sequencing errors, as confirmed by independent exome and RNA-seq analyses (Supplementary Fig. 9). Interestingly, most of the localized mutations (n = 28) were T>C substitutions on the L strand (Fig. 2b,c) and were co-clonal of each other, with highly similar VAFs (~7%) and direct physical phasing by Illumina sequencing reads (Supplementary Fig. 9). Collectively, these lines of evidence strongly suggest that the 28 localized mutations (19 missense, four silent and five tRNA mutations) were acquired by a ‘single-hit’ catastrophic mutational mechanism with strand-specific T>C substitutions as a dominant spectrum, reminiscent of the kataegis phenomenon in the nuclear genome28 (Fig. 2c) and/or complex somatic mutations reported in mtDNA29. The mutated mtDNA copy is then likely to shift to appreciable VAF (~7% frequency) by a series of replications throughout the cell lineages, despite the low probability of causation of a defective phenotype.
Cancer type-specific selective pressures on mtDNA mutations
To investigate the functional consequences of mtDNA genes, we examined the dN/dS ratio (a common measure of selective pressure on missense mutations) with consideration of the unique mtDNA mutational signature9. We found that dN/dS was overall close to 1 for missense mutations at different VAFs across cancer types, suggesting that overall selection for mtDNA missense mutations is nearly neutral (Supplementary Fig. 10). However, it should not be interpreted that all missense mtDNA mutations are passengers.
For truncating mutations on the 13 mtDNA genes, we found evidence of negative selection in most cancer types, suggesting the importance of intact mitochondrial function in cancer cells. For example, the VAFs of mtDNA truncating mutations were notably more suppressed than those of missense or silent mutations (Fig. 3a). Interestingly, kidney, colorectal and thyroid cancers showed the opposite trend, where mtDNA truncating mutations exhibited significantly higher VAFs than the background (F-test, P < 2.2 × 10−16; Fig. 3a). The enrichment of nearly homoplasmic (>60% VAF) truncating mutations was very striking in kidney cancers, especially in chromophobe and papillary types, suggesting that the inactivation of the normal mitochondrial function is an important step in tumorigenesis30 (Fig. 3b and Supplementary Fig. 11). The mtDNA truncating mutations were enriched in ND5. Compared with kidney chromophobe and colorectal cancers, kidney papillary cancers harbored ND5 truncation mutations enriched in the amino-terminal region (Fisher’s exact test, P = 0.05; Fig. 3c). Integrating with the mutation data of nuclear genes, we found that the high VAF truncating mutations in the two kidney cancer types were mutually exclusive to the mutations of known cancer genes (Fisher’s exact test, P = 0.01; Fig. 3d). Moreover, samples with mtDNA truncating mutations showed upregulation of gene expression in cancer-related pathways, such as mammalian target of rapamycin signaling, tumor necrosis factor-α signaling, oxidative phosphorylation and protein secretion (false discovery rate (FDR) < 0.05; Fig. 3e). Collectively, these results strongly suggest functional oncogenic impacts of mitochondrial truncating mutations in the initiation and clonal evolution of the specific cancer types.

Somatic transfer of mtDNA into the nuclear genome
The migration of mtDNA into the nuclear genome has been assessed using different technologies31–33. Recently, somatic mtDNA nuclear transfers (SMNTs) have been more systematically studied in nucleotide resolution11, mostly in breast cancers. In this study, of the 2,658 cancer cases across 21 tissue types, we found 55 positive cases (2.1% overall positive rate) (Methods). The SMNT rate varied according to the cancer tissue type (Fisher’s exact test, P < 1 × 10−5; Fig. 4a). For example, lung, skin, breast and uterine cancers showed frequencies higher than 5%. In particular, human epidermal growth factor receptor 2-positive (HER2+) breast cancers and squamous cell lung cancers showed positive rates of 16.0% (four out of 25 cases) and 14.6% (seven out of 48 cases), respectively, which were significantly higher than the average (Fisher’s exact test, P < 0.003 and P < 0.001, respectively). In contrast, we did not find any positive cases from blood, kidney, esophagogastric, liver, prostate and colorectal cancers. The samples with SMNTs showed a much higher number of global and local structural variations in the nuclear genome than the control samples16 (P = 1 × 10−4; Fig. 4b and Supplementary Fig. 12). SMNT integration sites (breakpoints) were spatially closer to inversion and translocation breakpoints than expected (Fig. 4c). These results suggest that the integration of mtDNA segments into nuclear DNA is often mechanistically combined with some specific processes underlying structural variations in the nuclear genome.

Despite the overall low SMNT frequency (~2%), some cancer samples showed up to three independent SMNT events (Fig. 4d and Supplementary Fig. 13). Sometimes, somatically transferred mtDNA segments were extensively rearranged (Supplementary Fig. 13b), implying extreme genomic instability at the time of the SMNT events. We observed 42 SMNT events in 35 tumor cases that were integrated in the middle of genes (n = 42), mostly in introns (n = 37), with a few events in the protein-coding regions (n = 3) and in the untranslated regions (n = 2) (Supplementary Table 3). Among these, open reading frames of at least 23 genes (23/42 = 55%), including cancer genes such as ERBB2, FOLH1 and ULK2, were predicted to be altered by these SMNTs and their combined structural variant events in the vicinity (Supplementary Fig. 14). Of particular interest, one SMNT was involved in transforming focal amplification of the ERBB2 gene in a HER2+ breast cancer genome (Fig. 4e).
Copy-number and structural variations of mtDNA
Although previous studies have examined mtDNA copy numbers in individual cancer types34–36 or from a collection of whole-exome sequencing data12, we performed a systematic and accurate analysis of mtDNA copy numbers per cell over the largest sample cohort with WGS data so far, with consideration to confounding factors such as the normal-cell contamination and genome ploidy of tumor cells (Supplementary Fig. 15 and Methods).
Based on the 2,157 cancer samples that passed the purity filter, we observed great variation in mtDNA copy numbers across and within cancer types: mtDNAs were most abundant in samples of ovarian cancer (median: 644 copies per cell) and least abundant in myeloid cancer (median: 90 copies per cell) (Fig. 5a). Different cancer subtypes originating from the same tissue sometimes showed distinct mtDNA copy-number distributions (Fig. 5b and Supplementary Fig. 16). For example, the mtDNA copy numbers for kidney chromophobe were significantly higher than those for kidney clear cell and kidney papillary (analysis of variance (ANOVA), P < 7.8 × 10−6; Fig. 5b). This may be interlinked with the general inadequacy of mitochondrial quality control and resultant increase in the steady-state mtDNA copy number, as seen in renal oncocytoma37. Indeed, we found that the mtDNA copy number was significantly higher in the samples with high-allele-frequency truncating mutations (ANOVA with consideration of confounders, P < 1.7 × 10−4; Fig. 5c), suggesting that the dosage effect of mtDNAs was selected to compensate for the deleterious effect of truncating mutations. For the cancer samples with WGS data from matched normal tissues (n = 507), we observed increased mtDNA copy numbers in cancer samples in patients with chronic lymphocytic leukemia, lung squamous cell carcinoma and pancreatic adenocarcinoma, but decreased copy numbers in cancer samples in patients with kidney clear cell carcinoma, hepatocellular carcinoma and myeloproliferative neoplasm (Fig. 5d). At face value, the distinct patterns in different cancer types may be due to cancer-specific oncogenic stimulation, metabolic activity and mitochondrial malfunctions. For example, a recent study12 suggested that significantly decreased mtDNA copy number in kidney clear cell cancer may be due to downregulation of peroxisome proliferator-activated receptor-γ coactivator 1α (a central regulator of mitochondrial biogenesis) by hyperactivated hypoxia-inducible factor 1α, which is most frequently mutated and activated in this disease38. However, since the available mtDNA copy numbers in normal tissues are average values from mixtures of many heterogeneous cell types with unknown relative contributions, a direct comparison between tumor and adjacent normal tissues should be interpreted cautiously.

To assess the potential biomedical significance of mtDNA copy numbers, we examined their correlations with key clinical variables. We found significant positive correlations between the mtDNA copy number and the patient’s age at diagnosis in prostate (Spearman’s rank, Rs = 0.31; P < 1.7 × 10−4; Fig. 5e), colorectal and skin cancers (Supplementary Fig. 17). In contrast, we observed negative correlations of normal blood mtDNA copy number with patient age in most cases (Supplementary Fig. 18). We further observed correlations between mtDNA copy number and tumor stage in multiple cancer types (Fig. 5f and Supplementary Fig. 19).
Using WGS data, we examined the focal copy gain and loss in the mitochondrial genomes that were known to be present in prostate cancers and aged tissues39. Of the 2,658 cancer samples, three (0.11%) showed notable structural variants in the mtDNA (Fig. 5g). For example, a pancreatic cancer case (sample ID: SP76017) harbored a ~3.4-kb-long mtDNA loss that truncated ribosomal RNA and ND1 genes. The VAF of this mutant mtDNA was estimated at 63%. Similarly, a melanoma case (sample ID: SP127680) showed tandem duplication of an mtDNA segment of ~4 kb, with 100% VAF. Thus, our analysis identified structural variants in mtDNA genomes based on WGS.
Co-expression network analysis of mitochondrial genes
To understand the functional impact of 13 mtDNA genes in cancers, we quantified the gene expression levels using RNA-seq data profiled from 4,689 TCGA tumor samples of 13 cancer types (Supplementary Table 4). The correlation between the gene expression levels and the mtDNA copy number varied by cancer type (Supplementary Fig. 20). Among the cancer types, the mtDNA genes were highly expressed in the three types of kidney cancer (chromophobe, papillary and clear cell) but weakly expressed in the three types of squamous cell carcinoma (cervical, lung and head and neck) (Fig. 6a). This observation was partially due to the relative abundance of mtDNA copy number across cancer types and is consistent with a study of normal tissues40.

To gain more insight into the functions of mtDNA genes and their related nuclear genes and pathways, for each cancer type, we used the weighted gene co-expression network analysis (WGCNA) package41 to build a weighted gene co-expression network that consisted of both nuclear genes and mitochondrial genes (Methods). We then performed gene set enrichment analysis (GSEA)42 based on the rank of all nuclear genes by measuring their edge strength to a mitochondrial gene in the co-expression network. We found oxidative phosphorylation to be the top-ranked enriched pathway, and to be enriched in eight out of the 13 cancer types examined (FDR < 0.05), highlighting the essential role of mitochondrial genes in energy generation (Fig. 6b). Pathways related to the cell cycle (MYC targets, mitotic spindle, G2/M checkpoint and E2F targets) and DNA repair were also enriched in multiple cancer types (Fig. 6b), consistent with the established notion that mtDNA plays an important role in these pathways37,43.
We also examined the mtDNA-centric co-expression networks (Fig. 6c and Methods). Across cancer types, the mtDNA genes were almost always strongly interconnected, which is expected since they are transcribed as long polycistronic precursor transcripts44. Interestingly, several clinically actionable genes were among the neighboring genes that showed strong co-expression patterns with mtDNA genes (Fig. 6c and Supplementary Fig. 21). For example, AR, EGFR, DDR2 and MAP2K2 were connected with mtDNA genes in prostate cancer, and TMPRSS2, NF1, PIK3CA, BRCA1 and TOP1 were the top neighbors of mtDNA genes in multiple cancer types. This correlation-based analysis does not necessarily suggest causality, and further efforts are needed to investigate these relationships. Elucidating the underlying mechanisms may lay a foundation for developing mtDNA-related cancer therapy.
An open-access Cancer Mitochondrial Atlas data portal
To facilitate mitochondria-related biological discoveries and clinical applications, we developed an open-access, user-friendly data portal, The Cancer Mitochondrial Atlas (TCMA), for fluent exploration of the various types of molecular data characterized in this study (Supplementary Fig. 22). The data portal can be accessed at http://bioinformatics.mdanderson.org/main/TCMA:Overview. There are four modules in TCMA: somatic mutations, nuclear transfer, copy number and gene expression. The first three modules are based on the ICGC WGS data and provide detailed annotations for the corresponding features of each cancer sample. The last module is based on TCGA RNA-seq data and provides an interactive interface through which users can visualize the co-expression network with convenient navigation and zoom features. Not only can users browse and query the molecular data by cancer type, they can also download all of the data for their own analysis.
Discussion
This work characterizes the cancer mitochondrial genome in a comprehensive manner, including somatic mutations, nuclear transfer, copy number, structural variants and mtDNA gene expression. Because of the ultra-high coverage of mtDNA from the WGS data and the large number of patient samples surveyed, our study provides a definitive landscape of mtDNA somatic mutations and identifies several unique features. First, we report hypermutated mitochondrial cases, highlighting the dynamic mutational processes in this tiny genome. Second, our systemic analysis of mitochondrial genomes has firmly shown that several cancer types are enriched for high-allele-frequency truncating mutations, including previously reported kidney chromophobe30,45 as well as newly identified kidney papillary, and thyroid and colorectal cancers. Interestingly, the thyroid and kidney are the most frequent sites of oncocytomas, which are rare, benign tumors characterized by frequent nuclear chromosomal aneuploidy as well as vast accumulation of defective mitochondria45,46, further assuring the functional association between mitochondrial inactivation and the pathogenesis of these cancer types. Third, in contrast with the diversified mutational signatures observed in the nuclear genomes of different cancers20, mtDNAs show very similar mutational signatures regardless of cancer tissue origins: predominantly G>A and T>C substitutions on the L strand. This monotonous pattern may partially stem from different mutational generators and DNA repair processes between the nucleus and mitochondria9,47,48. Due to their large numbers of copies per cell, mitochondria may simply remove mtDNA damaged from external mutagens (for example, ultraviolet radiation, tobacco smoking and reactive oxygen species) through autophagy and other mitochondrial dynamic mechanisms49, rather than employing a complex array of repair proteins as in the nucleus.
One unique aspect of our study is the integrative analysis of mitochondrial molecular alterations with those in the nuclear genome that are characterized by the PCAWG Consortium. We found that: (1) high-allele-frequency truncating mtDNA mutations are mutually exclusive to mutated cancer genes in kidney cancer; (2) mtDNA nuclear transfers are associated with increased numbers of structural variants in the nuclear genome; and (3) mtDNA co-expressed nuclear genes are enriched in several processes critical for tumor development. These results indicate that the mitochondrial genome is an essential component in understanding the complex molecular patterns observed in cancer genomes and helping to pinpoint potential cancer driver events. Our results, such as the nuclear transfer of mtDNA into a therapeutic target gene, correlations of mtDNA copy numbers with clinical variables, and the co-expression of mtDNA and clinically actionable genes, underscore the clinical importance of mitochondria.
Taken together, this study has untangled and characterized the full spectrum of molecular alterations of mitochondria in human cancers. Our analyses have provided essentially complete catalogs of somatic mtDNA alterations in cancers, including substitutions, indels, copy-number alterations and structural variants. Furthermore, we have developed a user-friendly web resource to enable the broader biomedical community to capitalize on our results. These efforts lay a foundation for translating mitochondrial biology into clinical investigations.
Methods
Data generation and collection
We extracted BAM files of mtDNA sequencing reads from the whole-genome alignment files of 2,658 cancer samples and their matched normal tissue samples generated by the PCAWG Consortium. BWA was used to align the reads to the human reference genome (hs37d5). From the CGHub, we obtained TCGA RNA-seq BAM files of 13 cancer types, all of which employed paired-end sequencing strategies. We used Cufflinks to quantify the messenger RNA expression levels (in fragments per kilobase per million mapped fragments) of the 13 mitochondrial protein-coding genes. We obtained the nuclear somatic mutations and annotated driver mutations of corresponding samples as described17.
Somatic mutation calling
The nuclear genome mutations were called using the Sanger pipeline, provided by the PCAWG. The mitochondrial variants were initially called using VarScan2 (ref. 50) and the same parameter setting as previously reported9: –strand-filter 1 (mismatches should be reported by both forward and reverse reads), –min-var-freq 0.01 (minimum VAF 1%), –min-avg-qual 20 (minimum base quality 20), –min-coverage × and –min-reads2 ×). We applied a series of downstream bioinformatic filters to further remove false positives as follows (Supplementary Fig. 2a).
First, we filtered germline polymorphisms and false positive calls (for example, frequent mapping errors due to known mtDNA homopolymers, candidates with substantial mapping strand bias and candidates with substantial mutant alleles in the matched normal sample). For analytic simplicity, we removed multi-allelic mtDNA mutations and back mutations from the non-reference to the reference allele. After this filtration step, we obtained 10,083 somatic substitution candidates.
Second, we examined DNA cross-contamination because even minor DNA cross-contamination (that is, contamination level < 3%) would generate many low-VAF false positive calls that are in fact germline polymorphisms from the contaminating sample. We tested whether mtDNA somatic mutations detected from a cancer sample show greater overlap with known mtDNA polymorphisms than expected from the overall average rate (73.5%; 3,922/5,337 substitutions) using the binomial test with a cutoff P < 0.01. From this step, we removed 96 samples with evidence of DNA cross-contamination (harboring 935 known mutations out of 1,131 known mutation candidates).
Third, we examined the overall mtDNA substitution signatures in the 96 possible mutation classes. We removed four samples with extremely high proportions of C>G substitutions with strong sequence context bias (at CpCpN>CpApN; most frequently at CpCpG>CpApG; Supplementary Fig. 2b). This spectrum is known to arise from artificial guanine oxidation during sequencing library preparation steps19 with low VAF (1–2%). We explicitly removed these samples from further analyses.
Then, we examined the possibility of false positive calls due to mismapping of reads from inherited nuclear mtDNA-like sequences (known as numts) not represented in the human reference genome18, especially when the specific numts regions were amplified in the cancer nuclear genome. These mutation candidates showed some specific features: (1) they appeared as highly recurrent mtDNA somatic mutations among multiple samples; (2) VAFs in mitochondria were only slightly higher than our 1% cutoff criteria; and (3) the matched normal samples also had small but substantial numbers of mutation allele counts. To remove these false positive calls, we applied two statistical tests of: (1) whether the VAF of a mutation candidate in the matched normal sequences was within the normal range (<0.0024; the cutoff is determined by the median VAF of all mutation candidates +2× the interquartile range); and (2) whether:
\frac{{N_{{\rm{mut}}_{{\rm{nor}}}}/{\rm{RD}}_{{\rm{nor}}}}}{{(N_{{\rm{mut}}_{{\rm{nor}}}}/{\rm{RD}}_{{\rm{nor}}} + N_{{\rm{mut}}_{{\rm{tum}}}}/{\rm{RD}}_{{\rm{tum}}})}}
\]
was within the normal range (<0.0357; the cutoff is determined by the median VAF of all mutation candidates +2× the interquartile range), where Nmut is the mutation allele count, RD is the average read depth for the nuclear genome, and nor and tum are normal and matched tumor tissues, respectively. When a mutation appeared to be an outlier according to both criteria, we removed the candidate from our downstream analyses.
In our previous study9, we could not detect mutations under a 3% VAF cutoff because mtDNA was sequenced with a read depth of ~100× from the majority of samples surveyed. Taking advantage of the ultra-high depth (>8,000×) in this study, we used a 1% VAF cutoff to obtain better sensitivity. We found 2,133 more substitutions when the VAF was between 1 and 3%. Because of the ultra-high depth, even 1% VAF mutations were considered to be specific, and were supported by a high number (n = ~80) of mutation alleles. We confirmed the high specificity of these mutations using the unique mtDNA mutational signatures robustly observed even from these low-VAF mutations: (1) the mutational spectrum is generally consistent with those from higher heteroplasmic levels of mutations (that is, VAFs from 3–10% and 10–100%); (2) we observed the absolute dominance of C>T and T>C substitutions in the expected trinucleotide contexts (NpCpG for C>T and NpTpC for T>C substitutions); and (3) we also observed extreme replication strand bias (Supplementary Fig. 3). These features would not be observed if contaminations resulted in many false positive calls. To assess the factors affecting the mutation frequency of the 13 coding genes, we performed the sample-level analysis using log-linear modeling: we assigned the binary mutation indicator (1: with mutation; 0: without mutation) to each sample for each gene and then fit this binary response variable to a logistic regression model, including cancer type, gene identity and their interaction as explanatory variables, which were later summarized using ANOVA. In addition, within each cancer type, we used Spearman’s rank correlation to assess the association between the numbers of nuclear and mtDNA somatic mutations, as well as their individual association with patient age.
Truncating mutation analysis
Taking into account the mtDNA-specific mutational signature, we examined the dN/dS ratio for mtDNA missense substitutions as reported previously9. We defined truncating mutations as those that lead to truncated protein products (that is, nonsense mutations and frameshift indels), and accordingly categorized the samples into the truncating group (bearing at least one truncating mutation with VAF ≥ 60%). The ND5 protein domain information was obtained from Pfam (http://pfam.xfam.org/protein/P03915). The cancer gene census list was obtained from http://cancer.sanger.ac.uk/cosmic/download. Cancer census genes with recurrent somatic mutations in kidney chromophobe and kidney papillary cancers were selected for analysis of mutual exclusivity and heat-map representation. One sample with a nuclear DNA hypermutator phenotype was excluded from this analysis. To examine the functional consequences of mtDNA truncating mutations, we performed GSEA based on the ranks of differentially expressed genes between samples with and samples without mtDNA truncating mutations for kidney chromophobe, kidney papillary, colorectal and thyroid cancers and their combination, and identified significantly enriched pathways at FDR = 0.05.
SMNT analysis
We examined the WGS data from the cancer and matched control tissue samples using a pipeline for the identification of mtDNA translocation to the nuclear genome, as reported previously11. The specificity was shown to be 100% in the previous study11. Briefly, we extracted and clustered discordant reads from cancer genomes, where one end aligned to nuclear DNA and the other aligned to mtDNA. Then, to determine the nucleotide resolution breakpoints, we searched for split reads near putative breakpoint junctions (1,000 base pairs upstream and downstream), where a fraction of a single read aligned to genomic DNA near the junctions and the rest aligned to mtDNA. All filtering criteria were the same as previously reported, except that we did not use BLAT51 for split-read detection because the BWA-MEM alignment tool used to map all pan-cancer samples fundamentally enables split-read mapping. We removed candidate mitochondrion–nuclear DNA junctions that overlapped with clusters from matched and unmatched normal samples and/or known human SMNTs—a combined set from the human reference genome (hg19; n = 123) and a published study52 (n = 766)—because the source of the mtDNA sequence fused to the nuclear genome might be SMNTs rather than real mitochondria in the cytoplasm of cells. We obtained the PCAWG Structural Variation Working Group16 and compared the samples with and without SMNTs by t-test. To study the relationship of SMNTs and structural variant breakpoints, we randomly chose the same number of structural variant breakpoints from each sample 100 times to estimate the random expectation.
MtDNA copy-number analysis
To better estimate the mtDNA copy number for cancer samples, we employed the following formula, which incorporates both tumor purity and ploidy information:
{\rm{CN}}_{{\rm{tumor}}} = \frac{{{\rm{coverage}}\_{\rm{depth}}_{{\rm{mtDNA}}}}}{{{\rm{coverage}}\_{\rm{depth}}_{{\rm{gDNA}}}}}(f \times{\rm{ploidy}}_{{\rm{cancer}}}+(1 – f) \times 2)
\]
where f is the tumor purity (ranging from 0 to 1, where 1 stands for pure cancer cells and 0 stands for pure normal cells), CN is the mtDNA copy number, coverage_depthmtDNA and coverage_depthgDNA are the mean coverage depths for mtDNA and the nuclear genome in individual WGS BAM files, respectively, and ploidycancer is the number of sets of chromosomes in tumor cells, while ploidy in the normal cells is 2. Both f and ploidycancer were obtained using allele-specific copy-number analysis of tumors estimation53, provided by the PCAWG Consortium. Donors with multiple samples were preselected so that each donor came with one representative primary cancer sample. We excluded cancer samples with low purity (<0.4, estimated by allele-specific copy-number analysis of tumors) for further downstream analyses. We used ANOVA (if there were more than two cancer types) or t-test to compare the mtDNA copy number of cancer types derived from the same tissue. Since many of the normal samples were from blood, we focused on the cancer types with at least ten samples from the normal tissue adjacent to the tumor in order to compare the mtDNA copy number of the paired cancer and normal samples. We used the Wilcoxon signed-rank test to compare the mtDNA copy number for each selected cancer type and further adjusted the raw P values based on the FDR. To assess the correlation of mtDNA copy number with truncating mutations, we employed ANOVA (with the cancer type included in the model, to account for its potential effect). We assessed the correlations of the mtDNA copy number with the patient’s age, overall survival time and cancer stage using Spearman’s rank correlation, Cox model/log-rank test and ANOVA, respectively. We log2-transformed the mtDNA copy-number values when using ANOVA and the t-test, to conform to the normality assumption.
mtDNA structural variation analysis
To investigate large deletions or duplications in the mtDNA genome, we sought the read-depth change of tumor mtDNA sequences using normal mtDNA sequences as a reference. To this end, we calculated the normalized depth of mtDNA loci in 100-base pair-sized bins from all of the normal samples. Then, we calculated the deviation of mtDNA read depth in each tumor sample. When ten bins were consecutively increased or decreased in the relative depth sufficiently (z score > 3), we considered the region as a structural variation candidate. From all of the candidates, we sought discordant paired-end reads, or breakpoint-spanning reads, which strongly support structural variations11.
Co-expression analysis
For each cancer type, we used the WGCNA package41 to build a weighted gene co-expression network that contains ~20,000 nodes (including both nuclear genes and mitochondrial genes). The key parameter, β, for a weighted network construction was optimized to maintain both the scale-free topology and sufficient node connectivity, as recommended in the manual. In such a network, any two genes were connected and the edge weight was determined by the topology overlap measure provided in WGCNA. This measure considered not only the expression correlation between two partner genes, but also how many ‘friends’ the two genes shared. The weights ranged from 0 to 1, which reflected the strength of the connection between the two genes. To identify mitochondria-related pathways, we performed GSEA42 on the basis of the full set of nuclear protein-coding genes, ranked on the basis of the weights of the edge connecting the mitochondrial genes, and detected significant pathways at FDR = 0.05. To construct the mitochondria-centric network, we focused on the top 500 neighboring genes that showed the strongest connections with the mitochondrial genes, with a minimum weight of 0.05. Among these neighboring genes, we detected the clinically actionable genes (defined as FDA-approved therapeutic targets and their relevant predictive markers54) in at least one of the cancer types we surveyed. We examined the correlations of mtDNA gene expression levels with mtDNA copy numbers using Spearman’s rank correlations.
TCMA data portal construction
We stored the precalculated mtDNA molecular data (including mtDNA mutation, nuclear transfer, copy number and expression) in a database of CouchDB. The Web interface was implemented by JavaScript, tables were visualized by DataTables, and the co-expression network visualization was implemented by Cytoscape Web.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-019-0557-x.
Supplementary Materials
References
- EA Schon, S DiMauro, M Hirano. Human mitochondrial DNA: roles of inherited and somatic mutations. Nat. Rev. Genet., 2012. [DOI | PubMed]
- J Smeitink, L van den Heuvel, S DiMauro. The genetics and pathology of oxidative phosphorylation. Nat. Rev. Genet., 2001. [DOI | PubMed]
- S Anderson. Sequence and organization of the human mitochondrial genome. Nature, 1981. [DOI | PubMed]
- M Brandon, P Baldi, DC Wallace. Mitochondrial mutations in cancer. Oncogene, 2006. [DOI | PubMed]
- WX Zong, JD Rabinowitz, E White. Mitochondria and cancer. Mol. Cell, 2016. [DOI | PubMed]
- D Hanahan, RA Weinberg. Hallmarks of cancer: the next generation. Cell, 2011. [DOI | PubMed]
- MO Hengartner. The biochemistry of apoptosis. Nature, 2000. [DOI | PubMed]
- TC Larman. Spectrum of somatic mitochondrial mutations in five cancers. Proc. Natl Acad. Sci. USA, 2012. [DOI | PubMed]
- YS Ju. Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer.. eLife, 2014. [DOI | PubMed]
- JB Stewart. Simultaneous DNA and RNA mapping of somatic mitochondrial mutations across diverse human cancers. PLoS Genet., 2015. [DOI | PubMed]
- YS Ju. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells. Genome Res., 2015. [DOI | PubMed]
- E Reznik. Mitochondrial DNA copy number variation across human cancers. eLife, 2016. [DOI | PubMed]
- JF Hopkins. Mitochondrial mutations drive prostate cancer aggression. Nat. Commun., 2017. [DOI | PubMed]
- 14.The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature10.1038/s41586-020-1969-6 (2020).
- 15.The Cancer Genome Atlas Research Network et al. The Cancer Genome Atlas pan-cancer analysis project. Nat. Genet.45, 1113–1120 (2013).
- 16.Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature10.1038/s41586-019-1913-9 (2020).
- 17.Rheinbay, E. et al. Analyses of non-coding somatic drivers in 2,693 cancer whole genomes. Nature10.1038/s41586-020-1965-x (2020).
- G Dayama, SB Emery, JM Kidd, RE Mills. The genomic landscape of polymorphic human nuclear mitochondrial insertions. Nucleic Acids Res., 2014. [DOI | PubMed]
- M Costello. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res., 2013. [DOI | PubMed]
- MS Lawrence. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature, 2013. [DOI | PubMed]
- SR Kennedy, JJ Salk, MW Schmitt, LA Loeb. Ultra-sensitive sequencing reveals an age-related increase in somatic mitochondrial mutations that are inconsistent with oxidative damage. PLoS Genet., 2013. [DOI | PubMed]
- M Tanaka, T Ozawa. Strand asymmetry in human mitochondrial DNA mutations. Genomics, 1994. [DOI | PubMed]
- W Zheng, K Khrapko, HA Coller, WG Thilly, WC Copeland. Origins of human mitochondrial point mutations as DNA polymerase γ-mediated errors. Mutat. Res., 2006. [DOI | PubMed]
- LB Alexandrov. Clock-like mutational processes in human somatic cells. Nat. Genet., 2015. [DOI | PubMed]
- HA Coller. High frequency of homoplasmic mitochondrial DNA mutations in human tumors can be explained without selection. Nat. Genet., 2001. [DOI | PubMed]
- Comprehensive molecular characterization of human colon and rectal cancer. Nature, 2012. [DOI | PubMed]
- 27.The Cancer Genome Atlas Research Network et al. Integrated genomic characterization of endometrial carcinoma. Nature497, 67–73 (2013).
- S Nik-Zainal. Mutational processes molding the genomes of 21 breast cancers. Cell, 2012. [DOI | PubMed]
- JW Pak, F Vang, C Johnson, D McKenzie, JM Aiken. MtDNA point mutations are associated with deletion mutations in aged rat. Exp. Gerontol., 2005. [DOI | PubMed]
- CF Davis. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell, 2014. [DOI | PubMed]
- P Caro. Mitochondrial DNA sequences are present inside nuclear DNA in rat tissues and increase with age. Mitochondrion, 2010. [DOI | PubMed]
- D Chen, W Xue, J Xiang. The intra-nucleus integration of mitochondrial DNA (mtDNA) in cervical mucosa cells and its relation with c-myc expression. J. Exp. Clin. Cancer Res., 2008. [DOI | PubMed]
- V Srinivasainagendra. Migration of mitochondrial DNA in the nuclear genome of colorectal adenocarcinoma. Genome Med., 2017. [DOI | PubMed]
- H Cui. Association of decreased mitochondrial DNA content with the progression of colorectal cancer. BMC Cancer, 2013. [DOI | PubMed]
- A Dickinson. The regulation of mitochondrial DNA copy number in glioblastoma cells. Cell Death Differ., 2013. [DOI | PubMed]
- FH Van Osch. Mitochondrial DNA copy number in colorectal cancer: between tissue comparisons, clinicopathological characteristics and survival. Carcinogenesis, 2015. [PubMed]
- HM McBride, M Neuspiel, S Wasiak. Mitochondria: more than just a powerhouse. Curr. Biol., 2006. [DOI | PubMed]
- S Vyas, E Zaganjor, MC Haigis. Mitochondria and cancer. Cell, 2016. [DOI | PubMed]
- MT Lott. mtDNA variation and analysis using mitomap and mitomaster. Curr. Protoc. Bioinformatics, 2013. [DOI | PubMed]
- TR Mercer. The human mitochondrial transcriptome. Cell, 2011. [DOI | PubMed]
- P Langfelder, S Horvath. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 2008. [DOI | PubMed]
- A Subramanian. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA, 2005. [DOI | PubMed]
- CA Koczor. Mitochondrial DNA damage initiates a cell cycle arrest by a Chk2-associated mechanism in mammalian cells. J. Biol. Chem., 2009. [DOI | PubMed]
- D Ojala, J Montoya, G Attardi. tRNA punctuation model of RNA processing in human mitochondria. Nature, 1981. [DOI | PubMed]
- S Joshi. The genomic landscape of renal oncocytoma identifies a metabolic barrier to tumorigenesis. Cell Rep., 2015. [DOI | PubMed]
- G Gasparre, G Romeo, M Rugolo, AM Porcelli. Learning from oncocytic tumors: why choose inefficient mitochondria?. Biochim. Biophys. Acta, 2011. [DOI | PubMed]
- DA Clayton, JN Doda, EC Friedberg. The absence of a pyrimidine dimer repair mechanism in mammalian mitochondria. Proc. Natl Acad. Sci. USA, 1974. [DOI | PubMed]
- NJ Haradhvala. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell, 2016. [DOI | PubMed]
- AS Bess, TL Crocker, IT Ryde, JN Meyer. Mitochondrial dynamics and autophagy aid in removal of persistent mitochondrial DNA damage in Caenorhabditis elegans. Nucleic Acids Res., 2012. [DOI | PubMed]
- DC Koboldt. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res., 2012. [DOI | PubMed]
- WJ Kent. BLAT—the BLAST-like alignment tool. Genome Res., 2002. [PubMed]
- D Simone, FM Calabrese, M Lang, G Gasparre, M Attimonelli. The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser. BMC Genomics, 2011. [DOI | PubMed]
- P Van Loo. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA, 2010. [DOI | PubMed]
- EM Van Allen. Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat. Med., 2014. [DOI | PubMed]
