Genomic Data Leads to Better Understanding of the Genomic Progression of Breast Cancer

Computational and machine learning sift through the vast amounts of data to make more accurate predictions

A transformative recent development in biology has been the systematic collection of massive datasets. These datasets capture the state and physical interactions between molecules such as DNA, RNA, and proteins across hundreds of human tissues. Increasingly, the challenge in biology has therefore become computational; scientists must find ways to best make use of this data to understand human biology and disease. Dr. Serafim Batzoglou, of Stanford University, develops computational and machine learning methods to study biological and genomic data in order to make the most accurate predictions. Dr. Batzoglou hopes to understand the progression of cancer, and specifically breast cancer at the genomic level. Dr. Batzoglou's research has important implications in the diagnoses of cancer and ultimately could aid providers and patients in deciding treatment plans. In addition, his work in population genomics helps enable analyses of massive amounts of human genomes more accurately and rapidly than previously used approaches.

Dr. Batzoglou's research may one day allow providers to distinguish between early tumors that become cancerous from early tumors that seem not to be related to the cancer of the patient. Such a finding, would be an important diagnostic for women with early breast tumors who are facing a choice of whether to undergo mastectomy. Moreover, understanding this early progression is important for therapy: the early stage, where few key mutations in the DNA cause the initial cancer phenotype, is the best place for drugs to target for two reasons: (1) because every single cancer cell in a patient's tissue has these early mutations (they are ancestors to later mutations); (2) because early treatment should in principle be much more likely to be successful and kill all the cancer cells, while later treatments face the issue of multiple variations of cancer cells, some of which may be killed by a treatment, but the others left to proliferate later.

Current research includes:

  • Cancer Genomics: In cancer genomics, Dr. Batzoglou and his team focus on understanding the genomic evolution of cancer. Cancer is a disease of the genome: as cells divide, their genome accumulates mutations, and an unlucky combination of mutations causes a cell to proliferate beyond control. Cells that proliferate get more mutations, that can further increase their ability to grow. This is an evolutionary process. To study this, together with collaborators Dr. Batzoglou collects multiple samples of a tumor, and of normal tissue, from patients who have undergone an operation. Then, researchers reconstruct the cancer genomes in each sample, and develop the computational methods to connect the samples of each patient in an evolutionary tree. This tree summarizes how a tumor evolved within each patient's tissues: how the normal cells first became tumorous, and how the early tumors became cancer and then invasive and metastatic cancer. Understanding which mutations cause each transition is important for diagnosis and treatment.Thus far, Dr. Batzoglou has applied these techniques to a study of six breast cancer patients together with his collaborators Rob West and Arend Sidow from Stanford Pathology, with some fascinating early results.

  • Population Genomics: Dr. Batzoglou develops computational methods that rapidly analyze massive numbers of human genomes to (1) predict all the variation within the genomes, and (2) find all the segments that are shared across individuals through familial descent. As the scientific community is moving from a few human genomes available a few years ago, to about 100,000 in 2015, to millions in a few years, the methodologies that Dr. Batzoglou develops are necessary to be able to analyze such volumes of data.


Serafim Batzoglou received his B.S. in Mathematics, B.S. in Computer Science, and M.Eng. in Electrical Engineering and Computer Science from MIT in 1996. He received his Ph.D. in Computer Science under the supervision of Professor Bonnie Berger and co-supervision of Professor Eric Lander from MIT in 2000. He was a research scientist at MIT's Whitehead Institute in 2000-2001, and joined Stanford University's Department of Computer Science in 2001, where he has been an Assistant Professor until 2008, and an Associate Professor since 2008. He is the recipient of the Alfred P. Sloan Fellowship, the NSF CAREER Award, and was named one of the top 100 Young Technology Innovators by MIT's Technology Review Magazine in 2003. In addition to his academic career, Dr. Batzoglou is a co-founder and scientific advisor at DNAnexus, and a member of the scientific advisory boards of 23andMe and Eve Biomedical.

As a young child, growing up in Greece, Dr. Batzoglou enjoyed watching Carl Sagan's Cosmos series on TV. He was fascinated by the discoveries about the universe and knew he wanted to be a part of it. As he continued his studies, he found that he especially enjoyed math. When he started college, he wanted to study either physics or artificial intelligence, in order to understand either the universe or the human mind. As a senior, he took a class in computational biology and found that computational genomics was the perfect place to continue his professional career. Dr. Batzoglou now applies math, computer science, and artificial intelligence to the study of genomic and biological data. In his free time, Dr. Batzoglou enjoys spending time with friends, traveling, and watching movies.



Alfred P. Sloan Fellowship, 2004

NSF CAREER Award, 2004

MIT Technology Review Magazine, 100 Top Young Technology Innovators, 2003