Identifying the Clues Hidden in our Genomes

Our genomes harbor a trove of clues that point toward novel therapies that can alleviate human disease

Dr. Benjamin Voight witnessed some of his close family members struggle with health complications at an early age, and he became dedicated to providing translational research benefits that could one day better the human condition. He is achieving this by studying both the evolutionary history of humans using genomics data, and the genetic risk that also segregates in the genomes of human populations. Both veins of research not only improve the ability to map complex disease genes which create the possibility for new therapies, but they also teach us about the complex history and biological forces that drove the ascent of modern humans as a species. These narratives are not only intrinsically interesting, but are also helpful in identifying why and how human populations are distinctive. Dr. Voight also studies biological oddities in nature beyond humans: in particular, the silk of spiders, one of the strongest biomaterials known to man. Such effort informs general biology with the potential to fundamentally change the nature of our future world by leading to novel material production with potentially revolutionary biomedical and industrial applications. Dr. Voight works with a highly interdisciplinary team comprised of professionals in fields from molecular biology and genetics to statistics and computer science who take an unflinching approach to large-scale genomics projects. With translation as the key focus of his lab, the findings that Dr. Voight and his team gather are exquisitely positioned to make a tangible impact on medicine and ultimately human health.

The growing catalog of human genetic loci that contribute susceptibility to complex disease offers great potential to improve human health either by direct clinical action or therapeutic development. Dr. Benjamin Voight, Assistant Professor in the Department of Systems Pharmacology and Translational Therapeutics and the Department of Genetics at the University of Pennsylvania, is dedicated to determining the locations in the human genome responsible for susceptibility to coronary heart disease and type 2 diabetes. But activating the translational potential of this research requires addressing several key questions. How many genes and alleles underlie complex disease, and what are their effects and frequencies? For a given susceptibility locus, what are the causal genes and functional alleles which modify risk to disease, and what are the genetic mechanisms? What pathways or gene networks do these genes modulate? What direction – overexpression or knockdown – do causal genes or networks need to be perturbed in order to obtain therapeutic benefits? How can current genetics data for causal and predictive biomarkers further improve clinical risk prediction and patient stratification?

To address these questions, Dr. Voight's lab builds computational tools grounded in the principles of population biology and statistics to then analyze data from tens of thousands of human genomes. Understanding the genetic, biological, and evolutionary basis of metabolic and cardiovascular phenotypes in human populations will ultimately lead to the development of new clinical practices, whether through new therapeutics or in the development of computational models that help explain how a disease functions.

Current research efforts within his lab focus on the following areas:

  • Inference of modifiable risk factors that cause disease. Using a clever analytical methodology, Dr. Voight is efficiently distinguishing between clinically measurable biomarkers that cause disease from those biomarkers that are merely correlated with the expression of a disease. For example, low-density cholesterol (LDL), otherwise known as "bad" cholesterol, has been identified as a causative agent for heart disease and heart attack, as it accumulates as plaques in patient’s coronary arteries. LDL cholesterol levels are genetically heritable, resulting from the genetic traits inherited from one's parents, in addition to diet. Some individuals have genetically and exceedingly high LDL cholesterol, which causes cholesterol nodules in the knees and arms to accumulate, and eventually leading to premature death. Conversely, some individuals have genetic sequences that carry "lucky variants," genetic factors that naturally lower cholesterol levels. Thus, they are actually protected against heart disease. Beyond LDL cholesterol, there are numerous biomarkers that are measurable in serum, but identifying which biomarkers cause one of many diseases (heart disease, diabetes, even migraine headache) is critical for the development of efficacious and expedited therapeutic development for these conditions. Using genetic data, Dr. Voight's broad objective is to identify particular genetic variants that influence these measurable biomarkers, and determine if “lucky individuals” carry genetic variants that actually protect them against a disease endpoint. By identifying such factors, new opportunities for drug discovery and development are possible.

  • Discovery and characterization of modern evolution in humans. Are humans still evolving? Dr. Voight’s graduate work demonstrated that the human genome has continued to evolve as humans adapted to their local environments over the last ~120 thousand years. For example, Northern Europeans are relatively unique in their ability to process lactose, the sugar in milk, into adulthood, but in contrast many Asian and African populations are lactose intolerant in adulthood. Pastoralism, the process of raising cattle for milk, was an agricultural development utilized by inhabitants of Northern European regions, and could easily be hypothesized as an advantageous practice for our human ancestors. The preponderance of evidence indicates that there are hundreds of hidden stories encoded in the human genome: from resistance to pathogens and adaptation to xenobiotics based on regional diets, to increased tolerance of UV exposure. Even though the locations where adaptation is likely to have occurred are known, unlike the story of pastoralism above, the mechanism, phenotype, and historical context in which Nature has acted is not known for the vast majority of these locations. Dr. Voight's effort in this area is to use genetics and computational models to understand the evolutionary history these sites in greater detail. He has developed approaches to understand the natural history of genetic variants, to elucidate the story hiding at each location that has recently adapted. Quantifying these stories will reveal forces that drove our species to evolve recently, but will also uncover why susceptibility to disease differs so dramatically across diverse human populations. In addition, differences in genetic variation and historical adaptation to local environments may also explain why certain therapeutics used to treat disease varies in their efficacy, or further still why adverse outcomes are prevalent (and predictable) across diverse human groups.

Dr. Voight is approaching this research by comparing the genomes of different people spread across different geographic areas. The relatively low expense needed to characterize individual genomes has led to the cataloging of thousands of diverse genomes from individuals around the world. Dr. Voight's efforts aim to design and apply novel statistical and computational methodology, and an intuitive knowledge of population biology, to address questions of causative players that will tell the narrative hidden in human evolution.

  • Spider Genomics (Order Araneae). The scientific community has complete genomes sequenced for dozens of mammals and insects, including fruit flies, mosquitos, and even ticks. Shockingly, until very recently little was known about the spider's genome sequence at a high-resolution molecular level, despite their deep mythology, historical interest, unusual sex-dimorphic trait distributions, and production of unusual macromolecules: venom and silk. Upon realizing this issue, Dr. Voight combined forces with Drs. Linden Higgins and Ingi Agnarsson and are working to generate genome data on two species of the Order Araneae: Nephila clavipes (a species of Golden orb-web spider) and Caerostris darwini (Darwin's bark spider), spiders that produce some of the strongest biomaterial that is known in the form of silk. How have spider genomes and anatomy evolved together to produce silks and weave the silk that it does? The group is poised to describe the first complete catalog of silk proteins used by these orb-weaving spiders, and will connect the structure of such proteins to biological function. Coupling high-throughput sequencing to obtain complete sequences of DNA as well as those coding sequences that produce proteins using RNA sequencing, understanding the genetic basis of silk strengths will be possible and serves as an accomplishable starting point for the arachnid community. A long-term objective would be to create novel silk proteins in vitro, based on knowledge gained by studying the genomes of these spiders, which could have revolutionary applications in biomedicine and industry.


The subject of Dr. Voight's research is very personal to him due in large part to illnesses within his family's history. His father, who served as a Major in the United States Air Force, suffered a heart attack in his late 30's, very early for such an event, which effectively ended his career as an Air Force pilot. At the age of six, in an attempt to cure her disease, Dr. Voight was a bone-marrow for his sister, who ultimately passed away from childhood leukemia. These events were difficult to process, and perhaps expectedly, he turned to science fiction as an outlet. Drawing heavily on these experiences, at the age of 13 he convinced himself that he was going to pursue a career in research, specifically biology, and focus on the applications of gene therapy. At such a young age, he did not appreciate the degree of difficulty of this research, but at the time it seemed a viable opportunity that could have prevented the untimely fates of his loved ones. He adopted a sincere desire to improve the human condition, a sentiment that has persisted even as his career developed toward a slightly different trajectory.

In high school, one of the most interesting and provoking memories Dr. Voight can recall occurred during his AP Biology class, where he essentially followed in the footsteps of Alfred Sturtevant, the American geneticist who constructed the first genetic map of a chromosome, by performing a genetics cross with a model organism (fruit fly) to identify the ‘order’ in which segregating traits (and their mutations) occur along chromosome, via linkage analysis. It was at this moment that he realized there were ways to study biology that involved mathematics, one of his perceived strengths. As an undergraduate, he was fortunate enough to participate in directed research following a transfer to the University of Washington. His mentor, Dr. Maynard Olson, was trained in both computational and wet-lab experimental sciences. Dr. Olson took note of Dr. Voight's track record in mathematics and biology, but not in computer programming, and suggested that Dr. Voight take a class or two on the subject during the end of his senior year before departing to graduate school. He had never seriously programmed up until that point, but followed his mentor’s advice, fell in thick with the discipline, and quickly identified it as another of his core strengths. Thus, Dr. Voight found his niche applying mathematics and computation to biology.

While attending graduate school at the University of Chicago, Dr. Voight focused his research on complex trait models as well as evolutionary studies in humans. In that time, his father passed away at quite an early age due to neurological and vascular complications Dr. Voight believes stemmed from his father's cardiovascular disease pathophysiology. This event led him to include as part of his research focus, efforts to understand human disease, particularly the cardiometabolic diseases of type 2 diabetes and coronary heart disease, both medically relevant and which place a substantial burden on our health care system.

To his amazement, the collection of skills and aptitudes he acquired are unique and highly desirable in today's era of "Big Data." In the early days of Dr. Voight's own independent laboratory, he continues to maintain a diversity of interests encouraged by his fascination with problems in biology and medicine that can be addressed through interdisciplinary efforts: using biological knowledge and instincts to identify key questions in medicine, develop mathematical and statistical models to address them, evaluated in “Big Data” using advanced computational techniques. This keeps him constantly thinking about science, planning ahead and focusing on important clinical and evolutionary genetics questions. As a mentor, this is the same line of thinking he hopes to impart on the students that come into contact with his research group. Rather gratifyingly, Dr. Voight's lab has become a popular hub for M.D./Ph.D. students who have recently defended their doctoral theses and wish to develop some computational skills before returning to medical school to complete their M.D. degrees.

As a self-proclaimed "tech geek" Dr. Voight is fascinated by all sort of trends in the area. He has been building computers since he was an undergraduate. To get away from technology, Dr. Voight works with his wife with a “Do it Yourself” pioneering spirit, as they renovate their late 19th-century Victorian home. He makes it a point in his life to remain open- and scientifically-minded so that he is fully ready to approach any challenge as it presents itself.

In the News

"Still Evolving, Human Genes Tell New Story"

Front Page Article of the New York Times Featured Story on Natural Selection in Humans

"Doubt cast on 'Good' in 'Good Cholesterol'"

Front Page Article of the New York Times Featured Story on Mendelian Randomization


A map of recent positive selection in the human genome


Twelve type 2 diabetes susceptibility loci identied through large-scale association analysis


Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study


MR_predictor: a simulation engine for Mendelian Randomization studies.



New York Times Front Page News article, “Doubt cast on ‘Good’ in ‘Good Cholesterol’," 2012

Selected Alfred P. Sloan Research Fellow, 2012

Semi-finalist, Trainee Research Award, 59th Meeting of the American, 2009

Team Award for Outstanding Research, Clinical Research Day, Massachusetts General Hospital, 2007

Best [Ph.D. Dissertation] in the Biological Sciences Division, University of Chicago

New York Times Front Page News article, “Still evolving, human genes Society of Human Genetics tell new story,” 2006