It’s an issue that attracts debate because there are large enough overlapping or gray areas that make it challenging to offer a definitive answer across a range of circumstances.
“I had a professor in graduate school who put it this way: If you have the genetic variant for Huntington’s disease, you will get Huntington’s disease,” said W. Richard McCombie, a professor and director of the Stanley Institute for Cognitive Genomics at Cold Spring Harbor Laboratory. “If you walk in front of a truck that’s going 70 miles per hour on an interstate, your genes are irrelevant. Everything else is in between.”
Indeed, McCombie and his lab have become something of expert genetic speed readers, looking at enormous multiples of genes that were almost unthinkable just a decade or so ago.
“Next-generation sequencing has dramatically changed the field of genomics, allowing researchers to access an unprecedented amount of data,” he said. “The challenge lies in the analysis of these large data sets.”
The sequences he describes are the combination of the four base pairs, adenine, guanine, cytosine and tyrosine, strung together in a double-helix ladder design.
The implications of these new genetic sequences and libraries range from generating personalized medicine and understanding the prognosis for different diseases and likelihoods of effective therapy to seeking ways to enhance the production of food and energy crops.
The basic question he’s asking is “what’s the correlation between the structure and function of a living organism, in terms of the genome?”
From a practical standpoint, working in different systems helps when McCombie is applying for funding, he suggested.
The technology and expertise he develops also have applications across systems. When he gets funding to explore the sequence of large plant genomes, he can then use what he learns from that to work on studying cancer.
McCombie’s contributions have spanned several areas, including developing next-generation sequencing, contributing to plant genome sequencing and studying the genetic basis of cognitive disorders, said Greg Hannon, the Royal Society Wolfson Research Professor at the Cancer Research UK Cambridge Institute at the University of Cambridge, who has co-authored 17 papers with McCombie.
“He has made tremendous impacts across multiple fields,” Hannon said,
McCombie is “a real hero of the lab,” and Hannon said he “can’t think of anyone else who has had the diversity of impact he has.”
Sequencing in general has involved instruments that look at small bits of data at a time, around 100 base fragments. Using something called long-read technology, researchers can now examine pieces that are around 10,000 base pairs.
This technology is “really coming along” and has implications for cancer, where tumors are often due to rearrangements, insertions or deletions, while it also might impact plant genomics, where the long-read technology can be 100 to 1,000 times as effective as the short-read technology, McCombie said.
Sequencing pieces of genes is like taking a picture of, say, the Grand Canyon and turning that into a jigsaw puzzle. In the short-read technology, the pieces are smaller and, in some cases, show some of the same features. In the long-read technology, the pieces are much larger, turning the picture into something closer to a small child’s puzzle.
The long reads have a lower raw accuracy, he said, but with enough coverage, scientists can achieve a high consensus accuracy because the errors are mostly random.
The long-read technology is like having a puzzle with four pieces, instead of 1,000, he said.
The process of comparing genes or looking for a smoking gun causative set of genes involved in disease can be and is difficult, especially when comparing the genes of an individual with a representative healthy set of genes.
“Searching for causative genes can be very challenging particularly in complex diseases where more than one gene (and often many genes) contributes to the disease,” McCombie explained. “Trying to pinpoint causative variants is complicated by the normal background variation.”
Indeed, it’s more productive and instructive to look at larger sample sizes of people or to examine trios — the genes of parents unaffected by a genetic disease and their affected child.
Using these trios, McCombie and other scientists have found some overlap in potentially causative genes across disorders from schizophrenia and bipolar disorder to autism and intellectual impairment. McCombie is currently exploring multiple sets of genes in cases of depression.
McCombie and his wife Janice, a computer technician who works in Manhattan, live in Port Washington, which, he says, is convenient to the many operas they enjoy.
Given the flood of information available through all the genetic data that comes out daily, McCombie said scientists entering this field have to have some skill and understanding of bioinformatics, which makes sense of vast amounts of data.
“I give a short talk to the first-year grad students on their research every year,” he said. “One of them asked me if I thought bioinformatics was important in biology research. To be realistic, people in [the next generation] have no future if [they’re] not adept at working on computers and don’t understand bioinformatics.”