Tags Posts tagged with "Peter Koo"

Peter Koo

Peter Koo Photo from CSHL

By Daniel Dunaief

The goal sounds like a dystopian version of a future in which computers make critical decisions that may or may not help humanity.

Peter Koo, Assistant Professor and Cancer Center Member at Cold Spring Harbor Laboratory, would like to learn how to design neural networks so they are more interpretable, which will help build trust in the networks.

The neural networks he’s describing are artificial intelligence programs designed to link a molecular function to DNA sequences, which can then inform how mutations to the DNA sequences alter the molecular function. This can help “propose a potential mechanism that plays a causal role” for a mutation in a given disease, he explained in an email.

Researchers have created numerous programs that learn a range of tasks. Indeed, scientists can and have developed neural networks in computer vision that can perform a range of tasks, including object recognition that might differentiate between a wolf and a dog.

Koo when he received a COVID vaccination.

With the pictures, people can double check the accuracy of these programs by comparing the program’s results to their own observations about different objects they see.

While the artificial intelligence might get most or even all of the head-to-head comparisons between dogs and wolves correct, the program might arrive at the right answer for the wrong reason. The pictures of wolves, for example, might have all been taken during the winter, with snow in the background The photos of dogs, on the other hand, might have cues that include green grass.

The neural network program can arrive at the right answer for the wrong reason if it is focused on snow and grass rather than on the features of the animal in a picture.

Extending this example to the world of disease, researchers would like computer programs to process information at a pace far quicker than the human brain as it looks for mutations or genetic variability that suggests a predisposition for a disease.

The problem is that the programs are learning in the same way as their programmers, developing an understanding of patterns based on so-called black box thinking. Even when people have designed the programs, they don’t necessarily know how the machine learned to emphasize one alteration over another, which might mean that the machine is focused on the snow instead of the wolf.

Koo, however, would like to understand the artificial intelligence processes that lead to these conclusions.

In research presented in the journal Nature Machine Intelligence, Koo provides a way to access one level of information learned by the network, particularly DNA patterns called motifs, which are sites associated with proteins. It also makes the current tools that look inside black boxes more reliable.

“My research shows that just because the model’s predictions are good doesn’t mean that you should trust the network,” Koo said. “When you start adding mutations, it can give you wildly different results, even though its predictions were good on some benchmark test set.”

Indeed, a performance benchmark is usually how scientists evaluate networks. Some of the data is held out so the network has never seen these during training. This allows researchers to evaluate how well the network can generalize to data it’s never seen before.

When Koo tests how well the predictions do with mutations, they can “vary significantly,” he said. They are “given arbitrary DNA positions important scores, but those aren’t [necessarily] important. They are just really noisy.”

Through something Koo calls an “exponential activation trick,” he reduces the network’s false positive predictions, cutting back the noise dramatically.

“What it’s showing you is that you can’t only use performance metrics like how accurate you are on examples that you’ve never seen before as a way to evaluate the model’s ability to predict the importance of mutations,” he explained.

Like using the snow to choose between a wolf and a dog, some models are using shortcuts to make predictions.

“While these shortcuts can help them make predictions that [seem more] accurate, like with the data you trained it on, it may not necessarily have learned the true essence of what the underlying biology is,” Koo said.

By learning the essence of the underlying biology, the predictions become more reliable, which means that the neural networks will be making predictions for the right reason.

The exponential activation is a noise suppressor, allowing the artificial intelligence program to focus on the biological signal.

The data Koo trains the program on come from ENCODE, which is the ENCyclopedia Of DNA Elements.

“In my lab, we want to use these deep neural networks on cancer,” Koo said. “This is one of the major goals of my lab’s research at the early stages: to develop methods to interpret these things to trust their predictions so we can apply them in a cancer setting.”

At this point, the work he’s doing is more theoretical than practical.

“We’re still looking at developing further tools to help us interpret these networks down the road so there are additional ways we can perform quality control checks,” he said.

Koo feels well-supported by others who want to understand what these networks are learning and why they are making a prediction.

From here, Koo would like to move to the next stage of looking into specific human diseases, such as breast cancer and autism spectrum disorder, using techniques his lab has developed.

He hopes to link disease-associated variance with a molecular function, which can help understand the disease and provide potential therapeutic targets.

While he’s not a doctor and doesn’t conduct clinical experiments, Koo hopes his impact will involve enabling more trustworthy and useful artificial intelligence programs.

Artificial intelligence is “becoming bigger and it’s undoubtedly impactful already,” he said. “Moving forward, we want to have transparent artificial intelligence we can trust. That’s what my research is working towards.”

He hopes the methods he develops in making the models for artificial intelligence more interpretable and trustworthy will help doctors learn more about diseases.

Koo has increased the size and scope of his lab amid the pandemic. He current has eight people in his lab who are postdoctoral students, graduate students, undergraduates and a master’s candidate.

Some people in his lab have never met in person, Koo said. “I am definitely looking forward to a normal life.”

Peter Koo. Photo by ©Gina Motisi, 2019/ CSHL

By Daniel Dunaief

We built a process that works, but we don’t know why. That’s what one of the newest additions to Cold Spring Harbor Laboratory hopes to find out.

Researchers have applied artificial intelligence in many areas in biology and health care. These systems are making useful predictions for the tasks they are trained to perform. Artificial intelligence, however, is mostly a hands-off process. After these systems receive training for a particular task, they learn patterns on their own that help them make predictions.

How these machines learn, however, has become as much of a black box as the human brains that created these learning programs in the first place. Deep learning is a way to build hierarchical representations of data, explained Peter Koo, an assistant professor at the Simons Center for Quantitative Biology at CSHL, who studies the way each layer transforms data and the next layer builds upon this in a hierarchical manner.

Koo, who earned his doctorate at Yale University and performed his postdoctoral research at Harvard University, would like to understand exactly what the machines we created are learning and how they are coming up with their conclusions.

“We don’t understand why [these artificial intelligence programs] are making their predictions,” Koo said. “My postdoctoral research and future research will continue this line of work.”

Koo is not only interested in applying deep learning to biological problems to do better, but he’s also hoping to extract out what knowledge these machines learn from the data sets to understand why they are performing better than some of the traditional methods.

“How do we guide black box models to learn biologically meaningful” information? he asked. “If you have a data set and you have a predictive model that predicts the data well, you assume it must have learned something biologically meaningful,” he suggested. “It turns out, that’s not always the case.”

Deep learning can pick up other trends or links in the data that might not be biologically meaningful. In a simplistic example, an artificial intelligence weather system that tracked rain patterns during the spring might conclude, after seven rainy Tuesdays, that it rains on Tuesdays, even if the day of the week and the rain don’t have a causative link.

“If the model is trained with limited data that is not representative, it can easily learn patterns that are correlative in the training data,” Koo said. He tries to combat this in practice by holding out some data, which is called validating data. Scientists use it to evaluate how well the model generalizes to new data.

Koo plans to collaborate with numerous biologists at Cold Spring Harbor Laboratory, as well as other quantitative biologists, like assistant professors Justin Kenney and David McCandlish.

In an email, Kenney explained that the Simons Center is “very interested in moving into this area, which is starting to have a major impact on biology just as it has in the technology industry.”

The quantitative team is interested in high-throughput data sets that link sequence to function, which includes assays for protein binding, gene expression, protein function and a host of others. Koo plans to take a “top down” approach to interpret what the models have learned. The benefit of this perspective is that it doesn’t set any biases in the models.

Deep learning, Koo suggested, is a rebranding of artificial neural networks. Researchers create a network of simple computational units and collectively they become a powerful tool to approximate functions.

A physicist by training, Koo taught himself his expertise in deep learning, Kenney wrote in an email. “He thinks far more deeply about problems than I suspect most researchers in this area do,” he  wrote. Kenney is moving in this area himself as well, because he sees a close connection between the problem of how artificial intelligence algorithms learn to do things and how biological systems mechanistically work.

While plenty of researchers are engaged in the field of artificial intelligence, interpretable deep learning, which is where Koo has decided to make his mark, is a considerably smaller field.

“People don’t trust it yet,” Koo said. “They are black box models and people don’t understand the inner workings of them.” These systems learn some way to relate input function to output predictions, but scientists don’t know what function they have learned.

Koo chose to come to Cold Spring Harbor Laboratory in part because he was impressed with the questions and discussions during the interview process.

Koo, daughter Evie (left) and daughter Yeonu (right) during Halloween last year. Photo by Soohyun Cho

He started his research career in experimental physics. As an undergraduate, he worked in a condensed matter lab of John Clarke at the University of California at Berkeley. He transitioned to genomics, in part because he saw a huge revolution in next-generation sequencing. He hopes to leverage what he has learned to make an impact toward precision medicine. 

Biological researchers were sequencing all kinds of cancers and were trying to make an impact toward precision medicine. “To me, that’s a big draw,” Koo said, “to make contributions here.”

A resident of Jericho, Koo lives with his wife, Soohyun Cho, and their 6-year-old daughter Evie and their 4-year old-daughter Yeonu.

Born and raised in the Los Angeles area, he joined the Army Reserves after high school, attended community college and then transferred to UC Berkeley to get his bachelor’s degree in physics.

As for his decision to join Cold Spring Harbor Laboratory, Koo said he is excited with the opportunity to combine his approach to his work with the depth of research in other areas. 

“Cold Spring Harbor Laboratory is one of those amazing places for biological research,” Koo said. “What brought me here is the quantitative biology program. It’s a pretty new program” that has “incredibly deep thinkers.”