Information

Could AI be applied to protein folding?

Could AI be applied to protein folding?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Two years later, there is a follow up question to the one asked here: How do we know if the [email protected] project results are right? Since we are quite sure [email protected] is working right and following this article's statement:

Similar techniques could be applied to protein folding, reducing energy consumption, or searching for revolutionary new drugs and materials.

… I would like to ask if AI stuff, like deep learning, neural networks and the rest of today's buzz words could be applied to molecular dynamics, especially in the protein folding field?


Yes, and no :-)

In the meantime many protein structures can be predicted quite accurately - even those for which no reference fold had been known before.

In this case the important buzz word is "big data": co-mutations (of charged amino acids) that can be found when sequencing many independent genomes. (… which indirectly bypasses the emphasis on dynamics for protein folding)

Editorial: 2017, Science: http://www.sciencemag.org/news/2017/01/hundreds-elusive-protein-structures-pinned-down-genome-data

Perspective: Soding et al. , 2017, Science ( http://science.sciencemag.org/content/355/6322/248 )

Research article: Ovchinnikov et al. 2017, Science ( https://www.bakerlab.org/wp-content/uploads/2017/01/ovchinnikov_science_2017.pdf )


DeepMind develops AI solution to 50-year-old protein challenge

London, 30 November 2020 - In a major scientific advance, the latest version of DeepMind's AI system AlphaFold has been recognised as a solution to the 50-year-old grand challenge of protein structure prediction, often referred to as the 'protein folding problem', according to a rigorous independent assessment. This breakthrough could significantly accelerate biological research over the long term, unlocking new possibilities in disease understanding and drug discovery among other fields.

Today, results from CASP14 show that DeepMind's latest AlphaFold system achieves unparalleled levels of accuracy in structure prediction. The system is able to determine highly-accurate structures in a matter of days. CASP, the Critical Assessment of protein Structure Prediction, is a biennial community-run assessment started in 1994, and the gold standard for assessing predictive techniques. Participants must blindly predict the structure of proteins that have only recently - or in some cases not yet - been experimentally determined, and wait for their predictions to be compared to experimental data.

CASP uses the "Global Distance Test (GDT)" metric to assess accuracy, ranging from 0-100. The new AlphaFold system achieves a median score of 92.4 GDT overall across all targets. The system's average error is approximately 1.6 Angstroms - about the width of an atom. According to Professor John Moult, Co-founder and Chair of CASP, a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods.

Professor John Moult, Co-Founder and Chair of CASP, University of Maryland said:

"We have been stuck on this one problem - how do proteins fold up - for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts wondering if we'd ever get there, is a very special moment."

Why protein structure prediction matters

Proteins are essential to life and their shapes are closely linked with their functions. The ability to predict protein structures accurately enables a better understanding of what they do and how they work. There are currently over 200 million proteins in the main database and only a fraction of their 3D structures have been mapped out.

A major challenge is the astronomical number of ways a protein could theoretically fold before settling into its final 3D structure. Many of the greatest challenges facing society, like developing treatments for diseases or finding enzymes that break down industrial waste, are fundamentally tied to proteins and the role they play. Determining protein shapes and functions is a major field of scientific research, primarily using experimental techniques that can take years of painstaking and laborious work per structure, and require the use of multi-million dollar specialised equipment.

DeepMind's approach to the protein folding problem

This breakthrough builds on DeepMind's first entry at CASP13 in 2018, where the initial version of AlphaFold achieved the highest level of accuracy among all participants. Now, DeepMind has developed new deep learning architectures for CASP14, drawing inspiration from the fields of biology, physics, and machine learning, as well as the work of many scientists in the protein folding field over the past half-century.

A folded protein can be thought of as a "spatial graph", where residues are the nodes and edges connect the residues in close proximity. This graph is important for understanding the physical interactions within proteins, as well as their evolutionary history. For the latest version of AlphaFold used at CASP14, DeepMind created an attention-based neural network system, trained end-to-end, that attempts to interpret the structure of this graph, while reasoning over the implicit graph that it's building. It uses evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph.

By iterating this process, the system develops strong predictions of the underlying physical structure of the protein. Additionally, AlphaFold can predict which parts of each predicted protein structure are reliable using an internal confidence measure.

The system was trained on publicly available data consisting of

170,000 protein structures from the protein data bank, using a relatively modest amount of compute by modern machine learning standards - approximately 128 TPUv3-cores (roughly equivalent to

100-200 GPUs) run over a few weeks.

Potential for real world impact

DeepMind is excited to collaborate with others to learn more about AlphaFold's potential, and the AlphaFold team is looking into how protein structure predictions could contribute to understanding of certain diseases with a few specialist groups.

There are also signs that protein structure prediction could be useful in future pandemic response efforts, as one of many tools developed by the scientific community. Earlier this year, DeepMind predicted several protein structures of the SARS-CoV-2 virus, and impressively quick work by experimentalists has now confirmed that AlphaFold achieved a high degree of accuracy on its predictions.

AlphaFold is one of DeepMind's most significant advances to date. But as with all scientific research, there's still much to be done, including figuring out how multiple proteins form complexes, how they interact with DNA, RNA, or small molecules, and how to determine the precise location of all amino acid side chains.

As with its earlier CASP13 AlphaFold system, DeepMind is planning to submit a paper detailing the workings of this system to a peer-reviewed journal in due course, and is simultaneously exploring how best to provide broader access to the system in a scalable way.

AlphaFold breaks new ground in demonstrating the stunning potential for AI as a tool to aid fundamental scientific discovery. DeepMind looks forward to collaborating with others to unlock that potential.

Statements from independent scientists:

Professor Venki Ramakrishnan, Nobel Laureate and President of the Royal Society
"This computational work represents a stunning advance on the protein-folding problem, a 50-year old grand challenge in biology. It has occurred decades before many people in the field would have predicted. It will be exciting to see the many ways in which it will fundamentally change biological research."

Professor Dame Janet Thornton, Director Emeritus & Senior Scientist, EMBL-EBI
"What the DeepMind team has managed to achieve is fantastic and will change the future of structural biology and protein research. After decades of studying proteins, the molecules that provide the structure and functions of all living things, I awoke this morning feeling that progress has been made."

Arthur D. Levinson, PhD, Founder & CEO Calico, Former Chairman & CEO, Genentech
"AlphaFold is a once in a generation advance, predicting protein structures with incredible speed and precision. This leap forward demonstrates how computational methods are poised to transform research in biology and hold much promise for accelerating the drug discovery process."

Professor Andrei Lupas, Director, Max Planck Institute for Developmental Biology
"AlphaFold's astonishingly accurate models have allowed us to solve a protein structure we were stuck on for close to a decade, relaunching our effort to understand how signals are transmitted across cell membranes."

Professor Ewan Birney, Deputy Director General EMBL, Director EMBL-EBI
"I nearly fell off my chair when I saw these results. I know how rigorous CASP is - it basically ensures that computational modelling must perform on the challenging task of ab-initio protein folding. It was humbling to see that these models could do that so accurately. There will be many aspects to understand but this is a huge advance for science."

Statements from DeepMind / Alphabet:

Demis Hassabis, PhD, Founder and CEO, DeepMind
"The ultimate vision behind DeepMind has always been to build AI and then use it to help further our knowledge about the world around us by accelerating the pace of scientific discovery. For us AlphaFold represents a first proof point for that thesis. This advance is our first major breakthrough in a long-standing grand challenge in science, which we hope will have a big real-world impact on disease understanding and drug discovery."

Pushmeet Kohli, PhD, Head of AI for Science, DeepMind
"These incredible results are testament to DeepMind's unique research philosophy - bringing together mission-focused, multidisciplinary teams to target ambitious scientific goals. Critical assessments like CASP are important for driving research progress, and we look forward to building on this work, deepening our understanding of proteins and biological mechanisms, and opening up new avenues of exploration."

John Jumper, PhD, AlphaFold Lead, DeepMind
"Protein biology is fantastically complex and defies simple characterisation. Our team's work demonstrates that machine learning techniques are finally able to meet the complexity of describing these incredible protein machines, and we are truly excited to see what new breakthroughs in both human health and fundamental biology it will bring."

Kathryn Tunyasuvunakool, PhD, Science Engineer, DeepMind
"The ability to predict high accuracy protein structures with AI could change how we approach biology, with potential applications in drug design and bioremediation. Particularly for experimentally challenging proteins, good predictive techniques could make a huge difference."

Sundar Pichai, CEO, Google and Alphabet
"This is an incredible AI-powered breakthrough in protein folding, which will help us better understand one of life's most fundamental building blocks. This huge leap forward from DeepMind has immediate practical implications, enabling researchers to tackle new and difficult problems, from future pandemic response to environmental sustainability."

Note, CASP also has a press release available, which you can obtain from Kerry Noble: [email protected]

DeepMind is a multidisciplinary team of scientists, engineers, machine learning experts and more, working together to research and build safe AI systems that learn how to solve problems and advance scientific discovery for all.

Best-known for developing AlphaGo, the first program to beat a world champion at the complex game of Go, DeepMind has published over 1000 research papers - including more than a dozen in Nature and Science - and achieved breakthrough results in many challenging AI domains from StarCraft II to protein folding.

DeepMind was founded in London in 2010, and joined forces with Google in 2014 to accelerate its work. Since then, its community has expanded to include teams in Alberta, Montreal, Paris, and Mountain View in California.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.


Finding the best ways to do good.

DeepMind, an AI research lab that was bought by Google and is now an independent part of Google’s parent company Alphabet, announced a major breakthrough this week that one evolutionary biologist called “a game changer.”

“This will change medicine,” the biologist, Andrei Lupas, told Nature. “It will change research. It will change bioengineering. It will change everything.”

The breakthrough: DeepMind says its AI system, AlphaFold, has solved the “protein folding problem” — a grand challenge of biology that has vexed scientists for 50 years.

Proteins are the basic machines that get work done in your cells. They start out as strings of amino acids (imagine the beads on a necklace) but they soon fold up into a unique three-dimensional shape (imagine scrunching up the beaded necklace in your hand).

That 3D shape is crucial because it determines how the protein works. If you’re a scientist developing a new drug, you want to know the protein’s shape because that will help you come up with a molecule that can bind to it, fitting into it to alter its behavior. The trouble is, predicting which shape a protein will take is incredibly hard.

Every two years, researchers who work on this problem try to prove how good their predictive powers are by submitting a prediction about the shapes that certain proteins will take. Their entries are judged at the Critical Assessment of Structure Prediction (CASP) conference, which is basically a fancy science contest for grown-ups.

By 2018, DeepMind’s AI was already outperforming everyone at CASP, provoking some melancholic feelings among the human researchers. DeepMind took home the win that year, but it still hadn’t solved the protein folding problem. Not even close.

This year, though, its AlphaFold system was able to predict — with impressive speed and accuracy — what shapes given strings of amino acids would fold up into. The AI is not perfect, but it’s pretty great: When it makes mistakes, it’s generally only off by the width of an atom. That’s comparable to the mistakes you get when you do physical experiments in a lab, except that those experiments are much slower and much more expensive.

“This is a big deal,” John Moult, who co-founded and oversees CASP, told Nature. “In some sense the problem is solved.”

Why this is a big deal for biology

The AlphaFold technology still needs to be refined, but assuming the researchers can pull that off, this breakthrough will likely speed up and improve our ability to develop new drugs.

Let’s start with the speed. To get a sense of how much AlphaFold can accelerate scientists’ work, consider the experience of Andrei Lupas, an evolutionary biologist at the Max Planck Institute in Germany. He spent a decade — a decade! — trying to figure out the shape of one protein. But no matter what he tried in the lab, the answer eluded him. Then he tried out AlphaFold and he had the answer in half an hour.

AlphaFold has implications for everything from Alzheimer’s disease to future pandemics. It can help us understand diseases, since many (like Alzheimer’s) are caused by misfolded proteins. It can help us find new treatments, and also help us quickly determine which existing drugs can be usefully applied to, for example, a new virus. When another pandemic comes along, it could be very helpful to have a system like AlphaFold in our back pocket.

“We could start screening every compound that is licensed for use in humans,” Lupas told the New York Times. “We could face the next pandemic with the drugs we already have.”

But for this to be possible, DeepMind would have to share its technology with scientists. The lab says it’s exploring ways to do that.

Why this is a big deal for artificial intelligence

Over the past few years, DeepMind has made a name for itself by playing games. It has built AI systems that crushed pro gamers at strategy games like StarCraft and Go. Much like the chess matches between IBM’s Deep Blue and Garry Kasparov, these matches mostly served to prove that DeepMind can make an AI that surpasses human abilities.

Now, DeepMind is proving that it has grown up. It has graduated from playing video games to addressing scientific problems with real-world significance — problems that can be life-or-death.

The protein folding problem was a perfect thing to tackle. DeepMind is a world leader in building neural networks, a type of artificial intelligence loosely inspired by the neurons in a human brain. The beauty of this type of AI is that it doesn’t require you to preprogram it with a lot of rules. Just feed a neural network enough examples of something, and it can learn to detect patterns in the data, then draw inferences based on that.

So, for example, you can present it with many thousands of strings of amino acids and show it what shape they folded into. Gradually, it detects patterns in the way given strings tend to shape up — patterns that human experts may not have detected. From there, it can make predictions about how other strings will fold.

This is exactly the sort of problem at which neural networks excel, and DeepMind recognized that, marrying the right type of AI to the right type of puzzle. (It also integrated some more complex knowledge — about physics and evolutionarily related amino acid sequences, for example — though the details remain scant as DeepMind is still preparing a peer-reviewed paper for publication.)

Other labs have already harnessed the power of neural networks to make breakthroughs in biology. At the beginning of this year, AI researchers trained a neural network by feeding it data on 2,335 molecules known to have antibacterial properties. Then they used it to predict which other molecules — out of 107 million possibilities — would also have these properties. In this way, they managed to identify brand-new types of antibiotics.

DeepMind researchers are capping the year with another achievement that shows just how much AI has matured. It’s genuinely great news for a generally terrible 2020.

Sign up for the Future Perfect newsletter and we’ll send you a roundup of ideas and solutions for tackling the world’s biggest challenges — and how to get better at doing good.

Millions turn to Vox to understand what’s happening in the news. Our mission has never been more vital than it is in this moment: to empower through understanding. Financial contributions from our readers are a critical part of supporting our resource-intensive work and help us keep our journalism free for all. Please consider making a contribution to Vox today from as little as $3.


Could AI be applied to protein folding? - Biology

Proteins are essential to life, supporting practically all its functions. They are large complex molecules, made up of chains of amino acids, and what a protein does largely depends on its unique 3D structure. Figuring out what shapes proteins fold into is known as the “protein folding problem”, and has stood as a grand challenge in biology for the past 50 years. In a major scientific advance, the latest version of our AI system AlphaFold has been recognised as a solution to this grand challenge by the organisers of the biennial Critical Assessment of protein Structure Prediction (CASP). This breakthrough demonstrates the impact AI can have on scientific discovery and its potential to dramatically accelerate progress in some of the most fundamental fields that explain and shape our world.

A protein’s shape is closely linked with its function, and the ability to predict this structure unlocks a greater understanding of what it does and how it works. Many of the world’s greatest challenges, like developing treatments for diseases or finding enzymes that break down industrial waste, are fundamentally tied to proteins and the role they play.

We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment.

Co-Founder and Chair of CASP, University of Maryland

This has been a focus of intensive scientific research for many years, using a variety of experimental techniques to examine and determine protein structures, such as nuclear magnetic resonance and X-ray crystallography. These techniques, as well as newer methods like cryo-electron microscopy, depend on extensive trial and error, which can take years of painstaking and laborious work per structure, and require the use of multi-million dollar specialised equipment.

The ‘protein folding problem’

In his acceptance speech for the 1972 Nobel Prize in Chemistry, Christian Anfinsen famously postulated that, in theory, a protein’s amino acid sequence should fully determine its structure. This hypothesis sparked a five decade quest to be able to computationally predict a protein’s 3D structure based solely on its 1D amino acid sequence as a complementary alternative to these expensive and time consuming experimental methods. A major challenge, however, is that the number of ways a protein could theoretically fold before settling into its final 3D structure is astronomical. In 1969 Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation – Levinthal estimated 10^300 possible conformations for a typical protein. Yet in nature, proteins fold spontaneously, some within milliseconds – a dichotomy sometimes referred to as Levinthal’s paradox.

Protein folding explained

Results from the CASP14 assessment

In 1994, Professor John Moult and Professor Krzysztof Fidelis founded CASP as a biennial blind assessment to catalyse research, monitor progress, and establish the state of the art in protein structure prediction. It is both the gold standard for assessing predictive techniques and a unique global community built on shared endeavour. Crucially, CASP chooses protein structures that have only very recently been experimentally determined (some were still awaiting determination at the time of the assessment) to be targets for teams to test their structure prediction methods against they are not published in advance. Participants must blindly predict the structure of the proteins, and these predictions are subsequently compared to the ground truth experimental data when they become available. We’re indebted to CASP’s organisers and the whole community, not least the experimentalists whose structures enable this kind of rigorous assessment.

AlphaFold: The making of a scientific breakthrough

The main metric used by CASP to measure the accuracy of predictions is the Global Distance Test (GDT) which ranges from 0-100. In simple terms, GDT can be approximately thought of as the percentage of amino acid residues (beads in the protein chain) within a threshold distance from the correct position. According to Professor Moult, a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods.

In the results from the 14th CASP assessment, released today, our latest AlphaFold system achieves a median score of 92.4 GDT overall across all targets. This means that our predictions have an average error (RMSD) of approximately 1.6 Angstroms, which is comparable to the width of an atom (or 0.1 of a nanometer). Even for the very hardest protein targets, those in the most challenging free-modelling category, AlphaFold achieves a median score of 87.0 GDT (data available here).

Improvements in the median accuracy of predictions in the free modelling category for the best team in each CASP, measured as best-of-5 GDT.

Two examples of protein targets in the free modelling category. AlphaFold predicts highly accurate structures measured against experimental result.

These exciting results open up the potential for biologists to use computational structure prediction as a core tool in scientific research. Our methods may prove especially helpful for important classes of proteins, such as membrane proteins, that are very difficult to crystallise and therefore challenging to experimentally determine.

This computational work represents a stunning advance on the protein-folding problem, a 50-year-old grand challenge in biology. It has occurred decades before many people in the field would have predicted. It will be exciting to see the many ways in which it will fundamentally change biological research.

Professor Venki Ramakrishnan

Nobel Laureate and President of the Royal Society

Our approach to the protein folding problem

We first entered CASP13 in 2018 with our initial version of AlphaFold, which achieved the highest accuracy among participants. Afterwards, we published a paper on our CASP13 methods in Nature with associated code, which has gone on to inspire other work and community-developed open source implementations. Now, new deep learning architectures we’ve developed have driven changes in our methods for CASP14, enabling us to achieve unparalleled levels of accuracy. These methods draw inspiration from the fields of biology, physics, and machine learning, as well as of course the work of many scientists in the protein folding field over the past half-century.

A folded protein can be thought of as a “spatial graph”, where residues are the nodes and edges connect the residues in close proximity. This graph is important for understanding the physical interactions within proteins, as well as their evolutionary history. For the latest version of AlphaFold, used at CASP14, we created an attention-based neural network system, trained end-to-end, that attempts to interpret the structure of this graph, while reasoning over the implicit graph that it’s building. It uses evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph.

By iterating this process, the system develops strong predictions of the underlying physical structure of the protein and is able to determine highly-accurate structures in a matter of days. Additionally, AlphaFold can predict which parts of each predicted protein structure are reliable using an internal confidence measure.

We trained this system on publicly available data consisting of

170,000 protein structures from the protein data bank together with large databases containing protein sequences of unknown structure. It uses approximately 16 TPUv3s (which is 128 TPUv3 cores or roughly equivalent to

100-200 GPUs) run over a few weeks, a relatively modest amount of compute in the context of most large state-of-the-art models used in machine learning today. As with our CASP13 AlphaFold system, we are preparing a paper on our system to submit to a peer-reviewed journal in due course.

An overview of the main neural network model architecture. The model operates over evolutionarily related protein sequences as well as amino acid residue pairs, iteratively passing information between both representations to generate a structure.

The potential for real-world impact

When DeepMind started a decade ago, we hoped that one day AI breakthroughs would help serve as a platform to advance our understanding of fundamental scientific problems. Now, after 4 years of effort building AlphaFold, we’re starting to see that vision realised, with implications for areas like drug design and environmental sustainability.

Professor Andrei Lupas, Director of the Max Planck Institute for Developmental Biology and a CASP assessor, let us know that, “AlphaFold’s astonishingly accurate models have allowed us to solve a protein structure we were stuck on for close to a decade, relaunching our effort to understand how signals are transmitted across cell membranes.”

We’re optimistic about the impact AlphaFold can have on biological research and the wider world, and excited to collaborate with others to learn more about its potential in the years ahead. Alongside working on a peer-reviewed paper, we’re exploring how best to provide broader access to the system in a scalable way.

In the meantime, we’re also looking into how protein structure predictions could contribute to our understanding of specific diseases with a small number of specialist groups, for example by helping to identify proteins that have malfunctioned and to reason about how they interact. These insights could enable more precise work on drug development, complementing existing experimental methods to find promising treatments faster.

AlphaFold is a once in a generation advance, predicting protein structures with incredible speed and precision. This leap forward demonstrates how computational methods are poised to transform research in biology and hold much promise for accelerating the drug discovery process.

PhD, Founder & CEO Calico, Former Chairman & CEO, Genentech

We’ve also seen signs that protein structure prediction could be useful in future pandemic response efforts, as one of many tools developed by the scientific community. Earlier this year, we predicted several protein structures of the SARS-CoV-2 virus, including ORF3a, whose structures were previously unknown. At CASP14, we predicted the structure of another coronavirus protein, ORF8. Impressively quick work by experimentalists has now confirmed the structures of both ORF3a and ORF8. Despite their challenging nature and having very few related sequences, we achieved a high degree of accuracy on both of our predictions when compared to their experimentally determined structures.

As well as accelerating understanding of known diseases, we’re excited about the potential for these techniques to explore the hundreds of millions of proteins we don’t currently have models for – a vast terrain of unknown biology. Since DNA specifies the amino acid sequences that comprise protein structures, the genomics revolution has made it possible to read protein sequences from the natural world at massive scale – with 180 million protein sequences and counting in the Universal Protein database (UniProt). In contrast, given the experimental work needed to go from sequence to structure, only around 170,000 protein structures are in the Protein Data Bank (PDB). Among the undetermined proteins may be some with new and exciting functions and – just as a telescope helps us see deeper into the unknown universe – techniques like AlphaFold may help us find them.

Unlocking new possibilities

AlphaFold is one of our most significant advances to date but, as with all scientific research, there are still many questions to answer. Not every structure we predict will be perfect. There’s still much to learn, including how multiple proteins form complexes, how they interact with DNA, RNA, or small molecules, and how we can determine the precise location of all amino acid side chains. In collaboration with others, there’s also much to learn about how best to use these scientific discoveries in the development of new medicines, ways to manage the environment, and more.

For all of us working on computational and machine learning methods in science, systems like AlphaFold demonstrate the stunning potential for AI as a tool to aid fundamental discovery. Just as 50 years ago Anfinsen laid out a challenge far beyond science’s reach at the time, there are many aspects of our universe that remain unknown. The progress announced today gives us further confidence that AI will become one of humanity’s most useful tools in expanding the frontiers of scientific knowledge, and we’re looking forward to the many years of hard work and discovery ahead!

Until we’ve published a paper on this work, please cite:

High Accuracy Protein Structure Prediction Using Deep Learning

John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Kathryn Tunyasuvunakool, Olaf Ronneberger, Russ Bates, Augustin Žídek, Alex Bridgland, Clemens Meyer, Simon A A Kohl, Anna Potapenko, Andrew J Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Martin Steinegger, Michalina Pacholska, David Silver, Oriol Vinyals, Andrew W Senior, Koray Kavukcuoglu, Pushmeet Kohli, Demis Hassabis.

In Fourteenth Critical Assessment of Techniques for Protein Structure Prediction (Abstract Book), 30 November - 4 December 2020. Retrieved from here.


London A.I. Lab Claims Breakthrough That Could Accelerate Drug Discovery

Researchers at DeepMind say they have solved “the protein folding problem,” a task that has bedeviled scientists for more than 50 years.

Some scientists spend their lives trying to pinpoint the shape of tiny proteins in the human body.

Proteins are the microscopic mechanisms that drive the behavior of viruses, bacteria, the human body and all living things. They begin as strings of chemical compounds, before twisting and folding into three-dimensional shapes that define what they can do — and what they cannot.

For biologists, identifying the precise shape of a protein often requires months, years or even decades of experimentation. It requires skill, intelligence and more than a little elbow grease. Sometimes they never succeed.

Now, an artificial intelligence lab in London has built a computer system that can do the job in a few hours — perhaps even a few minutes.

DeepMind, a lab owned by the same parent company as Google, said on Monday that its system, called AlphaFold, had solved what is known as “the protein folding problem.” Given the string of amino acids that make up a protein, the system can rapidly and reliably predict its three-dimensional shape.

This long-sought breakthrough could accelerate the ability to understand diseases, develop new medicines and unlock mysteries of the human body.

Computer scientists have struggled to build such a system for more than 50 years. For the last 25, they have measured and compared their efforts through a global competition called the Critical Assessment of Structure Prediction, or C.A.S.P. Until now, no contestant had even come close to solving the problem.

DeepMind solved the problem with a wide range of proteins, reaching an accuracy level that rivaled physical experiments. Many scientists had assumed that moment was still years, if not decades, away.

“I always hoped I would live to see this day,” said John Moult, a professor at the University of Maryland who helped create C.A.S.P. in 1994 and continues to oversee the biennial contest. “But it wasn’t always obvious I was going to make it.”

As part of this year’s C.A.S.P., DeepMind’s technology was reviewed by Dr. Moult and other researchers who oversee the contest.

If DeepMind’s methods can be refined, he and other researchers said, they could speed the development of new drugs as well as efforts to apply existing medications to new viruses and diseases.

The breakthrough arrives too late to make a significant impact on the coronavirus. But researchers believe DeepMind’s methods could accelerate the response to future pandemics. Some believe it could also help scientists gain a better understanding of genetic diseases along the lines of Alzheimer’s or cystic fibrosis.

Still, experts cautioned that this technology would affect only a small part of the long process by which scientists identify new medicines and analyze disease. It was also unclear when or how DeepMind would share its technology with other researchers.

DeepMind is one of the key players in a sweeping change that has spread across academia, the tech industry and the medical community over the past 10 years. Thanks to an artificial intelligence technology called a neural network, machines can now learn to perform many tasks that were once beyond their reach — and sometimes beyond the reach of humans.

A neural network is a mathematical system loosely modeled on the network of neurons in the human brain. It learns skills by analyzing vast amounts of data. By pinpointing patterns in thousands of cat photos, for instance, it can learn to recognize a cat.

This is the technology that recognizes faces in the photos you post to Facebook, identifies the commands you bark into your smartphone and translates one language into another on Skype and other services. DeepMind is using this technology to predict the shape of proteins.

If scientists can predict the shape of a protein in the human body, they can determine how other molecules will bind or physically attach to it. This is one way drugs are developed: A drug binds to particular proteins in your body and alters their behavior.

By analyzing thousands of known proteins and their physical shapes, a neural network can learn to predict the shapes of others. In 2018, using this method, DeepMind entered the C.A.S.P. contest for the first time and its system outperformed all other competitors, signaling a significant shift. But its team of biologists, physicists and computer scientists, led by a researcher named John Jumper, were nowhere close to solving the ultimate problem.

In the two years since, Dr. Jumper and his team designed an entirely new kind of neural network specifically for protein folding, and this drove an enormous leap in accuracy. Their latest version provides a powerful, if imperfect, solution to the protein folding problem, said the DeepMind research scientist Kathryn Tunyasuvunakool.

The system can accurately predict the shape of a protein about two-thirds of the time, according to the results of the C.A.S.P. contest. And its mistakes with these proteins are smaller than the width of an atom — an error rate that rivals physical experiments.

“Most atoms are within an atom diameter of where they are in the experimental structure,” said Dr. Moult, the contest organizer. “And with those that aren’t, there are other possible explanations of the differences.”

Andrei Lupas, director of the department of protein evolution at the Max Planck Institute for Developmental Biology in Germany, is among those who worked with AlphaFold. He is part of a team that spent a decade trying to determine the physical shape of a particular protein in a tiny bacteria-like organism called an archaeon.

This protein straddles the membrane of individual cells — part is inside the cell, part is outside — and that makes it difficult for scientists like Dr. Lupas to determine the shape of the protein in the lab. Even after a decade, he could not pinpoint the shape.

With AlphaFold, he cracked the problem in half an hour.

If these methods continue to improve, he said, they could be a particularly useful way of determining whether a new virus could be treated with a cocktail of existing drugs.

“We could start screening every compound that is licensed for use in humans,” Dr. Lupas said. “We could face the next pandemic with the drugs we already have.”

During the current pandemic, a simpler form of artificial intelligence proved helpful in some cases. A system built by another London company, BenevolentAI, helped pinpoint an existing drug, baricitinib, that could be used to treat seriously ill Covid-19 patients. Researchers have now completed a clinical trial, though the results have not yet been released.

As researchers continue to improve the technology, AlphaFold could further accelerate this kind of drug repurposing, as well as the development of entirely new vaccines, especially if we encounter a virus that is even less understood than Covid-19.

David Baker, the director of the Institute for Protein Design at the University of Washington, who has been using similar computer technology to design anti-coronavirus drugs, said DeepMind’s methods could accelerate that work.

“We were able to design coronavirus-neutralizing proteins in several months,” he said. “But our goal is to do this kind of thing in a couple of weeks.”

Still, development speed must contend with other issues, like massive clinical trials, said Dr. Vincent Marconi, a researcher at Emory University in Atlanta who helped lead the baricitinib trial. “That takes time,” he said.

But DeepMind’s methods could be a way of determining whether a clinical trial will fail because of toxic reactions or other problems, at least in some cases.

Demis Hassabis, DeepMind’s chief executive and co-founder, said the company planned to publish details describing its work, but that was unlikely to happen until sometime next year. He also said the company was exploring ways of sharing the technology itself with other scientists.

DeepMind is a research lab. It does not sell products directly to other labs or businesses. But it could work with other companies to share access to its technology over the internet.

The lab’s biggest breakthroughs in the past have involved games. It built systems that surpassed human performance on the ancient strategy game Go and the popular video game StarCraft — enormously technical achievements with no practical application. Now, the DeepMind team are eager to push their artificial intelligence technology into the real world.

“We don’t want to be a leader board company,” Dr. Jumper said. “We want real biological relevance.”


Pichai’s DeepMind AI Solves 50-Year-Old Problem, By Unlocking Mysteries Of Protein

Alphabet’s DeepMind has achieved a groundbreaking breakthrough in the field of AI-based protein structure prediction.

The AI company has revealed that its AlphaFold system has managed to solve a 50-year old mystery pertaining to protein folding.

Reuters

To the uninitiated, protein folding is, basically when a protein is made, it usually emerges in the form of a long string. However, for this string to be used by the body, it needs to fold three-dimensionally. The shape that they make is what enters and exits a particular cell.

Cracking the mystery of this protein shape will allow future drug makers to find a solution to a disease . Understanding a protein’s shape can help researchers stop a disease from transmitting or stop in case of neurodegenerative and cognitive disorders, correct the mistakes and offer people a new lease on life. Doing this is a complex and time-consuming task.

However, DeepMind’s AlphaFold has used AI to correctly figure out the structure of the protein in a matter of days.

In case you’re wondering that you’ve heard about protein folding before, you’re not wrong. In the initial days of the pandemic, [email protected] allowed people to help share their system’s compute power with researchers to help them crack the SARS CoV-2 code . However, DeepMind’s application is more standalone.

DeepMind makes use of an ‘Attention-based neural network system’, which in layman’s terms is basically a neural network that concentrates on specific inputs to multiply the efficiency. The system is able to continuously adapt and refine its predictive graph of possible protein folding outcomes, looking at their history, and delivering very accurate predictions most of the time.

AFP

Alphabet CEO, Sundar Pichai said in a statement on Twitter, announcing this groundbreaking discovery, “@ DeepMind's incredible AI-powered protein folding breakthrough will help us better understand one of life’s fundamental building blocks + enable researchers to tackle new and hard problems, from fighting diseases to environmental sustainability.”

DeepMind’s edge in cracking the code of the proteins in a shorter span of time could help the world better prepare against potential pandemics, like the SARS CoV-2 , and help scientists prepare a solution to help save lives of millions around the world.


Proteins by design

Proteins are life’s workhorses. They speed up vital chemical reactions, enable muscles to tug, carry out communication between and within cells, and defend against invaders. Given proteins’ talents, researchers have long wanted to create their own versions. They have modified many existing proteins by making small tweaks to an organism’s DNA code, but this year, they took protein modification to a whole new level: They created a suite of designer proteins unlike anything found in nature, setting the stage for novel medicines and materials.

Designing new proteins from scratch has been a hit-or-miss activity. It’s easy enough to write any desired DNA code, but researchers have had no way of knowing how the novel strings of amino acids encoded by this DNA would fold into complex 3D shapes. That’s a problem, because for proteins, shape dictates function. Recently, however, computational biologists have made heady progress in designing computer programs that accurately predict how designer proteins will fold. Those advances made possible this year’s surge in designer proteins.

In February, a team led by researchers in Washington state used one such program to design what may become a universal flu vaccine, able to spark immune defenses to all flu strains simultaneously. In July, a team that included many of the same researchers created proteins that self-assemble into hollow cages, which some day could be filled with drugs or snippets of DNA to treat a range of diseases. Another team used a similar program to produce 3D, folded RNA molecules, which present a folding problem similar to proteins, as well as RNA-protein complexes, opening up new research possibilities.

Now, researchers want to use their skills to create everything from novel biosensors to new ways to remove carbon dioxide from the atmosphere. And because life makes only a tiny fraction of all possible proteins, protein designers have a vast new territory to explore. –Robert F. Service


Tryptophan technique illuminates protein folding.

Misfolded proteins are implicated in diseases from Alzheimer's to Parkinson's, but tracking the process by which they occur remains one of biochemistry's greatest challenges. Now, a team at Universite de Montreal has shown that a technique based on the fluorescence of tryptophan might be a better tool to probe protein folding than anyone previously thought.

The transitions involved in protein folding are notoriously difficult to study as the half-folded intermediates don't usually last long enough for their unique signatures to be unambiguously detected by traditional methods such as crystallography or nuclear magnetic resonance (NMR) spectroscopy. An alternate method is based on the fluorescence of tryptophan (while several amino acids exhibit fluorescence, tryptophan's is the strongest). By measuring changes in the light emitted by excited tryptophan molecules, researchers can glean information about the local environment in a specific part of the protein.

"It has become dogma that tryptophan has to be at least partially buried in the folded structure in order to see a strong change in fluorescence between unfolded and folded states," says Stephen Michnick, a biochemist at Universite de Montreal. In a technical report published in Nature Structural and Molecular Biology Michnick and Alexis Vallee-Belisle disproved that theory. They created mutant versions of the protein ubiquitin with tryptophan substituted in sites that were exposed on the surface of the protein. Fluorescence spectroscopy showed that even on these sites, the electronic differences between folded and unfolded states was still enough to cause detectable changes in fluorescence. The team went on to create mutant versions of ubiquitin with up to 27 of its 76 amino acids replaced with tryptophan. The larger number of probes allows researchers to study many areas of the protein at once.

The technique is surprisingly simple, which Michnick says is precisely the point. "I hope this gives the protein community a license to try something that they probably wanted to try but didn't have the nerve to, because they thought it was crazy," he says. He adds that the technique could be applied not only to protein folding intermediates, but any conformational change in proteins including allosteric transitions and macromolecular assembly.


Special Issue Editors

The protein homeostasis system, including autophagy, chaperone, and unfolded protein response, is one of the most essential gatekeepers to maintain cell homeostasis. Thus, the loss of protein homeostasis due to protein misfolding and aggregation often results in cell death with gain-of-toxic function and loss-of-function diseases such as neurodegenerative and many other aging-related disorders. Key and fundamental properties of protein (mis)folding and aggregation in causing cell death have been increasing revealed. However, a clearer understanding of the complex mechanisms of protein misfolding diseases remains elusive. This Special Issue focuses on in vitro and in vivo studies of protein folding, aggregation, and cell death.

Proteins are one of the most fundamental players for numerous biological processes that sustain living organisms. The correct folding of nascent polypeptides from unstructured conformations to the native structure in cells is a prerequisite for gain-of-function of proteins. Stochastic and environment-induced misfolding often results in irreversible protein aggregation, which causes loss-of-function and gain-of-toxic-function of proteins by damaging organelles and inducing cell death.

The protein homeostasis (proteostasis) system, including autophagy, unfolded protein response, and chaperone function, is fundamental to keep cell homeostasis. Among them, chaperone plays a central role for protein homeostasis by helping correct folding and preventing aggregation of misfolded proteins. Most seriously, errors in the proteostasis system in organelles, cells, tissues, and organs give rise to various protein misfolding diseases, including Alzheimer's disease (AD), Parkinson's disease (PD), type 2 diabetes mellitus (T2DM), and amyotrophic lateral sclerosis (ALS). Misfolded and aggregated proteins interact with distinct cellular membranes, e.g., the plasma and organelle membranes, which affects the regulation of organelle function and intracellular signaling associated with apoptotic and non-apoptotic cell death processes. In addition, the interaction of mis-regulated protein aggregates with cellular organelles such as the mitochondria and endoplasmic reticulum has shown the loss of cellular calcium homeostasis with oxidative stress, thereby leading to several types of protein misfolding diseases.

Although several key and underlying aspects of protein folding, misfolding-induced aggregation, chaperone function, and their relation to cell death have been suggested, complicated linkages among these biological and pathological processes remain to be further discussed. In order to gain a deeper understanding, we invite cutting-edge researchers to submit original and review articles on the broad topic of &ldquoProtein Folding, Aggregation, and Cell Death&rdquo. This Special Issue will collect comprehensive manuscripts and provide valuable insight for the scientific community, ranging from basic to clinical researchers. Original research articles, timely reviews, and short communications are welcome.

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biology is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.


Artificial intelligence tool cracks code to imagine proteins in 3D

An artificial intelligence network solved a scientific problem that has stumped researchers for half a century, successfully predicting the way proteins fold into three-dimensional shapes, a process that has typically taken expensive and painstaking lab work that could go on for decades.

The way proteins, one of the building blocks of life, fold drives their functionality and behaviour. For instance, SARS-Cov-2 has a protein that folds as a spike. This shape, therefore, is relevant for biologists (including for its ability to find cures for illnesses). It isn’t easy to predict the shape of a protein, though, based on the way amino acids come together to form a protein. That’s because there are countless ways in which a protein can fold into a three-dimensional structure.

DeepMind, owned by Google, created a computer programme called AlphaFold, which predicted to surprising accuracy the 3D shapes of proteins after being fed their constituent parts – data depicting strings of amino acids.

“This computational work represents a stunning advance on the protein-folding problem, a 50-year-old grand challenge in biology. It has occurred decades before many people in the field would have predicted. It will be exciting to see the many ways in which it will fundamentally change biological research,” said professor Venki Ramakrishnan, Nobel laureate and president of the Royal Society, according to a blog post by DeepMind.

“It’s a breakthrough of the first order, certainly one of the most significant scientific results of my lifetime,” a report in Nature quoted Mohammed Al Quraishi, a computational biologist at Columbia University in New York City, as saying. “I think it’s fair to say this will be very disruptive to the protein-structure-prediction field. I suspect many will leave the field as the core problem has arguably been solved,” he added.

Quraishi was part of the Critical Assessment of Structure Prediction (CASP), a competition held every two years to accelerate research into the field, where AlphaFold reached the threshold for what is considered as having “solved” the problem.

DeepMind became a subsidiary of Google after a 2014 acquisition and is best known for its gamer AI, teaching itself to beat Atari video games and defeating world-renowned Go players like Lee Sedol. The company’s ambition has been to develop AI that can be applied to broader problems, and it’s so far created systems to make Google’s data centres more energy-efficient, identify eye disease from scans and generate human-sounding speech.

“These algorithms are now becoming strong enough and powerful enough to be applicable to scientific problems,” DeepMind chief executive officer Demis Hassabis said in a call with reporters, news agency Bloomberg reported. After four years of development “we have a system that’s accurate enough to actually have biological significance and relevance for biological researchers.”

The DeepMind blog post referred to comme nts by eminent scientists on the topic in the past to illustrate the significance of the breakthrough. “In his acceptance speech for the 1972 Nobel Prize in Chemistry, Christian Anfinsen famously postulated that, in theory, a protein’s amino acid sequence should fully determine its structure. This hypothesis sparked a five decade quest to be able to computationally predict a protein’s 3D structure based solely on its 1D amino acid sequence as a complementary alternative to these expensive and time consuming experimental methods,” it said.

“A major challenge, however, is that the number of ways a protein could theoretically fold before settling into its final 3D structure is astronomical. In 1969 Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation – Levinthal estimated 10^300 possible conformations for a typical protein,” it added.

CASP scientists analysed the shape of amino acid sequences for about 100 proteins. Competitors were given the sequences, and charged with predicting their shape.

AlphaFold’s assessment lined up almost perfectly with the CASP analysis for two-thirds of the proteins, compared to about 10% from the other teams, and better than what DeepMind’s tool achieved two years ago

Hassabis said his inspiration for AlphaFold came from “citizen science” attempts to find unknown protein structures, like Foldit, which presented amateur volunteers with the problem in the form of a puzzle.

In its first two years, the human gamers proved to be surprisingly good at solving the riddles, and ended up discovering a structure that had baffled scientists and designing a new enzyme that was later confirmed in the lab.