(Nanowerk News) In 2019, scientists at the joint School of Engineering and School of Life Sciences Protein Design Laboratory and Immunoengineering (LPDI) led by Bruno Correia developed MaSIF: a machine learning-based method to scan millions of protein surfaces in minutes to analyze their structure and functional properties. The researchers’ main goal is to computationally design protein interactions by finding optimal matches between molecules based on their surface chemical and geometric ‘fingerprints’.
Four years later, they have achieved just that. In a paper published in Natural (“De novo design of protein interactions with studied surface fingerprints”), they report that they have created a novel protein called a binder that is designed to interact with four therapeutically relevant protein targets, including the SARS-CoV-2 spike protein.
Perfect molecular match engineering
Physical interactions between proteins affect everything from cell signaling and growth to immune response, so the ability to control protein-protein interactions is of great interest to the fields of biology and biotechnology. While textbook depictions of protein binding may seem as simple as putting puzzle pieces together, the reality is more complex: protein surfaces are highly variable and dynamic, making it difficult to predict how and where binding events will occur.
“A piece of the puzzle is two-dimensional, but with a protein surface, we see many dimensions: chemical composition, such as the interaction of positive versus negative charges; complementarity in form, curvature, etc.,” explains LPDI PhD student and co-author Anthony Marchand.
“The idea that everything in nature bonds with each other – for example, positive charges bond with negative charges – has been an old idea in the field, which we captured in our computational framework.”
To design new protein binders, the researchers used MaSIF to create protein surface ‘fingerprints’, and then identified complementary surfaces for key protein target sites from the fragment database. They then digitally grafted the fragments onto larger protein scaffolds, and selected the resulting binders that were predicted to interact best with their targets. After synthesizing and testing these selected binders in the laboratory, the researchers were able to computationally confirm the resulting hypothesis.
“The fact that we can design new, site-specific protein binding in just a few months makes this method very attractive for therapy. It’s not just a tool: it’s a pipeline,” said Marchand.
‘Direct from computer’
Researchers were developing protein binding for three major cancer immunotherapy targets as the COVID pandemic hit, so they added the SARS-CoV-2 spike protein to their list. Using their approach, the four binders they produced displayed excellent affinity for their targets.
MaSIF’s success rate, combined with its speed and ability to produce high quality site specific designs, all demonstrate its therapeutic potential. For example, the ability to generate accurate protein binding very quickly could be of great advantage for epidemiological applications, such as in the case of the SARS-CoV-2 spike protein. Marchand also saw the pipeline’s potential to facilitate the development of the chimeric antigen receptor (CAR-T) protein, which could be engineered to allow a patient’s immune cells to target cancer cells.
“Further advances in machine learning methods will help improve our methods, but our work today has provided a strategy for developing innovative therapies that benefit patients through the rapid design of protein-based therapies – right from the computer.”