(Nanowerk News) A research team from Baidu Research has developed an artificial intelligence (AI) algorithm that can quickly design highly stable COVID-19 mRNA vaccine sequences that were previously unattainable. The algorithm, named LinearDesign, represents a major leap in stability and efficacy for the vaccine suite, achieving a 128-fold increase in the antibody response of the COVID-19 vaccine.
“This research can apply mRNA coding drugs to a wider range of therapeutic proteins, such as monoclonal antibodies and anticancer drugs, promising broad applications and broad impacts,” said Dr. He Zhang, Staff Software Engineer at Baidu Research.
In collaboration with Oregon State University, StemiRNA Therapeutics, and the University of Rochester Medical Center, the study “Algorithm for Optimized mRNA Design Improves Stability and Immunogenicity” appears in the scientific journal Natural (“Algorithm for Optimized mRNA Design Improves Stability and Immunogenicity”).
This paper reveals how complex biology problems can be tackled by taking a classic approach to natural language processing (NLP), using elegantly simple solutions that have been used to understand words and grammar.
mRNA, or messager RNA, has emerged as a revolutionary technology for developing vaccines and potential treatments against cancer and other diseases. Serving as an important messenger that carries genetic instructions from DNA to the protein-making machinery of the cell, mRNA allows the manufacture of specific proteins for various functions in the human body. With multiple advantages in terms of safety, efficacy and production, mRNA has been rapidly adopted in the COVID-19 vaccine development process.
However, the natural instability of the mRNA results in insufficient protein expression which weakens the vaccine’s capacity to stimulate a strong immune response. This instability also poses challenges for storing and transporting mRNA vaccines, especially in developing countries where resources are often limited.
Previous studies have shown that optimizing the stability of the mRNA secondary structure, when combined with optimal codons, leads to increased protein expression. The challenge lies in the mRNA design space, which is vast due to synonymous codons. For example, there are approximately 10^632 mRNAs that can be translated into the same SARS-CoV-2 Spike protein, presenting an insurmountable challenge to the previous methods.
Although NLP and biology may seem unrelated at first glance, the two fields share a strong mathematical connection. In human language, a sentence consists of a word order and an underlying syntax tree with noun and verb phrases, which together convey meaning. Likewise, an RNA strand has a nucleotide sequence and an associated secondary structure based on its folding pattern.
Researchers used a technique in language processing called lattice parsing, which represents potential word connections in a grid graph and selects the option that makes the most sense based on the grammar. Similarly, they constructed a graph that compactly represents all candidate mRNAs, using a deterministic finite-state automaton (DFA). Applying lattice parsing to mRNAs, finding the optimal mRNA is akin to identifying the most likely sentence among a set of similar-sounding alternatives.
Using this approach, it took LinearDesign only 11 minutes to generate the most stable mRNA sequence encoding the Spike protein.
In head-to-head comparisons, the sequences designed by LinearDesign showed significantly improved yields compared to existing vaccine sequences. For the COVID-19 vaccine mRNA sequence, the algorithm achieved up to a 5-fold increase in stability (mRNA half-life), a 3-fold increase in protein expression levels (within 48 h), and a remarkable 128-fold increase in antibody response. For the VZV mRNA vaccine sequence, this study reported up to a 6-fold increase in stability (half-life of the mRNA molecule), a 5.3-fold increase in protein expression level (48 h), and an 8-fold increase in antibody response.
“Vaccines designed through our method can offer better protection at the same dose, and potentially provide equivalent protection at a smaller dose, causing fewer side effects. This will greatly reduce vaccine research and development costs for biopharmaceutical companies while increasing yields,” Dr. Zhang added. In 2021, Baidu and Sanofi began a partnership to integrate LinearDesign algorithms into Sanofi’s product design pipeline for mRNA vaccines and drug development.
Baidu has created a bio-computing platform based on PaddlePaddle called PaddleHelix, which includes the ERNIE-Bio-Computing Grand Model. This platform explores the application of AI in various fields, such as small molecules, proteins/peptides, and RNA, offering a new research paradigm for AI in the life sciences. Baidu’s ERNIE Grand Model has developed a comprehensive big model technology system, covering NLP, vision, cross-investment, and biocomputing. The recently launched ERNIE Bot, a knowledge-enhanced big language model (LLM) capable of understanding and generating human language, is part of the ERNIE Big Model family.
Going forward, Baidu will continue to explore the application of AI in life sciences, broaden the scope and depth of inclusive technologies, and strive for the health and well-being of all mankind.