(Nanowerk News) Enzymes are the molecular factories in biological cells. However, what basic molecular building blocks they use to assemble the target molecule is often unknown and difficult to measure. An international team including bioinformaticists from Heinrich Heine University Düsseldorf (HHU) have now taken an important step forward in this regard: Their AI method predicts with a high degree of accuracy whether an enzyme can work with a particular substrate.
They are now presenting their results in a scientific journal Nature Communications (“Enzyme substrate scope: general predictive models based on machine and deep learning”).
Enzymes are important biocatalysts in all living cells: They facilitate chemical reactions, in which all molecules essential to organisms are produced from basic substances (substrates). Most organisms have thousands of different enzymes, with each one responsible for very specific reactions. The collective function of all enzymes shapes metabolism and thus provides the conditions for the life and survival of organisms.
Although the gene encoding the enzyme can be easily identified, the exact function of the enzyme produced is unknown in the majority – more than 99% – of cases. This is because the experimental characterization of their function – that is, the initial molecule that a particular enzyme converts into a concrete final molecule – is very time-consuming.
Together with colleagues from Sweden and India, a research team led by Professor Dr Martin Lercher from the Computational Cell Biology research group at HHU have developed an AI-based method to predict whether an enzyme can use a particular molecule as a substrate for its reaction. catalyst.
Professor Lercher: “A special feature of our ESP (“Enzyme Substrate Prediction”) model is that we are not limited to individual specific enzymes and others closely related to them, as was the case in the previous model. Our general model can work with any combination of enzymes and more than 1,000 different substrates.”
PhD student Alexander Kroll, lead author of the study, has developed a so-called Deep Learning model in which information about enzymes and substrates is encoded in mathematical structures known as numerical vectors. Vectors of around 18,000 experimentally validated enzyme-substrate pairs – where the enzyme and substrate are known to work together – were used as input to train the Deep Learning model.
Alexander Kroll: “After training the model in this way, we then apply it to an independent test dataset where we already know the correct answer. In 91% of cases, the model predicted exactly which substrate went with which enzyme.”
This method offers a variety of potential applications. In both drug and biotechnology research, it is very important to know which substances can be converted by enzymes. Professor Lercher: “This will enable research and industry to narrow down a large number of possible pairs to the most promising pairs, which they can then use for the enzymatic production of new drugs, chemicals or even biofuels.”
Kroll added: “It will also allow for the creation of better models to simulate cell metabolism. Plus, it will help us understand the physiology of various organisms – from bacteria to humans.”
Apart from Kroll and Lercher, Professor Dr Martin Engqvist from Chalmers University of Technology in Gothenburg, Sweden, and Sahasra Ranjan from the Indian Institute of Technology in Mumbai were also involved in the research. Engqvist helped design the study, while Ranjan implemented the model which encoded the enzyme information that was fed into the overall model developed by Kroll.