Structural analysis with machine learning reveals tactics of the SARS-CoV-2 virus
by Dr. Andreas Battenberg
(17.09.2021) The proteins of the SARS-Cov-2 virus play a key role in the ability of the virus to outsmart the human immune system and reproduce in patient cells. An international research team with the participation of the Technical University of Munich (TUM) has now compiled the most comprehensive and detailed overview of all 3D structures of virus proteins available worldwide. The analysis using artificial intelligence methods revealed surprising findings.
Prof. Dr. Burkhard Rost, Bioinformatics, Technical University of Munich
Photo: Juli Eberle / ediundsepp / TUM
Structure of the SARS-CoV-2 protein NSP1 (blue) in complex with a host ribosome (gray). NSP1 blocks the reading of host mRNA in virus-infected cells.
Photo: Seán O’Donoghue, Garvan Institute of Medical Informatics
How does the SARS-CoV-2 virus manage to evade the immune system and replicate in the cells of patients? In order to clarify this question, an international research team has compiled the most comprehensive overview to date of all analyzes of the exact three-dimensional shape of the SARS-CoV-2 proteins – including the well-known spike protein – available to date. To compile this overview, the team used high-throughput machine learning. This approach makes it possible to predict the structural states of coronavirus proteins based on analyzes of related proteins. The database now consists of 2,060 3D models with atomic resolution. All structural models are freely available on the Aquaria-COVID website (https://aquaria.ws/covid).
“This offers an unprecedented wealth of detail that will help researchers better understand the molecular mechanisms of COVID-19 infection and develop therapies to combat the pandemic, for example by identifying potential new targets for future treatments or vaccines,” says Prof. Burkhard Rost, holder of the chair for bioinformatics at the Technical University of Munich.
The structural map makes the collected knowledge accessible
In a second part of the study, a complementary approach known as human-in-the-loop machine learning was used. A novel, visual interface was created here, which brings together everything that is currently known about the three-dimensional shape of SARS-CoV-2 proteins – and what is not.
Researchers can also use the visual interface as a navigation aid to find suitable structural models for specific research questions. Working with the models has already given some important clues as to how coronaviruses manage to take command in our cells.
With the help of machine learning algorithms, the team identified three coronavirus proteins (NSP3, NSP13 and NSP16) that “mimic” human proteins and successfully trick the host cells into believing that they are endogenous proteins that work in the best interests of the cell. The modeling also revealed five coronavirus proteins (NSP1, NSP3, spike glycoprotein, envelope protein and ORF9b protein) that “misuse” or disrupt processes in human cells. In this way, the virus manages to take control, complete its life cycle and spread.
Understand how the virus works – and how to stop it
“When analyzing these structural models, we also found new evidence of how the virus copies its own genome – this is the central process that enables the virus to spread quickly in infected people,” says Burkhard Rost. “The findings from our study bring us closer to understanding how the virus works and what we can do to stop it.”
“The longer the virus circulates, the greater the risk that it will mutate and form new variants such as the Delta strain,” says Sean O’Donoghue, first author of the study and professor at the Garvan Institute in Sydney. “Our resource will help researchers understand how new strains of the virus differ from one another – a piece of the puzzle that we hope will help combat newly emerging strains.”
The research was funded by the Garvan Research Foundation, Sony Foundation Australia, Tour de Cure Australia, Wellcome Trust, Biotechnology and Biological Sciences Research Council, the Federal Ministry of Education and Research (BMBF), and Amazon Web Services (AWS). Researchers from the Garvan Institute of Medical Informatics, the Commonwealth Scientific and Industrial Research Organization (CSIRO) and the University of New South Wales (Sydney, Australia), the Technical University of Munich (Garching / Munich), the Weihenstephan-Triesdorf University of Applied Sciences (Freising) were involved ), the University of Dundee (Scotland) and the University College London (UCL, UK).
Sean I. O’Donoghue, Andrea Schafferhans, Neblina Sikta, Christian Stolte, Sandeep Kaur, Bosco K. Ho, Stuart Anderson, James B. Procter, Christian Dallago, Nicola Bordin, Matt Adcock, Burkhard Rost
SARS-CoV-2 structural coverage map reveals viral protein assembly, mimicry, and hijacking mechanisms
Molecular Systems Biology, Sept. 14, 2021 – DOI: 10.15252/msb.202010079