The Covid-19 Proteins: Machine Learning Structural Analysis Reveals SARS-CoV-2 Virus Tactics

The proteins of the SARS-CoV-2 virus play a key role in the ability of the virus to outsmart the human immune system and multiply in patient cells. An international research team with the participation of the Technical University of Munich (TUM) has now compiled the most comprehensive and detailed overview of all 3D structures of virus proteins available worldwide. The analysis using artificial intelligence methods revealed surprising findings.

How does the SARS-CoV-2 virus manage to evade the immune system and replicate in the cells of patients? In order to clarify this question, an international research team has compiled the most comprehensive overview to date of all analyzes of the exact three-dimensional shape of the SARS-CoV-2 proteins – including the well-known spike protein – available to date.

To compile this overview, the team used high-throughput machine learning. This approach makes it possible to predict the structural states of coronavirus proteins based on analyzes of related proteins. The database now consists of 2,060 3D models with atomic resolution. On the Aquaria-COVID website all structural models are freely available.

“This offers an unprecedented wealth of detail that will help researchers better understand the molecular mechanisms of COVID-19 infection and develop therapies to combat the pandemic, for example by identifying potential new targets for future treatments or vaccines,” says Burkhard Rost , Holder of the chair for bioinformatics at the Technical University of Munich.

The structural map makes the collected knowledge accessible

In a second part of the study, a complementary approach known as human-in-the-loop machine learning was used. A novel, visual interface was created here, which brings together everything that is currently known about the three-dimensional shape of SARS-CoV-2 proteins – and what is not.

Researchers can also use the visual interface as a navigation aid to find suitable structural models for specific research questions. Working with the models has already given some important clues as to how coronaviruses manage to take command in our cells.

How coronaviruses take command in our cells

Using machine learning algorithms, the team identified three coronavirus proteins (NSP3, NSP13 and NSP16) that “mimic” human proteins and successfully trick the host cells into believing that they are endogenous proteins that work in the best interests of the cell.

The modeling also revealed five coronavirus proteins (NSP1, NSP3, spike glycoprotein, envelope protein and ORF9b protein) that “misuse” or disrupt processes in human cells. In this way, the virus manages to take control, complete its life cycle and spread.

Understand how the virus works – and how to stop it

“When analyzing these structural models, we also found new evidence of how the virus copies its own genome – this is the central process that enables the virus to spread quickly in infected people,” says Burkhard Rost. “The findings from our study bring us closer to understanding how the virus works and what we can do to stop it.”

“The longer the virus circulates, the greater the risk that it will mutate and form new variants such as the Delta strain,” says Sean O’Donoghue, first author of the study and professor at the Garvan Institute in Sydney. “Our resource will help researchers understand how new strains of the virus differ from one another – a piece of the puzzle that we hope will help combat emerging variants.” (Molecular Systems Biology, 2021; doi: 10.15252/msb.202010079)

Source: Technical University of Munich

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.