The COVID-19 pandemic motivated an intense debate over high transmissibility and unavailability of effective vaccine to cover all existent variants, and also has raised critical questions, such as concerns about new mutations and genetic recombination that could lead to novel variants of concerns. The density of mutation observed in the different residue indices of spike protein sequence, may correlate to the speed of virus distribution. Therefore, predicting an accurate determination of mutation rates is essential to comprehend this virus evolution and assess the risk of emergent infectious disease. The current study predicts the mutations that may be cause of new variants of concerns using a genetic algorithm approach. In this regard, we mutated randomly the wild-type sequence of SARS-CoV-2 spike protein to generate first 100 different sequences (initial population) that were modelled individually and used to evaluate their discrete optimized protein energy score. After applying cross-over and breeding 200 new generations, one of the sequences with the lowest discrete optimized protein energy score was identified and chosen for a further analysis to realize whether this sequence is potential for being the next variant of concern.
Genetic algorithm application for the prediction of potential SARS-CoV-2 new variant of concern
Maleki A.;Di Salvatore V.;Russo G.;Crispino E.;Pappalardo F.
2022-01-01
Abstract
The COVID-19 pandemic motivated an intense debate over high transmissibility and unavailability of effective vaccine to cover all existent variants, and also has raised critical questions, such as concerns about new mutations and genetic recombination that could lead to novel variants of concerns. The density of mutation observed in the different residue indices of spike protein sequence, may correlate to the speed of virus distribution. Therefore, predicting an accurate determination of mutation rates is essential to comprehend this virus evolution and assess the risk of emergent infectious disease. The current study predicts the mutations that may be cause of new variants of concerns using a genetic algorithm approach. In this regard, we mutated randomly the wild-type sequence of SARS-CoV-2 spike protein to generate first 100 different sequences (initial population) that were modelled individually and used to evaluate their discrete optimized protein energy score. After applying cross-over and breeding 200 new generations, one of the sequences with the lowest discrete optimized protein energy score was identified and chosen for a further analysis to realize whether this sequence is potential for being the next variant of concern.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.