NLP Based Protein Sequence Classification Through Convolutional Neural Network

Main Article Content

Pooja Sharma
Manish Maheshwari

Abstract

Redesigning and modifying proteins is a leading objective in the pharmaceutical industry today. Modern technology has made it possible to efficiently redesign proteins by simulating mutation, natural selection, and amplification in the lab. There are an infinite number of possible mutations for each protein. It would be impossible to synthesise every sequence or even examine every version that could be beneficial. Recently, there has been an increase in the use of machine learning to aid in protein redesign, as prediction models can be used to virtually evaluate a large number of different sequences. Modern machine learning models, notably deep learning models, are poorly understood. In addition, few descriptors of protein sequences have been considered. This paper presents a novel classification method for protein sequences that is propelled by artificial intelligence. Two distinct single-amino-acid descriptors and one structure-based, three-dimensional descriptor are used to create prediction models, and their effectiveness is compared. Several various evaluation metrics were applied to a variety of public and private data sets to determine the accuracy of the predictions. The study's findings indicate that the convolution neural network models constructed using amino acid property descriptors are the most pertinent to protein redesign problems encountered in the pharmaceutical industry.

Downloads

Download data is not yet available.

Article Details

How to Cite
Pooja Sharma, & Manish Maheshwari. (2024). NLP Based Protein Sequence Classification Through Convolutional Neural Network. Educational Administration: Theory and Practice, 30(1), 1324–1333. https://doi.org/10.53555/kuey.v30i1.6208
Section
Articles
Author Biographies

Pooja Sharma

PhD Research Scholar, MCNUJC, Bhopal

Manish Maheshwari

Professor, MCNUJC, Bhopal