1st International and 10th National Iranian Conference on Bioinformatics
Development of a new oligonucleotide block location-based feature extraction (BLBFE) method for the classification of riboswitches
Paper ID : 1214-ICB10
Authors:
Faegheh Golabi *1, Mousa Shamsi2, Mohammad Hossein Sedaaghi3, Abolfazl Barzegar4, Mohammad Saeid Hejazi5
1Department of Biomedical Engineering, Faculty of Advanced Biomedical Sciences, Tabriz University of Medical Sciences
2Faculty of Biomedical Engineering, Sahand University of Technology
3Faculty of Electrical Engineering, Sahand University of Technology
4Department of Medical Biotechnology, Faculty of Advanced Biomedical Sciences, Tabriz University of Medical Sciences
5Department of Pharmaceutical Biotechnology, Faculty of Pharmacy, Tabriz University of Medical Sciences
Abstract:
As knowledge of genetics and genome elements increases, the demand for the development of bioinformatics tools for analyzing these data are raised. Riboswitches are genetic components, usually located in the untranslated regions of mRNAs, that regulate gene expression [1-16]. Additionally, their interaction with antibiotics has been recently suggested, implying a role in antibiotic effects and resistance [9, 10, 17-21]. Following a previously published sequential block finding algorithm [22], herein, we report the development of a new block location-based feature extraction strategy (BLBFE). This procedure utilizes the locations of family-specific sequential blocks on riboswitch sequences as features. Furthermore, the performance of other feature extraction strategies, including mono- and dinucleotide frequencies [23], k-mer [24], DAC, DCC, DACC [25-27], PC-PseDNC-General and SC-PseDNC-General [27, 28] methods, was investigated [29, 30]. KNN [31], LDA [32], naïve Bayes [33-35], PNN [36] and decision tree [37] classifiers accompanied by V-fold cross-validation [38] were applied for all methods of feature extraction, and their performances based on the defined feature extraction strategies were compared. Performance measures of accuracy, sensitivity, specificity and F-score for each method of feature extraction were studied [39, 40]. The proposed feature extraction strategy resulted in a classification of riboswitches with an average correct classification rate (CCR) of 90.8%. Furthermore, the obtained data confirmed the performance of the developed feature extraction method with an average accuracy of 96.1%, an average sensitivity of 90.8%, an average specificity of 97.52% and an average F-score of 90.69%. Our results implied that the proposed feature extraction (BLBFE) method can classify and discriminate riboswitch families with high CCR, accuracy, sensitivity, specificity and F-score values.
Keywords:
Riboswitches; feature extraction; sequential blocks; block location-based feature extraction; classification; performance measures.
Status : Paper Accepted (Oral Presentation)