1st International and 10th National Iranian Conference on Bioinformatics
Prediction of immunogenic peptides derived from FVIII by machine learning approach
Paper ID : 1445-ICB10
Authors:
Mahsa Bahramimoghadam1, Kaveh Kavousi *2, Mohammadali Mazloumi3, Ali Mohammad BanaeiMoghaddam4, Gholam Ali Kardar5
1Institute of Biochemistry and Biophysics (IBB) University of Tehran
2University of Tehran
3گروه زیست فناوری پزشکی ، دانشکده فناوری های نوین پزشکی دانشگاه علوم پزشکی تهران
4بیوشیمی، موسسه بیوشیمی و بیوفیزیک، دانشگاه تهران، تهران، ایران
5مرکز تحقیقات ایمونولوژی، آسم و آلرژی دانشگاه علوم پزشکی تهران
Abstract:
Various factors are involved in the development of an immune response to factor eight (FVIII) in hemophilia patients, among which T cell epitopes (factor eight-derived peptides) play the most important role. Mutations in T cell epitopes while maintaining the structure of factor eight can be a good solution to this unwanted immune response. The need for this research is due to the fact that Treatment of patients in whose body exogenous factor eight exhibits an immune response (inhibitor patients) It is much more difficult than hemophiliacs who do not have this problem. Also, the treatment methods and strategies that exist today have a lot of costs and complications which are not very successful. In this study, we build a model using machine learning algorithms to predict the immunogenicity of immunogenic peptide sequences.We first used compositional features to predict the peptides that bind to class II molecules of the Major Histocompatibility Complex (MHCII). Including: AAC, APAAC, CKSAAGP, CTDC, CTDT, DPC, GDPC and PAAC. We then evaluated these features with some classifiers such as Random Forrest, Support Vector Machine, Decision Tree, Naive Bayes Classifier, XGBoost, and Perceptron, and the accuracy for each of these classifiers was, 0.62,0.51,0.35,0.54,0.58,0.66 respectively. In the next step some of the best features were selected. The accuracy of the classifiers including Random Forest (0.51), Support Vector Machine (0.59), Decision Tree (0.44), Naive Bayes (0.58), XGBoost (0.51) and Perceptron (0.46) were not good enough. The data that used in this method cover all types of human HLA-DR. Also, the features used were the most up-to-date features related to peptide-MHCII Binding. We hope to achieve higher accuracy by enhancing them.
Keywords:
T cell epitopes, immunogenic peptides, prediction, machine learning, FVIII
Status : Paper Accepted (Poster Presentation)