1st International and 10th National Iranian Conference on Bioinformatics
Similarity detection between modern human genome and their ancestors DNA sequences by Deep Learning
Paper ID : 1474-ICB10
Authors:
Keivan Naseri *1, Mahboobeh Golchinpour2
1University of Tehran
2Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran
Abstract:
Neanderthals were a species of human that lived in Europe and parts of western Asia, Central Asia, and northern China (Altai). The first signs of early Neanderthals date back to about 350,000 years ago in Europe. There is ample genetic evidence that modern humans had sex with Neanderthals, Denisovans, and other ancient relatives.
In this study, we used in-depth learning to identify areas of Neanderthal intrusion in the modern human genome. Recent methods, such as the Markov latent model (HMM) to find the Neanderthal effect on the genome, are a memoryless model that does not consider the relationship between nucleotide distances along DNA sequences. Therefore, we used deep learning power to process crude genomic sequences and nucleotide long-term memory in genomes with short-term long-term memory (LSTM).
This model works better than linear models such as support vector machines (SVMs) or simple Bayesian classifiers, so we recommend the LSTM method for analyzing ancient biological data.
We first converted DNA sequences into k-mers with limited space. We then used the Bag Of Words model to compare k-mers frequencies between sequences inherited from Neanderthals and sequences from weak ancient ancestors. Finally, when classifying sequences, we learned Word Embeddings with a sequential model with the Keras Embeddings layer. The model achieved an accuracy of 87.6% in the data set that classifies the input Neanderthal sequences against the discharged source.
It should be noted that for the near future, our vision is to find similarities between modern humans and their ancestors in the genomic data of skin patients using the LSTM model.
Keywords:
Neanderthal genome; DNA-Sequencing; Deep Learning; SVM; LSTM
Status : Paper Accepted (Poster Presentation)