1st International and 10th National Iranian Conference on Bioinformatics
A diffusion kernel-based approach for protein domain identification
Paper ID : 1435-ICB10
Authors:
Amirali Zandieh1, Seyed Peyman Shariatpanahi1, Mohammad Taheri-Ledari2, Changiz Eslahchi *3
1Department of Biophysics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
2Laboratory of Complex Biological Systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
3Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
Abstract:
It is almost half a century since the concept of protein domain, as compact and recurring units that are able to fold and function independently, was introduced. Nevertheless, the inherent ambiguity of the definition besides the increasing number of newly solved structures keeps the accurate automated methods in high demand. Contrary to the majority of the state-of-the-art methods, we employed enhanced measures of proximity between amino acids rather than developing context-specific clustering algorithms. Here, the power of kernel functions to separate structural domains in their corresponding Hilbert spaces is investigated. For this purpose, utilizing four different diffusion kernels on protein graphs, a novel pipeline for protein domain assignment is developed. The result of the presented method on commonly used benchmark data sets shows a marginally better performance compared to the best available methods based on two different metrics. Moreover, by offering alternative partitionings, our method answers the problem of subjectivity in protein domain definition. The high prediction accuracy of the approach reveals the diffusion kernels' potential to split entangled structures of complex proteins. In addition to out-competing other methods by merely employing general (rather than context-specific) clustering algorithms, our pipeline provides the versatility to implement other graph node kernels that can potentially boost its performance.
Keywords:
Protein structure; Graph node kernel; Protein domain assignment; Clustering; Diffusion kernel
Status : Paper Accepted (Oral Presentation)