1st International and 10th National Iranian Conference on Bioinformatics
Progressively Multiple Protein Sequences Alignment Using Intuitionistic Fuzzy Approach
Paper ID : 1143-ICB10
Authors:
Behzad بهزاد Hajieghrari *1, Naser Farrokhi2, Mojahed Kamalizadeh3
1گروه بیوتکنولوژی، دانشکده کشاورزی دانشگاه جهرم
2گروه زیست شناسی سلولی مولکولی -دانشکده علوم و فناوری زیستی-دانشگاه شهید بهشتی
3گروه بیوتکنولوژی-دانشکده کشاورزی دانشگاه جهرم
Abstract:
The progressive alignment approach constitutes one of the most convenient and effective ways to align multiple sequences. Atanassov modifies the fuzzy set by proposing an intuitionistic concept. In this concept, a related degree and a non-relationship degree is detected. However, the sum of the relationship and non-relationship degrees is bare than or equal to one. As a result, the hesitancy degree equals one minus from the whole of the related degree and the non-relationship degree. Traditional hierarchical clustering algorithms are employed broadly to cluster numerical information. Some modifications need the formal hierarchical clustering algorithms to deal with the data expressed in an intuitionistic fuzzy set. In this study, we proposed a measure of the distance between pairs of protein sequences by intuitionistic fuzzy approach and construction merge the tree by a hierarchical grouping to improve the sensitivity of progressive multiple sequence alignments. Both unweighted paired groups with arithmetic mean (UPGMA)- and neighbor-joining (NJ)-based hierarchical clustering were employed to evaluate the algorithm performance. The merging continues until one group remains. Ultimately, the sequences have progressively aligned according to the branching order in the merge tree. Reference sequences from BALiBASE 4.0 (hand-aligned), PREFAB 4.0 (structurally supervised), and OXBench were employed to evaluate the method performance. We computed the quality of the alignments using the Friedman ranks test in the P<0.05 statistical significance level, not only in terms of SP-and C- but also TC-score. The UPGMA and NJ-groping of the proposed method perform well in improving the alignment sensitivity and accuracy. Comparatively, where the sequences are not close to each other, the NJ clustering model has more reliable performance. However, UPGMA clustering was the top performer in aligning all the BALiBASE reference sequence sets. The drawback of this approach is its higher time complexity with similar memory usage to the ClustalW.
Keywords:
Intuitionistic fuzzy approach; Multiple sequence alignment; Progressive alignment.
Status : Paper Accepted (Poster Presentation)