1st International and 10th National Iranian Conference on Bioinformatics
The Application of Feature weighting models for Identification of key genes associated with the Transcriptomic Response to Drought Stress in Populus Species
Paper ID : 1198-ICB10
Authors:
Sahar Akrami *, Ahmad Tahmasebi, Ali Niazi
Institute of Biotechnology, Shiraz University, Shiraz, Iran
Abstract:
Introduction: Poplar varieties are planted in short rotation coppice, and are supposed to show high biomass production. Drought is a very important abiotic stressor. High throughput gene expression technologies provide valuable information about transcriptome. Feature weighting models are also known as attractive strategies to gain new biological insights. At the transcriptome level, the algorithms for identifying key signatures related to environmental stress have not been applied in Populus. In this study, we used the large transcriptome data to gain comprehensive view of drought stress response in Populus. Method: The array expression datasets retrieved from GEO and ArrayExpress. RMA algorithm was used for background correction and normalization of gene expression data by Affy R package. Finally, an empirical Bayes method was performed to correct non-biological differences and remove batch effects from gene expression datasets using ComBat function in the SVA Rpackage. Feature selection algorithms were employed to reduce the dimensionality of expression dataset and identify the gene expression features. We implemented various attribute weighting algorithms include SVM, Chi Squared, Information Gain, Information Gain Ratio, Deviation, Gini Index, Uncertainty, Relief, and PCA to identify the most important genes using RapidMiner Studio software. Result: In total 13 microarray datasets consisting of 324 arrays were considered. After pre-processing and removing the batch effect, the normalized datasets were obtained for further downstream analysis. In total, 648 genes were identified as the most important features by at least one of the models. Functional annotation showed that the feature genes were enriched in response to abiotic stimulus and MAPK signaling pathway. In addition, a lot of genes were related to secondary metabolic process. Interestingly, the seven methods selected auxin response factor 2-like and PYL4-like as important features. Conclusion: Our analysis suggests that ARF2-like and PYL4-like genes can be potential candidates for screening and breeding purposes in Populus.
Keywords:
Populus Species; Feature weighting models; SVM; PCA
Status : Paper Accepted (Poster Presentation)