1st International and 10th National Iranian Conference on Bioinformatics
Identification of genes affecting in Non – small cell lung cancer using machine learning techniques and bioinformatics tools
Paper ID : 1287-ICB10
Authors:
Marzie Shadpirouz *, مرتضی هادی زاده1, صادق رئوفی2
1Physiology Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences, Kerman, Iran
2Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
Abstract:
Background & Objective: Lung cancer is the second most common cancer after breast cancer and the main trigger cause of death in women and men globally [1]. Non-small cell lung cancer (NSCLC) accounts for 85% of lung cancers [2]. Because early detection of cancer plays a vital role in treatment, this study sought to identify genes that could potentially be effective in early Non – small cell lung cancer screening.
Material & Methods: Firstly, three micro-array datasets (GSE1987, GSE44077, and GSE74706) related to non-small cell lung cancer were downloaded from the Gene Expression Omnibus (GEO). After integrating and bath effect removal of these datasets, Lasso logistic regression was used to extract important genes. Processing of all data was performed using the R statistical programming language. Also, Gene Set Enrichment Analysis (GSEA) was performed by Metascape bioinformatics tool to identify KEGG pathways and Gene Ontology Enrichment.
Results: Finally, the introduced model selected 15 genes (ACVRL1, ANKRD1, C11orf80, CA4, EIF1B, FGF2, GRK5, KLHL18, LILRA1, MME, SDC1, STX11, TMOD1, TTN, WIF1). The accuracy level of the model was 100%. These genes are related to the Wnt signaling pathway, which plays a significant role in NSCLC [3]. Until now, seven genes (47%) have been reported in biological studies as genes effective in NSCLC [4-10].
Conclusion: With the use of machine learning techniques and bioinformatics tools, this study has introduced new genes that can serve as the target of early diagnosis or treatment of NSCLC.
Keywords:
Non – small cell lung cancer; Gene expression; Gene selection; Machine learning; Lasso logistic regression.
Status : Paper Accepted (Poster Presentation)