1st International and 10th National Iranian Conference on Bioinformatics
Nominal p-value: An Inaccurate Predictor of the Gene Significance
Paper ID : 1150-ICB10
Authors:
maryam maghsoudi *
Abstract:
Abstract A recent discovery in cancer research revealed that many of the randomly selected genes in some cancer types are significantly associated with patients' survival time. Studies show that this phenomenon in breast cancer is influenced by the activity of the proliferation signature and by eliminating the effect of this signature from the expression data, the association of a random gene set with survival time is dramatically reduced. In another study, it has been demonstrated that using a proliferation signature in other breast cancer datasets does not remove this association. We argue that the nominal p-value is not a good estimator for determining the significance of a random gene sets’ association with survival time. As a result, we define a function that can distinguish the difference between a random gene set and published signatures. To do this, we used a random gene set denoted by X, to divide samples into two different groups using principal component analysis (PCA), and the groups’ survival time were compared to each other using a log-rank test to obtain a nominal p-value (pvalue(X)). If the nominal p-value was significant, we used significance analysis of microarray (SAM) to find genes, denoted by Y, that are differentially expressed between these two groups. Using a method similar for random gene set X, we obtain p-value for set Y (pvalue(Y)) and define the function as fleft(Xright)=frac{left|Xcup Yright|+1}{left|Xcap Yright|^2+1}astmax{left(pvalue(X),pvalue(Y)right)}. We consider the gene set X to be significantly associated with survival time if f(x) is less than 0.05. By utilizing this function on 34 different cancer types, the result show that randomly selected genes that are biologically meaningless and unrelated to cancer progression and metastasis are no longer considered to be significantly associated with patient survival time.
Keywords:
breast cancer; nominal p-value; principal component analysis; survival time
Status : Paper Accepted (Poster Presentation)