Skip to content

Suggestions/hints for doing GSEA #820

@desmodus1984

Description

@desmodus1984

Hi,
I have tested some genes for positive selection, and I would like to do some enrichment analysis on those using as background all the genes tested for selection.
I tried enrichment analysis with GOfuncR and I got one biological process out, when I tried enricher in clusterprofiler, I was surprised that I got no biological/molecular pathway. As suggested in the manual, since I did manual direct functional annotation I used buildGOmap. Since my gene universe is smaller than the total number of genes I filtered to retain the tested for selection, and I also retained only the relevant/present GO:terms.

Surprisingly, and shockingly, I got no significant pathway/process with enricher. I read that perhaps, instead of using enricher I should do gene set enrichment with GSEA.

I read that I need a ranked list with all the genes, my issue is what I can use as "ranking parameter". I did the selection analysis with hyphy aBSREL, and it determines selection per branch using p-values, and I thought that perhaps Corrected p-values would be sufficient to "rank" the genes. It outputs also the per-branch the statistic of the LRT test.

Perhaps I should optimize the parameters, for instance, the "gene universe" is 13445, the gene set is 92, but only 86 have annotations.
I ran an analysis with p/q-value = 1 to test, and I saw that some terms have low gene ratio (2-3/92).

> head(top_hits_Hydlep_cluster_Sel_TERM, 10)
                   ID                                          Description GeneRatio   BgRatio RichFactor FoldEnrichment    zScore
GO:0035608 GO:0035608                              protein deglutamylation      3/92   7/13445 0.42857143      62.631988 13.538066
GO:0035609 GO:0035609                   C-terminal protein deglutamylation      3/92   7/13445 0.42857143      62.631988 13.538066
GO:0071499 GO:0071499      cellular response to laminar fluid shear stress      3/92  17/13445 0.17647059      25.789642  8.489025
GO:0018410 GO:0018410           C-terminal protein amino acid modification      3/92  19/13445 0.15789474      23.074943  7.992301
GO:0120222 GO:0120222                 regulation of blastocyst development      2/92   5/13445 0.40000000      58.456522 10.665800
GO:0018200 GO:0018200                  peptidyl-glutamic acid modification      3/92  24/13445 0.12500000      18.267663  7.027737
GO:1904375 GO:1904375 regulation of protein localization to cell periphery     10/92 439/13445 0.02277904       3.328959  4.118044
GO:0002536 GO:0002536  respiratory burst involved in inflammatory response      3/92  31/13445 0.09677419      14.142707  6.080724
GO:0004181 GO:0004181                     metallocarboxypeptidase activity      3/92  31/13445 0.09677419      14.142707  6.080724
GO:0006276 GO:0006276                                  plasmid maintenance      2/92   8/13445 0.25000000      36.535326  8.344934
                 pvalue   p.adjust     qvalue
GO:0035608 1.063922e-05 0.09103983 0.09103983
GO:0035609 1.063922e-05 0.09103983 0.09103983
GO:0071499 1.966897e-04 1.00000000 1.00000000
GO:0018410 2.775172e-04 1.00000000 1.00000000
GO:0120222 4.569984e-04 1.00000000 1.00000000
GO:0018200 5.654846e-04 1.00000000 1.00000000
GO:1904375 8.199401e-04 1.00000000 1.00000000
GO:0002536 1.213131e-03 1.00000000 1.00000000
GO:0004181 1.213131e-03 1.00000000 1.00000000
GO:0006276 1.262573e-03 1.00000000 1.00000000
                                                                                                                  geneID Count
GO:0035608                                                                              GNX-029059/GNX-037418/GNX-034376     3
GO:0035609                                                                              GNX-029059/GNX-037418/GNX-034376     3
GO:0071499                                                                              GNX-037622/GNX-025087/GNX-037126     3
GO:0018410                                                                              GNX-029059/GNX-037418/GNX-034376     3
GO:0120222                                                                                         GNX-029059/GNX-037418     2
GO:0018200                                                                              GNX-029059/GNX-037418/GNX-034376     3
GO:1904375 GNX-030347/GNX-022077/GNX-025087/GNX-025668/GNX-024960/GNX-035899/GNX-022108/GNX-024787/GNX-011804/GNX-016902    10
GO:0002536                                                                              GNX-019781/GNX-022624/GNX-023678     3
GO:0004181                                                                              GNX-029059/GNX-037418/GNX-034376     3
GO:0006276                                                                                         GNX-022108/GNX-023602     2

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions