| Title: | 'C++' Implementations of Functional Enrichment Analysis |
|---|---|
| Description: | Fast implementations of functional enrichment analysis methods using 'C++' via 'Rcpp'. Currently provides Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA). The multilevel GSEA algorithm is derived from the 'fgsea' package. Methods are described in Subramanian et al. (2005) <doi:10.1073/pnas.0506580102> and Korotkevich et al. (2021) <doi:10.1101/060012>. |
| Authors: | Guangchuang Yu [aut, cre] |
| Maintainer: | Guangchuang Yu <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 0.1.4 |
| Built: | 2026-05-08 09:04:43 UTC |
| Source: | https://github.com/cran/enrichit |
Class "compareClusterResult" This class represents the comparison result of gene clusters by GO categories at specific level or GO enrichment analysis.
compareClusterResultcluster comparing result
geneClustersa list of genes
funone of groupGO, enrichGO and enrichKEGG
gene2Symbolgene ID to Symbol
keytypeGene ID type
readablelogical flag of gene ID in symbol or not.
.callfunction call
termsimSimilarity between term
methodmethod of calculating the similarity between nodes
drdimension reduction result
organismorganism
Guangchuang Yu https://yulab-smu.top
Common parameters for enrichit functions
geneList |
A named numeric vector of gene statistics (e.g., log fold change), ranked in descending order. |
gene_sets |
A named list of gene sets. Each element is a character vector of genes. |
nPerm |
Number of permutations for p-value calculation (default: 1000). |
exponent |
Weighting exponent for enrichment score (default: 1.0). |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
P-value cutoff. |
pAdjustMethod |
P-value adjustment method (e.g., "BH"). |
verbose |
Logical. Print progress messages. |
gson |
A GSON object containing gene set information. |
method |
Permutation method. |
adaptive |
Logical. Use adaptive permutation. |
minPerm |
Minimum permutations for adaptive mode. |
maxPerm |
Maximum permutations for adaptive mode. |
pvalThreshold |
P-value threshold for early stopping. |
Class "enrichResult" This class represents the result of enrichment analysis.
resultenrichment analysis
pvalueCutoffpvalueCutoff
pAdjustMethodpvalue adjust method
qvalueCutoffqvalueCutoff
organismonly "human" supported
ontologybiological ontology
geneGene IDs
keytypeGene ID type
universebackground gene
gene2Symbolmapping gene to Symbol
geneSetsgene sets
readablelogical flag of gene ID in symbol or not.
termsimSimilarity between term
methodmethod of calculating the similarity between nodes
drdimension reduction result
Guangchuang Yu https://yulab-smu.top
mapping gene ID to gene Symbol
EXTID2NAME(OrgDb, geneID, keytype, toType = "SYMBOL")EXTID2NAME(OrgDb, geneID, keytype, toType = "SYMBOL")
OrgDb |
OrgDb |
geneID |
entrez gene ID |
keytype |
keytype |
toType |
ID type of the output |
gene symbol
Guangchuang Yu https://yulab-smu.top
geneID generic
geneID(x)geneID(x)
x |
enrichResult object |
'geneID' return the 'geneID' column of the enriched result which can be converted to data.frame via 'as.data.frame'
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- DOSE::enrichDO(de) geneID(x)data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- DOSE::enrichDO(de) geneID(x)
geneInCategory generic
geneInCategory(x)geneInCategory(x)
x |
enrichResult |
'geneInCategory' return a list of genes, by spliting the input gene vector to enriched functional categories
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- DOSE::enrichDO(de) geneInCategory(x)data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- DOSE::enrichDO(de) geneInCategory(x)
Perform Gene Set Enrichment Analysis (GSEA) using a ranked gene list.
gsea( geneList, gene_sets, minGSSize = 10, maxGSSize = 500, nPerm = 1000, exponent = 1, method = "multilevel", adaptive = FALSE, minPerm = 101, maxPerm = 1e+05, pvalThreshold = 0.1, eps = 1e-10, sampleSize = 101, seed = FALSE, nPermSimple = 1000, scoreType = "std", verbose = TRUE )gsea( geneList, gene_sets, minGSSize = 10, maxGSSize = 500, nPerm = 1000, exponent = 1, method = "multilevel", adaptive = FALSE, minPerm = 101, maxPerm = 1e+05, pvalThreshold = 0.1, eps = 1e-10, sampleSize = 101, seed = FALSE, nPermSimple = 1000, scoreType = "std", verbose = TRUE )
geneList |
A named numeric vector of gene statistics (e.g., log fold change), ranked in descending order. |
gene_sets |
A named list of gene sets. Each element is a character vector of genes. |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
nPerm |
Number of permutations for p-value calculation (default: 1000). |
exponent |
Weighting exponent for enrichment score (default: 1.0). |
method |
Permutation method. |
adaptive |
Logical. Use adaptive permutation. |
minPerm |
Minimum permutations for adaptive mode. |
maxPerm |
Maximum permutations for adaptive mode. |
pvalThreshold |
P-value threshold for early stopping. |
eps |
Epsilon for multilevel methods (default: 1e-10). Sets the smallest p-value that can be estimated. |
sampleSize |
Sample size for multilevel methods (default: 101). |
seed |
Random seed for reproducibility (default: FALSE). If FALSE, a random seed is generated. |
nPermSimple |
Number of permutations for the simple method (default: 1000). |
scoreType |
Type of enrichment score calculation: "std", "pos", "neg" (default: "std"). |
verbose |
Logical. Print progress messages. |
A data.frame with columns:
ID: Gene set name
enrichmentScore: Enrichment Score
NES: Normalized Enrichment Score
pvalue: Empirical p-value from permutation test
setSize: Size of the gene set (number of genes found in geneList)
nPerm: (adaptive mode only) Actual number of permutations used
rank: Rank at which the maximum enrichment score is attained
leading_edge: Leading edge statistics (tags, list, signal)
core_enrichment: Genes in the leading edge, separated by '/'
# Example data stats <- rnorm(1000) names(stats) <- paste0("Gene", 1:1000) stats <- sort(stats, decreasing = TRUE) gs1 <- paste0("Gene", 1:50) gs2 <- paste0("Gene", 500:550) gene_sets <- list(Pathway1 = gs1, Pathway2 = gs2) # Use default fixed permutation method result <- gsea(geneList=stats, gene_sets=gene_sets, nPerm=100) # Use adaptive permutation for more accurate p-values result_adaptive <- gsea(geneList=stats, gene_sets=gene_sets, adaptive=TRUE)# Example data stats <- rnorm(1000) names(stats) <- paste0("Gene", 1:1000) stats <- sort(stats, decreasing = TRUE) gs1 <- paste0("Gene", 1:50) gs2 <- paste0("Gene", 500:550) gene_sets <- list(Pathway1 = gs1, Pathway2 = gs2) # Use default fixed permutation method result <- gsea(geneList=stats, gene_sets=gene_sets, nPerm=100) # Use adaptive permutation for more accurate p-values result_adaptive <- gsea(geneList=stats, gene_sets=gene_sets, adaptive=TRUE)
generic function for gene set enrichment analysis
gsea_gson( geneList, gson, nPerm = 1000, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", method = "multilevel", adaptive = FALSE, minPerm = 101, maxPerm = 1e+05, pvalThreshold = 0.1, verbose = TRUE, ... )gsea_gson( geneList, gson, nPerm = 1000, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", method = "multilevel", adaptive = FALSE, minPerm = 101, maxPerm = 1e+05, pvalThreshold = 0.1, verbose = TRUE, ... )
geneList |
A named numeric vector of gene statistics (e.g., log fold change), ranked in descending order. |
gson |
A GSON object containing gene set information. |
nPerm |
Number of permutations for p-value calculation (default: 1000). |
exponent |
Weighting exponent for enrichment score (default: 1.0). |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
P-value cutoff. |
pAdjustMethod |
P-value adjustment method (e.g., "BH"). |
method |
Permutation method. |
adaptive |
Logical. Use adaptive permutation. |
minPerm |
Minimum permutations for adaptive mode. |
maxPerm |
Maximum permutations for adaptive mode. |
pvalThreshold |
P-value threshold for early stopping. |
verbose |
Logical. Print progress messages. |
... |
Additional parameters passed to gsea() |
gseaResult object
Guangchuang Yu
Class "gseaResult" This class represents the result of GSEA analysis
resultGSEA anaysis
organismorganism
setTypesetType
geneSetsgeneSets
geneListorder rank geneList
keytypeID type of gene
permScorespermutation scores
paramsparameters
gene2Symbolgene ID to Symbol
readablewhether convert gene ID to symbol
drdimension reduction result
Guangchuang Yu https://yulab-smu.top
Calculate GSEA Running Enrichment Scores
gseaScores(geneList, geneSet, exponent = 1, fortify = FALSE)gseaScores(geneList, geneSet, exponent = 1, fortify = FALSE)
geneList |
a named numeric vector of gene statistics (e.g., t-statistics or log-fold changes), sorted in decreasing order. |
geneSet |
a character vector of gene IDs belonging to the gene set. |
exponent |
a numeric value defining the weight of the running enrichment score. Default is 1. |
fortify |
logical. If TRUE, returns a data frame with columns |
If fortify = TRUE, a data frame containing the running enrichment scores and positions.
If fortify = FALSE, a numeric value representing the Enrichment Score (ES).
Guangchuang Yu
filter enriched result by gene set size or gene count
gsfilter(x, by = "GSSize", min = NA, max = NA)gsfilter(x, by = "GSSize", min = NA, max = NA)
x |
instance of enrichResult or compareClusterResult |
by |
one of 'GSSize' or 'Count' |
min |
minimal size |
max |
maximal size |
update object
Guangchuang Yu
Perform over-representation analysis using hypergeometric test (Fisher's exact test).
ora(gene, gene_sets, universe)ora(gene, gene_sets, universe)
gene |
Character vector of differentially expressed genes (or gene list of interest). |
gene_sets |
A named list of gene sets. Each element is a character vector of genes. |
universe |
Character vector of background genes (e.g., all genes in the platform). |
A data.frame with columns:
GeneSet |
Gene set name |
SetSize |
Number of genes in the gene set (intersected with universe) |
DEInSet |
Number of differentially expressed genes in the gene set |
DESize |
Total number of differentially expressed genes in universe |
PValue |
Raw p-value from hypergeometric test |
# Example data de_genes <- c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5") all_genes <- paste0("Gene", 1:1000) gs1 <- paste0("Gene", 1:50) gs2 <- paste0("Gene", 51:150) gs3 <- paste0("Gene", 151:300) gene_sets <- list(Pathway1 = gs1, Pathway2 = gs2, Pathway3 = gs3) result <- ora(gene=de_genes, gene_sets=gene_sets, universe=all_genes) head(result)# Example data de_genes <- c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5") all_genes <- paste0("Gene", 1:1000) gs1 <- paste0("Gene", 1:50) gs2 <- paste0("Gene", 51:150) gs3 <- paste0("Gene", 151:300) gene_sets <- list(Pathway1 = gs1, Pathway2 = gs2, Pathway3 = gs3) result <- ora(gene=de_genes, gene_sets=gene_sets, universe=all_genes) head(result)
interal method for enrichment analysis
ora_gson( gene, pvalueCutoff, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, gson )ora_gson( gene, pvalueCutoff, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, gson )
gene |
a vector of entrez gene id. |
pvalueCutoff |
P-value cutoff. |
pAdjustMethod |
P-value adjustment method (e.g., "BH"). |
universe |
background genes, default is the intersection of the 'universe' with genes that have annotations.
Users can set |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
cutoff of qvalue |
gson |
A GSON object containing gene set information. |
using the hypergeometric model
A enrichResult instance.
Guangchuang Yu https://yulab-smu.top
mapping geneID to gene Symbol
setReadable(x, OrgDb, keyType = "auto", toType = "SYMBOL")setReadable(x, OrgDb, keyType = "auto", toType = "SYMBOL")
x |
enrichResult Object |
OrgDb |
OrgDb |
keyType |
keyType of gene |
toType |
ID type of the output |
enrichResult Object
Guangchuang Yu
show method for gseaResult instance
show method for enrichResult instance
show(object) show(object)show(object) show(object)
object |
A |
message
message
Guangchuang Yu https://yulab-smu.top
summary method for gseaResult instance
summary method for enrichResult instance
summary(object, ...) summary(object, ...)summary(object, ...) summary(object, ...)
object |
A |
... |
additional parameter |
A data frame
A data frame
Guangchuang Yu https://yulab-smu.top