PAVER

Overview

PAVER is an R package for interpreting and visualizing pathway analysis results. PAVER uses embedding representations and hierarchical clustering to identify and characterize similar pathways in a given dataset. PAVER is designed to be used with any GSEA tool that produces a ranked list of pathways, like GSEA or Enrichr.

Data Preparation

PAVER requires 3 inputs: a ranked list of pathways, a matrix of pathway embeddings, and a mapping of pathway IDs to pathway names. The following code chunk demonstrates how to prepare these inputs using the example data provided in the PAVER package.

library(PAVER)

input <- gsea_example

embeddings <- readRDS(url("https://github.com/willgryan/PAVER_embeddings/raw/main/2023-03-06/embeddings_2023-03-06.RDS"))

term2name <- readRDS(url("https://github.com/willgryan/PAVER_embeddings/raw/main/2023-03-06/term2name_2023-03-06.RDS"))

PAVER_result <- prepare_data(input, embeddings, term2name)

Identifying and Naming Pathways Clusters

After preparing your data, PAVER can generate a set of pathway clusters and identify the most representative pathway (theme) for each cluster. The following code chunk demonstrates how to generate pathway clusters using the example data provided in the PAVER package. To constrain the pathway clustering, we pass the following arguments to (dynamicTreeCut)[https://cran.r-project.org/package=dynamicTreeCut]. Increasing minClusterSize will result in fewer clusters, while increasing maxCoreScatter will result in more clusters.

minClusterSize <- 3
maxCoreScatter <- 0.33
minGap <- (1 - maxCoreScatter) * 3 / 4
PAVER_result <- generate_themes(
  PAVER_result,
  maxCoreScatter = maxCoreScatter,
  minGap = minGap,
  minClusterSize = minClusterSize
)

Visualization

PAVER offers different visualizations for exploring and interpreting pathway clusters, described below.

Theme Plot

The theme plot is a scatter plot showing all pathways in the dataset, colored by theme. The theme plot is useful for identifying pathways that are similar to each other, and for identifying pathways that are outliers. The theme plot can be generated using the PAVER_theme_plot function.

PAVER_theme_plot(PAVER_result)

Interpretation Plot

The interpretation plot is a scatter plot showing all clusters in the dataset, colored by theme. The interpretation plot is useful for identifying clusters that are similar to each other, and for identifying clusters that are outliers. The interpretation plot can be generated using the PAVER_interpretation_plot function.

PAVER_interpretation_plot(PAVER_result)

Regulation Plot

The regulation plot is similar to the theme plot, except each pathway is colored by whether it is upregulated or downregulated. The regulation plot is useful for qualitatively identifying differences in pathway regulation across different pathway analyses. The regulation plot can be generated using the PAVER_regulation_plot function.

PAVER_regulation_plot(PAVER_result)

Heatmap Plot

The heatmap plot shows the enrichment scores for each pathway in each cluster. The heatmap plot is useful for quantitatively identifying identifying differences in pathway regulation across different pathway analyses. The heatmap plot can be generated using the PAVER_hunter_plot function.

PAVER_hunter_plot(PAVER_result)

Combined Plot

The combined plot simply combines all four plots for ease of use and quick inspection of output. The combined plot can be generated using the PAVER_combined_plot function.

PAVER_combined_plot(PAVER_result)

Export Results

PAVER offers a function for exporting the clustering results of a PAVER analysis.

clustered_input <- PAVER_export(PAVER_result)