An interactive R Shiny application for RNA-seq data analysis built with Spark backend support. It offers high-performance DEG analysis, real-time visualization (volcano plots, heatmaps), PCA, and functional enrichment (GO, KEGG, GSEA) in a modular, reactive framework.
This platform is designed to integrate seamlessly with the Atgenomix Seqslab platform. Spark connections and delta table queries are executed through the Seqslab DataHub, making the app ideal for large-scale biomedical applications in cloud-based environments.
RNAseqShinyAppSpark allows users to explore and analyze large-scale RNA-seq expression datasets directly from Spark-based delta tables. The app is optimized for both public demo usage and internal data upload scenarios, with full support for asynchronous computation, gene filtering, and biological insight discovery through visual analytics.
- Interactive DEG Analysis – Built-in pipelines for differential expression using
edgeR
andlimma
, with adjustable fold change and p-value thresholds. - Dynamic Visualization – Real-time rendering of volcano plots, violin plots, scatter plots, and heatmaps.
- Gene Set Enrichment – GO (BP, CC, MF), KEGG, and GSEA visualization using
clusterProfiler
andenrichplot
. - PCA Explorer – Perform and visualize Principal Component Analysis with optional clustering.
- Modular UI – Shiny modules for plot controls, filtering, and enrichment make extension easy.
- Asynchronous Execution – Responsive UI powered by
future_promise
and task progress popups.
The app manages all Spark and data access processes behind the scenes, so users can focus entirely on analysis and visualization.
- R >= 4.1
- Apache Spark >= 3.0 with delta table support
- The following R packages are either required or suggested based on the
DESCRIPTION
file:
install.packages(c(
"shiny", "sparklyr", "promises", "future", "DT", "ggplot2", "shinyjs",
"shinycssloaders", "InteractiveComplexHeatmap", "clusterProfiler",
"enrichplot", "org.Hs.eg.db", "org.Mm.eg.db", "MultiAssayExperiment", "SummarizedExperiment",
"factoextra", "ggiraph", "shinydashboard", "pcaPP", "dplyr", "tidyr", "viridis",
"reshape2", "stringr", "readr", "readxl", "tibble", "RColorBrewer", "pheatmap",
"ggrepel", "bslib"
))
install.packages(c(
"airway", "edgeR", "limma", "DESeq2", "ComplexHeatmap"
))
RNAseqShinyAppSpark()
The application will automatically connect to the configured Spark cluster and initialize required datasets via Seqslab DataHub.
- Select Spark database (
*_cus_username
). - Automatically loads:
normcounts_*
table (normalized expression)exacttest_*
table (DEG results)coldata_*
table (sample metadata)
- Visualize:
- DEG tables and download CSV
- Volcano + violin + scatter plots
- Heatmaps by gene list
- GO/KEGG enrichment
- GSEA analysis (up/down-regulated)
.
├── RNAseqShinyApp.R # Main Shiny app (UI + server)
├── mod_enrichment.R # GSEA module (GO/KEGG support)
├── mod_volcanoplot.R # Volcano, scatter, violin plot module
├── mod_sample_selection.R # Sample filtering and gene selector module
├── mod_progress_popup.R # Popup progress UI module
├── mod_spark.R # Spark connection + DB browser module
├── plot_enrichment.R # GO/KEGG enrichment plotting
├── plot_volcano.R # Static and interactive volcano plotting utils
├── plot_heatmap.R # Heatmap generation from MAE
├── plot_pca.R # PCA plot with clustering
├── spark_query.R # Async Spark query and pattern matching
├── utils.R # GTF conversion, expr comparison, helper functions
We welcome contributions via pull requests or issues. To contribute:
- Fork this repo
- Create a new branch (
feature/your-feature
) - Test your changes
- Submit a PR
This project is licensed under the Apache License 2.0.
Copyright © 2025 Charles Chuang, atgenomix.
See the LICENSE file for details.
Please create a GitHub Issue for bug reports, questions, or feature requests.