Introduction

It is well established that transcription of many genes occurs in bursts1,2, resulting in large mRNA number variability between cells, even if they are from the same underlying state. The importance of cell-to-cell gene expression variability in determining and shaping cell function has been recognized for some time3,4,5. Increased variability in gene expression has been associated with the differentiation of embryonic stem cells6, reprogramming of induced pluripotent stem cells7, circadian rhythm regulation8,9, differentiation10,11, and aging12,13. In the single-cell omics era, investigators now have a higher resolution tool such as single-cell RNA sequencing (scRNA-seq) to reveal gene expression variability between cells, allowing us to better understand its role. In a recent study14, we demonstrated that single-cell gene expression variability is intrinsically linked to the function of genes. Individual cells of the same cell type exhibit stochastic gene expression beyond technical noise. This variability is primarily driven by highly variable genes (HVGs), which may or may not be highly expressed in all cells but show significant variation between cells and predominantly contribute to cell type-specific functions. Given that variability is central to cellular function, as our findings suggest, developing analytical methods and computational tools for single-cell data should take a variability-centric view. Unfortunately, in practice, the importance of variability is often overlooked. Many single-cell data analytical frameworks are mean-centric and neglect gene expression variability by design. Significant refinement is required in many methods developed under non-variability-centric principles.

Differential expression (DE) analysis has been a mainstay of gene expression studies since the days of microarray and bulk RNA-seq. DE analysis focuses on identifying genes that are up- or down-regulated (with increased or decreased expression) across conditions, using a mean-difference approach, which is also applied to scRNA-seq data using R packages like DESeq2, EdgeR, and limma15. Despite the development of single-cell-specific DE methods16, these do not consistently outperform bulk-oriented methods17. These methods often consider cell‐to‐cell variability and dropout as mere technical noise, overlooking the crucial biological insights embedded within this variability. Recent studies have demonstrated that dropout events can be as biologically informative as expression levels18,19, further suggesting that DE methods may miss key aspects of gene expression regulation. Moreover, inconsistent identification of DE genes across existing methods20, underscores a need for approaches that capture the full spectrum of biologically relevant expression variability.

Here, we advocate for moving beyond methods that focus solely on detecting mean expression differences in scRNA-seq data to embrace gene expression variability as a crucial dimension of cellular biology. We acknowledge the central role of gene expression variability, challenging the current dominance of mean-based DE analysis in single-cell studies. This shift becomes particularly urgent in light of recent efforts, such as the study by Squair et al.21 which proposes analyzing pooled, pseudobulk scRNA-seq data to mimic bulk RNA-seq for DE analysis. While such approaches may seem appealing, they disregard scRNA-seq’s primary strength—capturing cell-to-cell variability. Critically, Squair et al.21 use bulk RNA-seq DE results as a reference to evaluate the performance of scRNA-seq DE methods, reinforcing a mean-centric view that considers bulk RNA-seq as the gold standard, despite its limitations in capturing cellular variability. Thus, the assumption that the pseudobulk method accurately recovers bulk RNA-seq expression patterns needs further investigation, as these techniques likely reveal distinct biological phenomena. Mean-centric thinking in molecular biology due to the legacy of microarray and bulk RNA-seq analysis poses a risk of obscuring key biological phenomena when applied to scRNA-seq. Such a focus may miss critical insights that a variability-driven approach could uncover. We propose that prioritizing variability in gene expression offers equally if not more, valuable insights into cellular function by capturing a larger spectrum of biological heterogeneity.

Results

Illustrating the “variation-is-function” concept

The “variation-is-function” hypothesis proposed by Dueck et al.4 posits that cell-to-cell gene expression variability is key to population-level cellular functions. Our previous study14 empirically supports this hypothesis, showing that highly variable genes (HVGs) in homogeneous cellular populations participate in biological processes and provide molecular functions specific to their respective cell types. While both mean-based and variability-based analyses may identify enriched functions, our findings suggest that variability-based approaches may be particularly effective in capturing the functional relevance of within-cell-type variation. Interestingly, most HVGs are not highly expressed, whereas highly expressed genes—such as housekeeping genes and marker genes—tend not to convey cell-type specific functions. Figure 1 illustrates the “variation-is-function” concept through a scatter plot, where genes are represented using three summary statistics: mean, coefficient of variation (CV), and dropout rate. We applied a spline-fit algorithm from scGEAToolbox22 to generate a curve for all genes, identifying HVGs as those that deviate most from this curve. In the simulated data (Fig. 1A), no HVGs emerge, as all genes align closely along the spline-fit curve. By contrast, human embryonic stem cells (hESCs) and human umbilical vein endothelial cells (hUVECs) show gene expression variability with the level increased with differentiation—undifferentiated hESCs display fewer detected HVGs (Fig. 1B), while differentiated HUVECs exhibit higher variability, more HVGs (Fig. 1C). As undifferentiated cells (e.g., hESCs) differentiate into more distinct, mature forms (e.g., HUVECs), more genes are expressed increasingly variably and become HVGs. One of such genes is ANKRD1 (shown in Fig. 1C), encoding the ankyrin repeat domain 1 transcription factor, which localizes to the nucleus of endothelial cells to play its role.

Fig. 1: Single-cell gene expression variability, driven by highly variable genes (HVGs), increases with cell differentiation level.
figure 1

A Simulated data: a 5000-by-3000 gene-by-cell count matrix, randomly generated using a negative binomial model with a fixed dispersion level of 0.1 (“Methods”). Three 2D-scatter plots show the pairwise relationships between gene-level metrics (mean, CV, and dropout rate), which are combined into a 3D-scatter plot where genes (blue dots) align along a characteristic spline-fit curve (red line). Each gene’s position is defined by its mean, CV, and dropout rate. B Human embryonic stem cell (hESC) data from scRNA-seq51 showing gene expression variability in a 3D-scatter plot, with a spline-fit curve (red line) capturing the global pattern of the three metrics. C Human umbilical vein endothelial cell (HUVEC) data representing a differentiated cell type. Compared to undifferentiated hESCs in (B), HUVECs display more HVGs and increased gene expression variability. One HVG, ANKRD1, is highlighted in the 3D-scatter plot as an example.

Identification of differentially variable genes

We introduce the spline-DV method—a nonparametric, model-free framework for analyzing changes in single-cell gene expression variability between two conditions. The aim is to identify genes showing differential variability (DV), which are functionally more active or transcriptionally more engaged in one condition than the other. By focusing on variability changes independent of mean gene expression, DV analysis provides a new, distinct perspective on cellular state transitions across conditions. Figure 2 illustrates the spline-DV method. This approach uses three gene-level metrics—mean expression, CV, and dropout rate as x, y, and z coordinates, respectively, to create a 3D model for estimating gene expression variability. Within this 3D space, two spline-fit curves are generated for two conditions independently (Fig. 2A) and merged and visualized together for comparative assessment (Fig. 2B). For a given gene (e.g., Gene A in Fig. 2A, B), its position in this 3D space is determined by the observed mean expression, CV, and dropout rate. A vector originating at the nearest point on the spline curve to the gene’s position represents the gene’s deviation from the expected expression statistics. For two conditions, two vectors \({\vec{v}}_{1}\) and \({\vec{v}}_{2}\) are defined. The magnitude of each vector was used to quantify the level of deviation, i.e., the expression variability of the gene in each condition. To obtain the level of DV between two conditions, the difference between \({\vec{v}}_{1}\) and \({\vec{v}}_{2}\) is computed (Fig. 2C). The resulting DV vector, \(\vec{{dv}}={\vec{v}}_{2}-{\vec{v}}_{1}\), captures the difference in variability between two conditions. In this way, a fair comparison between conditions is made by first comparing each gene to its expected statistics rather than making a direct comparison. The magnitude of \(\vec{{dv}}\) is called DV score, which is used to quantify the level of DV of a gene between two conditions (Methods). Finally, spline-DV ranked the list of genes based on their DV scores. Top DV genes across conditions are prioritized for further investigation as they likely play crucial roles in the biological processes under study.

Fig. 2: Illustration of the spline-DV method proposed in this study.
figure 2

A Two spline-fit curves are created independently with scRNA-seq data of two conditions, blue for condition 1 and red for condition 2. These spline curves represent the “expected” expression profile for genes based on their statistical properties in each condition. B Two spline-fit curves are projected into the same 3D space for comparison. For a given gene such as Gene A, its position is compared with respect to the condition 1 spline-fit curve. The vector \({\vec{v}}_{1}\) extends from the closest point on the blue curve (spline-fit) to the blue point (Gene A in condition 1). Similarly, for condition 2 (red point), the vector \({\vec{v}}_{2}\) extends from the closest point on the red curve to the red point (Gene A in condition 2). The L2-norm of \({\vec{v}}_{1}\) (or \({\vec{v}}_{2})\) quantifies the deviation of the same Gene A from its expectation in condition 1 (or condition 2). C Given that the deviation is the shortest Euclidean distance from the blue (or red) point to the blue (or red) spline-fit curve, representing the level of expression variability of the gene in condition 1 (or condition 2), \(\vec{{dv}}={\vec{v}}_{2}-{\vec{v}}_{1}\), is used to measure the level of differential variability. The L2-norm of \(\vec{{dv}}\) is called DV score. Note that, in B, \({\vec{v}}_{1}\) and \({\vec{v}}_{2}\) have their own origin with respect to their respective conditions; in C, \({\vec{v}}_{1}\) and \({\vec{v}}_{2}\) share the same origin to compute \(\vec{{dv}}\).

Simulated data analysis for method validation

To evaluate the performance of spline-DV, we first simulated scRNA-seq data with controlled variability modifications. Simulated data was chosen to mimic real-world scRNA-seq challenges, including sparse expression patterns, variations in dropout rates, and relevant changes in variability. This controlled environment allowed us to rigorously test spline-DV’s ability to detect DV under conditions representative of actual biological systems. A gene-by-cell matrix was generated with 500 genes and 1000 cells, divided into two equal groups (A and B). Ten genes were then randomly selected, with seven assigned increased variability and three assigned decreased variability in group B compared to group A. Variability modifications included scaling expression values to emulate biologically relevant changes in variability, incorporating changes in the CV and dropout rates (“Methods”). Spline-DV successfully identified all “ground-truth” DV genes, highlighting its accuracy and robustness (Supplementary Fig. 1).

Case studies

Case study 1—More insightful genes identified in adipocytes in response to diet-induced obesity

We applied the spline-DV method to real scRNA-seq data sets to assess its performance. Our first case study utilized a scRNA-seq data set from a study on diet-induced obesity23, in which adipocytes dissociated from adipose tissues in mice fed either low-fat diet (LFD) or high-fat diet (HFD) for 18 weeks (Fig. 3A) were collected and compared (Fig. 3B). Using the spline-DV, we identified 249 DV genes (Supplementary Table 1), showing differential variability between the two conditions. The top genes include Plpp1, Thrsp, Blcap, Nnat, and Lyz2 (Fig. 3C). Supplementary videos showcase the differential deviation of Plpp1 and Thrsp from spline curves across two conditions (Supplementary Videos 1 and 2). Plpp1, which encodes a protein of the phosphatidic acid phosphatase family, showed increased variability in the HFD sample (Fig. 3D). Plpp1 deletion increases endogenous lipid lysophosphatidic acid concentrations in hepatocytes and reduces glucose production24. Thrsp, which encodes a thyroid hormone-inducible hepatic protein, exhibited decreased variability under HFD conditions (Fig. 3D). Thrsp deletion decreases mitochondrial respiration and fatty acid oxidation is known to contribute to metabolic dysfunction in obese adipose tissue25. Functional enrichment analysis revealed that DV genes are enriched in core adipocyte pathways related to lipid metabolism, insulin response, and fatty acid biosynthesis (Supplementary Table 2). In particular, DV analysis uniquely identified Hadh in the fatty acid biosynthesis pathway, a gene not flagged by DE analysis (see Section “Limited overlap between DV and DE genes” for more details). Hadh, encoding hydroxyacyl-CoA dehydrogenase, regulates fatty acid breakdown and tryptophan metabolism (Fig. 3E). Ptgis and Nr1h3, which are critical regulators of adipogenesis, lipid metabolism, and inflammation, and Acsl1, Slc27a1, Nr1h3, which are PPAR signaling genes essential for fatty acid transport and metabolism, were also identified.

Fig. 3: Application of the spline-DV method in analyzing scRNA-seq data from a mouse nutrition study.
figure 3

A Experimental design for the scRNA-seq study involving low-fat diet (LFD) and high-fat diet (HFD) mice23. Illustration created using BioRender. B T-SNE representation of HFD and LFD adipocyte cells. C List of top DV genes ranked by DV score, with normal font indicating genes with decreased variability and bold font indicating genes with increased variability in HFD samples. Two genes highlighted with underlined font are selected as representatives. The column D.V. gives the DV score values. D HFD-LFD spline-DV variance representation of gene statistics (mean, CV, and dropout rate) for the two selected genes, Plpp1 and Thrsp. Supplementary Videos 1 and 2 are provided to visualize the differential deviation of two genes from the spline-curves. E Pathway-gene network plot highlighting significant pathways enriched among HFD-LFD DV genes in adipocyte cells, including glycerolipids & phospholipids, triglyceride synthesis, adipogenesis genes, fatty acid biosynthesis, fatty acid beta-oxidation, omega-9 fatty acid synthesis, and PPAR signaling pathway.

Case study 2 – Enhanced knowledge of gene activity in hepatic stellate cells with simulated chronic liver fibrosis

Our second case study for the spline-DV method utilized a scRNA-seq data set from a study focusing on potential drivers of liver fibrosis26. The study was designed with a well-established chronic injury model using carbon tetrachloride administration in mice (Fig. 4A). Carbon tetrachloride effectively mimics human centrilobular fibrosis, allowing researchers to investigate disease mechanisms. Using this publicly available data set, we compared the healthy control and 6-week chronic injury hepatic stellate cell samples (Fig. 4B). We identified 205 DV genes (Supplementary Table 3) including Acta2, Gpx3, and Dpt (Fig. 4C). DV genes identified using the spline-DV method were more representative of genes associated with the progression of fibrosis, in contrast to DE genes that were not. For example, DV genes, Col1a1, Col3a1, Col6a3, and Col8a1, which are directly linked to fibrosis progression, were not captured by DE. The products of these collagen genes are the major protein component of the extracellular matrix that accumulates excessively in fibrosis27. Additionally, we identified DV genes encoding enzymes critical for extracellular matrix remodeling, including Mmp2, Mmp14, Adamts4, and Adamts5, which are involved in the proteolytic processes that regulate fibrosis progression28. The presence of Timp1 and Timp3, which encode tissue inhibitors of metalloproteinases, indicates a regulatory mechanism that balances extracellular matrix degradation and synthesis, crucial in fibrosis progression28. Furthermore, Gas6, involved in cell survival and inflammation, showed increased variability, suggesting its role in modulating hepatic stellate cell activation and inflammation in the fibrotic liver29. Functional enrichment analysis revealed that DV genes were enriched in core fibrotic processes, particularly in collagen fibril organization, extracellular matrix organization, and regulation of fibroblast proliferation (Fig. 4D, Supplementary Table 4). For instance, genes like Pdgfra, Tgfb3, and Csf1, which are crucial for hepatic stellate cell proliferation and activation during liver fibrosis, were uniquely identified as DV genes, demonstrating the spline-DV method’s ability to capture the dynamic changes in gene expression variability that occur in response to chronic injury. Other DV unique genes of interest included Loxl1, which encodes an enzyme involved in collagen cross-linking, playing a vital role in maintaining extracellular matrix integrity and stability30,31, and Clec11a, which has been implicated in cell proliferation and differentiation. The increased variability of these genes indicates their potential roles in the activation and function of hepatic stellate cells during fibrosis development.

Fig. 4: Application of the spline-DV method in analyzing scRNA-seq data from a chronic liver fibrosis study.
figure 4

A Experimental design for the scRNA-seq study with chronic liver fibrosis induced by carbon tetrachloride in mice26. Illustration created using BioRender. B T-SNE representation of hepatic stellate cells from healthy control (orange) and chronic injury (blue) samples, illustrating clustering by sample condition. C List of top DV genes ranked by DV score, with normal font indicating genes with decreased variability and bold font indicating genes with increased variability in fibrosis samples. D Pathway-gene network plot highlighting significant pathways enriched among chronic injury-healthy DV genes in hepatic stellate cells, including regulation of cell population proliferation, extracellular matrix organization, positive regulation of fibroblast proliferation, and collagen fibril organization.

Case study 3—Deeper understanding of driver genes in epithelial cells affected by colorectal cancer

Our third case study utilized a scRNA-seq data set from a study focused on the malignant transformation of colorectal cancer (CRC)32 (Fig. 5A). In comparing epithelial cells from unaffected and cancerous patient samples (Fig. 5B), 197 DV genes (Supplementary Table 5) were identified. These include genes associated with CRC progression such as MACF1, SMOC2, and FGFR2 (Fig. 5C), which are critical in cell adhesion, extracellular matrix remodeling, and fibroblast growth factor signaling—all processes known to support cancer cell proliferation and metastatic potential33,34,35,36. Additionally, functional enrichment analysis revealed DV genes represented by JAG1, GMDS, SORBS2, and RBPJ are involved in Notch signaling, the pathway known to enhance colonic epithelial cell survival and promote tumorigenesis37. The complete list of enriched pathways information is provided in Supplementary Table 6. Notably, high JAG1 expression correlates with poor post-surgical prognosis in CRC patients38. Furthermore, DV analysis revealed distinct genes and nuanced regulatory functions that are missed by DE analysis (see Section “Limited overlap between DV and DE genes” for more details). For instance, DV analysis identified genes (CEACAM1, PDE3B, SORBS1, LPIN2) involved in insulin response, which is relevant to CRC as insulin signaling is implicated in cell growth, energy metabolism, and survival of cancer cells39. Similarly, the DV-specific gene, TMSB4X, is known to be involved in ATP biosynthetic and metabolic processes, in line with the increased energy demands and metabolic shifts characteristic of cancer cells40. Additional DV-specific genes, such as RNF213, RUBCNL, and RORA, were enriched in lipid metabolism pathways, further supporting the role of altered lipid metabolism in CRC progression by providing essential structural components for rapidly dividing cells and influencing intracellular signaling41. In immune-related pathways, VAV3 and MALT1 in antigen receptor-mediated signaling were identified, indicating potential interactions between cancer cells and immune modulation in the tumor microenvironment. This DV-derived gene-level insight refines our understanding of CRC biology, revealing unique targets across critical regulatory functions—including ERBB signaling, the ERK1/ERK2 cascade, protein kinase activity, and cell motility—while also mapping metabolic and immune pathways integral to the complex network of CRC progression and metastasis.

Fig. 5: Application of the spline-DV method in analyzing scRNA-seq data from a colorectal cancer (CRC) study.
figure 5

A Experimental design for the scRNA-seq study involving malignant transformation of CRC32. Illustration created using BioRender. B T-SNE representation of epithelial cells from CRC (orange) and unaffected (green) samples, illustrating clustering by sample condition. C List of top DV genes ranked by DV score, with normal font indicating genes with decreased variability and bold font indicating genes with increased variability in CRC samples. D Pathway-gene network plot highlighting significant pathways enriched among CRC-unaffected DV genes in epithelial cells, including regulation of lipid metabolic process, response to insulin, antigen receptor-mediated signaling pathway, and ATP biosynthetic and metabolic processes.

Overlap between DV and DE genes

In this section, we conducted an overlap analysis between DV and DE genes identified from the same data sets. We included three different DE methods, namely the Wilcoxon rank-sum test, DESeq242, and MAST43, and applied them to the data sets of three case studies. In each comparison, overlap between the top 200 DV genes and the top 200 up- or down-regulated DE genes was shown (Fig. 6). While DE and DV analyses revealed overlapping genes, the numbers were limited—in most cases, regardless of data sets and DE methods, less than 25% of DV genes were also identified as top up- or down-regulated genes. The limited overlap between DE and DV genes is not unexpected as it reflects the analysis quantity difference between the two methods. Supplementary Table 7 provides complete lists of genes that were exclusively identified as significant DV genes but not DE genes by the three DE methods. These unique DV genes highlighted the distinct roles of gene expression variability in shaping respective cellular diversity and provided unique insights into gene functions and cellular states. For example, in the case of diet-induced obesity in mouse adipocytes (Fig. 6A), there are 123, 126, and 121 exclusive DV genes when compared to DE genes identified by the Wilcoxon rank-sum test, DESeq2, and MAST, respectively. They are enriched in significant pathways including inflammatory response pathway, extracellular matrix remodeling, fatty acid biosynthesis, and white fat cell differentiation. The data of case studies 2 (Fig. 6B) and 3 (Fig. 6C) revealed a similarly large proportion of unique DV genes along with respective significant pathways. These findings support the enhanced performance of DV genes, which capture a broader functional spectrum across comparisons using spline-DV’s non-mean-centric analysis.

Fig. 6: Limited overlap between DV genes and DE genes across three case studies.
figure 6

Venn diagrams showing the overlap between DV genes identified by spline-DV and DE genes identified by three statistical approaches: Wilcoxon Rank-Sum Test, DESeq2, and MAST. In each panel, DE genes are split into down-regulated (DE Down) and up-regulated (DE Up) groups and overlap with DV genes is depicted separately. The numbers represent the counts of genes unique to each category and those shared across categories. Panels AC represent results from three different data sets of case studies: A diet-induced obesity, B liver fibrosis, and C colorectal cancer.

Spline-DV leverages dropout rate to gain additional insights

While dropout can arise from technical factors, it also captures meaningful biological signals, such as biologically relevant sparsity in gene expression18,19. This dual role makes dropout a critical dimension for understanding cellular heterogeneity, and spline-DV uniquely leverages this property to detect genes with sparse yet functionally significant expression patterns. To evaluate the added value of incorporating dropout as a third dimension in spline-DV, we compared its results to the spline-2d method, which is a truncated version of spline-DV that models gene variability using only mean and CV. Across three case studies, spline-DV consistently identified biologically significant genes that were missed by spline-2d. These findings highlight the importance of integrating dropout to capture condition-specific variability patterns in single-cell data.

In case study 1, spline-DV identified genes such as SERPINE1, CD74, and CCL2, which play pivotal roles in inflammatory signaling within adipose tissue. These genes were not detected by spline-2d because their mean and CV values overlapped with other genes, making them indistinguishable in a 2D space. Similarly, in case study 2, spline-DV uniquely identified key fibrosis-associated genes like CXCL10 and MMP14, which are critical for immune modulation and extracellular matrix remodeling in liver fibrosis. These genes had sparse yet biologically relevant expression patterns that were resolved only through the incorporation of dropout. Finally, in case study 3, dropout incorporation allowed for the detection of PPARG and CDH17, genes critical for colorectal cancer progression. These genes, with sparse but biologically relevant expression patterns, were indistinguishable using spline-2d. Importantly, the genes identified with dropout were highly enriched in condition-specific pathways, such as pro-inflammatory signaling, fibrosis progression, and metabolic reprogramming, underscoring their functional relevance.

Spline-DV and BASiCS

We further compared the results obtained from spline-DV with those derived from BASiCS44, which is another established DV analysis method. BASiCS takes batch information and uses a Bayesian hierarchical model to estimate expression variability; DV is measured by residual dispersion distance, which quantifies how far a gene’s variability deviates from a global mean/over-dispersion trend, comparing the degree of variability between two samples.

We benchmarked BASiCS and spline-DV on the diet-induced obesity and colorectal cancer data sets. The BASiCS-based differential expression analysis with default parameters (see Methods) identified only a small number of genes; 1179 genes were identified in the diet-induced obesity data set (915 common with spline-DV) and 1194 genes were identified in the colorectal cancer data set (762 common with spline-DV). For these genes that had a significant residual dispersion using BASiCS, we observed strong agreement in gene rankings between BASiCS (based on residual dispersion distances) and spline-DV (based on DV scores) for the genes they had in common (Supplementary Fig. 2). Next, we performed an enrichment analysis with the top 200 unique DV genes from spline-DV that were not identified by BASiCS and found biologically relevant pathways for each experiment (Supplementary Tables 8 and 9). For example, Regulation of Angiogenesis and Regulation of Smooth Muscle Cell Proliferation were significantly enriched for the colorectal cancer data set. Acetyl-CoA Metabolic Process, Lipid Biosynthetic Process, and Fatty Acid Metabolic Process were significantly enriched for the diet-induced obesity data set. These confirm that unique genes captured by spline-DV are functionally relevant.

More importantly, we found that, for a given gene, residual dispersion measured in BASiCS is equivalent to the DV score in spline-DV. The difference is that spline-DV assumes no prior distribution and uses separate trend estimates for each condition. Residual dispersion in BASiCS models variability relative to a global mean/over-dispersion trend, which can overlook condition-specific differences. Such differences are often key to understanding dynamic gene regulation under distinct biological conditions. In contrast, spline-DV uses separate trend estimates for each condition, ensuring that the unique variability patterns specific to each condition are captured. This allows spline-DV to identify genes whose variability may be masked by the assumptions of a global trend in BASiCS. By incorporating mean, CV, and dropout, spline-DV captures a multi-dimensional view of gene expression variability, which better reflects biological heterogeneity than single-dimensional metrics like residual dispersion. The enrichment of unique DV genes in functionally relevant pathways further supports this approach as biologically meaningful.

Notably, spline-DV demonstrated faster runtimes, processing the diet-induced obesity data set in 3.7 sec and the colorectal cancer data set in 5.6 s on a 14-core Intel Core i5-13500 CPU (2.50 GHz) with 32 GB of RAM. In contrast, the same tasks took BASiCS more than 11 h for processing the diet-induced obesity data set, and more than 44 for the colorectal cancer data set on a computer cluster (“Methods”).

Spline-DV method is robust against sampling bias

Last, to assess the stability of the spline-DV method, we performed a sampling analysis experiment. Our goal was to evaluate how much of the DV estimate between cells from two different conditions would fluctuate if, instead, it was based on cells sampled from the same condition. We used a subsampling strategy. For cells from two conditions, for example, condition A and condition B, we subsampled cells randomly into A-1 and A-2 as well as B-1 and B-2. Each contains half of the cells from their original group. Then spline-DV was applied to cells within group (i.e., A-1 vs. A-2 and B-1 vs. B-2) as well as between groups (e.g., A-1 vs. B-1 or A-1 vs. B-2). This process was repeated 100 times to capture variability across random permutations of cells and account for inherent variability within the samples. The violin plots in Fig. 7 show the differences between within-group and between-group estimates of DV scores for top DV genes. With data sets from all three case studies, the violin plots demonstrate a clear separation between within-group and between-group estimates. High DV scores indicate greater DV levels, with “between” scores exhibiting larger spreads than “within” scores, suggesting greater DV levels between groups than within. The differences in DV scores highlight genes with significant changes in expression variability from the difference between conditions, allowing for the identification of condition-specific variability patterns in gene expression. Together, these results indicate that the spline-DV method effectively captures condition-specific DV with neglectable sampling bias in the identified DV genes.

Fig. 7: DV scores estimated with cells from between- and within-group sampling.
figure 7

Violin plots display the DV scores for each gene. DV scores are shown separately for “between” group comparisons (light blue) and “within” group comparisons (dark gray), providing a comparison of the variability patterns for each gene under the two conditions. Panels AC represent results from three different data sets of case studies: A diet-induced obesity, B liver fibrosis, and C colorectal cancer.

Discussion

Our goal was to develop a robust method, both biologically and statistically optimized, to identify transcriptomic changes at a single-cell level that extends beyond average gene expression changes. Our solution, the spline-DV method, is an analytical framework for comparative gene expression variability analysis, based on single-cell gene expression mean, CV, and dropout rate. Spline-DV is expected to identify novel, functionally relevant genes that are typically overlooked by traditional DE methods.

Statisticians have been undoubtedly influenced by biologists in developing statistical methods for solving biological questions. DE methods are designed to estimate the mean expression changes as accurately as possible. The strategy often involves complex statistical modeling and individual gene expression distribution fitting to precisely estimate the mean. However, concentrating solely on mean differences may be misguided if variability in gene expression holds a significant contribution in cellular function. Based on this premise, a comparison of mean expression levels between samples would not represent the complete biological complexities of the data. Unfortunately, traditional methods are centered on DE analysis. The effort of developing DV analysis is relatively limited44,45,46,47. It is worth noting that Liu et al.48 have previously proposed a method using scRNA-seq data to detect changes in overall variability between two groups of cells. However, that analysis is focused on assessing the variability of cell populations rather than genes, on which our spline-DV method prioritizes.

A shift in focus from expression mean to variability is essential for a more comprehensive understanding of cellular function. As a promising research direction, after identifying DV genes, we will unlock a deeper understanding of cellular function by delving into the cause of DV. Indeed, evidence from coupled scRNA-seq and scATAC-seq data (i.e., simultaneous profiling of transcriptome and chromatin accessibility and gene expression from the same cell/nucleus) shows that the expression of HVGs is strongly regulated49. Thus, changes in single-cell gene expression variability may be further evaluated using information from correlated single-cell open chromatin profiling.

Spline-DV uses two distinct data-driven spline-fit curves (baselines) for the two conditions. This approach was designed to capture gene expression changes relative to the characteristic spline-fit curves of each condition, rather than emphasizing only the differences between the baseline gene expression. By doing so, our method provides a second-order measure of variability, reflecting relative shifts in gene expression dynamics between the two conditions. Thus, spline-DV estimates variability by leveraging the observed CV and dropout rates of genes. With advances in scRNA-seq technology, this approach suggests that imputation and empirical distribution estimations have minimal improvement in modeling scRNA-seq data. Instead, real observations provide reliable measures of gene expression variability.

The spline-DV method has several advantages in addition to its ability to identify significant genes that could be more functionally relevant to the specific cell type under study. One of the advantages is its robustness against the batch effect of input data, which is a common issue in scRNA-seq analyses. For example, we know that the baseline expression of genes often tends to shift between different biological treatment conditions, resulting in a batch effect. The spline-DV is resilient to such bias and provides a fair means to compare scRNA-seq data from two treatment groups. This is because the 3D spline curve, the building block of spline-DV, is computed in a treatment-specific manner, i.e., two conditions are processed independently. Other advantages of the spline-DV method are related to its impressive computational efficiency and its unsupervised methodology. The spline-fit method itself has been considered one of the best methods for feature selection50.

In conclusion, we propose the spline-DV method for comparing changes in expression variability in single-cell data between different conditions. The approach, termed “differential variability”, can potentially provide additional insight into the role of gene function within the continuum of cell state transitions compared to traditional DE analysis. Based on the work presented here, we demonstrate the effectiveness of spline-DV using real scRNA-seq data case studies. By embracing cell-to-cell variability, we can gain a deeper understanding of cellular processes and dynamics within complex tissues. In addition to its application in the case studies presented here, spline-DV’s ability to capture condition-specific gene expression variability positions it as a valuable tool for a wide range of future applications, including disease modeling, drug response studies, and understanding complex tissue heterogeneity. Its computational efficiency further makes it suitable for large-scale single-cell studies, where rapid and robust methods are essential. Moreover, spline-DV is computationally efficient, processing large data sets in significantly less time compared to existing methods. This scalability ensures its applicability for analyzing increasingly complex and high-dimensional single-cell data sets, making it a valuable tool for both small-scale exploratory analyses and large-scale studies requiring rapid and robust results. As scRNA-seq data sets grow in complexity, spline-DV’s scalable, robust framework will serve as a cornerstone for future analyses, unlocking novel insights across diverse biological domains.

Methods

Single-cell gene expression data sets

The scRNA-seq data sets for the case studies were sourced from three models: the HFD-LFD mouse adipose model by Sabari et al.23 the mouse liver model by Dobie et al.26 and the human colon model by Becker et al.32 respectively. The first case study involves the analysis of data from adipocytes of mouse adipose tissue23. The data was downloaded from the NCBI GEO database using accession numbers GSM4878207 and GSM4878210. The second case study involves the analysis of data from hepatic stellate cells in mouse liver26. The data was downloaded from the NCBI GEO database using accession numbers GSM4085625 and GSM4085623. The third case study involves the analysis of data from cancerous and unaffected epithelial cells in the human colon32. The data was downloaded from the NCBI GEO database using accession numbers GSM6061702 and GSM6061683. The real scRNA-seq data sets used for generating Fig. 1 were derived from human embryonic stem cells (hESCs) and human umbilical vein endothelial cells (HUVECs), respectively. The hESC data set was obtained from Xiu et al.51 which focused on the role of FLI1 in the hESC-EC system. The HUVEC data set was obtained from the GEO database using accession number GSM7511518. This scRNA-seq study investigated the role of vascular endothelial cell growth factor (VEGF), with the data derived from a control sample not treated with human recombinant VEGF-A165.

Processing of scRNA-seq data for case studies

The downloaded scRNA-seq UMI count matrices were imported into scGEAToolbox22 for pre-processing before running the spline-DV. The default quality control filtering was applied with thresholds of library size of 1000 minimum reads per cell, 15% maximum mitochondrial DNA ratio per cell, 15 minimum nonzero cells per gene, and 500 minimum nonzero genes per cell. The cells were then embedded using t-SNE and clustered using the K-means algorithm and annotated using marker genes from the PanglaoDB database. Functional enrichment analysis of genes was conducted using Enrichr52.

Simulating scRNA-seq data

The simulated scRNA-seq data for making Fig. 1A was generated using the algorithm proposed by Lun, Bach, and Marioni53. Briefly, for each gene i in a cell j, the expression count Yij was sampled from a negative binomial distribution with mean θjλi, where θj is cell-specific variability term from a normal distribution with log2(θj)~N(0, 0.25), and λi is a gene-specific constant sampled from a gamma distribution with shape and rate parameters set to 2. The negative binomial dispersion was set for each gene at φi = 0.1.

To evaluate the performance of spline-DV, we simulated scRNA-seq data using the Splatter R package54. A gene-by-cell matrix was generated with 500 genes and 1000 cells, divided into two groups, A and B, with equal proportions (50%) and a baseline variability parameter (bcv.common = 0.2). These settings provided a controlled environment for testing differential variability detection. To introduce realistic variability changes, 10 genes were randomly selected, with 7 assigned increased variability and 3 assigned decreased variability in group B compared to group A. Variability modifications were implemented by scaling expression values, where genes with increased variability were scaled up by a factor of 4, with slight adjustments to mean values and a dropout fraction of 10%, and genes with decreased variability were scaled down by a factor of 0.25, with adjustments to mean values and a dropout fraction of 30%. These transformations emulated biologically relevant changes in variability, incorporating modifications in CV and dropout rate, both critical metrics utilized by spline-DV.

Differential expression analysis and BASiCS analysis

Differential expression analysis was conducted using the Wilcoxon rank-sum test, DESeq242, and MAST43, all implemented with their default parameters using the Seurat R package55. BASiCS44 analysis was performed with its R implementation on the Texas A&M High-Performance Research Computing platform, using a 48-core Intel Xeon 6248R (Cascade Lake) compute node with a 3.0 GHz processor and 128 GB of random-access memory (RAM). BASiCS was executed for each treatment group, using the default parameters as suggested in its tutorial, with 20,000 iterations, a thinning interval of 20, and a burn-in period of 10,000 iterations. The BASiCS differential test was run using the default parameter of \({{\varepsilon }}^{M}\) and \({{\varepsilon }}^{D}\) set to \({\rm{Log}}_{2}1.5,\) and an FDR cutoff of 0.1.

The spline-DV framework

The input of the spline-DV is two lists of three statistics: mean, CV, and dropout rate, which together depict the expression profile of genes across cells in two conditions. These statistics are defined as follows:

Mean (\({\mu }_{i}):\) \({\mu }_{i}=\frac{{\sum }_{j=1}^{{n\,{\rm{cells}}}}{X}_{{ij}}}{{n}_{\rm{cells}}}\), representing the average gene expression.

CV (\({C}_{\sigma ,i}):\) \({C}_{\sigma ,i}=\frac{{\sigma }_{i}}{{\mu }_{i}}\), representing the normalized variability of gene expression.

Dropout rate (\(D{r}_{i}):\) \(D{r}_{i}={\sum }_{j=1}^{{n}_{\rm{cells}}}I({X}_{{ij}}=0)/{n}_{\rm{cells}}\), capturing the proportion of cells where the gene is not detected.

Here, \({\sigma }_{i}={\sum }_{j=1}^{{n}_{\rm{cells}}}{({X}_{{ij}}-{\mu }_{i})}^{2}/({n}_{\rm{cells}}-1)\) is the standard deviation of a gene, and \(I({X}_{{ij}}=0)\) is an indicator function that equals 1 if \({X}_{{ij}}=0\) (indicating a dropout event) and 0 otherwise. \(i\) denotes the gene index and \(j\) denotes the cell index.

For each condition, these three summary statistics are processed independently. The genes are sorted by mean expression, and a 3D spline-fit curve is constructed to represent the cumulative sum of the logarithmic differences between successive elements:

$$\delta {\mu }_{i}=\mathrm{ln}\left({\mu }_{i+1}+1\right)-\mathrm{ln}\left({\mu }_{i}+1\right)$$
$$\delta {C}_{\sigma ,i}=\mathrm{ln}\left({C}_{\sigma ,i+1}+1\right)-\mathrm{ln}\left({C}_{\sigma ,i}+1\right)$$
$$\delta {{Dr}}_{i}={{Dr}}_{i+1}-{{Dr}}_{i}$$

Based on these metrics, the 3D spline-fit curve encapsulates the expected behavior of all genes within a condition. This expected variability is used as a baseline for comparison and is defined as:

$${s}_{j}=\mathop{\sum}\limits_{i=1}^{i\le j}\sqrt{{\left(\delta {\mu }_{i}\right)}^{2}+{\left(\delta {C}_{\sigma ,i}\right)}^{2}+{\left(\delta {{Dr}}_{i}\right)}^{2}}\forall j\in \{1,\ldots ,n_{\rm{genes}}-1\}$$

The spline-DV method uses condition-specific spline-fit curves as baselines to account for the unique variability patterns in each condition. This approach avoids assumptions about a global trend, which could mask condition-specific differences in variability. By constructing separate baselines, spline-DV ensures that deviations (\({\vec{v}}_{1}\) and \({\vec{v}}_{2}\)) accurately reflect variability relative to the expected behavior within each condition.

Gene statistics \({\vec{{gs}}}_{i}=\left({\mu }_{i},{C}_{\sigma ,i},D{r}_{i}\right)\) are compared to their closest point on the spline-fit curve \(({\vec{s}}_{{fit},i})\), which represents the “expected” variability. The deviation from the expected variability for each condition is calculated as: \({\vec{v}}_{1}={\vec{gs}}_{i}-{\vec{s}}_{{{fit}},i}\) (for condition 1) and \({\vec{v}}_{2}={\vec{{gs}}^{\prime} }_{i}-{\vec{s}^{\prime} }_{{{fit}},i}\) (for condition 2).

The difference in the variability between the two conditions is given by:

$$\vec{{dv}}={\vec{v}}_{2}-{\vec{v}}_{1}$$

and the magnitude of this difference \((\Vert{\vec{dv}}{\Vert})\) is the DV score—a measure of relative changes in variability between the two conditions.

Identifying differentially variable genes

Genes with a significantly greater DV score are considered DV genes. The direction of variability change between conditions is given by \({||}{\vec{v}}_{1}{||}-{||}{\vec{v}}_{2}{||}\), indicating whether variability increased or decreased. To assess the confidence of DV scores, the distance difference magnitude (\({||}\vec{{dv}}{||}\)) is standardized using z-scores. Assuming a Gaussian error distribution, the cumulative density function is used to compute p-values, identifying genes with distinct variability changes in the right tail of the distribution. A p-value threshold of <0.05, rather than a more stringent cutoff, was used to obtain sufficiently large sets of DV genes in each case study, facilitating the interpretation of gene functions. Notably, the Benjamini-Hochberg procedure56 or the Benjamini-Yekutieli correction57, the latter applicable when test statistics are correlated, can be employed to control the false discovery rate.