Integrating spatial and chemical information enhances differentiation of non-alcoholic steatohepatitis states in Raman imaging

Kondo, Ryoya; Mizuno, Yuta; Mochizuki, Kentaro; Hashimoto, Kosuke; Clément, Jean-Emmanuel; Kumamoto, Yasuaki; Fujita, Katsumasa; Harada, Yoshinori; Komatsuzaki, Tamiki

doi:10.1038/s41598-025-17495-z

Download PDF

Article
Open access
Published: 13 October 2025

Integrating spatial and chemical information enhances differentiation of non-alcoholic steatohepatitis states in Raman imaging

Scientific Reports volume 15, Article number: 35216 (2025) Cite this article

1949 Accesses
8 Altmetric
Metrics details

Subjects

Abstract

Machine learning studies for Raman imaging have addressed the differentiability of normal and diseased states in biomedical applications by grouping a set of Raman spectra in terms of spectral similarity over the sample. However, Raman imaging provides both chemical information relevant to the underlying chemical species and their spatial distribution across the biological samples. Utilizing both the chemical and spatial information may further discriminate the sample states more than just using the spectral similarity free from the spatial structure. Here, we develop a Raman image analysis method integrating spatial and chemical information. The crux of our method is to introduce a measure to quantify spatial heterogeneity among Raman spectra at each pixel, and to classify Raman images using information theory, based not directly on the Raman spectra themselves at individual pixels but on the spatial heterogeneity in the spectral space over the surroundings. In this paper, we applied the method to a set of liver tissues dissected from non-alcoholic fatty liver disease (NAFLD) rat model, each of which is pathologically classified into normal, nonalcoholic fatty liver (NAFL) and non-alcoholic steatohepatitis (NASH), respectively. We show how the pathologically-identified liver states can be further classified using chemical information, and both chemical and spatial information. All NASH tissues that are belonging to a same cluster in spectral similarity are found to be further divided into substates that correlate the progression of the NAFLD disease, and subtle contamination of bloods that often prevents from appropriate pathological judgments. The potential of a use of both chemical and spatial information in Raman imaging is expected to enhance the differentiability of disease states of biological samples.

Rapid, label-free histopathological diagnosis of liver cancer based on Raman spectroscopy and deep learning

Article Open access 04 January 2023

Digital pathology for nonalcoholic steatohepatitis assessment

Article 03 October 2023

Diagnostic accuracy of artificial intelligence models for imaging detection of hepatic steatosis through systematic review and meta analysis

Article Open access 02 October 2025

Introduction

Raman imaging has significant potential to provide information that not only reflects the (relative) population of various chemical species but also their spatial distribution on biological samples^{1,2,3,4,5,6,7,8}. Its label-free and noninvasive nature, combined with the ability to perform measurements without any pretreatment, such as staining, allows pathological specimens to be preserved for subsequent histological and biomolecular analysis.^9,10. Each pixel on a two-dimensional Raman image possesses a Raman spectrum, the intensity of Raman scattering by molecules in the pixel region for each Raman shift. Therefore, a Raman image has a data structure spanned by the 2D spatial position and Raman shift. Because the number of measured Raman shifts reaches several hundreds or even thousands, Raman images contain much chemical information.

Recent advances in Raman microscopy have enabled us to access rich information in subcellular, in vitro^11,12, and ex vivo tissue imaging^13,14,15. Because the spectral differences of biological samples are often subtle, chemometrics aids their use in many aspects^16,17,18, involving classification models such as k-nearest neighbor and discriminant analysis^{19,20,21,22,23,24}, support vector machines^{25,26,27,28,29,30,31,32,33,34,35}, neural networks^{17,36,37,38,39,40,41,42,43,44,45,46,47,48} , and fuzzy clustering^{49,50,51,52,53,54}. Regression models in terms of Raman signals have succeeded in predicting pH⁵⁵. However, these machine learning methods focus on chemical information and rarely consider spatial information; that is, they treat Raman images as ensembles of Raman spectra, neglecting the spatial structure of how the spectra assemble to each other in space over the field of view (FOV)^14,56,57. Chemical species in cells and tissues exhibit spatially-heterogeneous structures and patterns. Such spatial structure is considered versatile, especially in pathology, observing spatial contrasts in cytoplasm, nuclei of cells, and tissues in stained images⁵⁸. Using the spatial information of Raman signals in addition to its chemical information can also accelerate measurements through selective illumination on the fly^59,60,61,62. In addition, most pathological inspections of biological samples are performed in relation to morphological characteristics. For instance, nonalcoholic fatty liver disease (NAFLD), characterized by abnormal lipid accumulation in more than 5% of the liver without significant alcohol consumption, is classified pathologically into nonalcoholic fatty liver (NAFL) and nonalcoholic steatohepatitis (NASH) through its microscopic inspection of stained images by pathologists in terms of spatial structure such as hepatocellular inflammation, ballooning, and fibrosis^63,64. Here, NAFL has generally a stable and non-progressive course, with the presence of lipid droplets in the hepatocytes. In contrast, NAFL followed by NASH is associated with a poor prognosis and the risk of advancing to liver cirrhosis or liver cancer^65,66,67. Recent evidence suggests a more complex and diverse progression of NAFLD^68,69. Research on rapid progression to advanced fibrosis and NASH with cirrhosis has also progressed^68,70,71, and recently developed histological and staging criteria have improved the clinical diagnosis of NAFLD^63,72,73,74. However, prediction for future fibrosis progression and rapid progressors among NAFL patients remains challenging^68,70. While other hyperspectral imaging techniques, such as near-infrared spectroscopy⁴⁸, provide powerful capabilities for biological imaging, spontaneous Raman scattering microscopy offers distinct advantages essential for this study: non-invasive spectral collection from water-rich tissues, broad spectral coverage to capture chemo-spatial diversity, and high-sensitivity detection of biomolecules including cytochromes and vitamin A, facilitated by resonance Raman effects. Furthermore, the non-necessity of sample staining is another advantage compared to fluorescence-based hyperspectral imaging⁷⁵.

In this paper, we present a method based on information theory to classify Raman images of biological samples in terms of both chemical (Raman) information and spatial information of chemical signals by incorporating an auxiliary variable in the classification. The auxiliary variable quantifies the spatial heterogeneity of the spectra, that is, how much Raman spectra reflecting the microscopic chemical states at neighboring pixels are different from each other. We analyze liver tissues dissected from the NAFLD rat model annotated with pathological classifications such as normal, NAFL, and NASH, respectively. We address how the addition of spatial information on top of chemical information further provides a clue to differentiate tissue state, especially at NASH samples. In what follows, we refer to the spectral information combined with the spatial position information as chemo-spatial information.

Results

Spatial heterogeneity in a Raman image

Spatial heterogeneity is defined at each pixel based on the Raman spectral distances between the pixel and its surrounding pixels as follows (see Fig. 1): We define a set of the surrounding pixels around the pixel r as the pixels in the circle centered at r with a radius of $\lambda _0$. We denote the set of the surrounding pixels of r by $S_r$. In this paper, as a typical spatial scale, we employed $\lambda _0=20$ µm, ca. 4 pixels, corresponding to being slightly larger than the typical size of hepatocytes so that our chemo-spatial information is relevant to the spatial diversity in Raman signals in the scale of hepatocyte.

In Fig. 1, for two given center pixels r and $r'$, their Raman spectra at a given wavenumber $w$ are denoted as $I_r(w)$ and $I_{r'}(w)$, respectively (blue box in Fig. 1). In this example, the set of Raman spectra near each of the pixel r and $r'$ is collectively depicted in the orange box (i.e., the spectra at the surrounding pixels $s \in S_r$ and $s' \in S_{r'}$). One can see that the diversity of Raman spectra is apparently different between the neighborhoods of the two center pixels r and $r'$. It is clearly impossible to capture this only through the Raman signals of the central pixel. To quantify such diverse characteristics of the chemical composition in space, we first introduce a spectral distance $v$ between pixel r and one of its surrounding pixels $s \in S_r$, defined by

$$\begin{aligned} v\left( I_{r},I_{s}\right) = \sum _{w \in W}|I_{r}\left( w \right) - I_{s}\left( w\right) | \end{aligned}$$

(1)

where W denotes the set of all measured wavenumbers.

Not a single Raman spectrum at pixel s near the center pixel r, but the set of surrounding Raman spectra characterizes the spatial heterogeneity at the given center pixel through their spectral distances. That is, for any given (center) pixel r, a set of different $v$ exists. This provides us with the conditional probability distribution of the spectral distance, $p(V=v|R=r)$, as a descriptor to quantify the spatial heterogeneity for a given Raman image. Here, according to the standard notation in statistics, capital letters V and R denote stochastic variables, while the small characters $v$ and r do some of their actual values. In this paper, we briefly represent $p_{V|R}\left( v| r\right)$ and when we refer to the function with any value of $v$ and r we use $p_{V|R}$ (see details of how to compute the conditional probability distribution in Supplementary Information (SI)). The spatial heterogeneity dictated by $p_{V|R}(v|r)$ reflects the spatial structure of the Raman image within the spatial scale defined by the radius $\lambda _0$. Figure 1 exemplifies the probability distribution functions (PDFs), $p_{V|R}(v|r)$ and $p_{V|R}(v|r')$, respectively. The former has a larger variance with a longer tail, reflecting its heterogeneous chemo-spatial information near the pixel r, while the latter has a smaller variance with a sharper peak, reflecting its homogeneous local chemo-spatial information in the vicinity of the pixel $r'$. Note that random shuffle of the spatial positions of Raman spectra results in completely different $p_{V|R}$ profile from original one, while it does not affect any result of commonly used clustering analyses (e.g.,^{14,17,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54}) because they do not take into account any spatial information in the images (see SI Fig. S1).

Using the information bottleneck method⁷⁶, we performed clustering based on the probability distribution function $p_{V|R}$ as a descriptor. In the clustering, the total number of clusters was set to 4 (We confirmed that the conclusions below do not change for the number of clusters 3–6). The details of the clustering method are provided in the Methods section below.

Chemo-spatial clustering of Raman images of liver tissues from the NAFLD model

We applied our chemo-spatial clustering method to Raman images of NAFLD-model liver tissues. As recently reported⁷⁷, a total of 48 rats were housed with different diets: standard diet (SD, 16 rats), high-fat diet (HFD, 60% lard, 16 rats) and high-fat high-cholesterol diet (HFHC, 60% lard, 1.25% cholesterol, 0.5% cholic acid, 16 rats). Four rats from each group were sacrificed at 2, 4, 8, and 16 weeks (after starting designated feeding) for sample collection. Raman images of sliced liver tissues excised from each sacrificed rat were taken. Raman images of NAFLD liver tissues were preprocessed using the well-known protocol as before⁴⁹. The detailed protocol is provided in the Methods section below.

Figure 2a shows the conditional probability distribution of the spectral distance, denoted by $p_{V|C}(v|c)$, which represents the chemical heterogeneity of pixels belonging to the cluster c. (Here, C denotes the stochastic variable that represents the cluster index.) The index of clusters is taken from 1 to 4 in decreasing order of its chemical heterogeneity, so that the larger the index is, the lesser the Raman spectral difference near the pixels (i.e., more homogeneous).

Then, how is each cluster distributed over space depending on the state of NAFLD? What can be the additional information we may acquire compared to a simple clustering based solely on Raman spectra? Fig. 2b shows the spatial distributions of the clusters, hereinafter referred to as ‘cluster maps’, for each of the liver tissue samples from rats, classified by the dietary condition SD, HFD, and HFHC. The cluster colors on the cluster maps are the same as those in Fig. 2a. In the figure, for example, the label HFD8w3 means a rat #3 out of the four individuals which were fed a high-fat diet (HFD) for 8 weeks. The pathologically identified tissues states, Normal, NAFL and NASH, are indicated by thin black, thick pink, and thick blue frames, respectively. As the overall trend of chemo-spatial analysis for NAFLD, the worst disease tissues NASH exhibit chemically spatial homogeneous patterns (see green contiguous regions, which are the most chemically homogeneous cluster). Interestingly, not all, but some NAFL tissues from the HFHC diet model also exhibit similar chemical homogeneity to NASH (e.g., HFHC4w2 and HFHC8w2). In terms of dietary model categories, Cluster 3 (blue) is distributed over the tissues of the SD and HFD images, in which more chemically spatial heterogeneous Clusters 2 (yellow) and 1 (orange) are scattered sparsely. In turn, HFHC images seem to be divided into two classes roughly in terms of the abundance of chemical spatial homogeneous Cluster 4, not necessarily pathologically identified NAFL and NASH classes. That is, HFHC4w1, HFHC4w2, HFHC8w2 and most HFHC16w images (except HFHC16w4) have highly populated Cluster 4 compared to the other HFHC images. It should be noted that such a distinct pattern of chemo-spatial heterogeneity across dietary and pathologically identified states cannot be captured by standard clustering based solely on Raman spectral features free from spatial information [see further detailed analysis in the Supplemental Materials (Fig. S1)].

Then, what chemical species actually affect chemical heterogeneity in space? To address this question, let us compare the current results with the cluster analysis based solely on Raman spectral information existing in the same field of view (FOV) without referring to the spatial location^14,77. Hereinafter, we call this ‘chemical clustering’ while the former clustering taking into account chemo-spatial heterogeneity in Raman signals is called ‘chemo-spatial clustering.’

Figure 3a presents the results of chemical clustering analysis in which seven different spectral clusters are identified^14,77. Note that the resolution in cluster difference is higher in chemical clustering than in chemo-spatial clustering. Fat accumulation increases from Cluster 1′ to Cluster 7′ (see the high-wavenumber region of the lipid 2790-3045 cm$^{-1}$) while vitamin A decreases (1595 cm$^{-1}$). Figure 3b shows the spatial distribution of the clusters in each image. These indicate the localization of vitamin A in normal tissues, while some NAFLs with HFHC diets (termed NAFL-$\beta$⁷⁷) and NASH seem to exhibit a wide range of fat accumulation. It should be noted that the size of cluster (i.e., population) in such standard chemical clustering analysis is exactly preserved, even when the spatial information of Raman signals is completely spoiled by random shuffling among pixels (see Fig. S1). There seem to be partially similar patterns as Fig. 2b, e.g., spatial patterns of Cluster 7′ (gray) ($\sim$ lipid-rich and less vitamin A and cytochrome) in the chemical clustering and Cluster 4 (green) ($\sim$ most homogeneous) in the chemo-spatial clustering (Fig. 2b) seem to be similar in the HFHD dietary models.

To quantify how chemo-spatial clustering and the chemical clustering results are related to each other, we examine the joint probability distribution $p_{C,C'}(c,c')$ of the two types of clustering, shown in Figs. 2b and 3b, that is, the probability of a pixel belonging to the cth cluster in chemo-spatial clustering and the $c'$th cluster using chemical clustering simultaneously. As seen in the cluster maps (first row: chemo-spatial clustering, second row: chemical clustering) of HFHC4w2 (NAFL) and also HFHC8w2 (NAFL) in Fig. 4a, for example, the chosen two pixels both belong to Cluster 4 in chemo-spatial clustering (i.e., $C=4$) and Cluster $7'$ in chemical clustering (i.e., $C'=7'$) simultaneously, and are counted in the total number of such pixels $N(C=4,C'=7')$. For example, the joint probability distribution for a Raman image is computed by $p_{C,C'}(c,c')=N(C=c,C'=c')/N_\text {pixel}$. The leftmost figure in Fig. 4a shows the heatmap of the joint probability distribution $p_{C,C'}(c,c')$ whose statistics are taken on all Raman images of the liver tissue in order to glimpse the overall relationship between the two clustering schemes. The horizontal and vertical axes correspond to the cluster indices of chemical clustering free from spatial structure and those of chemo-spatial clustering, respectively. Cluster 3 has the highest probability in chemo-spatial clustering, and Cluster $4'$ has the largest in chemical clustering (see the marginal probability distributions $P_C(c)$ and $P_{C'}(c')$ in Fig. 4a). Very roughly, relatively high joint probabilities appear to align diagonally on the heat map as the overall trend: the more the cluster index increases in chemo-spatial clustering [i.e., more spatially homogeneous (Fig. 2a)], the more the cluster index in chemical clustering increases [i.e., more lipids and less vitamin A and cytochrome contents (Fig. 3a)]. This might coincide with the fact that Figs. 2b and 3b look similar as overall trend.

However, when the joint probabilities are taken over the liver tissue ensemble only for the same pathological states with the same diet, the pattern of the joint probabilities becomes more diversified (Fig. 4b). For example, (diet, pathological state) = (HFD, NAFL) corresponds to the set of liver tissues of rats that have been fed a high-fat diet for a relatively long period (4, 8, 16 weeks) and are identified as pathologically NAFL. Likewise, (HFHC, NAFL) corresponds to the set of those fed a high-fat, high-cholesterol diet (e.g. 2, 4, 8 weeks) and are identified pathologically as NAFL. As an overall trend as (SD, Normal) $\rightarrow$ (HFD, Normal) $\rightarrow$ (HFD, NAFL) $\rightarrow$ (HFHC, NAFL) $\rightarrow$ (HFHC, NASH), the most populous cluster C shifts from Cluster 3 to 4 in chemo-spatial clustering and the most populous cluster $C'$ shifts from Cluster $4'$ to $7'$ in chemical clustering (see $P_{C'}(c')$ and $P_C(c)$ attached to each figure vertically and horizontally in Fig. 4b). This indicates that as NAFLD progresses, lipids increase but vitamin A and cytochrome decrease in liver tissues, while their local chemical environment becomes more homogeneous within the typical spatial scale of a hepatocyte.

The first striking consequence in Fig. 4b is that in the same disease state pathologically identified as NAFL with different diet models (HFD and HFHC), the joint probability distribution $p_{C,C'}(c,c')$ of (HFHC, NAFL) looks closer to that of (HFHC, NASH) than that of (HFD, NAFL), indicating that the liver tissues of (HFHC, NAFL) resemble those of (HFHC, NASH). This indicates that before the morphological signatures of NASH emerges, Raman imaging has some potential to infer the closeness to NASH (i.e., ill prognosis) among the NAFL not only chemically but also chemo-spatially. Likewise, when (SD, Normal) and (HFD, Normal) are compared to each other for the same pathological state ‘Normal,’ this joint probability distributions are not necessarily similar. For both, the most populous cluster in chemo-spatial clustering is Cluster 3, which may indicate that the heterogeneity of local chemical environments does not change largely but chemical ingredients change (the most populous cluster changes from Cluster $4'$ to Cluster $5'$).

Let us look deeper into the image-wise analysis especially for the liver tissue Raman images pathologically identified as NAFL and NASH, because NAFL is a “driver” to transition to NASH. In Fig. 5, juxtaposing the two different types of cluster maps (chemo-spatial and chemical clusterings), the set of the heat map of the joint probability distribution $p_{C,C'}(c,c')$ as well as its marginals $P_C(c)$ and $P_{C'}(c')$ attached vertically and horizontally for each of these images is presented. Comparison of the heat map in each image of rats fed the HFHC diet for 16 weeks (HFHC16w)—expected as the most severe condition with the state of NASH — with the overall (HFHC, NASH) heat map (Fig. 4b) tells that the images of HFHC16w1, HFHC16w2, and HFHC16w3 follow the general characteristic of the HFHC-NASH heatmap. In chemo-spatial clustering, the spectra of HFHC16w1, HFHC16w2, and HFHC16w3 belong predominantly to Cluster 4 and simultaneously to Cluster $7'$, showing that chemical environment characterized with more lipids and less vitamin A and cytochromes spreads over a large spatial region homogeneously.

The second striking consequence of chemo-spatial clustering is as follows: Among the HFHC16w rat model, the fourth individual (represented as HFHC16w4) deviates from the most populous averaged characteristics of chemo-spatial clustering: Unlike the other HFHC16w models, many spectra belong to Clusters 1 through 3 rather than dominantly to Cluster 4. This is opposite to the situation of chemical clustering in that the Raman spectra of HFHC16w4 belong mainly to Cluster $7'$ just as the other HFHC16w model. This coincides with our visual inspection of the two types of cluster maps of chemical and chemo-spatial clusterings. Even though Raman spectra indicate populous lipid with fewer vitamin A and cytochromes (from the dominance of Cluster $7'$), the pattern of how chemical ingredients are distributed in tissue differs from the averaged Raman signature. Note that the distinction between HFHC16w4 and the other HFHC16w model is not manifested using chemical clustering because it does not take into account the spatial structure of Raman signals, and visual inspection of the cluster map for chemical clustering appears not to show significant differences between the HFHC16w model accompanying the populous Clusters $6'$ and $7'$.

Finally, we address why spatial heterogeneity increases within regions where similar spectra are distributed in HFHC16w4. To consider the origin of the increased spatial heterogeneity, the cluster maps of HFHC16w4 in chemo-spatial (Fig. 6a) and chemical clusterings (Fig. 6b) are shown with the radius of 4 pixels indicated by the white circles ($\sim$ the typical size of hepatocyte) composed of similar spectra (Cluster 7′) but relatively heterogeneous in space. The Raman spectra within the white circle zone, which were used to calculate the spatial heterogeneity, are shown in Fig. 6c (c.f., the corresponding Raman spectra in a region of HFHC16w1, shown in Fig. S2, where both clustering methods present a uniform distribution, mainly composed of a single cluster).

The comparison of Fig. 6c and Fig. S2c suggests that there may be three distinct features in the Raman spectra of the HFHC16w4 model. The first characteristic is the relatively high intensities with fluctuations in the blood component (hemoglobin). In Fig. 6c, several peaks are assigned to specific vibration modes of heme b and c, or both, such as 750, 1127, 1360, 1550, 1585, and 1635 cm$^{-1}$^79,80,81,82. In these experimental conditions, heme b can be mainly attributed to hemoglobin in blood and heme c to cytochrome c. Among the peaks attributed to heme above, 1360 and 1550 cm$^{-1}$ are generally more intense in heme b than in heme c, and 1635 cm$^{-1}$ are assigned only to heme b (oxygenated). In Fig. 6c, the intensities of these three peaks exhibit a correlation with each other in the spectra. In this study, we were unable to clarify the origin of hemoglobin, but the most plausible scenario is the blood residues in the tissue from the step of its excision and slicing. This study used freshly resected liver samples: The influence of red blood cells that escape from the resected area cannot be completely ruled out. This interpretation is also supported by the result on the presence of a higher amount of oxygenated hemoglobin (1635 cm$^{-1}$) in HFHC16w4 compared to HFHC16w1.

The second is the presence of relatively abundant vitamin A. Spectra with peaks at 1015, 1160, 1200, 1275, and 1595 cm$^{-1}$ attributed to vitamin A exist within the local region. As seen in the HFHC16w1 model, few spectra reflecting vitamin A were measured. Vitamin A is known to be localized in fat droplets under normal conditions and its decrease is known to occur in several types of disease, including fatty liver⁸³. Thus, the presence of abundant vitamin A suggests that the pathology of HFHC16w4 is still in progress, compared to HFHC16w1. The third feature is the large fluctuation in the Raman intensity in the fat region. The regime indicated by the white circle in HFHC16w4 —where chemical clustering provides almost a single cluster, although chemo-spatial clustering indicates spatially heterogeneous— has higher fat fluctuations than that in HFHC16w1.

To confirm whether the above discussions hold for all pixels in the HFHC16w family, Fig. 7 presents the relationship between chemo-spatial heterogeneity and its Raman intensities of hemoglobin-related, vitamin A, and lipids across that family. Fig. 7a presents the heat maps of Raman intensities at 1130 cm$^{-1}$, 1360 cm$^{-1}$, 1550 cm$^{-1}$, 1635 cm$^{-1}$ (all hemoglobin-related), and vitamin A at 1595 cm$^{-1}$, and lipids at 2856 cm$^{-1}$, respectively, standardized by Amide-I Raman shift, with the cluster maps of the HFHC16w family generated by chemo-spatial clustering and chemical clustering in Fig. 7b. One can see that not only in the white circle region, but also in the other regions on the heat maps of HFHC16w4, hemoglobin-related ingredients are more observed than in the other HFHC16w family (that is, redish pixels are more abundant in across the hemoglobin-related HFHC16w4 image). For example, in HFHC16w2 and HFHC16w3, hemoglobin ingredients at 1550 cm$^{-1}$ are sparsely distributed (indicated by black circles and black arrows in Fig. 7a). Chemical clustering (with total number of clusters seven) cannot discriminate such sparsely distributed minor components. Specifically, chemo-spatial clustering identifies the hemoglobin-related minor components within the right black circle in HFHC16w3 as belonging to Cluster 2 (yellow) (i.e., more heterogeneous than its neighbor’s Cluster 4 (green)) while chemical clustering does not. Likewise, chemo-spatial clustering differentiates the minor components within the black circles in HFHC16w2 as belonging to Cluster 2 (yellow) (i.e., more heterogeneous than its neighbor’s Clusters 3 (light blue) and 4 (green)) while chemical clustering does not differentiate them from the neighbors, and assigns them as belonging to Cluster $6'$ in similar to the cytochrome rich (1130 cm$^{-1}$) regions.

In order to further quantify predictive performance of the two types of clustering schemes and address the limitation of chemo-spatial clustering, we synthesized a set of Raman images in which chemical contamination is artificially added to the Raman images of HFHC8w2 and SD16w3 as representatives of relatively homogeneous and heterogeneous chemical environments, respectively, and evaluated to what degree chemo-spatial clustering and chemical clustering can predict chemical contamination. Detailed results of the analysis are presented in Supporting Information, Section 4. In the analysis, cholesterol (resp. polystyrene) was randomly introduced into 10% of the pixels of the entire image with several mixing ratios, e.g., mixing ratio 20:80 means that the original Raman spectrum located in the chosen pixel is contaminated 20% by cholesterol (resp. polystyrene) (Fig. S3). The deviations between the clustering results of the two clustering schemes before and after chemical contamination are shown in Figures S4 through S7, as well as the corresponding analysis of the F-measures with respect to the prediction of which pixels are chemically contaminated using the change of the cluster indices of each pixel (Fig. S8). For example, we found that the F-measures increase as the mixing ratio of the contamination increases in both clusterings, and that chemo-spatial clustering yields higher F-measures in a more chemically homogeneous environment, although there is no such systematic trend and the F-measures are relatively low in chemical clustering. However, for chemo-spatial clustering, contamination is harder to locate in more heterogeneous environments. Overall, chemo-spatial clustering is complementary to conventional chemical clustering, which takes into account spatial information of Raman signals across a sample and is more sensitive to chemical contamination.

Conclusion and outlook

We have developed an information-theoretic clustering approach taking into account chemo-spatial information without compromising the structure of hyperspectral images. We introduced the spatial heterogeneity measure to quantify local chemical heterogeneity in space, to which the clustering is performed. In the NAFLD disease model of rats, we demonstrated that the increase and decrease in certain chemical species and the homogenization of the local chemical environment occurred simultaneously. Some tissues in NAFL also exhibited similar surrounding chemical environments as seen in NASH. This feature in NAFL was not apparent from the morphology-based observations. Furthermore, this chemo-spatial clustering revealed in some of the NASH tissues the existence of spatially heterogeneous blood deposits, which could not be easily characterized by chemical clustering.

In June 2023, NAFLD and NASH were renamed metabolic dysfunction-associated steatotic liver disease (MASLD) and metabolic dysfunction-associated steatohepatitis (MASH), respectively⁸⁴. This change in nomenclature involves a review of diagnostic criteria with a focus on metabolic abnormalities and a comprehensive understanding of the disease, including those who drink small amounts of alcohol. However, MASLD/MASH and NAFLD/NASH are conceptually similar, so the findings of this study, derived from the NAFLD/NASH model, may apply to the updated concept.

Since histopathologies utilize tissue morphological information in stained images, it is natural to reference the spatial structure of Raman images in addition to chemical compositions. In this study, we set the spatial heterogeneity range at a radius $\lambda _0$ of 20 µm corresponding to the typical size of hepatocytes. Increasing the radius reflects a larger chemical environment in the space, while the difference among each pixel in the spatial heterogeneity measure is expected to decrease because of larger overlap of regimes centered at the individual pixels. Conversely, decreasing the radius may induce a larger fluctuation irrelevant to the subcellular and cellular structure. One of the future subjects to be studied should be the dependency of classification performance on the size of local regions within which chemical heterogeneity in physical space is characterized.

In this paper, we used the information bottleneck method to group the spatial heterogeneity distribution $p_{V|R}$ at each pixel. This method prevents the whole data set R from being divided into each group more than the experimental error. For example, the cluster C should capture less detailed information about the target data V than the pixel data R does, because the cluster representation C only retains a simplified or “coarse-grained version” of the details of the pixels in R. Due to inherent experimental variation, the degree of information captured by R about V can fluctuate between different experimental instances. Meanwhile, as the number of clusters increases, C tends to capture more information about V. Therefore, if the information retained by C in the current set-up exceeds that retained by R in some (hypothetical) experimental instance, then it would be advisable to reduce the number of clusters because current C captures too much detailed information about V (see also⁸⁵).

The proposed method may be useful for detecting foreign bodies with specific geometries present in collected tissue samples, e.g. asbestos or microplastics. For example, it is known that various types of asbestos can be discriminated by Raman microscopy even in biological tissues. However, the detection of their Raman spectra requires a relatively higher excitation energy (such as a laser exposure time of several hundred seconds), thus the comprehensive detection of their distribution in tissues by spontaneous Raman imaging has been technically limited⁸⁶. One of the advantages of this method is the capability to increase the detection sensitivity of specific targets by detecting differences in chemical compositions and spatial distribution. This may make it possible to detect such foreign bodies in biological tissue even under the condition that the Raman spectrum of the target is weak or its pattern is unknown.

Finally, we address the difference of our information-theoretic approach from the deep learning (DL) approach of Raman image analysis. When the Raman images are employed as the input data into the DL, in principle, any chemical and chemo-spatial information would be taken into account in much more abstract, black-box fashion if the spatial information is also versatile to differentiate the disease states of the sample using the Raman images. However, this is faced on the issue of information leakage¹⁶, in which DL can use irrelevant information to the sample, such as (unnoticeable) experimental conditions involving inhomogeneous irradiation profiles⁸⁷. Despite this issue, when appropriate preprocessings of Raman images in question can be undergone, DL could be another means to address both chemical and chemo-spatial information in differentiating the sample states.

Methods

Information bottleneck method

We use the information bottleneck method⁷⁶ to perform clustering according to spatial heterogeneity $p_{V|R}$. The information bottleneck method computes the probability that a pixel r belongs to a class (or a cluster) c, denoted by $p_{C|R}(c|r)$, by solving the following variational problem:

$$\begin{aligned} \operatorname*{minimize}_{p_{C|R}} \left( I(C;R)-\beta I(C;V) \right) . \end{aligned}$$

(2)

Here, the capitals C and R denote the stochastic variables that represent classes and pixels, respectively. I(C; R) is the mutual information between the classes and pixel data, measuring how much information from the original data remains in the clustering result. In turn, I(C; V) is the mutual information between the classes and the spatial heterogeneity data, indicating how much the spatial heterogeneity information is retained in the clustering result. Again, the capital V represents the stochastic variable of spatial heterogeneity. The objective functional of the above variational problem represents a trade-off between maximizing data compression (i.e., the smaller I(C; R) more compresses or unifies the pixel data) and maximizing information preservation of the spatial heterogeneity (i.e., the larger I(C; V) more preserves the details of spatial heterogeneity, requiring more number of clusters), where the parameter $\beta ~(>0)$ controls the tradeoff between the two quantities.

A formal solution of Eq. (2) is given by⁷⁶

$$\begin{aligned} p_{C|R}(c|r)=\frac{p_C(c)}{Z(r,\beta )} \textrm{e}^{-\beta D_{\textrm{KL}}\left[ p_{V|R}(v|r)\Vert p_{V|C}(v|c)\right] }, \end{aligned}$$

(3)

where $Z(r,\beta )$ is the normalizing constant and the symbol $D_{\textrm{KL}}$ denotes the Kullback–Leibler divergence measuring the “distance” between two probability distributions. In summary, as $D_{\textrm{KL}}$ decreases — as the spatial heterogeneity profile at r (that is, $p_{V|R}(v|r)$) and the representative profile of class c (that is, $p_{V|C}(v|c)$) becomes closer, the probability $p_{C|R}(c|r)$ increases more. In other words, the information bottleneck method classifies each pixel in the Raman image based on the similarity of the spatial heterogeneity characterized by $p_{V|R}$.

In this study, the hyperparameter $\beta$, which controls the trade-off between data compression and preservation, was set to 850 to ensure that the resulting clustering corresponds to a hard clustering—that is, the cluster attribution probability for each pixel, $p_{C|R}(c|r)$, is close to one for a single cluster and close to zero for all others. This parameter choice was also adopted in Ref. ⁴⁹. The total number of clusters was set at 4 (We confirmed that the conclusions of this paper do not change for the number of clusters 3–6).

Model animals, sample preparation, and measurement condition

This study was performed and described in accordance with the ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines 2.0 (https://arriveguidelines.org/). Animal models were prepared according to the following established protocols with minor modifications^{10,88,89,90,91,92}. Eight-week-old male Slc: Sprague Dawley rats (Shimizu Laboratory Supplies, Kyoto, Japan) were housed in 12-hour light/dark cycles with free access to food and water. As recently reported⁷⁷, the 48 rats were divided into three groups receiving different diets: standard diet (SD), high-fat diet (HFD, 60% lard), and high-fat high-cholesterol diet (HFHC, 60% lard, 1.25% cholesterol, 0.5% cholic acid). All diets were purchased from Oriental Yeast Co., Ltd. (Tokyo, Japan). Four rats from each group were sacrificed at 2, 4, 8, and 16 weeks (after starting designated feeding) for sample collection. The weight (mean ± standard deviation) of rats at the sample collection were, in the unit of gram (g), 409.0±21.9 (SD), 430.2±27.7 (HFD), 401±27.9 (HFHC) for 4 weeks feeding, 472.5±35.5 (SD), 510.3±42.1 (HFD), 529.5±49.0 (HFHC) for 8 weeks feeding, and 563.3±33.3 (SD), 681.3±47.5 (HFD), 633.8±43.1 (HFHC) for 16 weeks feeding, respectively. In the sample collection process, rats were placed under deep general anesthesia by intraperitoneal injection of a combination of three anesthetic agents (0.1 mg/kg medetomidine, 3.0 mg/kg midazolam, and 5.0 mg/kg butorphanol), then their liver was surgically excised. Euthanasia of the rats was conducted by the gentle exsanguination after the liver extraction. All procedures were approved and are in accordance with the guidelines of the Committee for Animal Research of the Kyoto Prefectural University of Medicine (M2022-229/238). Liver tissues excised from each sacrificed rats were sliced to approximately 1 mm thickness, and immediately transferred to 4$^{\circ }$C Krebs-Henseleit buffer (KHB), and used for Raman measurements within 2 hours. In the measurements, the tissue slices were placed in glass bottom plates with fully immersion of KHB. The remainder of each liver organ was fixed with 10% formalin buffer solution (Wako Pure Chemical Industires, Ltd., Osaka, Japan), then paraffinized and cut to a thickness of 4 mm with standard protocols for each step. The slices were treated for staining with hematoxylin and eosin (HE) after deparaffinization, then classified into normal liver tissue, NAFL, NASH based on NAFLD Activity Score (NAS)⁶³ and Brunt’s criteria⁷³, focusing on the comprehensive cellular and tissue architecture including the extent of steatosis, lobular inflammation, hepatocellular ballooning, fibrosis staging. The classification process was performed by a pathologist and a hepatologist familiar with liver histopathology. The results of liver histopathology revealed the 26 normal livers, 15 NAFL, 6 NASH (and 1 fibrosis as an unexpected result) as the prepared liver conditions from the 48 rats. The liver histology was mostly the same in the same dietary condition but slightly different between the conditions. Especially, the diet of HFD and HFHC for 16 weeks provided NAFL and NASH, respectively, and the time points 2, 4 and 8 weeks were necessary to obtain the liver condition in progress to NAFL/NASH. Thus, the liver model reflecting the stepwise progress from normal liver to NAFL/NASH was prepared. Raman measurements of thick slices of the liver were performed with a commercial confocal Raman microscope (Raman-11, Nanophoton, Osaka, Japan) with a 532-nm excitation light source and a 20$\times$/0.75 dry-type objective lens (Olympus / Evacuation, Tokyo, Japan). Raman scattering of liver tissues was excited by point illumination of excitation light (68 mW · µm$^{-2}$), and detected by a CCD camera (Pixis 400BR, Teledyne Princeton Instrument, NJ, USA) at -70 $^{\circ }$ C through a 50 µm confocal pinhole. Raman images with an area size of 95 µm $\times$ 345 µm (20 $\times$ 70 pixels) were recorded with a scan step of 5 µm and an exposure time of 1 s. Raman images of NAFLD liver tissues were preprocessed using the well-known protocol as before⁴⁹.

Data availability

The data that support the findings of this study are available from the corresponding authors, upon request.

References

Hamada, K. et al. Raman microscopy for dynamic molecular imaging of living cells. J. Biomed. Opt. 13, 044027–044027 (2008).
Article ADS PubMed Google Scholar
Puppels, G. et al. Studying single living cells and chromosomes by confocal Raman microspectroscopy. Nature 347, 301–303 (1990).
Article ADS CAS PubMed Google Scholar
Okada, M. et al. Label-free Raman observation of cytochrome c dynamics during apoptosis. Proc. Natl. Acad. Sci. USA 109, 28–32 (2012).
Article ADS CAS PubMed Google Scholar
Kumamoto, Y., Harada, Y., Takamatsu, T. & Tanaka, H. Label-free molecular imaging and analysis by Raman spectroscopy. Acta Histochem. Cytochem. 51, 101–110 (2018).
Article CAS PubMed PubMed Central Google Scholar
Haka, A. S. et al. In vivo margin assessment during partial mastectomy breast surgery using Raman spectroscopy. Cancer Res. 66, 3317–3322 (2006).
Article CAS PubMed Google Scholar
Bergholt, M. S. et al. Fiberoptic confocal Raman spectroscopy for real-time in vivo diagnosis of dysplasia in Barrett’s esophagus. Gastroenterology 146, 27–32 (2014).
Article PubMed Google Scholar
Wang, J. et al. Simultaneous fingerprint and high-wavenumber fiber-optic Raman spectroscopy improves in vivo diagnosis of esophageal squamous cell carcinoma at endoscopy. Sci. Rep. 5, 12957 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Yamamoto, T. et al. Label-free evaluation of myocardial infarct in surgically excised ventricular myocardium by Raman spectroscopy. Sci. Rep. 8, 14671 (2018).
Article ADS PubMed PubMed Central Google Scholar
Harada, Y. & Takamatsu, T. Raman molecular imaging of cells and tissues: towards functional diagnostic imaging without labeling. Curr. Pharm. Biotechnol. 14, 133–140 (2013).
CAS PubMed Google Scholar
Takemura, M. et al. Label-free assessment of the nascent state of rat non-alcoholic fatty liver disease using spontaneous Raman microscopy. Acta Histochem. Cytochem. 55, 57–66 (2022).
Article CAS PubMed PubMed Central Google Scholar
Takeuchi, M., Kajimoto, S. & Nakabayashi, T. Experimental evaluation of the density of water in a cell by Raman microscopy. J. Phys. Chem. Lett. 8, 5241–5245 (2017).
Article CAS PubMed Google Scholar
Adamczyk, A. et al. Toward Raman subcellular imaging of endothelial dysfunction. J. Med. Chem. 64, 4396–4409 (2021).
Article CAS PubMed PubMed Central Google Scholar
Belanger, M. C., Anbaei, P., Dunn, A. F., Kinman, A. W. & Pompano, R. R. Spatially resolved analytical chemistry in intact, living tissues. Anal. Chem. 92, 15255–15262 (2020).
Article CAS PubMed PubMed Central Google Scholar
Helal, K. M. et al. Raman spectroscopic histology using machine learning for nonalcoholic fatty liver disease. FEBS Lett. 593, 2535–2544. https://doi.org/10.1002/1873-3468.13520 (2019).
Article CAS PubMed Google Scholar
Nishiki-Muranishi, N. et al. Label-free evaluation of myocardial infarction and its repair by spontaneous Raman spectroscopy. Anal. Chem. 86, 6903–6910 (2014).
Article CAS PubMed Google Scholar
Bocklitz, T. W., Guo, S., Ryabchykov, O., Vogler, N. & Popp, J. Raman based molecular imaging and analytics: a magic bullet for biomedical applications!?. Anal. Chem. 88, 133–151 (2016).
Article ADS CAS PubMed Google Scholar
Lussier, F., Thibault, V., Charron, B., Wallace, G. Q. & Masson, J.-F. Deep learning and artificial intelligence methods for Raman and surface-enhanced Raman scattering. TrAC Trends Anal. Chem. 124, 115796 (2020).
Article CAS Google Scholar
Guo, S., Popp, J. & Bocklitz, T. Chemometric analysis in Raman spectroscopy from experimental design to machine learning-based modeling. Nat. protoc. 16, 5426–5459 (2021).
Article CAS PubMed Google Scholar
Brownfield, B., Lemos, T. & Kalivas, J. H. Consensus classification using non-optimized classifiers. Anal. Chem. 90, 4429–4437 (2018).
Article CAS PubMed Google Scholar
Zhang, H. et al. Rapid identification of cervical adenocarcinoma and cervical squamous cell carcinoma tissue based on Raman spectroscopy combined with multiple machine learning algorithms. Photodiagn. Photodyn. Ther. 33, 102104 (2021).
Article CAS Google Scholar
Parlatan, U. et al. Raman spectroscopy as a non-invasive diagnostic technique for endometriosis. Sci. Rep. 9, 19795 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, K.-M. et al. Rapid detection and prediction of chlortetracycline and oxytetracycline in animal feed using surface-enhanced Raman spectroscopy (SERS). Food Control 114, 107243 (2020).
Article CAS Google Scholar
Zhang, H. et al. Feature fusion combined with Raman spectroscopy for early diagnosis of cervical cancer. IEEE Photonics J. 13, 1–11 (2021).
Article Google Scholar
Mandrell, C. T. et al. Machine learning approach to Raman spectrum analysis of MIA Paca-2 pancreatic cancer tumor repopulating cells for classification and feature analysis. Life 10, 181 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Kang, S., Kim, I. & Vikesland, P. J. Discriminatory detection of ssDNA by surface-enhanced Raman spectroscopy (SERS) and tree-based support vector machine (Tr-SVM). Anal. Chem. 93, 9319–9328 (2021).
Article CAS PubMed Google Scholar
Moawad, A. A. et al. A machine learning-based Raman spectroscopic assay for the identification of Burkholderia mallei and related species. Molecules 24, 4516 (2019).
Article CAS PubMed PubMed Central Google Scholar
He, C., Wu, X., Zhou, J., Chen, Y. & Ye, J. Raman optical identification of renal cell carcinoma via machine learning. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc 252, 119520 (2021).
Article CAS Google Scholar
Zhang, L. et al. Raman spectroscopy and machine learning for the classification of breast cancers. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 264, 120300 (2022).
Article CAS Google Scholar
Yin, G. et al. An efficient primary screening of COVID-19 by serum Raman spectroscopy. J. Raman Spectrosc. 52, 949–958 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Sciortino, T. et al. Raman spectroscopy and machine learning for IDH genotyping of unprocessed glioma biopsies. Cancers 13, 4196 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wen, J. et al. Detection and classification of multi-type cells by using confocal Raman spectroscopy. Front. Chem. 9, 641670 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhu, S., Cui, X., Xu, W., Chen, S. & Qian, W. Weighted spectral reconstruction method for discrimination of bacterial species with low signal-to-noise ratio Raman measurements. RSC Adv. 9, 9500–9508 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zheng, X. et al. Rapid and non-invasive screening of high renin hypertension using Raman spectroscopy and different classification algorithms. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 215, 244–248 (2019).
Article ADS CAS Google Scholar
Bakhtiaridoost, S. et al. Raman spectroscopy-based label-free cell identification using wavelet transform and support vector machine. RSC Adv. 6, 50027–50033 (2016).
Article ADS CAS Google Scholar
Widjaja, E., Zheng, W. & Huang, Z. Classification of colonic tissues using near-infrared Raman spectroscopy and support vector machines. Int. J. Oncol. 32, 653–662 (2008).
CAS PubMed Google Scholar
Ralbovsky, N. M., Halamkova, L., Wall, K., Anderson-Hanley, C. & Lednev, I. K. Screening for Alzheimer’s disease using saliva: a new approach based on machine learning and Raman hyperspectroscopy. J. Alzheimer’s Dis. 71, 1351–1359 (2019).
Article CAS Google Scholar
Maruthamuthu, M. K., Raffiee, A. H., De Oliveira, D. M., Ardekani, A. M. & Verma, M. S. Raman spectra-based deep learning: A tool to identify microbial contamination. MicrobiologyOpen 9, e1122 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Application of serum SERS technology combined with deep learning algorithm in the rapid diagnosis of immune diseases and chronic kidney disease. Sci. Rep. 13, 15719 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Ibtehaz, N. et al. RamanNet: a generalized neural network architecture for Raman spectrum analysis. Neural Comput. Appl. 35, 18719–18735 (2023).
Google Scholar
Ali, S., Hassan, M., Saleem, M. & Tahir, S. F. Deep transfer learning based hepatitis B virus diagnosis using spectroscopic images. Int. J. Imaging Syst. Technol. 31, 94–105 (2021).
Article Google Scholar
Deng, L., Zhong, Y., Wang, M., Zheng, X. & Zhang, J. Scale-adaptive deep model for bacterial Raman spectra identification. IEEE J. Biomed. Health Inform. 26, 369–378 (2021).
Article Google Scholar
Lee, W., Lenferink, A. T., Otto, C. & Offerhaus, H. L. Classifying Raman spectra of extracellular vesicles based on convolutional neural networks for prostate cancer detection. J. Raman Spectrosc. 51, 293–300 (2020).
Article ADS CAS Google Scholar
Huang, S. et al. Blood species identification based on deep learning analysis of Raman spectra. Biomed. Opt. Express 10, 6129–6144 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Shin, H. et al. Early-stage lung cancer diagnosis by deep learning-based spectroscopic analysis of circulating exosomes. ACS Nano 14, 5435–5444 (2020).
Article ADS CAS PubMed Google Scholar
Otange, B., Birech, Z., Rop, R. & Oyugi, J. Estimation of HIV-1 viral load in plasma of HIV-1-infected people based on the associated Raman spectroscopic peaks. J. Raman Spectrosc. 50, 620–628 (2019).
Article ADS CAS Google Scholar
Huang, L. et al. Rapid, label-free histopathological diagnosis of liver cancer based on Raman spectroscopy and deep learning. Nat. Commun. 14, 48 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Weng, S., Xu, X., Li, J. & Wong, S. T. Combining deep learning and coherent anti-Stokes Raman scattering imaging for automated differential diagnosis of lung cancer. J. Biomed. Opt. 22, 106017–106017 (2017).
Article ADS PubMed PubMed Central Google Scholar
Baffa, M. F. O. et al. Deep neural networks can differentiate thyroid pathologies on infrared hyperspectral images. Comput. Methods Prog. Biomed. 247, 108100 (2024).
Article Google Scholar
Taylor, J. N. et al. High-resolution Raman microscopic detection of follicular thyroid cancer cells with unsupervised machine learning. J. Phys. Chem. B 123, 4358–4372 (2019).
Article CAS PubMed Google Scholar
Dina, N. E. et al. Characterization of clinically relevant fungi via SERS fingerprinting assisted by novel chemometric models. Anal. Chem. 90, 2484–2492 (2018).
Article CAS PubMed Google Scholar
Dina, N. E., Gherman, A. M. R., Colnţă, A., Marconi, D. & Sârbu, C. Fuzzy characterization and classification of bacteria species detected at single-cell level by surface-enhanced Raman scattering. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 247, 119149 (2021).
Article CAS Google Scholar
Durgarao, N. & Sudhavani, G. Diagnosing skin cancer via C-means segmentation with enhanced fuzzy optimization. IET Image Processing 15, 2266–2280 (2021).
Article Google Scholar
Wang, Y.-P., Wang, Y. & Spencer, P. Fuzzy clustering of Raman spectral imaging data with a wavelet-based noise-reduction approach. Appl. Spectrosc. 60, 826–832 (2006).
Article ADS CAS PubMed Google Scholar
El Abbassi, M. et al. Benchmark and application of unsupervised classification approaches for univariate data. Commun. Phys. 4, 50 (2021).
Article Google Scholar
Kang, S., Nam, W., Zhou, W., Kim, I. & Vikesland, P. J. Nanostructured Au-based surface-enhanced Raman scattering substrates and multivariate regression for pH sensing. ACS Appl. Nano Mater. 4, 5768–5777 (2021).
Article CAS Google Scholar
Jermyn, M. et al. Neural networks improve brain cancer detection with Raman spectroscopy in the presence of operating room light artifacts. J. Biomed. Opt. 21, 094002–094002. https://doi.org/10.1117/1.jbo.21.9.094002 (2016).
Article ADS Google Scholar
Pavillon, N., Hobro, A. J., Akira, S. & Smith, N. I. Noninvasive detection of macrophage activation with single-cell resolution through machine learning. Proc. Natl. Acad. Sci. USA 115, E2676–E2685. https://doi.org/10.1073/pnas.1711872115 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, Y. et al. Hematoxylin and eosin staining of intact tissues via delipidation and ultrasound. Sci. Rep. 8, 1–8. https://doi.org/10.1038/s41598-018-30755-5 (2018).
Article ADS CAS Google Scholar
Kong, K., Rowlands, C. J., Elsheikha, H. & Notingher, I. Label-free molecular analysis of live Neospora caninum tachyzoites in host cells by selective scanning Raman micro-spectroscopy. Analyst 137, 4119–4122 (2012).
Article ADS CAS PubMed Google Scholar
Rowlands, C. J. et al. Rapid acquisition of Raman spectral maps through minimal sampling: applications in tissue imaging. J. Biophotonics 5, 220–229 (2012).
Article CAS PubMed Google Scholar
Zhang, S. et al. Dynamic sparse sampling for confocal Raman microscopy. Anal. Chem. 90, 4461–4469 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tabata, K. et al. On-the-fly Raman microscopy guaranteeing the accuracy of discrimination. Proc. Natl. Acad. Sci. USA 121, e2304866121 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kleiner, D. E. et al. Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 41, 1313–1321. https://doi.org/10.1002/hep.20701 (2005).
Article PubMed Google Scholar
Tokushige, K. et al. Evidence-based clinical practice guidelines for nonalcoholic fatty liver disease/nonalcoholic steatohepatitis 2020. J. Gastroenterol. 56, 951–963. https://doi.org/10.1007/s00535-021-01796-x (2021).
Article PubMed PubMed Central Google Scholar
Brunt, E. M. Nonalcoholic steatohepatitis: definition and pathology. Semin. Liver Dis. 21, 003–016 (2001).
Article CAS Google Scholar
Angulo, P. Nonalcoholic fatty liver disease. N. Engl. J. Med. 346, 1221–1231 (2002).
Article CAS PubMed Google Scholar
Liu, W., Baker, R. D., Bhatia, T., Zhu, L. & Baker, S. S. Pathogenesis of nonalcoholic steatohepatitis. Cell. Mol. Life Sci. 73, 1969–1987 (2016).
Article CAS PubMed PubMed Central Google Scholar
McPherson, S. et al. Evidence of NAFLD progression from steatosis to fibrosing-steatohepatitis using paired biopsies: implications for prognosis and clinical management. J. Hepatol. 62, 1148–1155 (2015).
Article PubMed Google Scholar
Lindenmeyer, C. C. & McCullough, A. J. The natural history of nonalcoholic fatty liver disease-an evolving view. Clin. Liver Dis. 22, 11–21 (2018).
Article PubMed PubMed Central Google Scholar
Pais, R. et al. A systematic review of follow-up biopsies reveals disease progression in patients with non-alcoholic fatty liver. J. Hepatol. 59, 550–556 (2013).
Article CAS PubMed Google Scholar
Singh, S. et al. Fibrosis progression in nonalcoholic fatty liver vs nonalcoholic steatohepatitis: a systematic review and meta-analysis of paired-biopsy studies. Clin. Gastroenterol. Hepatol. 13, 643–654 (2015).
Article PubMed Google Scholar
Matteoni, C. A. et al. Nonalcoholic fatty liver disease: a spectrum of clinical and pathological severity. Gastroenterology 116, 1413–1419 (1999).
Article CAS PubMed Google Scholar
Brunt, E. M., Janney, C. G., Di Bisceglie, A. M., Neuschwander-Tetri, B. A. & Bacon, B. R. Nonalcoholic steatohepatitis: a proposal for grading and staging the histological lesions. Am. J. Gastroenterol. 94, 2467–2474 (1999).
Article CAS Google Scholar
Brunt, E. M. et al. Nonalcoholic fatty liver disease (NAFLD) activity score and the histopathologic diagnosis in NAFLD: distinct clinicopathologic meanings. Hepatology 53, 810–820 (2011).
Article CAS PubMed Google Scholar
Studer, V. et al. Compressive fluorescence microscopy for biological and hyperspectral imaging. Proc. Natl. Acad. Sci. USA 109, E1679–E1687. https://doi.org/10.1073/pnas.1119511109 (2012).
Article PubMed PubMed Central Google Scholar
Naftali, T., Fernando C., P. & William, B. The information bottleneck method. The 37th annual Allerton Conf. on Communication, Control, and Computing,368–377 (1999).
Helal, K. M. et al. Raman imaging of rat nonalcoholic fatty liver tissues reveals distinct biomolecular states. FEBS Lett. 597, 1517–1527. https://doi.org/10.1002/1873-3468.14600 (2023).
Article CAS PubMed Google Scholar
Ikemoto, K. et al. Raman spectroscopic assessment of myocardial viability in Langendorff-perfused ischemic rat hearts. Acta Histochem. Cytochem. 54, 65–72 (2021).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Effect of red light-emitting diodes irradiation on hemoglobin for potential hypertension treatment based on confocal micro-Raman spectroscopy. Scanning 2017, 5067867 (2017).
Article PubMed PubMed Central Google Scholar
Kakita, M., Okuno, M. & Hamaguchi, H.-o. Quantitative analysis of the redox states of cytochromes in a living L929 (NCTC) cell by resonance Raman microspectroscopy. J. Biophotonics 6, 256–259 (2013).
Article CAS PubMed Google Scholar
Li, M. et al. Label-free chemical imaging of cytochrome P450 activity by Raman microscopy. Commun. Biol. 5, 778 (2022).
Article CAS PubMed PubMed Central Google Scholar
Abramczyk, H., Surmacki, J. M., Kopeć, M., Jarczewska, K. & Romanowska-Pietrasiak, B. Hemoglobin and cytochrome c. reinterpreting the origins of oxygenation and oxidation in erythrocytes and in vivo cancer lung cells. Sci. Rep. 13, 14731 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Kochan, K., Marzec, K., Maslak, E., Chlopicki, S. & Baranska, M. Raman spectroscopic studies of vitamin a content in the liver: a biomarker of healthy liver. Analyst 140, 2074–2079 (2015).
Article ADS CAS PubMed Google Scholar
Rinella, M. E. et al. A multisociety Delphi consensus statement on new fatty liver disease nomenclature. Hepatology 78, 1966–1986 (2023).
Article PubMed Google Scholar
Taylor, J. N., Li, C.-B., Cooper, D. R., Landes, C. F. & Komatsuzaki, T. Error-based extraction of states and energy landscapes from experimental single-molecule time-series. Sci. Rep. 5, 9174 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Petry, R. et al. Asbestos mineral analysis by UV Raman and energy-dispersive X-ray spectroscopy. Chemphyschem 7, 414–420 (2006).
Article CAS Google Scholar
Taylor, J. N. et al. Correction for extrinsic background in Raman hyperspectral images. Anal. Chem. 95, 12298–12305 (2023).
Article CAS PubMed PubMed Central Google Scholar
Van Herck, M. A., Vonghia, L. & Francque, S. M. Animal models of nonalcoholic fatty liver disease-a starter’s guide. Nutrients. 9. https://doi.org/10.3390/nu9101072 (2017).
Omagari, K. et al. Effects of a long-term high-fat diet and switching from a high-fat to low-fat, standard diet on hepatic fat accumulation in Sprague-Dawley rats. Dig. Dis. Sci. 53, 3206–3212 (2008).
Article CAS PubMed Google Scholar
Okada, Y. et al. Rosuvastatin ameliorates high-fat and high-cholesterol diet-induced nonalcoholic steatohepatitis in rats. Liver Int. 33, 301–311 (2013).
Article CAS PubMed Google Scholar
Ichimura, M. et al. High-fat and high-cholesterol diet rapidly induces non-alcoholic steatohepatitis with advanced fibrosis in s Prague-Dawley rats. Hepatol. Res. 45, 458–469 (2015).
Article CAS PubMed Google Scholar
Kanuri, G. & Bergheim, I. In vitro and in vivo models of non-alcoholic fatty liver disease (NAFLD). Int. J. Mol. Sci. 14, 11963–11980 (2013).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank J. Nicholas Taylor for his discussions to improve the manuscript. We also thank Atsuyoshi Nakamura, Koji Tabata, and Khalifa Helal for important discussions during this research. This work was partially supported by JST SPRING Grant Number JPMJSP2119 (to R.K.), Japan Science and Technology Agency (JST) COI-NEXT Grant Number JPMJPF2009, Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Scientific Research (No. 25287105) (to T.K.), Grant-in-Aid for Exploratory Research (No. 25650044) (to T.K.), JST Core Research for Evolutional Science and Technology (CREST), Grant Number JPMJCR1662, Japan (to T.K., K.F., Y.H.) and JPMJCR2333 (to T.K. and K.F.), and the Research Program of “Dynamic Alliance for Open Innovation Bridging Human, Environment and Materials” in “Network Joint Research Center for Materials and Device” (to Y.H.).

Author information

Yuta Mizuno
Present address: Global Research and Development Center for Business by Quantum-AI technology (G-QuAT), National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8560, Japan
Kosuke Hashimoto
Present address: Research Center for Pre-disease Science, University of Toyama, 2630 Sugitani, 930-0194, Toyama, Japan
Jean-Emmanuel Clément
Present address: UMR 8516 – LASIRE – Laboratoire de Spectroscopie pour Les Interactions, La Réactivité et L’Environnement, CNRS, Univ. Lille, Lille, France, F-59000
Ryoya Kondo and Yuta Mizuno have contributed equally to this work.

Authors and Affiliations

Graduate School of Chemical Sciences and Engineering, Hokkaido University, Kita 13 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-8628, Japan
Ryoya Kondo, Yuta Mizuno & Tamiki Komatsuzaki
Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Kita 20 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0020, Japan
Yuta Mizuno & Tamiki Komatsuzaki
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan
Yuta Mizuno, Jean-Emmanuel Clément & Tamiki Komatsuzaki
Department of Pathology and Cell Regulation, Graduate School of Medical Science, Kyoto Prefectural University of Medicine, Kajii-cho, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto, 602-8566, Japan
Kentaro Mochizuki, Kosuke Hashimoto, Yasuaki Kumamoto & Yoshinori Harada
Department of Biomedical Sciences, School of Biological and Environmental Sciences, Kwansei Gakuin University, 1 Gakuen, Uegahara, Sanda, Hyogo, 669-1330, Japan
Kosuke Hashimoto
Department of Applied Physics, The University of Osaka, 2-1 Yamadaoka, Suita, Osaka, 565-0871, Japan
Yasuaki Kumamoto & Katsumasa Fujita
Institute for Open and Transdisciplinary Research Initiatives, The University of Osaka, Yamadaoka, Suita, Osaka, 565-0871, Japan
Yasuaki Kumamoto, Katsumasa Fujita & Tamiki Komatsuzaki
Advanced Photonics and Biosensing Open Innovation Laboratory, AIST-Osaka University, Yamadaoka, Suita, Osaka, 565-0871, Japan
Katsumasa Fujita
The Institute of Scientific and Industrial Research (SANKEN), The University of Osaka, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan
Tamiki Komatsuzaki

Authors

Ryoya Kondo
View author publications
Search author on:PubMed Google Scholar
Yuta Mizuno
View author publications
Search author on:PubMed Google Scholar
Kentaro Mochizuki
View author publications
Search author on:PubMed Google Scholar
Kosuke Hashimoto
View author publications
Search author on:PubMed Google Scholar
Jean-Emmanuel Clément
View author publications
Search author on:PubMed Google Scholar
Yasuaki Kumamoto
View author publications
Search author on:PubMed Google Scholar
Katsumasa Fujita
View author publications
Search author on:PubMed Google Scholar
Yoshinori Harada
View author publications
Search author on:PubMed Google Scholar
Tamiki Komatsuzaki
View author publications
Search author on:PubMed Google Scholar

Contributions

R.K. performed data analysis under supervision of Y.M., J-E.C. and T.K.. K.M., K.H., Y.K., K.F., and Y.H. contributed to the preparation of the Raman images of the liver tissues of the NAFLD model. T.K. and Y.H. supervised the entire research project. R.K., Y.M. and T.K. cowrote the original draft. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Yoshinori Harada or Tamiki Komatsuzaki.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kondo, R., Mizuno, Y., Mochizuki, K. et al. Integrating spatial and chemical information enhances differentiation of non-alcoholic steatohepatitis states in Raman imaging. Sci Rep 15, 35216 (2025). https://doi.org/10.1038/s41598-025-17495-z

Download citation

Received: 01 April 2025
Accepted: 25 August 2025
Published: 13 October 2025
Version of record: 13 October 2025
DOI: https://doi.org/10.1038/s41598-025-17495-z