Introduction

Understanding the migration of gases following an underground nuclear explosion (UNE) is a vital part of the National Nuclear Security Administration (NNSA) mission of world-wide nuclear test monitoring1. UNEs produce radionuclide gases such as xenon (Xe) that can migrate to the surface minutes to months following an explosion, driven by gas migration mechanisms that include pressure and thermal forcing combined with barometric (i.e., atmospheric) pressure variations2,3,4,5,6,7,8,9,10. Detection of certain radioactive gases that have leaked into the atmosphere is considered a “smoking gun” that an explosion was nuclear, rather than chemical.

Understanding gas migration from underground explosions is improved through controlled observations of subsurface gas transport through experimentation. The Physics Experiment 1 (PE1) is series of chemical explosions (PE1-A, -B, -DL) designed to allow real-time monitoring of gas migration in the rocks surrounding the explosion1. Successful predictions and interpretation of gas transport experiments require rigorous characterization of hydraulic properties of the geologic units surrounding the blast11,12. Characterization of the hydraulic properties of the rocks at P-Tunnel, which houses PE1, has been done through a joint effort of multiple national laboratories1. PE1-A, the first experiment of the series, was a chemical explosion test performed in P-Tunnel in October 2023.

P-Tunnel is located within Aqueduct Mesa in the Nevada National Security Site (NNSS; Fig. 1)13. The PE1-A experiment is located within primarily volcanic tuff geology. Volcanic tuff is highly heterogeneous and its hydraulic properties are controlled by the degree of welding, presence of fractures, mineralogy, and weathering14. Characterization of tuff lithologies is an integral part of research into the monitoring of nuclear tests, subsurface remediation of legacy disposal sites, and siting of deep geologic nuclear waste repositories. At Los Alamos National Laboratory (LANL), legacy nuclear and hazardous waste disposal areas are located within volcanic tuff (Bandelier Tuff Formation). Previous research has relied on flow and transport characterization of volcanic tuff to analyze and simulate groundwater chromium plume migration in the aquifer beneath LANL15,16, volatile organic compounds (VOCs) and tritium plumes in the unsaturated zone17,18,19, radioactive contamination in soils surrounding transuranic liquid waste facility in TA-50 and -5420 and moisture and contaminant migration through the vadose zone from waste pits at Material Disposal Area G21,22. Extensive permeability and tracer transport tests have been performed to analyze the flow properties of welded and partially welded tuffs and to determine their suitability to store nuclear waste at Yucca Mountain, NV23,24. Additional tuff testing related to the Yucca Mountain Project was performed at the Apache Leap Research Site, Arizona, USA (ALRS)25,26,27,28.

Fig. 1
figure 1

(A) Location of the Nevada National Security Site, Aqueduct Mesa (red box) which houses U12p Tunnel (P-Tunnel). (B) Cross section of Aqueduct Mesa from X – X’ in (C). The layers correspond to the following lithologies: upper nonwelded to partially welded tuff (UNPWT), upper welded tuff (UWT), vitric nonwelded tuff (VNT), and upper zeolitic nonwelded tuff (UZNT). (C) Topographic contours of Aqueduct Mesa and P-Tunnel Complex. Yellow stars represent the locations of the Physical Experiment 1 (PE1) chemical explosion experiments (PE1-A, -B, and -DL). Figure modified from Myers et al. (2024).

Quantification of permeability depends heavily on the scale at which the experiment/simulation is performed29. Derived permeabilities are observed to change by orders of magnitude between measurements at the laboratory (0.01 – 10 m), borehole (10 – 100 m) and regional (1000 + m) scales29. There is strong evidence of scaling effects in the volcanic tuffs at the Yucca Mountain and ALRS. Cross-well permeability tests exhibit an increase in permeability of two orders of magnitude (ALRS25,26,27,28; Yucca Mountain24) compared to single-well tests. The scaling effects at the ALRS have been attributed to fractures in welded tuff observed across boreholes30. The orientation, density, and connectivity of fractures create a directional dependency in the scaling effects31. In contrast, the non-welded tuffs of the Crater Flats Group in the Yucca Mountains are zeolitized and contain few fractures, leading to a matrix-flow-dominated regime. Cross-well tests in boreholes at Yucca Mountain are thought to be influenced by a nearby fault24. However, within a single borehole, the scaling effect is negligible. Illman et al. (2004)31 incrementally increased the testing interval of a packer test within a single borehole (0.5 – 20 m) and found no distinguishable trend; permeability was within one order of magnitude. The scaling effects at the ALRS have been explained numerically using stochastic-scaling theory32,33 which treats the permeability field as a random fractal34,35. Similar scaling effects have been seen in other non-tuff geology and are highly dependent on the scale at which they are sampled and the heterogeneities the scale incorporates36,37,38,39,40.

In preparation for the PE1-A experiment, a high explosive source was emplaced in a 4 m per side rectangular cavity. After emplacement of the high explosives and backfilling the void space with sand, the access drift to the cavity was sealed by a grout plug. Gas samplers with pressure transducers were placed in boreholes 11 to 41 m away from the cavity and grouted in place. A cavity pressurization test was performed to assess the integrity of the grout plug representing the upper end of the spatial scales we used to measure permeability. Estimating the permeability field serves two primary roles: informing engineering decisions such as optimizing borehole sensor placement by way of predictive modeling, and in interpretation of the flow and transport patterns measured following the experiment. Accuracy in the permeability estimates at the relevant scale of interest is therefore critical to the success of several main goals of PE1-A. Uncertainty and heterogeneity in the vitric non-welded tuff (VNT) permeability specifically stems from several factors. Although microfractures are observed in the VNT layers41, the weak strength of the rock inhibits the maintenance of open fractures14. The VNT units lie on the boundary between the non-welded vitric tuff and altered zeolitic tuff14. Within this transition zone, chemical alteration has influenced the pore structure from the addition of clays and zeolites which affect permeability42. We hypothesize that estimated values of matrix permeability will be influenced by scale effects as the incorporation of heterogeneity increases with growing spatial scale. Much like the upscaling from single- to cross-well pumping tests in the ALRS and Yucca Mountain sites, VNT unit bulk permeability is expected to increase with the introduction of preferential flow resulting from the presence of high-permeability channels within the matrix as the length scale increases.

In this study, we estimate permeability in the PE1-A test bed at three scales. Permeabilities are estimated from mercury intrusion measurements on core samples (measurement scale ≈ 2.54 cm), within boreholes (measurement scale ≈ 1 m), and by pressurizing the pre-shot experimental cavity (measurement scale > 10 m). Core permeability was estimated using mercury intrusion. Borehole permeability was calculated using data from a dual-packer system that interrogated 0.5 to 2 m sections of boreholes by measuring the flow rate of air versus injection interval pressure. The cavity pressurization test to assess the grout plug integrity was leveraged to characterize subsurface tuff gas-phase permeability by measuring pressure changes in nearby gas sampling boreholes (11 to 41 m away) caused by pressure in the cavity. Both the permeability estimates of the packer and cavity pressurization tests rely on parameter estimation using multiphase flow and transport simulations.

Physical and hydraulic characterization of P-tunnel geology for numerical models

The geology in which the P-Tunnel complex is situated is primarily composed of pumice- and ash-fall and ash flow deposits (Timber Mountain Formation) from the southwestern Nevada volcanic field approximately 10 km away, which erupted between 15 and 11 Ma43. The volcanic deposits cover the pre-eruption topography of sedimentary rock heavily deformed after a compressional event in the Cretaceous44. Some tuffs have been reworked and exhibit various degrees of welding and zeolitization, however, much of the volcanic lithology has experienced little structural deformation from Basin and Range Extension45. The P-Tunnel is approximately 350 m above the water table. Perched aquifers have been observed in other nearby tunnels, but P-Tunnel remains relatively dry14. Figure 1B shows the lithology of the Aqueduct Mesa, where P-Tunnel resides. From the surface to 400 m depth, the lithology is as follows: upper nonwelded to partially welded tuff (UNPWT), upper welded tuff (UWT), vitric nonwelded tuff (VNT) and upper zeolitic nonwelded tuff (UZNT). The eastern portion of P-Tunnel lies mostly within the VNT14. The PE1-A cavity is positioned within the transition zone where the nonwelded tuff becomes progressively zeolitized with increasing depth as it grades downward into the UZNT. Eleven geologically distinct subunits have been identified within the VNT units exposed by P-Tunnel excavation. The units are ordered from shallowest to deepest, VNT-a lies above VNT-1–10, with VNT-10 being the deepest.

Heterogeneity in tuff forms via depositional or diagenetic processes. These heterogeneities include the degree of welding, weathering, and presence of fractures all of which are present at the P-Tunnel site14. The degree of welding is dependent on the temperature of the ash when it was deposited. The zeolitization and clay formation occurs via weathering of tuff in events such as water table fluctuations. The presence of these minerals reduces porosity and permeability. Tuff may develop fractures or microfractures due to proximal seismic events, both anthropogenic and natural. The development is controlled by the strength of the tuff and the proximity to the seismic event. Nine fractures were identified in the drift near the PE1-A cavity with apertures less than 5 mm12. Depending on the type and prevalence of heterogeneous features, the scaling effects will increase or decrease the permeability of the volume of rock. Characterization studies were performed within the P-Tunnel to understand the physical and hydraulic characteristics of the VNT units. A multi-scale approach was utilized to quantify the scaling effects at three scales: 1) core/grab sample, 2) packer test, 3) cavity pressurization tests.

VNT layer core/grab sample characterization

A series of boreholes (AC-1, GI-2, GI-3, GI-4, GI-5, HF-1, and GS-1–8) were drilled in the drifts to characterize the inter- and intra-subunit heterogeneity in the VNT. Core and grab sample analysis is described in detail in Bodmer et al. (2024)12. To briefly summarize, core samples from the boreholes and samples collected from the drift walls (i.e., grab samples) underwent in-situ water content analysis, thermogravimeteric analysis, and gas and mercury porosimetry to characterize their hydraulic and physical properties (saturation, density, porosity, and relative permeability)11. Permeabilities derived from mercury injection capillary pressure (MICP) analysis were calculated following Swanson (1981)46. Pore size distributions were obtained using the Young–Laplace equation and values of the surface tension of mercury and the mercury-air/vacuum-rock contact angle11. The porosity measurements obtained from these analyses were corroborated through petrographic observations of thin sections.

To understand the nature of unsaturated flow, Heath et al. (2021)41 performed mercury intrusion and direct air–water measurements on tuff core samples across the NNSS. Of the samples the authors analyzed, one is in the P-tunnel complex with similar lithology to the VNT layers (UE-12p#7, 601.5). Petrographic analysis found the pore structure is composed of a triple continuum of matrix, pre-existing and explosion-induced microfractures. Mercury intrusion and direct air–water measurements showed that this complicated pore size distribution creates a complex capillary pressure response that can be fit by a multimodal van Genutchen model and is discussed in Supplementary Information section. Macrofractures are not present in the VNT layers described as poorly fractured by Prothro et al. (2018)14. This is due to the relatively low strength of tuff and lack of cooling joints. Consequently, we did not consider a dual permeability/porosity model in our simulations, which can represent the effects of both fracture and matrix.

Borehole permeability test design

The packer tests were performed in boreholes GI-2, GI-3, and GI-4 during March 2021, AC-1, GI-5, GI-6, and HF-1 during June 2022 and GS-1–8 during August to November 2022. The procedures for each packer test are explained in detail in Fritz et al. (2023)47. In short, the dual-packer assembly consists of two inflatable packers inserted into a borehole. Air is pumped at a constant rate into the void space between the packers, known as the injection interval. Pressure is recorded once a steady-state pressure is reached in the injection interval. Packer tests are usually performed in triplicate with varying injection rates. The steady-state interval pressure can be used to estimate the permeability of the surrounding rock, analytically or numerically.

Gas samplers were installed in the GS-1–8 boreholes. Boreholes GS-2, -4, -6, -7, and -8 were drilled from the tunnel upward into the surrounding rock, while GS-1, -3, -5 were drilled with a downward orientation. During August-November 2022, the upward-oriented holes utilized a single-packer assembly installed with the gas sampler whereas down-going boreholes have a dual-packer system. Preliminary testing of hardware, grout mixture and grouting method (in a near-by practice borehole) indicated that there was some potential for grout migration around the packer, resulting in an impact on the permeability of the sampling zone. Several measures were taken to mitigate this risk, including the use of a dual packer string on the down-going boreholes. Other measures included a thicker grout recipe close to the packer and a multi-lift grout emplacement strategy to reduce the head pressure on the liquid grout closest to the sampling zone. Packer pressurizations were performed before and after grout was emplaced behind the packers to ensure minimal disruption to the VNT permeability. In total, 105 packer tests were performed within 15 boreholes. In the AC, GI, and HF boreholes, packer tests were performed at multiple depths to characterize each of the VNT units.

A previous PE1-related investigation of gas permeability used numerical modeling to replicate experimental data from the dual-packer; pressure versus flow tests in boreholes and determined the gas permeability of eleven VNT subunits12. This preliminary characterization assumed constant hydraulic properties across the VNT units. Subsequent analysis of core and “grab” samples of the eleven VNT subunits showed that saturation and porosity vary significantly between units on a meter to submeter scale. Variations in saturation and porosity that were not included in the original permeability estimates require an update to the initial analysis.

Cavity pressurization test design

Prior to the PE1-A experiment, a cavity pressurization test was performed to verify the integrity of the grout plug, sealing the drift access from the cavity. Air was pumped into the cavity (injection rate: 15,000 to 30,000 L/min) in two stages following the first and third pour of the grout plug. Five pours were conducted in total. The cavity was pressurized for 2.5 h. At the time of the pressurization test, the volume of the cavity was 32 m3 and was filled the explosive device and sand. Pressure transducers installed in the cavity and in the GS-1–8 boreholes recorded pressure signal at 5 to 35 m away from the center of the cavity and across 6 of the 11 VNT layers. Pressure in the cavity peaked at ~ 0.15 MPa (~ 22 psi) and transducers continued recording for several days after the pressurization stopped to observe the pressure decay proximal to the cavity. The integrity of the grout plug was successfully verified, and the pressure recorded in GS-1–8 was used to calculate the permeability of each VNT layer.

Numerical simulations

Permeability calibration of the VNT units was achieved using numerical models and an inverse code, iteratively adjusting parameters and running the model until the data and model results reach a best fit. The porous media flow and transport simulator (FEHM)48,49 simulates gas flow in the packer and cavity pressurization tests. Parameter optimization is performed by the inverse code PEST50. PEST calibrates permeability either by the Gauss-Marquardt-Levenberg method51 for the packer test and Tikhonov Regularization52 for the cavity pressurization test. Previous investigations using FEHM coupled with PEST have been used to calibrate subsurface flow and transport properties in similar hydrologic environments including Yucca Mountain, NV and the Apache Leap Research Site, AZ5,24,30,53.

Governing equations for flow and mass transport

FEHM is used to simulate multiphase isothermal porous flow through the VNT matrix. Since the permeability tests were conducted in the subsurface, temperatures are assumed to be constant and isothermal properties are used. The gas flow and injection in unsaturated porous media is given by:

$$\frac{{\partial \left( {\left( {\theta - 1} \right)\rho _{a} \phi } \right)}}{{\partial t}} = ~\nabla ~ \cdot \left[ {\frac{{k\rho _{a} }}{{\mu _{a} }}k\left( \theta \right)\left( {\nabla p_{a} - \rho _{a} gz} \right) + Q_{a} } \right]$$
(1)

where \(\theta\) is saturation, \({\rho }_{a}\) and \({\mu }_{a}\) is the density and viscosity of air, respectively, \(\phi\) is porosity, \(t\) is time, \(k\) is the intrinsic permeability, \({p}_{a}\) is the air pressure, \(g\) is gravity, \(z\) is depth, and \(Qa\) is an air source term. The relative permeability of each phase, \(k\left(\theta \right)\), is a function of saturation. Here, we use a multimodal van Genutchen relative permeability model based on Heath et al. (2021)41. The non-linearity is a result of the pore distribution, a combination of matrix and post-/pre-existing microfractures, in VNT units. All simulations are performed on a 2D mesh that assumes radial symmetry around the x-axis or y-axis in the packer and cavity pressurization tests, respectively.

Packer test model design and assumptions

The packer test domain extends 10 m radially and 10 m in length. The center of the borehole lies along the x = 0 m axis with 0.06 m radius (Fig. 2). Cell size in the x-direction varies from 1 cm within the borehole to Δx = 0.1 m and spacing increases geometrically by a factor of 1.25 radially away from the borehole to a maximum Δx of 2 m. The Δy is constant at 5 cm spacing.

Fig. 2
figure 2

Model domains of the two permeability experiments. (A) Dual packer assembly with dashed box showing the model domain (C). Boundary conditions (BC) are labeled on the domain edges (Const P = Constant Pressure). Dimensions of packer assemblies differ between tests (Supplementary Figure S1). (B) Model domain and boundary conditions of the cavity pressurization test simulation. Dashed box (B) shows the VNT units and positions of the GS pressure transducers (D). Translation of GS-1–8 preserves the distance from the cavity to the sampler and the host VNT layer.

Position, length, and number of the packers in the model are dependent on the packer assembly and change for each set of packer tests (Supplementary Figure S1). The packers are assumed to be impermeable (k = 1E-21 m2), and the surrounding rock is homogeneous with properties corresponding to the unit where the test was performed. Hydraulic properties (saturation, porosity, and grain density) are averaged for VNT layers 3–5 from data collected from grab samples (Table 1). Saturation and grain density have not been analyzed experimentally for the other VNT layers so we use the average of the three analyzed VNT layers, 65% saturation and density of 2,260 kg/m3. The hydraulic properties of the packers, VNT layers, and borehole void space are reported in Table 1. Simulations were isothermal since we do not expect a large change in temperature at the scale or pumping rates of the packer tests.

Table 1 Physical properties and multi-modal van Genutchen parameters41 of simulated units/features. Results for in situ water content, thermogravimeteric analysis, and gas and mercury porosimetry on P-Tunnel grab/core samples are in parentheses11. Average of saturation and density values from other VNT units (0.65 and 2260, respectively) are used for VNT1, 6–10.

Constant pressure (type 2 Neumann) boundary conditions are enforced on the longitudinal boundaries of the domain and a constant air flux (\({Q}_{a}\); Eq. 1) boundary condition is used in the injection chamber. A reflection boundary condition is used on the radial boundary. Depending on the packer test, pumping rates ranged between 0.1 to 20 L/min. Boundary effects are not observed to influence the model at pumping rates of 20 L/min (see Supplementary Figure S4).

An extensive description of the methodology of the PEST calibration used in this study is found in the Supplementary Information section. To summarize here, we allow PEST to run the FEHM simulation and adjust the VNT permeability with the assumption that the permeability value is between 1E-12 to 1E-18 m2 (initially \(k\) = 1E-15 m2 or ~ 1 mD). After comparing the simulated versus observed steady-state pressure within the injection interval using the Gauss-Marquardt–Levenberg method51, PEST calculates a new permeability for the FEHM simulation. This iterative process continues until PEST minimizes the error between the observed and simulated data, known as the objective function.

Cavity pressure test model design and assumptions

The cavity pressurization test is simulated using FEHM on a 150 m by 150 m mesh to avoid boundary effects. The element length discretization of the tetrahedral mesh is uniform (Δx = Δy = 1 m) except within the cavity (Δx = Δy = 0.25 m). The mesh is constructed using the LANL gridding tool LaGrit54.

The spherical cavity lies within the VNT layers. The positions of the GS-1–8 samplers are translated from 3D space onto 2D space based on their residing VNT layer and distance from the cavity center (Fig. 2). The cavity is assumed to be filled with dry sand that is highly permeable. We allow the permeability to vary from 1E-9 to 1E-12 in the calibration process. The saturation, porosity and density of the VNT layers are assigned using data collected from SNL (Table 1). The timestep size varied from 0.0086 to 650 s. Constant pressure boundary conditions (type 1; Dirichlet) are assumed on the top, bottom, and right lateral edge of the model domain (P = 0.08 MPa; atmospheric pressure at the NNSS). The effects of air-static gradient and the influence of barometric fluctuations at the land surface were not considered since the forcing from the cavity pressurization is much greater (0.15 MPa). The left lateral edge is a no-flux/reflection (type 2; Neumann) boundary condition, about which the model solution is radially symmetric. Pressurization of the cavity is represented by air injected uniformly within the entire cavity volume as a transient specified flux (type 2; Neumann) boundary condition. The air source term is distributed within the cavity volume and varies from 3,000 to 30,000 L/min over the 2.5-h injection time.

Unlike the packer tests, pressures at the gas samplers in the cavity pressurization test never achieved steady-state pressure due to the time-varying injection rate into the cavity. PEST is provided with 500 randomly selected observation points from the pressure data recorded by the GS samplers from the start of sampling through one day. PEST iteratively adjusts the permeability of the FEHM forward models using the Tikhonov regularization method50 until the objective function is minimized.

Results

Core measurements

Physical and hydrologic characterization results from core measurements are displayed in Tables 1 and 2. The VNT units remain moderately to nearly fully saturated (40–92%). Overall, porosity decreases with depth from VNT-1 to VNT-10. However, the porosity within one unit may vary considerably. Porosity measurements were taken from multiple cores within VNT-3–5 and range in porosity from 18–42%. Core permeability estimates range two orders of magnitude. MICP measurements were performed on samples from VNT-1, -3, -4, -5, -6, -9, -10. The highest core permeability estimate was observed in VNT-6 (2E-13 m2). The variability in permeability of the units captures the heterogeneous nature of tuff hydraulic properties at the local scale.

Table 2 Estimated and calibrated permeability across three scales of measurements: (1) Core permeability estimate based on capillary intrusion, (2) packer tests (3) cavity pressurization test. A full list of the 106 packer test measurements in supplemental materials (Supplementary Table S2). All boreholes were used in the packer tests with the exception of DA-1.

Packer test

Calibrated permeability values from packer tests demonstrate the spatial variability of flow properties between and within the VNT units. Table 2 reports the permeability range for each VNT unit. Maximum, minimum and median permeability values are reported to demonstrate the range of permeability in each unit. Multiple packer tests were performed in the AC, HF, and GI boreholes to characterize multiple VNT units with one borehole.

In the shallower units (VNT-a-5), the calibrated permeability fits the data reasonably well. The maximum residual, difference between the observed and simulated pressure, is between 0.039 to 260 Pa. Calibrated permeabilities of VNT-4 and -5 span across two orders of magnitude. These VNT layers were characterized by packer tests in 5–7 boreholes and show spatial heterogeneity.

Below the VNT layers, the tuff has been zeolitized (UZNT) due to historical fluctuations of the water table and geochemical interaction with groundwater14. Prothro (2018)14 notes that the contact between the VNT and UZNT is a transitional contact, and the VNT units lie within this transitional boundary. The weathering of the tuff has left zeolites (clinoptilolite) and other argillic clays in the pores, reducing the porosity of the UZNT14. This permeability-depth relationship is observed in the deeper VNT units, where VNT-7–10 are consistently less permeable and possess higher maximum residual error. Due to the low porosity within VNT-9 and -10, slight variations in flow rates lead to large fluctuations in interval pressure. Within the VNT-9/10 boundary, packer tests in the AC-1 borehole used two different injection rates and saw a 72.5 kPa pressure difference between the two tests. In the calibration process, small adjustments to permeability lead to large non-linear pressure deviations and consequently, higher residual error (See Supplementary Figure S5).

Cavity pressurization test

The VNT permeability values determined from the cavity pressurization test PEST calibration were much higher than the packer test calibration results (Table 2). During the experiment, seven pressure transducers (GS-1–8, excluding 7) recorded data within four of the ten VNT units (VNT-a, -1, -5 and -8). GS-7 developed a leak post emplacement and recorded ambient pressure. Figure 3 shows the observed versus simulated pressure signals and mean residual at each sensor used in the cavity pressurization experiment. The PEST calibration results in a geometric mean residual of 281 Pa for all observation groups for the VNT layers.

Fig. 3
figure 3

Calibrated permeability from the cavity pressurization test (top left panel; values from Table 2, S2). Four VNT units contain GS samplers: VNT1 (GS-2), VNT5 (GS-1 and 3), VNT 8 (GS-5) and VNTa (GS-4, 6 and 8). Pumping rates into the cavity during the test are shown in Figure S2. Simulated (dashed lines) versus observed (solid lines) pressure signals at GS samplers and cavity during the cavity pressurization test from PEST calibration. The pressure observations used as PEST calibration targets are shown as black stars.

Calibrated cavity pressurization tests consistently yield higher permeability data than packer tests in all units except VNT-a and VNT-10 (Fig 4). Red error bars indicate that adjustments to that VNT unit made little difference to the final pressure signal at the GS samplers. Blue error bars show the 95% confidence intervals52. The red circles are the values used in the final calibration. Upscaling from the packer test to the cavity pressurization test increases the length scale from 2 m (packer) to 10 – 40 m (cavity), with the potential of incorporating numerous heterogeneous features. Both tests show a permeability-depth relationship (Fig 4). However, contrary to the packer tests, VNT-3 shows the highest permeability (k = 2.13E-12 m2) instead of VNT-a (2.33E-12 m2). High permeability in VNT layers adjacent to the cavity (VNT-3–5) is necessary for the pressure to rapidly bleed off.

Fig. 4
figure 4

Composite VNT permeability obtained from PEST/FEHM numerical models in this study (116 measurements total). VNT-a is the shallowest unit. Depth increases from VNT-a to VNT-10. The line, whiskers and black open circles of the box plots are the median, 95% confidence interval and outliers of the packer test permeability distribution. Circle and error bars for the cavity pressurization test represent the calibrated permeability and 95% confidence interval, respectively. Error bars that extend beyond the range of PEST permeability estimates are labeled in red, indicating they are relatively insensitive to the calibration process.

Discussion

Impact of choice of permeability measurement scale on flow and transport

Scaling effects of hydraulic properties can have a profound impact how representative our models are for subsurface flow and transport. The maximum and minimum permeability derived from the packer inverse models both over- and underestimate the GS-1 pressure signal during the cavity pressurization test (Fig. 5). Except for VNT-a, -1, and -5, permeability calibrated via the cavity pressurization test are orders of magnitude higher than all permeabilities calibrated from the packer tests. To replicate the pressure signals of the GS samplers, the permeability of the VNT units adjacent to the cavity must be high, around one darcy, and capped by lower permeability layers above (VNT-a) and below (VNT-5).

Fig. 5
figure 5

Simulated (lines) and observed (blue dots) pressure signal at GS-1 during the cavity pressurization test with three different permeability scenarios: 1) Inverted permeability from the cavity pressurization test, 2) maximum and 3) minimum inverted permeability from the packer tests.

During the cavity pressurization test, the three GS samplers in VNT-a did not observe a measurable change in pressure and were difficult to calibrate. These three pressure transducers fluctuate between + /- 200 Pa with no change after pumping is stopped. No change in pressure signal during the cavity pressurization test indicates either 1) the permeability is too low, preventing the pressure signal from reaching the GS sampler, or 2) VNT-a is too permeable, causing the pressure signal to attenuate before reaching the sampler. The latter is the more likely, given that the packer permeability estimates are around one darcy. Our results suggest that because of the complexity of hydraulic properties and heterogeneous nature of tuff, it is best practice is to calibrate permeability at the scale of interest.

Limitations of cavity pressurization inversion models

The calibration process for the cavity pressurization test results in permeabilities are non-unique. VNT-a will be slightly higher (1E-13 m2 instead of 9E-14 m2) if the cavity permeability is decreased from 1E-11 m2 to 6E-12 m2. However, decreasing cavity permeability will result in higher permeability in all VNT units to reproduce pressure decay around the cavity. Additional errors may be due to the presence of asymmetric fast pathways in the matrix or fractures that cannot be adequately accounted for on an axisymmetric geometry. The relative proximity of samplers to one another is not preserved in 3D (e.g. GS-4 and GS-6 may not be as close as the mesh implies – only their relative distance from the cavity).

Additional uncertainty analyses were conducted to understand the impact of porosity and saturation ranges found in core analyses. The recalibrated permeabilities were within twice the calibrated permeability values when using the average porosity and saturation values. The sensitivity study is described in the Supplementary Information section.

Insight into permeability scaling: local packer tests versus cavity pressurization tests

When the hydraulic properties of a rock are conceptualized as a continuum, scaling effects can occur by the gradual incorporation of heterogeneities as the scale of the measurement increases. Scaling effects will continue to increase until the scale reaches the representative elementary volume (REV) and the scale effect will reach a maximum.

Upon further examination of scale-related permeability measurements in volcanic tuff, we begin to see a general trend. Figure 6 shows the permeability of tuffs within this study (NNSS), Yucca Mountain (YM), NV24,54, and the ALRS, AZ25,56,57, over 1400 + measurements in total. Outliers in the tuff permeability distribution are present at all scales (outside the 95% confidence interval) and are represented by the open circles in the box plot diagram. The outliers are all above the median, represented by the orange line, indicating high permeability. Even with outliers taken into consideration, no statistically significant changes in permeability were observed as the packer interval increased from 0.5 to 20 m in a single well test29. The outliers likely represent high permeability features within the volcanic tuff matrix that may be missed without dense sampling. A single individual discrete high permeability conduit from the cavity could be invoked to rapidly drain pressure; however this scenario would not fit the fact that all discrete monitoring points located within a 10’s of meter radial 3-D volume sampled around the injection cavity see rapid pressure changes. Thus, to fit the data at the seven outlying sample locations, permeability within the entire formation must be higher than core or packer measured permeability.

Fig. 6
figure 6

Permeability estimates of volcanic tuffs performed at three scales: 1) laboratory-analyzed core samples, 2) single-hole packer tests (calibrated), and 3) cross-hole pumping tests/cavity pressurization test (calibrated). The orange line, whiskers and black circles of the box plots are the median, 95% confidence interval and outliers of the permeability distribution. The data were obtained from the Apache Leap Research Site, AZ (ALRS, 1309 measurements)25,56,57, Yucca Mountain, NV (YM, 25 measurements)24,55 and the Nevada National Security Site, NV (NNSS, 116 measurements in this study and 8 measurements in Reference 11). Scale of the measurements are based on the size of the core, pressurized interval in packer tests, and maximum distance between packers in cross-hole and cavity pressurization tests.

These previous studies reported that both pneumatic and hydrologic cross-well tests consistently yielded permeability measurements higher than single-well packer tests, despite differing textural characteristics (i.e. welding and zeolitization) across each study area, a result that is echoed in our study. At the local scale, regions with higher permeability could connect to form pathways that contribute to an overall increase in bulk permeability across the cross-borehole scale.

The ALRS tuffs are partially welded ash-flow deposits that are heavily fractured58, whereas the Crater Flat and Bullfrog tuffs of Yucca Mountain are non-welded and zeolitized55. As discussed previously, the VNT units at P-Tunnel are vitric and non-welded, however zeolites are more present with increasing depth. Within volcanic tuff, scaling effects appear relevant at scales greater than 10 m across these three study areas (Fig. 6).

Previous authors have attributed scaling effects from the laboratory scale to the borehole scale to 1) the skin effects that describe the influence of the well on the pressure field25, 2) turbulence near sampling ports and injection boreholes25,26,27,28,57 3) the dual-continuum porous media structures introduced by fractures59. Skin effects have been shown to have little influence on the pressure field during single- and cross-borehole tests25. Because the VNT units are structurally weak, fracturing is unlikely and the damage to the periphery of the cavity during construction is minimal14. Furthermore, it would be accounted for in the calibration process since the cavity permeability is an adjustable parameter.

Upscaling permeability estimates is fundamental to modeling contamination transport, underground nuclear explosion monitoring and nuclear waste disposition. Understanding the spatial distribution of permeability, both in terms of magnitude and its variability, is crucial for assessing long-term containment strategies and for developing effective monitoring systems in geologically complex environments like volcanic tuff.

Conclusions

We present inverted permeabilities derived from numerical models that represent the conditions of experiments performed at multiple spatial scales. The numerical models reproduce steady-state pressures in packer tests (1–2 m-scale) and pressure signals observed in cavity pressurization test (10–40 m scale) performed in P-Tunnel from 2021 to 2023.

The tuff permeability clearly shows a depth-dependence as the shallower nonwelded tuff becomes progressively more zeolitic with increasing depth. We observe via inverse analyses of the tuff units (VNT) that derived permeability varies depending on the scale of measurement. Increasing the volume of the rock over which pressure measurements are performed (i.e., cavity pressurization test) results in higher bulk permeability than localized packer tests. The discrepancy between derived permeabilities from packer and cavity pressurization test is due to the sampling of additional heterogeneity as the spatial scale increases. These scaling effects highlight the importance of accurately characterizing the complex geology of P-Tunnel at multiple scales depending on the scale of the problem of interest. For instance, a similar permeability scaling relationship was not observed when comparing bench-scale core measurements to packer tests. For certain applications localized permeability measurements may be sufficient; on the other hand, if one is interested in predicting far-field flow and transport, simply using multiple scales at the core and packer test level will miss a significant degree of heterogeneity.

Improving multi-phase flow and transport simulations in volcanic tuff supports the PE1 experiment process by optimizing the placement of pressure sensors and gas sampling equipment in future experiments in the P-Tunnel. We demonstrate the impact of measurement scale on the resultant pressure wave propagation through the rock by comparing simulations using permeability measurements collected at two distinct scales. In so doing, we find in this simple comparison that even though measurements at each independent scale were correct, measurements closer to the relevant flow processes of interest provide a much better estimate of pressure propagation. Moreover, characterizing how scaling effects influence hydraulic properties is critical in applying our models to new lesser-known environments for nuclear explosion monitoring, environmental remediation, and nuclear waste disposition.