-
Multiscale Change Point Detection for Functional Time Series
Authors:
Tim Kutta,
Holger Dette,
Shixuan Wang
Abstract:
We study the problem of detecting and localizing multiple changes in the mean parameter of a Banach space-valued time series. The goal is to construct a collection of narrow confidence intervals, each containing at least one (or exactly one) change, with globally controlled error probability. Our approach relies on a new class of weighted scan statistics, called Hölder-type statistics, which allow…
▽ More
We study the problem of detecting and localizing multiple changes in the mean parameter of a Banach space-valued time series. The goal is to construct a collection of narrow confidence intervals, each containing at least one (or exactly one) change, with globally controlled error probability. Our approach relies on a new class of weighted scan statistics, called Hölder-type statistics, which allow a smooth trade-off between efficiency (enabling the detection of closely spaced, small changes) and robustness (against heavier tails and stronger dependence). For Gaussian noise, maximum weighting can be applied, leading to a generalization of optimality results known for scalar, independent data. Even for scalar time series, our approach is advantageous, as it accommodates broad classes of dependency structures and non-stationarity. Its primary advantage, however, lies in its applicability to functional time series, where few methods exist and established procedures impose strong restrictions on the spacing and magnitude of changes. We obtain general results by employing new Gaussian approximations for the partial sum process in Hölder spaces. As an application of our general theory, we consider the detection of distributional changes in a data panel. The finite-sample properties and applications to financial datasets further highlight the merits of our method.
△ Less
Submitted 10 November, 2025;
originally announced November 2025.
-
TWIN: Two window inspection for online change point detection
Authors:
Patrick Bastian,
Tim Kutta
Abstract:
We propose a new class of sequential change point tests, both for changes in the mean parameter and in the overall distribution function. The methodology builds on a two-window inspection scheme (TWIN), which aggregates data into symmetric samples and applies strong weighting to enhance statistical performance. The detector yields logarithmic rather than polynomial detection delays, representing a…
▽ More
We propose a new class of sequential change point tests, both for changes in the mean parameter and in the overall distribution function. The methodology builds on a two-window inspection scheme (TWIN), which aggregates data into symmetric samples and applies strong weighting to enhance statistical performance. The detector yields logarithmic rather than polynomial detection delays, representing a substantial reduction compared to state-of-the-art alternatives. Delays remain short, even for late changes, where existing methods perform worst. Moreover, the new procedure also attains higher power than current methods across broad classes of local alternatives. For mean changes, we further introduce a self-normalized version of the detector that automatically cancels out temporal dependence, eliminating the need to estimate nuisance parameters. The advantages of our approach are supported by asymptotic theory, simulations and an application to monitoring COVID19 data. Here, structural breaks associated with new virus variants are detected almost immediately by our new procedures. This indicates potential value for the real-time monitoring of future epidemics.
Mathematically, our approach is underpinned by new exponential moment bounds for the global modulus of continuity of the partial sum process, which may be of independent interest beyond change point testing.
△ Less
Submitted 13 October, 2025;
originally announced October 2025.
-
Monitoring Violations of Differential Privacy over Time
Authors:
Önder Askin,
Tim Kutta,
Holger Dette
Abstract:
Auditing differential privacy has emerged as an important area of research that supports the design of privacy-preserving mechanisms. Privacy audits help to obtain empirical estimates of the privacy parameter, to expose flawed implementations of algorithms and to compare practical with theoretical privacy guarantees. In this work, we investigate an unexplored facet of privacy auditing: the sustain…
▽ More
Auditing differential privacy has emerged as an important area of research that supports the design of privacy-preserving mechanisms. Privacy audits help to obtain empirical estimates of the privacy parameter, to expose flawed implementations of algorithms and to compare practical with theoretical privacy guarantees. In this work, we investigate an unexplored facet of privacy auditing: the sustained auditing of a mechanism that can go through changes during its development or deployment. Monitoring the privacy of algorithms over time comes with specific challenges. Running state-of-the-art (static) auditors repeatedly requires excessive sampling efforts, while the reliability of such methods deteriorates over time without proper adjustments. To overcome these obstacles, we present a new monitoring procedure that extracts information from the entire deployment history of the algorithm. This allows us to reduce sampling efforts, while sustaining reliable outcomes of our auditor. We derive formal guarantees with regard to the soundness of our methods and evaluate their performance for important mechanisms from the literature. Our theoretical findings and experiments demonstrate the efficacy of our approach.
△ Less
Submitted 24 September, 2025;
originally announced September 2025.
-
Monitoring Time Series for Relevant Changes
Authors:
Patrick Bastian,
Tim Kutta,
Rupsa Basu,
Holger Dette
Abstract:
We consider the problem of sequentially testing for changes in the mean parameter of a time series, compared to a benchmark period. Most tests in the literature focus on the null hypothesis of a constant mean versus the alternative of a single change at an unknown time. Yet in many applications it is unrealistic that no change occurs at all, or that after one change the time series remains station…
▽ More
We consider the problem of sequentially testing for changes in the mean parameter of a time series, compared to a benchmark period. Most tests in the literature focus on the null hypothesis of a constant mean versus the alternative of a single change at an unknown time. Yet in many applications it is unrealistic that no change occurs at all, or that after one change the time series remains stationary forever. We introduce a new setup, modeling the sequence of means as a piecewise constant function with arbitrarily many changes. Instead of testing for a change, we ask whether the evolving sequence of means, say $(μ_n)_{n \geq 1}$, stays within a narrow corridor around its initial value, that is, $μ_n \in [μ_1-Δ, μ_1+Δ]$ for all $n \ge 1$. Combining elements from multiple change point detection with a Hölder-type monitoring procedure, we develop a new online monitoring tool. A key challenge in both construction and proof of validity is that the risk of committing a type-I error after any time $n$ fundamentally depends on the unknown future of the time series. Simulations support our theoretical results and we present two real-world applications: (1) healthcare monitoring, with a focus on blood glucose tracking, and (2) political consensus analysis via citizen opinion polls.
△ Less
Submitted 1 September, 2025;
originally announced September 2025.
-
Monitoring for a Phase Transition in a Time Series of Wigner Matrices
Authors:
Nina Dörnemann,
Piotr Kokoszka,
Tim Kutta,
Sunmin Lee
Abstract:
We develop methodology and theory for the detection of a phase transition in a time-series of high-dimensional random matrices. In the model we study, at each time point \( t = 1,2,\ldots \), we observe a deformed Wigner matrix \( \mathbf{M}_t \), where the unobservable deformation represents a latent signal. This signal is detectable only in the supercritical regime, and our objective is to detec…
▽ More
We develop methodology and theory for the detection of a phase transition in a time-series of high-dimensional random matrices. In the model we study, at each time point \( t = 1,2,\ldots \), we observe a deformed Wigner matrix \( \mathbf{M}_t \), where the unobservable deformation represents a latent signal. This signal is detectable only in the supercritical regime, and our objective is to detect the transition to this regime in real time, as new matrix--valued observations arrive. Our approach is based on a partial sum process of extremal eigenvalues of $\mathbf{M}_t$, and its theoretical analysis combines state-of-the-art tools from random-matrix-theory and Gaussian approximations. The resulting detector is self-normalized, which ensures appropriate scaling for convergence and a pivotal limit, without any additional parameter estimation. Simulations show excellent performance for varying dimensions. Applications to pollution monitoring and social interactions in primates illustrate the usefulness of our approach.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Prokhorov Metric Convergence of the Partial Sum Process for Reconstructed Functional Data
Authors:
Tim Kutta,
Piotr Kokoszka
Abstract:
Motivated by applications in functional data analysis, we study the partial sum process of sparsely observed, random functions. A key novelty of our analysis are bounds for the distributional distance between the limit Brownian motion and the entire partial sum process in the function space. To measure the distance between distributions, we employ the Prokhorov and Wasserstein metrics. We show tha…
▽ More
Motivated by applications in functional data analysis, we study the partial sum process of sparsely observed, random functions. A key novelty of our analysis are bounds for the distributional distance between the limit Brownian motion and the entire partial sum process in the function space. To measure the distance between distributions, we employ the Prokhorov and Wasserstein metrics. We show that these bounds have important probabilistic implications, including strong invariance principles and new couplings between the partial sums and their Gaussian limits. Our results are formulated for weakly dependent, nonstationary time series in the Banach space of d-dimensional, continuous functions. Mathematically, our approach rests on a new, two-step proof strategy: First, using entropy bounds from empirical process theory, we replace the function-valued partial sum process by a high-dimensional discretization. Second, using Gaussian approximations for weakly dependent, high-dimensional vectors, we obtain bounds on the distance. As a statistical application of our coupling results, we validate an open-ended monitoring scheme for sparse functional data. Existing probabilistic tools were not appropriate for this task.
△ Less
Submitted 26 June, 2025;
originally announced June 2025.
-
General-Purpose $f$-DP Estimation and Auditing in a Black-Box Setting
Authors:
Önder Askin,
Holger Dette,
Martin Dunsche,
Tim Kutta,
Yun Lu,
Yu Wei,
Vassilis Zikas
Abstract:
In this paper we propose new methods to statistically assess $f$-Differential Privacy ($f$-DP), a recent refinement of differential privacy (DP) that remedies certain weaknesses of standard DP (including tightness under algorithmic composition). A challenge when deploying differentially private mechanisms is that DP is hard to validate, especially in the black-box setting. This has led to numerous…
▽ More
In this paper we propose new methods to statistically assess $f$-Differential Privacy ($f$-DP), a recent refinement of differential privacy (DP) that remedies certain weaknesses of standard DP (including tightness under algorithmic composition). A challenge when deploying differentially private mechanisms is that DP is hard to validate, especially in the black-box setting. This has led to numerous empirical methods for auditing standard DP, while $f$-DP remains less explored. We introduce new black-box methods for $f$-DP that, unlike existing approaches for this privacy notion, do not require prior knowledge of the investigated algorithm. Our procedure yields a complete estimate of the $f$-DP trade-off curve, with theoretical guarantees of convergence. Additionally, we propose an efficient auditing method that empirically detects $f$-DP violations with statistical certainty, merging techniques from non-parametric estimation and optimal classification theory. Through experiments on a range of DP mechanisms, we demonstrate the effectiveness of our estimation and auditing procedures.
△ Less
Submitted 13 June, 2025; v1 submitted 10 February, 2025;
originally announced February 2025.
-
Testing separability for continuous functional data
Authors:
Holger Dette,
Gauthier Dierickx,
Tim Kutta
Abstract:
Analyzing the covariance structure of data is a fundamental task of statistics. While this task is simple for low-dimensional observations, it becomes challenging for more intricate objects, such as multivariate functions. Here, the covariance can be so complex that just saving a non-parametric estimate is impractical and structural assumptions are necessary to tame the model. One popular assumpti…
▽ More
Analyzing the covariance structure of data is a fundamental task of statistics. While this task is simple for low-dimensional observations, it becomes challenging for more intricate objects, such as multivariate functions. Here, the covariance can be so complex that just saving a non-parametric estimate is impractical and structural assumptions are necessary to tame the model. One popular assumption for space-time data is separability of the covariance into purely spatial and temporal factors. In this paper, we present a new test for separability in the context of dependent functional time series. While most of the related work studies functional data in a Hilbert space of square integrable functions, we model the observations as objects in the space of continuous functions equipped with the supremum norm. We argue that this (mathematically challenging) setup enhances interpretability for users and is more in line with practical preprocessing.
Our test statistic measures the maximal deviation between the estimated covariance kernel and a separable approximation. Critical values are obtained by a non-standard multiplier bootstrap for dependent data. We prove the statistical validity of our approach and demonstrate its practicability in a simulation study and a data example.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Validating Approximate Slope Homogeneity in Large Panels
Authors:
Tim Kutta,
Holger Dette
Abstract:
Statistical inference for large data panels is omnipresent in modern economic applications. An important benefit of panel analysis is the possibility to reduce noise and thus to guarantee stable inference by intersectional pooling. However, it is wellknown that pooling can lead to a biased analysis if individual heterogeneity is too strong. In classical linear panel models, this trade-off concerns…
▽ More
Statistical inference for large data panels is omnipresent in modern economic applications. An important benefit of panel analysis is the possibility to reduce noise and thus to guarantee stable inference by intersectional pooling. However, it is wellknown that pooling can lead to a biased analysis if individual heterogeneity is too strong. In classical linear panel models, this trade-off concerns the homogeneity of slope parameters, and a large body of tests has been developed to validate this assumption. Yet, such tests can detect inconsiderable deviations from slope homogeneity, discouraging pooling, even when practically beneficial. In order to permit a more pragmatic analysis, which allows pooling when individual heterogeneity is sufficiently small, we present in this paper the concept of approximate slope homogeneity. We develop an asymptotic level $α$ test for this hypothesis, that is uniformly consistent against classes of local alternatives. In contrast to existing methods, which focus on exact slope homogeneity and are usually sensitive to dependence in the data, the proposed test statistic is (asymptotically) pivotal and applicable under simultaneous intersectional and temporal dependence. Moreover, it can accommodate the realistic case of panels with large intersections. A simulation study and a data example underline the usefulness of our approach.
△ Less
Submitted 13 December, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Multivariate Mean Comparison under Differential Privacy
Authors:
Martin Dunsche,
Tim Kutta,
Holger Dette
Abstract:
The comparison of multivariate population means is a central task of statistical inference. While statistical theory provides a variety of analysis tools, they usually do not protect individuals' privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a distorted analysis. In this paper we address this pr…
▽ More
The comparison of multivariate population means is a central task of statistical inference. While statistical theory provides a variety of analysis tools, they usually do not protect individuals' privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a distorted analysis. In this paper we address this problem by developing a hypothesis test for multivariate mean comparisons that guarantees differential privacy to users. The test statistic is based on the popular Hotelling's $t^2$-statistic, which has a natural interpretation in terms of the Mahalanobis distance. In order to control the type-1-error, we present a bootstrap algorithm under differential privacy that provably yields a reliable test decision. In an empirical study we demonstrate the applicability of this approach.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Statistical Quantification of Differential Privacy: A Local Approach
Authors:
Önder Askin,
Tim Kutta,
Holger Dette
Abstract:
In this work, we introduce a new approach for statistical quantification of differential privacy in a black box setting. We present estimators and confidence intervals for the optimal privacy parameter of a randomized algorithm $A$, as well as other key variables (such as the "data-centric privacy level"). Our estimators are based on a local characterization of privacy and in contrast to the relat…
▽ More
In this work, we introduce a new approach for statistical quantification of differential privacy in a black box setting. We present estimators and confidence intervals for the optimal privacy parameter of a randomized algorithm $A$, as well as other key variables (such as the "data-centric privacy level"). Our estimators are based on a local characterization of privacy and in contrast to the related literature avoid the process of "event selection" - a major obstacle to privacy validation. This makes our methods easy to implement and user-friendly. We show fast convergence rates of the estimators and asymptotic validity of the confidence intervals. An experimental study of various algorithms confirms the efficacy of our approach.
△ Less
Submitted 2 May, 2022; v1 submitted 21 August, 2021;
originally announced August 2021.
-
Statistical inference for the slope parameter in functional linear regression
Authors:
Tim Kutta,
Gauthier Dierickx,
Holger Dette
Abstract:
In this paper we consider the linear regression model $Y =S X+\varepsilon $ with functional regressors and responses. We develop new inference tools to quantify deviations of the true slope $S$ from a hypothesized operator $S_0$ with respect to the Hilbert--Schmidt norm $\| S- S_0\|^2$, as well as the prediction error $\mathbb{E} \| S X - S_0 X \|^2$. Our analysis is applicable to functional time…
▽ More
In this paper we consider the linear regression model $Y =S X+\varepsilon $ with functional regressors and responses. We develop new inference tools to quantify deviations of the true slope $S$ from a hypothesized operator $S_0$ with respect to the Hilbert--Schmidt norm $\| S- S_0\|^2$, as well as the prediction error $\mathbb{E} \| S X - S_0 X \|^2$. Our analysis is applicable to functional time series and based on asymptotically pivotal statistics. This makes it particularly user friendly, because it avoids the choice of tuning parameters inherent in long-run variance estimation or bootstrap of dependent data. We also discuss two sample problems as well as change point detection. Finite sample properties are investigated by means of a simulation study.\\ Mathematically our approach is based on a sequential version of the popular spectral cut-off estimator $\hat S_N$ for $S$. It is well-known that the $L^2$-minimax rates in the functional regression model, both in estimation and prediction, are substantially slower than $1/\sqrt{N}$ (where $N$ denotes the sample size) and that standard estimators for $S$ do not converge weakly to non-degenerate limits. However, we demonstrate that simple plug-in estimators - such as $\| \hat S_N - S_0 \|^2$ for $\| S - S_0 \|^2$ - are $\sqrt{N}$-consistent and its sequential versions satisfy weak invariance principles. These results are based on the smoothing effect of $L^2$-norms and established by a new proof-technique, the {\it smoothness shift}, which has potential applications in other statistical inverse problems.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Quantifying deviations from separability in space-time functional processes
Authors:
Holger Dette,
Gauthier Dierickx,
Tim Kutta
Abstract:
The estimation of covariance operators of spatio-temporal data is in many applications only computationally feasible under simplifying assumptions, such as separability of the covariance into strictly temporal and spatial factors.Powerful tests for this assumption have been proposed in the literature. However, as real world systems, such as climate data are notoriously inseparable, validating this…
▽ More
The estimation of covariance operators of spatio-temporal data is in many applications only computationally feasible under simplifying assumptions, such as separability of the covariance into strictly temporal and spatial factors.Powerful tests for this assumption have been proposed in the literature. However, as real world systems, such as climate data are notoriously inseparable, validating this assumption by statistical tests, seems inherently questionable. In this paper we present an alternative approach: By virtue of separability measures, we quantify how strongly the data's covariance operator diverges from a separable approximation. Confidence intervals localize these measures with statistical guarantees. This method provides users with a flexible tool, to weigh the computational gains of a separable model against the associated increase in bias. As separable approximations we consider the established methods of partial traces and partial products, and develop weak convergence principles for the corresponding estimators. Moreover, we also prove such results for estimators of optimal, separable approximations, which are arguably of most interest in applications. In particular we present for the first time statistical inference for this object, which has been confined to estimation previously. Besides confidence intervals, our results encompass tests for approximate separability. All methods proposed in this paper are free of nuisance parameters and do neither require computationally expensive resampling procedures nor the estimation of nuisance parameters. A simulation study underlines the advantages of our approach and its applicability is demonstrated by the investigation of German annual temperature data.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Detecting structural breaks in eigensystems of functional time series
Authors:
Holger Dette,
Tim Kutta
Abstract:
Detecting structural changes in functional data is a prominent topic in statistical literature. However not all trends in the data are important in applications, but only those of large enough influence. In this paper we address the problem of identifying relevant changes in the eigenfunctions and eigenvalues of covariance kernels of $L^2[0,1]$-valued time series. By self-normalization techniques…
▽ More
Detecting structural changes in functional data is a prominent topic in statistical literature. However not all trends in the data are important in applications, but only those of large enough influence. In this paper we address the problem of identifying relevant changes in the eigenfunctions and eigenvalues of covariance kernels of $L^2[0,1]$-valued time series. By self-normalization techniques we derive pivotal, asymptotically consistent tests for relevant changes in these characteristics of the second order structure and investigate their finite sample properties in a simulation study. The applicability of our approach is demonstrated analyzing German annual temperature data.
△ Less
Submitted 18 November, 2019;
originally announced November 2019.
-
The empirical process of residuals from an inverse regression
Authors:
Tim Kutta,
Nicolai Bissantz,
Justin Chown,
Holger Dette
Abstract:
In this paper we investigate an indirect regression model characterized by the Radon transformation. This model is useful for recovery of medical images obtained by computed tomography scans. The indirect regression function is estimated using a series estimator motivated by a spectral cut-off technique. Further, we investigate the empirical process of residuals from this regression, and show that…
▽ More
In this paper we investigate an indirect regression model characterized by the Radon transformation. This model is useful for recovery of medical images obtained by computed tomography scans. The indirect regression function is estimated using a series estimator motivated by a spectral cut-off technique. Further, we investigate the empirical process of residuals from this regression, and show that it satsifies a functional central limit theorem.
△ Less
Submitted 9 February, 2019;
originally announced February 2019.