Delay-Penalty Comparison for Sequential Testing and Quickest Detection in State-Dependent Diffusion Models

1 Introduction

Sequential methods determine endogenously when accumulated data justify a terminal action, balancing the cost of further observation against the quality of the eventual decision [wald2004sequential, wald1948optimum, rincon2025sequential, wang2025analysis, chow1971great, griffith2021statistics, silva2020optimal, silva2015continuous, wang2025analysis1, fischer2026improving]. Two problems organize much of the field. In sequential testing an observer watches a process whose law is governed by one of two simple hypotheses and must choose, as a function of the data, both a stopping time and a terminal decision so as to trade sampling cost against the probabilities of a wrong decision [shiryaev2025optimal, pabbaraju2026simple, liu2025bidirectional]. In quickest detection, or the disorder problem, the law of the observation changes at an unobservable time, and the observer raises an alarm to trade the frequency of false alarms against the delay incurred in detecting a true change [shiryaev1963optimum, naha2023quickest, snow2024quickest, sha2025quickest, liang2025global]. Minimax counterparts of the detection problem replace the prior on the change time by a worst-case criterion and lead to the CUSUM and Shiryaev–Roberts procedures [page1954continuous, banerjee2024minimax, xie2022minimax, wang2026algebraic, fromont2023minimax, yang2024sequential, huselitz2026online, lorden1971procedures, tosun2023robust, yu2024network, moustakides1986optimal, pollak1985optimal, wang2026damage, polunchenko2018comparative, pollak2009optimality]. The present paper is Bayesian and optimal-stopping based.

Both problems arise across applied domains. In quantitative finance a regime shift in a return or volatility process must be detected promptly while controlling false alarms; in statistical process control a manufacturing stream must be monitored for a shift in mean; in structural health monitoring, surveillance, and intrusion detection a sensor stream must be screened for the onset of an anomaly; and in epidemiological monitoring an incidence series must be watched for the start of an outbreak. In each case the natural observation model is a continuously sampled diffusion, the cost of a missed or delayed detection is problem specific and frequently nonlinear, and the practitioner needs to understand how the rule—and in particular the alarm threshold—moves when the penalty structure changes. That comparative question, rather than the explicit solution of any single model, is the focus of this paper.

By filtering, both problems reduce to fully observed optimal stopping for a Markovian sufficient statistic, after which the value function solves a variational inequality and the optimal rule is a first-exit time from a continuation region [liptser1977statistics, peskir2006optimal, yu2026pattern, peskir2000sequential, epstein2022optimal, ekstrom2022multi, ankirchner2020bayesian, wang2026breakdown, shiryaev2025optimal]. For Brownian observations with constant drift alternatives the sufficient statistic is a one-dimensional posterior diffusion and the rules are explicit thresholds. The situation changes when the diffusion coefficients are state dependent: the signal-to-noise ratio then depends on the current observation, the posterior equation no longer closes in the posterior coordinate alone, and the sufficient statistic becomes multidimensional. This is the regime we study, and it is the regime in which the comparative-statics question is least understood.

Contribution.

This paper makes two related contributions. First, we give a unified formulation of sequential testing and Bayesian quickest detection for state-dependent diffusion observations. The formulation makes explicit that, unless the signal-to-noise ratio is constant or otherwise reducible, the posterior probability is not a closed Markov state; the closed sufficient statistic is the augmented process $(\Pi_{t},X_{t})$ , or equivalently $(\Phi_{t},X_{t})$ , with an explicit degenerate generator. Second, and more importantly, we prove a delay-penalty comparison theorem for the resulting optimal stopping problems. For a fixed terminal cost, increasing the running delay penalty increases the value function, shrinks the continuation region, and induces earlier stopping. Whenever the stopping region is known to be one-sided in the posterior coordinate, this gives a monotone ordering of the alarm boundaries. This comparative-static result applies directly to linear delay costs and extends to nonlinear marginal delay penalties after adding the appropriate penalty-memory state. A worked Shiryaev example computes the alarm threshold numerically by a finite-difference solution of the variational inequality and exhibits the predicted monotonicity.

Organization.

Section 3 develops the filtering reductions for both problems and proves the closed Markov-state result. Section 4 states the generic optimal stopping problem and the variational inequality in complementarity form. Section 5 proves the delay-penalty comparison theorem and its linear and nonlinear specializations and gives the sampling-cost analogue for testing. Section 6 gives the free-boundary interpretation and the martingale verification. Section 7 is the worked Shiryaev example with the numerical method described in full. Section 8 relates the framework to CUSUM and Shiryaev–Roberts procedures, and Section 9 concludes.

2 Literature Review

Relation to existing literature.

The literature relevant to the present work lies at the intersection of sequential analysis, quickest detection, filtering, and optimal stopping.

On the sequential-analysis side, the foundations were laid by Wald and coauthors through the theory of sequential testing and optimal stopping of experiments [wald2004sequential, wald1948optimum, yu2026rigorous, chow1971great, griffith2021statistics]. In parallel, Shiryaev initiated the Bayesian disorder problem, in which an unobservable change point must be detected as rapidly as possible while controlling false alarms [shiryaev1963optimum]. These developments evolved into the modern theory of quickest detection, encompassing both Bayesian and minimax formulations, with comprehensive treatments given by Shiryaev’s monographs and retrospective survey and by the unified treatment of Poor and Hadjiliadis and others [shiryaev2025optimal, poor2009quickest, shiryaev2010quickest, yu2026beyond, shiryaev2009stochastic, pollak1985diffusion, pollak1985optimal, moustakides1986optimal, shiryaev2019stochastic, cai2026optimal]. The Bayesian formulation is naturally expressed as a Markov optimal stopping problem after filtering, while minimax formulations lead to procedures such as CUSUM and Shiryaev–Roberts [page1954continuous, wang2025multi, lorden1971procedures, moustakides1986optimal, pollak1985optimal, pollak2009optimality, polunchenko2018comparative, gao2022rolling]. The field now provides a unified framework for applications ranging from engineering and finance to surveillance and epidemiology.

A second strand of literature concerns diffusion observations and state-dependent signal structures. For Brownian models with constant signal-to-noise ratio, filtering reduces the problem to a one-dimensional posterior diffusion and optimal rules are characterized by scalar thresholds. The situation becomes substantially more difficult when the signal-to-noise ratio depends on the current state of the observed diffusion. In this setting, the posterior probability alone generally fails to form a closed Markov state. A line of research initiated by Gapeev and Shiryaev [gapeev2011sequential, gapeev2013bayesian] developed sequential testing and quickest detection formulations for diffusion processes with state-dependent coefficients. Subsequent analyses by Johnson and Peskir [johnson2017quickest, yu2026from, johnson2018sequential] revealed the rich boundary structure that may arise even in special cases such as Bessel-process observations. Most recently, Ernst and Peskir [ernst2024gapeev] resolved the Gapeev–Shiryaev conjecture by proving that monotonicity of the signal-to-noise ratio implies monotonicity of the associated optimal stopping boundaries.

A related body of work studies multidimensional and multi-source detection problems. When observations are collected from multiple sensors or coupled systems, the filtering state becomes vector valued and the stopping region is typically a hypersurface rather than a scalar threshold. Such models arise in distributed surveillance, sensor networks, and structural monitoring, and provide natural examples where finite-dimensional sufficient statistics remain available but the geometry of the stopping rule becomes substantially more complex [ludkovski2012bayesian, zhang2014quickest, kurt2018multisensor, konev2017quickest, didi2024active, dayanik2016sequential].

Methodologically, the present paper belongs to the optimal-stopping and free-boundary literature [peskir2006optimal]. After filtering, both sequential testing and Bayesian quickest detection reduce to variational inequalities for Markov sufficient statistics, with optimal rules represented as first-entry times into stopping regions [peskir2000sequential, ekstrom2022multi, epstein2022optimal, ankirchner2020bayesian]. Much of the existing literature focuses on deriving explicit solutions, characterizing free boundaries, or establishing structural properties of stopping regions for a fixed penalty specification. In diffusion-based optimal stopping problem, the principal questions have traditionally been existence, regularity, smooth fit, and stochastic control of optimal boundaries [peskir2019continuity, arkin2009variational, oshima2006optimal, oksendal2019applied].

The present paper addresses a different question. We do not seek a new explicit solution of a particular diffusion stopping problem. Instead, we study how the optimal rule changes when the delay-penalty structure changes. Our object is therefore the comparative-statics map

\text{penalty profile}\longmapsto\text{value function}\longmapsto\text{continuation region}\longmapsto\text{stopping boundary}.

Once a filtering reduction and optimal-stopping formulation are available, we show that larger marginal delay penalties increase the value function, shrink continuation regions, induce earlier stopping, and, whenever a one-sided boundary representation is known, produce a monotone ordering of alarm boundaries. Thus the contribution of the paper is not a new boundary formula but a structural comparison principle that applies across a broad class of diffusion-based sequential testing and quickest-detection models.

3 Filtering Reductions for Diffusion Sequential Problems

Throughout, $(\Omega,\mathcal{F},\mathbb{F},\mathbb{P})$ carries a standard Brownian motion $B=(B_{t})_{t\geq 0}$ , the observation is a one-dimensional diffusion $X=(X_{t})_{t\geq 0}$ on a state space $I\subseteq\mathbb{R}$ , and $\mathbb{F}^{X}=(\mathcal{F}^{X}_{t})_{t\geq 0}$ is the augmented right-continuous observation filtration. We write $\mathbb{P}_{i}:=\mathbb{P}(\,\cdot\mid\theta=i)$ for the conditional laws under the two hypotheses and $\mathbb{E}_{i},\mathbb{E}_{\pi}$ for the corresponding expectations.

3.1 Sequential testing

Under hypothesis $H_{i}$ , $i\in\{0,1\}$ , the observation solves

\,\mathrm{d}X_{t}=\mu_{i}(X_{t})\,\mathrm{d}t+\sigma(X_{t})\,\mathrm{d}B_{t},\qquad X_{0}=x_{0},

(1)

with $\mu_{0}\not\equiv\mu_{1}$ and $\sigma>0$ . The hidden hypothesis $\theta\in\{0,1\}$ has prior $\pi:=\mathbb{P}(\theta=1)\in(0,1)$ . A testing rule is a pair $(\tau,d)$ consisting of an $\mathbb{F}^{X}$ -stopping time $\tau$ and an $\mathcal{F}^{X}_{\tau}$ -measurable terminal decision $d\in\{0,1\}$ . With unit sampling cost per unit time and error costs $a,b>0$ for the two types of error, the Bayes risk is

R_{\pi}(\tau,d)=\mathbb{E}_{\pi}[\tau]+a\,\mathbb{P}_{\pi}(d=1,\theta=0)+b\,\mathbb{P}_{\pi}(d=0,\theta=1).

(2)

Let $\Pi_{t}:=\mathbb{P}_{\pi}(\theta=1\mid\mathcal{F}^{X}_{t})$ be the posterior probability of $H_{1}$ . For a fixed stopping time, the terminal-error part of (2) is minimized by deciding $H_{1}$ when its posterior cost is smaller, that is, $d^{\ast}=\mathbf{1}_{\{\Pi_{\tau}\geq p^{\dagger}\}}$ with $p^{\dagger}=a/(a+b)$ , and the resulting conditional terminal cost is

\mathbb{E}_{\pi}\!\big[a\,\mathbf{1}_{\{d^{\ast}=1,\theta=0\}}+b\,\mathbf{1}_{\{d^{\ast}=0,\theta=1\}}\mid\mathcal{F}^{X}_{\tau}\big]=\min\{b\Pi_{\tau},\,a(1-\Pi_{\tau})\}=:M(\Pi_{\tau}),

(3)

where $M$ is concave and piecewise linear with apex at $p^{\dagger}$ . Substituting $d^{\ast}$ collapses (2) to the optimal stopping problem

V(\pi,x_{0})=\inf_{\tau}\,\mathbb{E}_{\pi,x_{0}}\big[\tau+M(\Pi_{\tau})\big].

(4)

Under $\mathbb{P}_{1}\ll\mathbb{P}_{0}$ on $\mathcal{F}^{X}_{t}$ , Girsanov’s theorem gives the likelihood ratio

L_{t}=\left.\frac{\,\mathrm{d}\mathbb{P}_{1}}{\,\mathrm{d}\mathbb{P}_{0}}\right|_{\mathcal{F}^{X}_{t}}=\exp\!\left\{\int_{0}^{t}\frac{\mu_{1}-\mu_{0}}{\sigma^{2}}(X_{s})\,\mathrm{d}X_{s}-\tfrac{1}{2}\int_{0}^{t}\frac{\mu_{1}^{2}-\mu_{0}^{2}}{\sigma^{2}}(X_{s})\,\mathrm{d}s\right\}.

(5)

Introducing the signal-to-noise ratio

\vartheta(x):=\frac{\mu_{1}(x)-\mu_{0}(x)}{\sigma(x)},

(6)

one has $\,\mathrm{d}L_{t}/L_{t}=\vartheta(X_{t})\,\mathrm{d}B_{t}^{0}$ under $\mathbb{P}_{0}$ , where $B^{0}$ is the $\mathbb{P}_{0}$ -driving Brownian motion, so $L$ is a $\mathbb{P}_{0}$ -martingale. The posterior odds are $\Phi_{t}:=\Pi_{t}/(1-\Pi_{t})=\tfrac{\pi}{1-\pi}L_{t}$ . The innovation process

\bar{B}_{t}=\int_{0}^{t}\frac{1}{\sigma(X_{s})}\Big(\,\mathrm{d}X_{s}-\big[\mu_{0}(X_{s})+(\mu_{1}-\mu_{0})(X_{s})\Pi_{s}\big]\,\mathrm{d}s\Big)

(7)

is a standard $\mathbb{F}^{X}$ -Brownian motion [liptser1977statistics], and the Kushner–Stratonovich equation for the two-valued hidden variable gives the posterior diffusion together with the observation in innovation form,

\,\mathrm{d}\Pi_{t}=\vartheta(X_{t})\,\Pi_{t}(1-\Pi_{t})\,\mathrm{d}\bar{B}_{t},\qquad\,\mathrm{d}X_{t}=\big[\mu_{0}(X_{t})+(\mu_{1}-\mu_{0})(X_{t})\Pi_{t}\big]\,\mathrm{d}t+\sigma(X_{t})\,\mathrm{d}\bar{B}_{t}.

(8)

The odds process is the smooth image $\Phi=\Pi/(1-\Pi)$ of $\Pi$ under the bijection $p\mapsto p/(1-p)$ of $(0,1)$ onto $(0,\infty)$ ; we therefore use $(\Pi,X)$ and $(\Phi,X)$ interchangeably as state descriptors and do not record a separate stochastic differential for $\Phi$ , which carries an Itô correction relative to (8).

3.2 Bayesian quickest detection

Now the drift switches at an unobservable change time $\theta\geq 0$ :

\,\mathrm{d}X_{t}=\mu_{0}(X_{t})\,\mathrm{d}t+\sigma(X_{t})\,\mathrm{d}B_{t}\ \ (t<\theta),\qquad\,\mathrm{d}X_{t}=\mu_{1}(X_{t})\,\mathrm{d}t+\sigma(X_{t})\,\mathrm{d}B_{t}\ \ (t\geq\theta),

(9)

with the standard prior placing an atom at the origin and an exponential tail,

\mathbb{P}(\theta=0)=\pi,\qquad\mathbb{P}(\theta>t\mid\theta>0)=e^{-\lambda t},\quad\lambda>0.

(10)

For an alarm time $\tau$ the linear-delay Bayes risk weighs the probability of a false alarm against the expected detection delay,

R_{\pi}(\tau)=\mathbb{P}_{\pi}(\tau<\theta)+c\,\mathbb{E}_{\pi}\big[(\tau-\theta)^{+}\big],\qquad c>0.

(11)

Let $\Pi_{t}:=\mathbb{P}_{\pi}(\theta\leq t\mid\mathcal{F}^{X}_{t})$ . The false-alarm probability is $\mathbb{P}_{\pi}(\tau<\theta)=\mathbb{E}_{\pi}[\mathbb{P}_{\pi}(\theta>\tau\mid\mathcal{F}^{X}_{\tau})]=\mathbb{E}_{\pi}[1-\Pi_{\tau}]$ , and, by Fubini and the optional projection,

\mathbb{E}_{\pi}\big[(\tau-\theta)^{+}\big]=\mathbb{E}_{\pi}\!\left[\int_{0}^{\tau}\mathbf{1}_{\{\theta\leq s\}}\,\mathrm{d}s\right]=\mathbb{E}_{\pi}\!\left[\int_{0}^{\tau}\mathbb{P}_{\pi}(\theta\leq s\mid\mathcal{F}^{X}_{s})\,\mathrm{d}s\right]=\mathbb{E}_{\pi}\!\left[\int_{0}^{\tau}\Pi_{s}\,\mathrm{d}s\right].

(12)

Hence (11) becomes the optimal stopping problem

V(\pi)=\inf_{\tau}\,\mathbb{E}_{\pi}\!\left[(1-\Pi_{\tau})+c\int_{0}^{\tau}\Pi_{s}\,\mathrm{d}s\right],

(13)

with running cost $f(p)=cp$ and terminal cost $G(p)=1-p$ . The analytically convenient Shiryaev (weighted likelihood-ratio) statistic is the posterior odds, which admits the explicit representation

\Phi_{t}:=\frac{\Pi_{t}}{1-\Pi_{t}}=\frac{\pi}{1-\pi}\,e^{\lambda t}L_{t}+\lambda\int_{0}^{t}e^{\lambda(t-s)}\,\frac{L_{t}}{L_{s}}\,\mathrm{d}s,

(14)

where $L_{t}/L_{s}$ is the post-change-to-pre-change likelihood ratio over $[s,t]$ formed from (5). The filtering equation for the posterior, with the compensator $\lambda(1-\Pi_{t})$ of $\mathbf{1}_{\{\theta\leq t\}}$ induced by (10), is

\,\mathrm{d}\Pi_{t}=\lambda(1-\Pi_{t})\,\mathrm{d}t+\vartheta(X_{t})\,\Pi_{t}(1-\Pi_{t})\,\mathrm{d}\bar{B}_{t},

(15)

with $X$ as in (8). As above, $(\Phi,X)$ is the equivalent state under the bijection $p\mapsto p/(1-p)$ .

3.3 Closed Markov state and generator

The reductions (4) and (13) are optimal stopping problems driven by the posterior. Whether the posterior is by itself a closed Markov state depends on the signal-to-noise ratio.

Assumption 1.

$\mu_{0},\mu_{1},\sigma$ are locally Lipschitz on $I$ , $\sigma>0$ on $I$ , and (1) admits weakly unique nonexplosive solutions under $H_{0}$ and $H_{1}$ . Moreover, for each finite $T>0$ ,

\mathbb{E}_{i}\!\left[\exp\Big\{\tfrac{1}{2}\int_{0}^{T}\vartheta^{2}(X_{s})\,\mathrm{d}s\Big\}\right]<\infty\quad(i=0,1),

so that (5) is a true $\mathbb{P}_{0}$ -martingale on finite horizons.

Theorem 2 (Closed Markov state for diffusion observations).

Under Assumption 1, the pair $(\Pi_{t},X_{t})_{t\geq 0}$ is a time-homogeneous $\mathbb{F}^{X}$ -Markov sufficient statistic for the sequential decision problem, in both the testing and quickest-detection settings. Its generator on $w\in C^{2}((0,1)\times I)$ is

	$\displaystyle(\mathscr{L}^{T}w)(p,x)={}$	$\displaystyle\tfrac{1}{2}\vartheta^{2}(x)\,p^{2}(1-p)^{2}w_{pp}+\vartheta(x)\sigma(x)\,p(1-p)\,w_{px}$		(16)
		$\displaystyle+\tfrac{1}{2}\sigma^{2}(x)\,w_{xx}+\big[\mu_{0}(x)+(\mu_{1}-\mu_{0})(x)\,p\big]\,w_{x}$		(16)

in the testing case, and $\mathscr{L}^{D}=\mathscr{L}^{T}+\lambda(1-p)\,\partial_{p}$ in the quickest-detection case. If $\vartheta$ is constant, the posterior coordinate has closed one-dimensional Markov dynamics and the problem projects onto $\Pi$ alone, with generator

(\mathscr{L}_{0}w)(p)=\lambda(1-p)\,w^{\prime}(p)+\tfrac{1}{2}\vartheta^{2}p^{2}(1-p)^{2}w^{\prime\prime}(p)

(17)

(omitting the $\lambda$ -drift in the testing case). If $\vartheta$ is state dependent, the posterior SDE is not closed in $\Pi$ alone. Thus $(\Pi,X)$ provides the natural closed Markov realization of the filtering state. We do not attempt to characterize exceptional projection cases in which the posterior marginal may nevertheless be Markov.

Proof.

Assumption 1 is Novikov’s criterion, so (5) is a true martingale and Girsanov’s theorem yields the odds representations of Sections 3.1–3.2. The innovation theorem [liptser1977statistics] gives the $\mathbb{F}^{X}$ -Brownian motion $\bar{B}$ of (7), and the Kushner–Stratonovich equation for the two-valued hidden variable produces (8) and, with the compensator $\lambda(1-\Pi_{t})$ of $\mathbf{1}_{\{\theta\leq t\}}$ under the prior (10), (15). Writing $X$ in the innovation gives the joint dynamics, with quadratic covariation

\,\mathrm{d}\langle\Pi,X\rangle_{t}=\vartheta(X_{t})\,\sigma(X_{t})\,\Pi_{t}(1-\Pi_{t})\,\mathrm{d}t.

(18)

Applying Itô’s formula to $w(\Pi_{t},X_{t})$ and collecting drift terms yields (16) and, with the extra posterior drift, $\mathscr{L}^{D}$ ; the cross term in (16) is exactly (18). All coefficients are time-independent functions of the current value $(\Pi_{t},X_{t})$ , so $(\Pi,X)$ is a time-homogeneous Markov process; sufficiency for the decision problem is inherited from the posterior being a sufficient statistic for $\theta$ . If $\vartheta$ is constant and $w$ depends on $p$ only, (16) collapses to (17) and the marginal law of $\Pi$ is determined by $\Pi$ alone. When $\vartheta$ is state dependent, the coefficient of the posterior martingale term in (8)–(15) depends on $X_{t}$ , so the posterior equation is not autonomous in $\Pi_{t}$ . The augmented process supplies a closed Markov state; possible exceptional Markovian projections are outside the scope of the present comparison result. ∎

Remark 3 (On the cross term and projection).

The cross term $w_{px}$ in (16) reflects the common innovation noise driving both the posterior and the observation. When $\vartheta$ is constant the posterior coordinate has closed one-dimensional Markov dynamics and the stopping problem can be projected onto $\Pi$ alone; the cross term persists only in the redundant two-dimensional representation $(\Pi,X)$ . State dependence of $\vartheta$ removes the projection and makes $(\Pi,X)$ the operative state.

Remark 4 (Degeneracy and regularity).

The diffusion matrix in (16),

\begin{pmatrix}\vartheta^{2}p^{2}(1-p)^{2}&\vartheta\sigma p(1-p)\\[2.0pt] \vartheta\sigma p(1-p)&\sigma^{2}\end{pmatrix},

has determinant zero, so $(\Pi,X)$ diffuses along a single direction in $(p,x)$ -space: both coordinates are driven by the one innovation $\bar{B}$ . The operator is therefore degenerate elliptic rather than uniformly elliptic. Hypoellipticity nonetheless holds under Hörmander-type conditions on $(\vartheta,\sigma,\mu_{i})$ , which underlies the regularity and continuity of the resulting two-dimensional stopping boundaries [peskir2019continuity, ernst2024gapeev].

Example 5 (State-dependent signal-to-noise: Bessel dimension).

With $\sigma\equiv 1$ and $\mu_{i}(x)=(d_{i}-1)/(2x)$ on $I=(0,\infty)$ , the process $X$ is a Bessel process of dimension $d_{i}$ under $H_{i}$ , and $\vartheta(x)=(d_{1}-d_{0})/(2x)$ is state dependent. The likelihood ratio (5) becomes

L_{t}=\exp\!\Big\{\tfrac{d_{1}-d_{0}}{2}\int_{0}^{t}X_{s}^{-1}\,\mathrm{d}X_{s}-\tfrac{(d_{1}-d_{0})(d_{1}+d_{0}-2)}{8}\int_{0}^{t}X_{s}^{-2}\,\mathrm{d}s\Big\},

and does not yield a closed one-dimensional posterior equation; the augmented pair $(\Pi,X)$ is the closed state. The resulting two-dimensional stopping problem admits an analytic characterization in terms of special functions and the associated free-boundary conditions [johnson2017quickest, johnson2018sequential].

4 Generic Optimal Stopping Formulation

The reductions above are instances of a single optimal stopping problem. Let $Y=(Y_{t})_{t\geq 0}$ be a time-homogeneous Markov process on a state space $\mathcal{S}$ with generator $\mathscr{L}$ (for example $Y=\Pi$ , $(\Pi,X)$ , or $(\Phi,X)$ ). Given a running cost $f\geq 0$ and a terminal cost $G$ , set

V(y)=\inf_{\tau}\,\mathbb{E}_{y}\!\left[\int_{0}^{\tau}f(Y_{s})\,\mathrm{d}s+G(Y_{\tau})\right],\qquad y\in\mathcal{S},

(19)

with continuation and stopping regions

\mathcal{C}=\{y:V(y)<G(y)\},\qquad\mathcal{D}=\{y:V(y)=G(y)\}.

(20)

For sequential testing $f\equiv 1$ and $G=M$ of (4); for quickest detection $f(p)=cp$ and $G(p)=1-p$ of (13).

Standing assumptions.

We assume throughout the comparison results that, for the running and terminal costs under consideration, the value function (19) is finite on $\mathcal{S}$ and the first-entry time $\tau^{\ast}=\inf\{t\geq 0:Y_{t}\in\mathcal{D}\}$ is optimal in (19). These are mild and standard under, for instance, lower semicontinuity of $G$ , continuity of $f$ , and a moment or transience condition ensuring finiteness; they hold in the diffusion models considered here [peskir2006optimal].

The dynamic programming principle then gives

V\leq G,\qquad\mathscr{L}V+f=0\ \text{ on }\mathcal{C},\qquad\mathscr{L}V+f\geq 0\ \text{ on }\mathcal{D},

(21)

or, equivalently, the variational inequality

\max\big\{-\big(\mathscr{L}V+f\big),\ V-G\big\}=0\qquad\text{on }\mathcal{S},

(22)

which is in turn equivalent to the complementarity system

V\leq G,\qquad\mathscr{L}V+f\geq 0,\qquad(G-V)\,(\mathscr{L}V+f)=0.

(23)

Both arguments of the maximum in (22) are nonpositive, which makes the sign convention transparent: on $\mathcal{C}$ one has $\mathscr{L}V+f=0$ , and on $\mathcal{D}$ one has $\mathscr{L}V+f\geq 0$ , equivalently $\mathscr{L}G+f\geq 0$ .¹¹1The compact form $\min\{\mathscr{L}V+f,\,G-V\}=0$ is equivalent to (22)–(23); we adopt the maximum/complementarity form because both of its arguments are manifestly nonpositive and the equality $\mathscr{L}V+f=0$ on $\mathcal{C}$ is then immediate. Interpretations are in the classical, Sobolev, or viscosity sense according to the regularity of $V$ . The optimal rule is $\tau^{\ast}=\inf\{t:Y_{t}\in\mathcal{D}\}$ .

5 Delay-Penalty Comparison

The central result orders sequential rules by their running cost. It is purely comparative and does not require solving (22).

Theorem 6 (Delay-penalty ordering of sequential rules).

Let $Y$ be a Markov process on $\mathcal{S}$ with generator $\mathscr{L}$ , fix a common terminal cost $G$ , and for $i=1,2$ let $f_{i}\geq 0$ be measurable running costs with value functions $V_{i}$ , regions $\mathcal{C}_{i},\mathcal{D}_{i}$ as in (19)–(20), and optimal first-entry times $\tau_{i}^{\ast}=\inf\{t:Y_{t}\in\mathcal{D}_{i}\}$ . Assume $f_{1}\geq f_{2}$ pointwise on $\mathcal{S}$ . Then:

[label=()]
1.

(Value) $V_{1}\geq V_{2}$ on $\mathcal{S}$ .
2.

(Regions and stopping times) $\mathcal{C}_{1}\subseteq\mathcal{C}_{2}$ , $\mathcal{D}_{2}\subseteq\mathcal{D}_{1}$ , and $\tau_{1}^{\ast}\leq\tau_{2}^{\ast}$ $\mathbb{P}_{y}$ -almost surely for every $y$ . Apart from the standing assumptions that the value functions are finite and that the displayed first-entry times are optimal, no monotonicity, smoothness, or one-sidedness of the stopping set is needed.
3.

(Boundaries) If, in addition, each stopping region is one-sided in the posterior coordinate, $\mathcal{D}_{i}=\{(p,x)\in\mathcal{S}:p\geq b_{i}(x)\}$ for boundary functions $b_{i}:I\to[0,1]$ —the structure established under a monotone signal-to-noise condition by gapeev2011sequential, gapeev2013bayesian, ernst2024gapeev—then $b_{1}(x)\leq b_{2}(x)$ for all $x\in I$ .

Proof.

(i) For each admissible $\tau$ , pathwise $f_{1}\geq f_{2}\geq 0$ gives $\int_{0}^{\tau}f_{1}(Y_{s})\,\mathrm{d}s\geq\int_{0}^{\tau}f_{2}(Y_{s})\,\mathrm{d}s$ , hence $\mathbb{E}_{y}[\int_{0}^{\tau}f_{1}\,\mathrm{d}s+G(Y_{\tau})]\geq\mathbb{E}_{y}[\int_{0}^{\tau}f_{2}\,\mathrm{d}s+G(Y_{\tau})]$ ; taking the infimum over $\tau$ yields $V_{1}(y)\geq V_{2}(y)$ .

(ii) Choosing $\tau\equiv 0$ shows $V_{i}\leq G$ . If $y\in\mathcal{C}_{1}$ , i.e. $V_{1}(y)<G(y)$ , then by (i) $V_{2}(y)\leq V_{1}(y)<G(y)$ , so $y\in\mathcal{C}_{2}$ ; thus $\mathcal{C}_{1}\subseteq\mathcal{C}_{2}$ and, complementarily, $\mathcal{D}_{2}\subseteq\mathcal{D}_{1}$ . Since $\tau_{1}^{\ast}$ and $\tau_{2}^{\ast}$ are first-entry times of the same process $Y$ into $\mathcal{D}_{1}\supseteq\mathcal{D}_{2}$ , any entry of $Y$ into $\mathcal{D}_{2}$ already lies in $\mathcal{D}_{1}$ ; hence $\tau_{1}^{\ast}\leq\tau_{2}^{\ast}$ almost surely.

(iii) Under the one-sided representation, $\mathcal{D}_{2}\subseteq\mathcal{D}_{1}$ reads $\{p\geq b_{2}(x)\}\subseteq\{p\geq b_{1}(x)\}$ for each fixed $x$ , which holds if and only if $b_{1}(x)\leq b_{2}(x)$ . ∎

The monotonicity of the optimal stopping boundary with respect to the marginal delay penalty is intuitively depicted in Figure 1. When the system faces a more stringent delay penalty (i.e., $c_{1}>c_{2}$ ), the decision-maker becomes more conservative, which structurally shrinks the continuation region. Consequently, the optimal threshold shifts downward, yielding $b_{c_{1}}(x)\leq b_{c_{2}}(x)$ for all given states $X_{t}$ .

Refer to caption — Figure 1: Illustration of the optimal stopping boundaries in the $(X_{t},p_{t})$ state space. The optimal policy partitions the space into a continuation region $\mathcal{C}_{c_{1}}$ and a stopping region $\mathcal{D}_{c_{1}}$ , separated by the boundary $b_{c_{1}}(x)$ . A sample trajectory of the augmented process is shown, which starts at $(X_{0},p_{0})$ and triggers an alarm at the optimal stopping time $\tau^{*}$ upon hitting the boundary. Additionally, the figure demonstrates the comparative statics: a higher marginal delay penalty ( $c_{1}>c_{2}$ ) strictly lowers the optimal stopping threshold, such that $b_{c_{1}}(x)\leq b_{c_{2}}(x)$ .

Remark 7 (Comparative-statics reading).

Parts (i)–(ii) hold under only the standing assumptions and deliver the operational message: a uniformly larger marginal delay penalty makes continuation less attractive everywhere and triggers (weakly) earlier alarms, hence shorter expected delay at the cost of more false alarms. Part (iii) translates this into boundary geometry, but only once the stopping region is known to be one-sided—a property that need not hold for arbitrary state-dependent diffusions and is exactly what the monotone signal-to-noise results secure. We do not establish the one-sided structure here; we invoke it from gapeev2011sequential, gapeev2013bayesian, ernst2024gapeev.

5.1 Linear delay cost

The classical disorder problem has $f(p)=cp$ and $G(p)=1-p$ on the common state $Y$ . Theorem 6 with $f_{i}=c_{i}\,p$ specializes as follows.

Corollary 8 (Linear delay cost).

Let $c_{1}\geq c_{2}>0$ in the Bayesian disorder problem with common state $Y$ and terminal cost $G(p)=1-p$ . Then

V_{c_{1}}\geq V_{c_{2}},\qquad\mathcal{C}_{c_{1}}\subseteq\mathcal{C}_{c_{2}},\qquad\tau_{c_{1}}^{\ast}\leq\tau_{c_{2}}^{\ast}\quad\mathbb{P}_{y}\text{-a.s.}

If the stopping set has the form $\mathcal{D}_{c}=\{(p,x):p\geq b_{c}(x)\}$ , then $b_{c_{1}}(x)\leq b_{c_{2}}(x)$ for all $x\in I$ . In particular, in the constant-SNR case the single threshold satisfies $p^{\ast}(c_{1})\leq p^{\ast}(c_{2})$ : a larger delay cost rate implies an earlier alarm.

5.2 Sampling cost in sequential testing

The same principle applies on the testing side, where the running cost is the sampling cost. Scaling the sampling rate to $\kappa>0$ replaces (4) by $V_{\kappa}(\pi,x)=\inf_{\tau}\mathbb{E}[\kappa\tau+M(\Pi_{\tau})]$ , i.e. $f\equiv\kappa$ with common terminal cost $G=M$ .

Corollary 9 (Sampling cost).

Let $\kappa_{1}\geq\kappa_{2}>0$ in the sequential testing problem with common terminal cost $G=M$ . Then $V_{\kappa_{1}}\geq V_{\kappa_{2}}$ , $\mathcal{C}_{\kappa_{1}}\subseteq\mathcal{C}_{\kappa_{2}}$ , and $\tau_{\kappa_{1}}^{\ast}\leq\tau_{\kappa_{2}}^{\ast}$ almost surely: a more expensive observation stream induces an earlier terminal decision and a narrower continuation band around the indifference point $p^{\dagger}$ .

5.3 Nonlinear marginal delay penalties

For a general delay profile the risk is $R_{\pi}(\tau)=\mathbb{P}_{\pi}(\tau<\theta)+c\,\mathbb{E}_{\pi}[g((\tau-\theta)^{+})]$ with $g\geq 0$ nondecreasing and $g(0)=0$ . The running cost is then no longer a function of $\Pi_{t}$ alone, because the marginal delay cost at time $t$ depends on the unobserved elapsed post-change duration $t-\theta$ . The next proposition isolates the relevant statistic.

Proposition 10 (Marginal-cost augmentation).

Let $g$ be absolutely continuous, nondecreasing, with $g(0)=0$ , and suppose

\mathbb{E}_{\pi}\!\left[\int_{0}^{\tau}g^{\prime}(t-\theta)\,\mathbf{1}_{\{\theta\leq t\}}\,\mathrm{d}t\right]<\infty

for the stopping times $\tau$ under consideration. Then

\mathbb{E}_{\pi}\big[g((\tau-\theta)^{+})\big]=\mathbb{E}_{\pi}\!\left[\int_{0}^{\tau}\mathbb{E}_{\pi}\big[g^{\prime}(t-\theta)\,\mathbf{1}_{\{\theta\leq t\}}\mid\mathcal{F}^{X}_{t}\big]\,\mathrm{d}t\right].

(24)

Consequently the nonlinear-delay disorder problem is the optimal stopping problem (19) with terminal cost $G(p)=1-p$ and running cost

f_{t}=c\,\Psi_{t},\qquad\Psi_{t}=\mathbb{E}_{\pi}\big[g^{\prime}(t-\theta)\,\mathbf{1}_{\{\theta\leq t\}}\mid\mathcal{F}^{X}_{t}\big].

(25)

Proof.

By absolute continuity and $g(0)=0$ , $g((\tau-\theta)^{+})=\int_{0}^{(\tau-\theta)^{+}}g^{\prime}(u)\,\mathrm{d}u=\int_{0}^{\tau}g^{\prime}(t-\theta)\mathbf{1}_{\{\theta\leq t\}}\,\mathrm{d}t$ pathwise. The integrability hypothesis legitimizes taking $\mathbb{E}_{\pi}$ and applying Fubini together with the optional projection of the integrand onto $\mathbb{F}^{X}$ (valid since $\tau$ is an $\mathbb{F}^{X}$ -stopping time), which gives (24) and hence the running-cost representation (25). ∎

Whether $f_{t}=c\Psi_{t}$ yields a finite-dimensional Markov stopping problem depends on $g$ and is not automatic. Three cases are representative.

(a) Linear delay, $g(u)=u$ . Then $g^{\prime}\equiv 1$ and $\Psi_{t}=\mathbb{P}_{\pi}(\theta\leq t\mid\mathcal{F}^{X}_{t})=\Pi_{t}$ , recovering $f=c\Pi_{t}$ on the state of Theorem 2.

(b) Exponential delay, $g(u)=(e^{\beta u}-1)/\beta$ with $\beta>0$ . Then $g^{\prime}(u)=e^{\beta u}$ and the process $\Psi_{t}$ can be represented through a weighted likelihood-ratio statistic, $\Psi_{t}=e^{\beta t}\,\mathbb{E}_{\pi}[e^{-\beta\theta}\mathbf{1}_{\{\theta\leq t\}}\mid\mathcal{F}^{X}_{t}]$ . In state-dependent diffusion models this statistic must still be combined with the observation state $X_{t}$ to obtain a closed Markov state; the representation is not a universal dimension reduction, and the resulting alarm boundary is generally observation-dependent [gapeev2013bayesian].

(c) General $g$ . The statistic $\Psi_{t}$ need not admit a finite-dimensional filter, and the state must be augmented with accumulated-penalty information for the stopping problem to be Markovian.

In every case in which a common Markov state $Y$ carries the statistics for two penalties $g_{1},g_{2}$ , the pointwise ordering of marginal penalties $g_{1}^{\prime}\geq g_{2}^{\prime}$ transfers, via (25) and the monotonicity of conditional expectation, to $f_{1}\geq f_{2}$ . Theorem 6 then applies and yields the corresponding ordering of value functions, continuation regions, stopping times, and—where the one-sided structure holds—alarm boundaries. The convex (e.g. exponential) case has $g^{\prime}$ increasing, so a larger $\beta$ produces a pointwise-larger marginal penalty and an earlier alarm than the linear benchmark.

6 Free-Boundary Interpretation and Verification

On the Markov state $Y=(\Pi,X)$ (or its one-dimensional reduction), (22) is a free-boundary problem: $\mathscr{L}V+f=0$ on $\mathcal{C}$ , $V=G$ on $\mathcal{D}$ , with matching conditions across the free boundary $\partial\mathcal{C}$ . The continuous-fit condition $V|_{\partial\mathcal{C}}=G|_{\partial\mathcal{C}}$ always holds; the smooth-fit condition $\nabla V|_{\partial\mathcal{C}}=\nabla G|_{\partial\mathcal{C}}$ holds when the boundary point is probabilistically regular for the interior of $\mathcal{D}$ and the diffusion is nondegenerate there. For regular one-dimensional diffusions smooth fit is standard [peskir2006optimal]. In the present degenerate two-dimensional setting it may fail where the diffusion coefficient $\vartheta(x)\,p(1-p)$ vanishes (at $p\in\{0,1\}$ or where $\vartheta(x)=0$ ) or where the boundary is otherwise irregular; there only continuous fit is available, and the boundary’s continuity is itself a delicate question [peskir2019continuity].

When $\vartheta$ is constant the free-boundary problem reduces to the ordinary differential equation $\mathscr{L}_{0}V+f=0$ on the continuation interval, with $\mathscr{L}_{0}$ of (17), and the boundary is a single threshold determined by smooth fit; this is the setting of Section 7. When $\vartheta$ is state dependent, the operator is degenerate elliptic in $(p,x)$ , $\partial\mathcal{C}$ is a curve, and—consistent with the two-boundary structure found by gapeev2011sequential—the alarm is the first exit of the posterior from a region bounded by observation-dependent (stochastic) boundaries; explicit solutions exist only in special cases, and the boundary is otherwise characterized by systems of nonlinear integral equations arising from the change-of-variable formula with local time on curves, or computed numerically.

A candidate solution of (22) is confirmed optimal by martingale verification.

Proposition 11 (Verification).

Let $\widehat{V}$ be continuous on $\mathcal{S}$ , of polynomial growth, $C^{1}$ across $\partial\widehat{\mathcal{C}}$ , and $C^{2}$ on the interiors of $\widehat{\mathcal{C}}=\{\widehat{V}<G\}$ and $\widehat{\mathcal{D}}=\{\widehat{V}=G\}$ , and suppose

\mathscr{L}\widehat{V}+f\geq 0\ \text{on }\mathcal{S},\qquad\widehat{V}\leq G\ \text{on }\mathcal{S},\qquad\mathscr{L}\widehat{V}+f=0\ \text{on }\widehat{\mathcal{C}}.

Suppose moreover that for every admissible $\tau$ the local martingale $M_{\,\cdot\,}$ in (26) below, stopped along a localizing sequence $\tau_{n}\uparrow\infty$ , is uniformly integrable in the limit (e.g. a square-integrability or sublinear-growth condition on $\nabla\widehat{V}\cdot\sigma(Y)$ ). Then $\widehat{V}=V$ , and $\tau^{\ast}=\inf\{t:Y_{t}\in\widehat{\mathcal{D}}\}$ is optimal whenever $\mathbb{E}_{y}[\tau^{\ast}]<\infty$ .

Proof.

For an admissible $\tau$ and a localizing sequence $\tau_{n}\uparrow\infty$ , the Itô–Tanaka formula applied to $\widehat{V}(Y_{\cdot})$ gives

\widehat{V}(Y_{\tau\wedge\tau_{n}})+\int_{0}^{\tau\wedge\tau_{n}}f(Y_{s})\,\mathrm{d}s=\widehat{V}(y)+\int_{0}^{\tau\wedge\tau_{n}}(\mathscr{L}\widehat{V}+f)(Y_{s})\,\mathrm{d}s+M_{\tau\wedge\tau_{n}},

(26)

where $M$ is a local martingale. The $C^{1}$ (smooth-fit) hypothesis across $\partial\widehat{\mathcal{C}}$ ensures that the local-time term on the free boundary, which would otherwise appear because $\mathscr{L}\widehat{V}$ has a jump in its second derivatives there, vanishes; where only continuous fit holds, the local-time term is nonnegative and is retained in the inequality below without affecting its direction. Since $\mathscr{L}\widehat{V}+f\geq 0$ , taking expectations and letting $n\to\infty$ (using the growth and uniform-integrability hypotheses so that $\mathbb{E}_{y}[M_{\tau\wedge\tau_{n}}]\to 0$ ) gives $\widehat{V}(y)\leq\mathbb{E}_{y}[\int_{0}^{\tau}f\,\mathrm{d}s+\widehat{V}(Y_{\tau})]\leq\mathbb{E}_{y}[\int_{0}^{\tau}f\,\mathrm{d}s+G(Y_{\tau})]$ , hence $\widehat{V}\leq V$ . For $\tau=\tau^{\ast}$ the integrand $\mathscr{L}\widehat{V}+f$ vanishes on $\widehat{\mathcal{C}}$ and $\widehat{V}(Y_{\tau^{\ast}})=G(Y_{\tau^{\ast}})$ , so the inequalities are equalities and $\widehat{V}(y)=\mathbb{E}_{y}[\int_{0}^{\tau^{\ast}}f\,\mathrm{d}s+G(Y_{\tau^{\ast}})]\geq V(y)$ . Thus $\widehat{V}=V$ and $\tau^{\ast}$ is optimal. ∎

7 Worked Example: Threshold Monotonicity in the Shiryaev Diffusion Model

We illustrate Corollary 8 in the constant-SNR quickest-detection model, where the posterior is a closed one-dimensional diffusion

\,\mathrm{d}\Pi_{t}=\lambda(1-\Pi_{t})\,\mathrm{d}t+\rho\,\Pi_{t}(1-\Pi_{t})\,\mathrm{d}\bar{B}_{t},\qquad\rho:=\vartheta=\text{const}.

(27)

With running cost $f(p)=cp$ and terminal cost $G(p)=1-p$ , the value function solves the variational inequality (22) with the one-dimensional generator (17). On the continuation region the equation $\mathscr{L}_{0}V+cp=0$ reads

\tfrac{1}{2}\rho^{2}p^{2}(1-p)^{2}\,V^{\prime\prime}(p)+\lambda(1-p)\,V^{\prime}(p)+c\,p=0,

(28)

and on the stopping region $V=G$ with $\mathscr{L}_{0}G+cp=cp-\lambda(1-p)\geq 0$ required by (23), i.e. $p\geq\lambda/(c+\lambda)$ .

Numerical method.

We solve the obstacle problem (23) numerically rather than relying on a closed entrance-boundary shooting condition. The interval $[0,1]$ was discretized with a uniform grid of size $\Delta p$ . The degenerate diffusion coefficient $\tfrac{1}{2}\rho^{2}p^{2}(1-p)^{2}$ was evaluated at grid points and discretized by central differences; the drift term $\lambda(1-p)\geq 0$ was upwinded (forward difference), which renders the discrete generator a monotone $M$ -matrix. The degenerate left end $p=0$ , where the diffusion coefficient vanishes and the drift is $\lambda>0$ , supplies its own discrete relation $V_{0}=V_{1}$ through the upwinded operator, so no entrance-boundary derivative condition is imposed by hand; at $p=1$ we set $V=G=0$ . The obstacle problem was then solved by projected (policy) iteration—each sweep performs a Gauss–Seidel/successive-overrelaxation update of $\mathscr{L}_{h}V+f=0$ followed by the projection $V\leftarrow\min\{V,G\}$ onto the obstacle—iterated until the active set stabilized and the update fell below $10^{-11}$ . The reported thresholds were stable under halving of $\Delta p$ (grids of $500,1000,2000,4000$ points agree to the displayed digits) and were cross-checked against an entrance-boundary shooting solution of (28); the two methods agree.

Results.

Table 1 reports the optimal posterior threshold $p^{\ast}(c)$ for $\lambda=0.05$ and $\rho=1.0$ , together with the implied likelihood-ratio threshold $\Phi^{\ast}=p^{\ast}/(1-p^{\ast})$ . The threshold decreases monotonically as the delay cost rate $c$ increases. One checks directly that the computed thresholds satisfy $p^{\ast}\geq\lambda/(c+\lambda)$ , so the candidate solves the variational inequality and is optimal by Proposition 11. Figure 2 shows the value functions peeling away from the obstacle—continuation regions shrinking as $c$ grows—and the monotone curve $c\mapsto p^{\ast}(c)$ . We emphasize that the monotonicity observed in the computed thresholds is not used as evidence for Corollary 8; it only illustrates the theorem, which is proved independently in Section 5.

$\lambda$	$\rho$	$c$	threshold $p^{\ast}(c)$ ( $\Phi^{\ast}=p^{\ast}/(1-p^{\ast})$ )
0.05	1.0	0.5	$0.1735$ $(0.2099)$
0.05	1.0	1.0	$0.0705$ $(0.0759)$
0.05	1.0	2.0	$0.0303$ $(0.0313)$
0.05	1.0	5.0	$0.0109$ $(0.0110)$

Table 1: Monotone decrease of the Shiryaev alarm threshold as the delay cost rate

c

increases, illustrating Corollary 8 (

p^{\ast}(c_{1})\leq p^{\ast}(c_{2})

when

c_{1}\geq c_{2}

). Values from the finite-difference solution of the variational inequality, stable under halving of the grid spacing.

Interpretation.

The comparative-statics content is direct. A higher per-unit delay cost makes the observer less willing to wait, so the alarm is raised at a lower posterior probability of disorder: the threshold $p^{\ast}(c)$ , and with it the likelihood-ratio threshold $\Phi^{\ast}(c)$ , decreases in $c$ . By Theorem 6(ii) the associated alarm times are ordered pathwise, $\tau^{\ast}_{c_{1}}\leq\tau^{\ast}_{c_{2}}$ for $c_{1}\geq c_{2}$ , so a costlier delay yields uniformly earlier alarms; the price is a higher false-alarm probability, since stopping at a lower posterior is more often premature. The same qualitative picture persists for state-dependent $\vartheta$ , where $p^{\ast}$ is replaced by an observation-dependent boundary $b_{c}(x)$ ordered as in Theorem 6(iii).

8 Relation to CUSUM and Shiryaev–Roberts Procedures

The present comparison result is Bayesian and optimal-stopping based. It is therefore closest to the Shiryaev procedure and its limiting Shiryaev–Roberts forms [shiryaev1963optimum, pollak1985optimal, pollak2009optimality], and it is not a minimax optimality statement for CUSUM [lorden1971procedures, moustakides1986optimal]. Nevertheless, the same state-dependence issue appears in the minimax formulations: when the likelihood increments $\,\mathrm{d}\log L_{t}=-\tfrac{1}{2}\vartheta^{2}(X_{t})\,\mathrm{d}t+\vartheta(X_{t})\,\mathrm{d}\bar{B}_{t}$ depend on the current diffusion state, the CUSUM and Shiryaev–Roberts statistics are not closed one-dimensional Markov processes unless the observation state $X_{t}$ is included, and exactly characterized rules then involve the joint process $(\,\cdot\,,X)$ . A comparison principle for the minimax thresholds analogous to Theorem 6 would require monotonicity of the worst-case detection delay in the penalty parameters and is left for future work.

9 Conclusion

We have organized sequential testing and Bayesian quickest detection for state-dependent diffusion observations around two facts. The first is a closed Markov-state reduction identifying $(\Pi,X)$ as the sufficient statistic whenever the signal-to-noise ratio is state dependent, with an explicit degenerate generator. The second, and the methodological core of the paper, is a delay-penalty comparison theorem: uniformly larger marginal delay costs raise the value, shrink the continuation region, order the stopping times pathwise, and—under a one-sided boundary representation—lower the alarm boundary. The same principle gives a sampling-cost comparison for sequential testing. A constant-SNR Shiryaev example, solved through the variational inequality, exhibits the predicted threshold monotonicity. The comparison is deliberately structural rather than constructive: it presumes a stopping formulation and, for the boundary statement, the one-sided structure secured by monotone signal-to-noise conditions [ernst2024gapeev], and it does not provide new explicit boundary solutions. Natural extensions include multi-source and multi-hypothesis detection, where the posterior lives on a simplex and the stopping regions are separated by hypersurfaces [dayanik2008multisource]; nonlinear penalties whose marginal statistic (25) requires genuine state augmentation; and minimax analogues of the comparison principle. In each case the comparison continues to apply whenever a common Markov state and terminal cost are available.

Delay-Penalty Comparison for Sequential Testing and Quickest Detection in State-Dependent Diffusion Models

Abstract

keywords:

pacs:

1 Introduction

Contribution.

Organization.

2 Literature Review

Relation to existing literature.

3 Filtering Reductions for Diffusion Sequential Problems

3.1 Sequential testing

3.2 Bayesian quickest detection

3.3 Closed Markov state and generator

Assumption 1.

Theorem 2 (Closed Markov state for diffusion observations).

Proof.

Remark 3 (On the cross term and projection).

Remark 4 (Degeneracy and regularity).

Example 5 (State-dependent signal-to-noise: Bessel dimension).

4 Generic Optimal Stopping Formulation

Standing assumptions.

5 Delay-Penalty Comparison

Theorem 6 (Delay-penalty ordering of sequential rules).

Proof.

Remark 7 (Comparative-statics reading).

5.1 Linear delay cost

Corollary 8 (Linear delay cost).

5.2 Sampling cost in sequential testing

Corollary 9 (Sampling cost).

5.3 Nonlinear marginal delay penalties

Proposition 10 (Marginal-cost augmentation).

Proof.

6 Free-Boundary Interpretation and Verification

Proposition 11 (Verification).

Proof.

7 Worked Example: Threshold Monotonicity in the Shiryaev Diffusion Model

Numerical method.

Results.

Interpretation.

8 Relation to CUSUM and Shiryaev–Roberts Procedures

9 Conclusion

References