License: CC BY 4.0
arXiv:2606.23882v1 [math.ST] 22 Jun 2026

Order restricted estimation of the parameter functions in an additive hazard model

Dragi Anevski111dragi@maths.lth.se , Center for Mathematical Sciences, Lund University
and
ElBatoul Manel Merai222manel-elbatoul.merai@doc.umc.edu.dz, Department of Mathematics,
Constantine 1 Brothers Mentouri University
Abstract

In this paper we propose estimators of the parameter functions in an Aalen additive hasard regression model. The estimators are the individual and componentwise l2l^{2} projections of the naive estimators resulting from the ordinary least squares estimator in the Aalen additive hazard model on the space of monotone functions. We provide pointwise limit distribution results for the resulting estimators, that exhibit n1/3n^{-1/3} rate of convergence and the Chernoff distribution as the limit distribution.

1 Introduction

In this paper we suggest estimators for the parameter functions in an Aalen additive hasard model, in a survival analysis setting, under the assumption of right-censored date and independent censoring.

Assuming that the interesting time to event TT is a continuous positive random variable, with hasard cumulative distribution function FF and hasard function h(t)=F(t)/(1F(t))h(t)=F^{\prime}(t)/(1-F(t)), a possible model for hh, incorporating covariates, is the Aalen additive hasard model

h(t)\displaystyle h(t) =\displaystyle= β0(t)+β1(t)z1++βp(t)zp,\displaystyle\beta_{0}(t)+\beta_{1}(t)z_{1}+\ldots+\beta_{p}(t)z_{p}, (1)

where one supposes that β0,,βp\beta_{0},\ldots,\beta_{p} are (unknown) functions, and z1,,zpz_{1},\ldots,z_{p} are given covariates. If the the parameter vector of function β=(β0,,βp+1)\beta=(\beta_{0},\ldots,\beta_{p+1}) is completely unspecified, the common approach to estimating the vector of functions β\beta is to first realise that it is not possible to provide a nonparametric estimator of it directly. Instead one estimates the vector of integrated functions B=(B0,,Bp)B=(B_{0},\ldots,B_{p}) where Bk(t)=0tβk(u)𝑑uB_{k}(t)=\int_{0}^{t}\beta_{k}(u)\,du, for each k=0,,pk=0,\ldots,p, cf. [1].

The disadvantage with providing estimators of BB instead of β\beta is that Bk(t)B_{k}(t) gives the total effect of covariate zkz_{k} summed (i.e. integrated) for all times u[0,t]u\in[0,t], whereas βk(t)\beta_{k}(t) gives the effect of the covariate zkz_{k} at the time tt. Clearly βk(t)\beta_{k}(t) is more informative and is potentially more interesting e.g. for the clinical doctor that is interested in describing the effect of covariate values (e.g. LDL cholesterol) on the conditional probability of experiencing the interesting event (e.g. heart attack) at time tt, conditional on not having experienced it before time tt, by the interpretation of the hasard as

h(t)dt\displaystyle h(t)dt =\displaystyle= P(Tt+dt|T>t).\displaystyle P(T\leq t+dt|T>t).

One possibility is to use kernel estimators, to get an estimator for β\beta from the estimator of BB. However, kernel estimators are somewhat ad-doc, and in particular necessitates the choice of a bandwidth.

We suggest in this paper an approach that provide an order-restricted nonparametric estimator of β\beta. One advantage with this estimator is that it is data-adaptive in that it uses an implicit bandwidth given by the data. We are furthermore able to provide limit distributions for the suggested estimator. The limit distribution is the Chernoff distribution, which is commonly featured in order-restricted nonparametric inference.

There are some previous results on this problem, [5] uses a method to monotonise the basic estimator in the Aalen model that is different from ours and show’s that his estimator is asymptotically equivalent to the standard estimator, [6] uses a slightly different additive hasard model, which does not seem to include the Aalen model, proposes an order restricted least square estimator and treat mainly computational issues.

Our estimator is to our knowledge new, and our limit distribution results are to our knowledge as well novel.

The paper is organised as follows. In Section 2 we introduce the probabilistic model for the data, as well as the inference problem that we will treat. The model gives rise to a system of stochastic differential equation, and we review the common and well known least squares solution estimator B^{\hat{B}} of the integrals B=(B0,,Bp)B=(B_{0},\ldots,B_{p}) of the unknown parameter functions β=(β0,,βp)\beta=(\beta_{0},\ldots,\beta_{p}). The least squares solution will serve as a starting estimator for the order restricted estimator that we present next.

Then, in Section 3 we present the component-wise least squares projection of the naive estimator arising from the starting estimator presented in Section 2, on the space of decreasing functions. These can be written as the derivative of the function S(B^k)S(\hat{B}_{k}), where SS is the least concave majorant map.

Next, in Section 4, we derive the main results of this paper, which are the limit distributions of the estimators. We start by writing B^\hat{B} as a sum of the unknown BB and a stochastic process vnv_{n}. We furthermore rescale and localise the estimator B^\hat{B} which gives rise to a rescaled deterministic term gng_{n} and a rescaled stochastic term v~n\tilde{v}_{n}. In Theorem 1 we derive the process limit distribution of the p+1p+1 dimensional rescaled process v~n\tilde{v}_{n} to a Gaussian stochastic process v~\tilde{v} with a certain covariance structure. In Corollary 1 we state the resulting component-wise limit distributions for the individual processes v~k,n\tilde{v}_{k,n}, for every k=0,,pk=0,\ldots,p.

Next, in Lemma 1 we prove a result on a bound on the tail of the process v~k,n\tilde{v}_{k,n} that ensures that when applying the least concave majorant map SS on the process gk,n+v~k,ng_{k,n}+\tilde{v}_{k,n}, the tail behaviour of that process will not affect the application of the map SS around the origin. In Lemma 2 we state the analog bound for the tail of the limit process v~k\tilde{v}_{k} that ensures the same thing for the limit process s2+v~(s)-s^{2}+\tilde{v}(s), where is fact s2-s^{2} is proportional to the uniform limit of gk,n(s)g_{k,n}(s).

Then in Theorem 2 we state one of the two main results of this paper, namely that the integral B~k\tilde{B}_{k} of the proposed estimators converges to a limit random variable, as

n2/3c(t0)(B~k(t0)Bk(t0)\displaystyle n^{2/3}c(t_{0})(\tilde{B}_{k}(t_{0})-B_{k}(t_{0}) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(s2+w(s))(0),\displaystyle S(-s^{2}+w(s))(0),

with w(s)w(s) a two-sided Brownian motion. The constant c(t0)c(t_{0}) is specified in Theorem 2.

Next, in Theorem 3, we state the second main result of the paper, namely that the proposed estimator β~k\tilde{\beta}_{k} converges to a limit random variable, as

n1/3c(t0)(β~k(t0)βk(t0))\displaystyle n^{1/3}c(t_{0})(\tilde{\beta}_{k}(t_{0})-\beta_{k}(t_{0})) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(s2+w(s))(0).\displaystyle S(-s^{2}+w(s))^{\prime}(0).

We note that the rate is n1/3n^{1/3}, which is common in nonparametric order restricted inference, and that the limit distribution S(s2+w(s))(0)S(-s^{2}+w(s))^{\prime}(0) is (proportional to) the Chernoff distribution, which is common in nonparametric order restricted inference.

Finally in Section 5 we discuss the derived results.

2 The survival analysis model setting

Let T0T\geq 0 be a positive continuous random variable with an unknown distribution function FF. We assume that TT models the time to an event. We assume no left-truncation for the data, and that we have the standard right-censoring, i.e. that we observe the minimum of the time TiT_{i} and a censoring time CiC_{i}, together with an indicator for the time being exact, δi=1{TiCi}\delta_{i}=1\{T_{i}\leq C_{i}\}.

Introduce the individual counting processes Ni(t)=1{tit,δi=1}N_{i}(t)=1\{t_{i}\leq t,\delta_{i}=1\} for which one has the stochastic differential equation

dNi(t)\displaystyle dN_{i}(t) =\displaystyle= Yi(t)h(t)dt+dMi(t),\displaystyle Y_{i}(t)h(t)dt+dM_{i}(t), (2)

for i=1,,ni=1,\ldots,n, where h(t)h(t) is the individual hasard function, which exists when the distribution of TT is absolutely continuous, and is then given by h(t)=d/dtlog(1F(t))h(t)=-{d}/{dt}\log(1-F(t)), where Yi(t)=1{tit}Y_{i}(t)=1\{t_{i}\geq t\} is the left-continuous indicator process for the individual being at risk at time tt-, and MiM_{i} is the individual martingale, i.e. satisfying E(dMi(t)|t)=0E(dM_{i}(t)|{\cal F}_{t})=0, and where the sigma algebras {t,t}\{{\cal F}_{t},t\geq\} is a filtration storing the information available at times tt.

The σ\sigma-algebra generated by the information depends on the amount and type of information about the observed times that is available and the amount of information about the covariates that is available. Starting with the model (2)(\ref{eq:SDE-base}) satisfying E(dMi(t)|t)=0E(dM_{i}(t)|{\cal F}_{t})=0 one needs to establish that E(dMi(t)|𝒢t)=0E(dM_{i}(t)|{\cal G}_{t})=0 where 𝒢t{\cal G}_{t} is the σ\sigma-algebra generated by the observed and available information at time tt. The concept of noninformative and independent censoring as well as the innovation theorem are used to establish this link. We do not make this assumptions explicit here, since they are of no relevance to us, and refer to reader to a standard reference such as [1]. We will in the sequel assume that t{\cal F}_{t} is a filtration, containing the information available at time tt, and that E(dMi(t)|t)=0E(dM_{i}(t)|{\cal F}_{t})=0 holds so that E(dNi(t)|t)=Yi(t)h(t)dtE(dN_{i}(t)|{\cal F}_{t})=Y_{i}(t)h(t)dt.

A basic inference problem in survival analysis is assessing the effect of group indicators or continuous covariate measurements on the distribution of the time to an event. It is then necessary to assume that and to model for the distribution function depending on covariates z1,,zpz_{1},\ldots,z_{p}. This can be done using various models. One standard model is the Aalen additive hasard model

h(t)\displaystyle h(t) =\displaystyle= β0(t)+β1(t)z1++βp(t)zp,\displaystyle\beta_{0}(t)+\beta_{1}(t)z_{1}+\ldots+\beta_{p}(t)z_{p}, (3)

where β0,,βp\beta_{0},\ldots,\beta_{p} are (unknown) functions. Thus β=(β0,,βp)\beta=(\beta_{0},\ldots,\beta_{p}) is the unknown parameter vector of functions and Ep+1E^{p+1} is the parameter space, where E={g:[0,)R,g<}E=\{g:[0,\infty)\to{\mathbb R},\int g<\infty\} is the set of integrable functions on [0,)[0,\infty).

The value of βk(t)\beta_{k}(t) describes the time-instanteneous effect of covariate zkz_{k} on the hasard function h(t)h(t). The standard approach for estimating the βk\beta_{k}’s is to first acknowledge that they are not possible to estimate directly. Rather one estimates their integrals Bk(t)=0tβk(u)𝑑uB_{k}(t)=\int_{0}^{t}\beta_{k}(u)du. In fact, one can write (2)(\ref{eq:SDE-base}) for the Aalen model as

dNi(t)\displaystyle dN_{i}(t) =\displaystyle= Yi(t){dB0(t)+dB1(t)z1i++dBp(t)zpi}+dMi(t),\displaystyle Y_{i}(t)\left\{dB_{0}(t)+dB_{1}(t)z_{1i}+\ldots+dB_{p}(t)z_{pi}\right\}+dM_{i}(t), (4)

for i=1,,ni=1,\ldots,n, where Yi(t)Y_{i}(t) is the individual at-risk process and MiM_{i} is a continuous time martingale, for individual i=1,,ni=1,\ldots,n. The nn equations (4)(\ref{eq:SDE}) can be written on the matrix formulation

dN(t)\displaystyle dN(t) =\displaystyle= Y(t)dB(t)+dM(t),\displaystyle Y(t)dB(t)+dM(t),

where B(t)=(B0(t),,Bp(t))tB(t)=(B_{0}(t),\ldots,B_{p}(t))^{t} is the vector of unknown functions, and YY is a n×(p+1)n\times(p+1) matrix, with the ii’th row of YY being Yi(t)(1,z1i,,zpi)Y_{i}(t)(1,z_{1i},\ldots,z_{pi}).

If J(t)J(t) is the (predictable) indicator that Y(t)Y(t) has full rank, and

Y(t)\displaystyle Y^{-}(t) =\displaystyle= (Y(t)TY(t))1Y(t)T\displaystyle(Y(t)^{T}Y(t))^{-1}Y(t)^{T}

is a (generalised) inverse, then BB can be estimated by Aalen’s ordinary least squares solution

B^(t)\displaystyle\hat{B}(t) =\displaystyle= 0tJ(u)Y(u)𝑑N(u).\displaystyle\int_{0}^{t}J(u)Y^{-}(u)dN(u). (5)

The interpretation of a value of the integral Bk(t)B_{k}(t) is less intuitive than the interpretation of the value of βk(t)\beta_{k}(t), since βk(t)\beta_{k}(t) is the instantaneous effect, at time tt, of the covariate zkz_{k} of the total hasard h(t)h(t) at time tt. Thus one would really like to get an estimate of βk\beta_{k}, and this is not possible to do directly. One possibility for estimation of βk\beta_{k} itself would be to do kernel smoothing, with the drawback that this is a slightly ad hoc method, with a bandwidth the user has to specify. Thus it is not automated, or data adaptive.

An alternative for estimation of βk\beta_{k}, which does not necessitate bandwidth choices, and is data adaptive, is to use estimation under some nonparametric restrictions. In this paper we suggest to estimate BB under the assumption that each βk\beta_{k} is a nonincreasing function, i.e. a function that is (not necessarily strictly) decreasing. This is an order restricted inference problem, and a nonparametric such.

3 The order restricted estimator

We define the order restricted estimators of the βk\beta_{k}’s as the least squares projection of the increments of the components on the Aalen estimator B^\hat{B} on the space of monotone functions. Thus let B^k\hat{B}_{k} be the kk’th component in B^\hat{B}, for k=0,,pk=0,\ldots,p, and suppose that B^k\hat{B}_{k} (which is a step function) has LkL_{k} incremental steps, at points t1,,tLkt_{1},\ldots,t_{L_{k}}. Thus (Δ1B^k,,ΔLkB^k)(\Delta_{1}\hat{B}_{k},\ldots,\Delta_{L_{k}}\hat{B}_{k}) is the vector of increments, where ΔjB^k=B^k(tj)B^k(tj1)\Delta_{j}\hat{B}_{k}=\hat{B}_{k}(t_{j})-\hat{B}_{k}(t_{j-1}). Then we may introduce the naive vector of estimates β^(k)=(β^1(k),,β^Lk(k))\hat{\beta}^{(k)}=(\hat{\beta}_{1}^{(k)},\ldots,\hat{\beta}_{L_{k}}^{(k)}), where β^i(k)=ΔjB^k/Δjt\hat{\beta}_{i}^{(k)}={\Delta_{j}\hat{B}_{k}}/{\Delta_{j}t}, where Δjt=tjtj1\Delta_{j}t=t_{j}-t_{j-1}, and with the convention t0=0t_{0}=0.

For any integer number LL, let L={γRL:γ1γL}{\cal R}_{L}=\{\gamma\in{\mathbb R}^{L}:\gamma_{1}\geq\ldots\geq\gamma_{L}\} be the set of real vectors that have non-increasing coordinates. We then define the isotonic regression β~(k)\tilde{\beta}^{(k)} of β^(k)\hat{\beta}^{(k)} as

β~(k)\displaystyle\tilde{\beta}^{(k)} =\displaystyle= argminγLki=1Lk(β^i(k)γi)2.\displaystyle\mathrm{argmin}_{\gamma\in{\cal R}_{L_{k}}}\sum_{i=1}^{L_{k}}(\hat{\beta}_{i}^{(k)}-\gamma_{i})^{2}. (6)

Finally we define the order restricted estimator β~k\tilde{\beta}_{k} as the constant interpolation of the vector β~(k)\tilde{\beta}^{(k)}

β~k(s)\displaystyle\tilde{\beta}_{k}(s) =\displaystyle= tisβ~i(k)Δit+β~j(k)(stj),\displaystyle\sum_{t_{i}\leq s}\tilde{\beta}^{(k)}_{i}\Delta_{i}t+\tilde{\beta}^{(k)}_{j}(s-t_{j}),

where j=sup{i:tis}j=\sup\{i:t_{i}\leq s\} is index of the largest tist_{i}\leq s. Standard theory for isotonic regression shows that the vector β~(k)\tilde{\beta}^{(k)}, and therefore β~k(s)\tilde{\beta}_{k}(s), exists. Furthermore, a geometric characterisation of the solution β~k(s)\tilde{\beta}_{k}(s) is given by

β~k(t)\displaystyle\tilde{\beta}_{k}(t) =\displaystyle= ddt(S(B^k(t)))\displaystyle\frac{d}{dt}(S(\hat{B}_{k}(t))) (7)

where SS is the least concave majorant map and d/dtd/dt denotes the left hand derivative. We note also that the corresponding cumulative function

B~k(t)\displaystyle\tilde{B}_{k}(t) =\displaystyle= S(B^k(t))\displaystyle S(\hat{B}_{k}(t)) (8)

is an order restricted estimator of the cumulative function BkB_{k}, and that it is concave.

4 The limit distribution results for the estimators

We first see that we can write that the p+1p+1-dimensional vector of estimators B^\hat{B} as

B^(t)\displaystyle\hat{B}(t) =\displaystyle= 0tJ(u)Y(u)𝑑M(u),\displaystyle\int_{0}^{t}J(u)Y^{-}(u)dM(u),

where J(t)J(t) is the (predictable) indicator that Y(t)Y(t) has full rank, where

Y(t)\displaystyle Y^{-}(t) =\displaystyle= (Y(t)TY(t))1Y(t)T\displaystyle(Y(t)^{T}Y(t))^{-1}Y(t)^{T}

is a (generalised) inverse, and where M(t)M(t) is an nn-vector of locally square-integrable martingales.

We center the estimator B^\hat{B} at BB, and define the process part of the estimator as

vn(t)\displaystyle v_{n}(t) =\displaystyle= 0tJ(u)Y(u)𝑑M(u)B(t).\displaystyle\int_{0}^{t}J(u)Y^{-}(u)dM(u)-B(t).

to get

B^(t)\displaystyle\hat{B}(t) =\displaystyle= B(t)+vn(t),\displaystyle B(t)+v_{n}(t),

and note that this is a slight adaptation from the partition/centering used in [4]. In fact, we have written the preliminary estimator B^(t)\hat{B}(t) as a sum of a deterministic part B(t)B(t) and a stochastic part vn(t)v_{n}(t). The final order restricted estimator is obtained as a coordinate-wise isotonic regression of the increments of the preliminary estimator B^(t)\hat{B}(t), as defined in (6)(\ref{eq:def-isotonic-regression}).

Therefore the local rescaling should be defined coordinate-wise, and it is in fact enough to study the coordinate-wise partition

B^k(t)\displaystyle\hat{B}_{k}(t) =\displaystyle= Bk(t)+vk,n(t)\displaystyle B_{k}(t)+v_{k,n}(t)

and the component-wise rescaling

v~k,n(s)\displaystyle\tilde{v}_{k,n}(s) =\displaystyle= dn2(vk,n(t0+sdn)vk,n(t0)),\displaystyle d_{n}^{-2}\big(v_{k,n}(t_{0}+sd_{n})-v_{k,n}(t_{0})\big),

and to establish limit properties for the rescaled v~k,n\tilde{v}_{k,n}, for the results that we will develop here. In particular we want to establish local limit distribution results as well as certain truncation properties for the rescaled process v~k,n\tilde{v}_{k,n}. We are however able to establish the limit distributions for the full vector valued rescaled process, and since that result may be of independent interest, we will state this result. The coordinate wise property will then be a corollary of that result.

We will make frequent referencing to the Cramér-Wold device for processes, that WnW_{n}, a dd-dimensional stochastic process converging in distribution to a dd-dimensional Gaussian WW process, is equivalent to the weak convergence of the one-dimensional process α1W1,n++αdWd,n\alpha_{1}W_{1,n}+\ldots+\alpha_{d}W_{d,n} to α1W1++αdWd\alpha_{1}W_{1}+\ldots+\alpha_{d}W_{d}, for every choice of of α1,,αd\alpha_{1},\ldots,\alpha_{d}.

In fact we are going to adapt the proof of Theorem VII of [1] which states that the p+1p+1-dimensional process vnv_{n} converges to a Gaussian process, say vv, with a certain covariance structure, to our settings. This implies, by the Cramér-Wold device and since a Gaussian process is determined by it’s expectation and covariance function, that the kk’th coordinate vk,nv_{k,n} will converge to a Gaussian process vkv_{k} with a covariance structure determined from the covariance structure of the full process vv. One could therefeore rescale the kk’th coordinate process vk,nv_{k,n} and establish limit distributions for that process. As already mentioned, we will instead rescale the full process and invoke the Cramér-Wold device subsequentaly.

Thus let us define the full rescaled process part

v~n(s)=dn2(vn(t0+sdn)vn(t0)).\displaystyle\tilde{v}_{n}(s)=d_{n}^{-2}\big(v_{n}(t_{0}+sd_{n})-v_{n}(t_{0})\big).

For the local limit distribution results we will adapt the proof of Theorem VII.4.1 of [1] to our settings, and we will establish the limit distribution result under the same assumptions as those in Theorem VII.4.1 Thus we define for j,k,l=0,1,,pj,k,l=0,1,\ldots,p, the functions

Rj(1)(t)\displaystyle R^{(1)}_{j}(t) =\displaystyle= i=1nYi(t)Zij(t),\displaystyle\sum_{i=1}^{n}Y_{i}(t)Z_{ij}(t),
Rjk(2)(t)\displaystyle R^{(2)}_{jk}(t) =\displaystyle= i=1nYij(t)Yik(t),\displaystyle\sum_{i=1}^{n}Y_{ij}(t)Y_{ik}(t),
Rjkl(3)(t)\displaystyle R^{(3)}_{jkl}(t) =\displaystyle= i=1nYij(t)Yik(t)Yil(t).\displaystyle\sum_{i=1}^{n}Y_{ij}(t)Y_{ik}(t)Y_{il}(t).

Let 0<s<0<s^{\prime}<\infty be arbitrary, so that [0,s][0,s^{\prime}] is an arbitrary compact set.

Assumption 1

For all j,k,l=0,1,,pj,k,l=0,1,\ldots,p, there exist continuous functions rj(1),rjk(2),rjkl(3)r^{(1)}_{j},r^{(2)}_{jk},r^{(3)}_{jkl} such that as nn\to\infty:

sups[0,s]|1nRj(1)(s)rj(1)(s)|\displaystyle\sup_{s\in[0,s^{\prime}]}\left|\frac{1}{n}R^{(1)}_{j}(s)-r^{(1)}_{j}(s)\right| 𝑃0,\displaystyle\xrightarrow{P}0,
sups[0,s]|1nRjk(2)(s)rjk(2)(s)|\displaystyle\sup_{s\in[0,s^{\prime}]}\left|\frac{1}{n}R^{(2)}_{jk}(s)-r^{(2)}_{jk}(s)\right| 𝑃0,\displaystyle\xrightarrow{P}0,
sups[0,s]|1nRjkl(3)(s)rjkl(3)(s)|\displaystyle\sup_{s\in[0,s^{\prime}]}\left|\frac{1}{n}R^{(3)}_{jkl}(s)-r^{(3)}_{jkl}(s)\right| 𝑃0.\displaystyle\xrightarrow{P}0.
Assumption 2

For all j=0,1,,pj=0,1,\ldots,p,

1nsupi=1,,nsups[0,s]|Yij(s)|𝑃0.\displaystyle\frac{1}{\sqrt{n}}\sup_{i=1,\dots,n}\sup_{s\in[0,s^{\prime}]}|Y_{ij}(s)|\xrightarrow{P}0.
Assumption 3

For all s[0,s]s\in[0,s^{\prime}], the matrix r(2)(s)=(rjk(2)(s))r^{(2)}(s)=\big(r^{(2)}_{jk}(s)\big) is nonsingular.

Theorem 1

Suppose that Assumptions 1- 3 hold. Then

v~n(s)\displaystyle\tilde{v}_{n}(s) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} v~(s)\displaystyle\tilde{v}(s)

on Dp+1(c,c)D^{p+1}(-c,c), as nn\to\infty, where v~\tilde{v} is mean zero Gaussian process with covariance structure

Cov(v~j(s),v~k(s′′)\displaystyle Cov(\tilde{v}_{j}(s^{\prime}),\tilde{v}_{k}(s^{\prime\prime}) =\displaystyle= σj,kmin(s,s′′),\displaystyle\sigma_{j,k}\,\min(s^{\prime},s^{\prime\prime}),

where

σj,k\displaystyle\sigma_{j,k} =\displaystyle= g,l,m=0p(r(2)(t0))jl1(r(2)(t0))km1rlmg(3)(t0)βg(t0).\displaystyle\sum_{g,l,m=0}^{p}(r^{(2)}(t_{0}))^{-1}_{jl}(r^{(2)}(t_{0}))^{-1}_{km}r^{(3)}_{lmg}(t_{0})\beta_{g}(t_{0}).

Proof. Defining the matrix

R(2)(t)\displaystyle R^{(2)}(t) :=\displaystyle:= i=1nYi(t)YiT(t)\displaystyle\sum_{i=1}^{n}Y_{i}(t)Y_{i}^{T}(t)
=\displaystyle= Y(t)TY(t),\displaystyle Y(t)^{T}Y(t),

the second statement of Assumption 1 implies

sups[0,s]1nR(2)(s)r(2)(s)\displaystyle\sup_{s\in[0,s^{\prime}]}||\frac{1}{n}R^{(2)}(s)-r^{(2)}(s)|| P\displaystyle\stackrel{{\scriptstyle P}}{{\to}} 0\displaystyle 0

where r(2)r^{(2)} is defined in Assumption 3, and with ||||||\cdot|| denoting the euclidian (matrix) norm on Rp+1×Rp+1{\mathbb R}^{p+1}\times{\mathbb R}^{p+1}. By Assumption 3 the matrix r(2)r^{(2)} is invertible, and the inverse (1nR(2)(s))1(\frac{1}{n}R^{(2)}(s))^{-1} is well defined when J(s)=1J(s)=1, and for those ss converges to (r(2)(s))1(r^{(2)}(s))^{-1} by the continuous mapping theorem, since matrix inversion is a continuous map (under the supnorm matrix metric). Thus, since Y(t)=(Y(t)TY(t))1Y(t)TY^{-}(t)=(Y(t)^{T}Y(t))^{-1}Y(t)^{T}, we may partition the process part as

vn(t)\displaystyle v_{n}(t) =\displaystyle= 0tJ(u)[(1nR(2)(u))1(r(2)(u))1]YT(u)𝑑M(u)\displaystyle\int_{0}^{t}J(u)\Big[\big(\tfrac{1}{n}R^{(2)}(u)\big)^{-1}-\big(r^{(2)}(u)\big)^{-1}\Big]Y^{T}(u)dM(u) (9)
+0tJ(u)(r(2)(u))1YT(u)𝑑M(u)+0t(J(u)1)β(u)𝑑u.\displaystyle+\int_{0}^{t}J(u)\big(r^{(2)}(u)\big)^{-1}Y^{T}(u)dM(u)+\int_{0}^{t}(J(u)-1)\beta(u)du.

Recall the definition of the rescaled process

v~n(s)=dn2(vn(t0+sdn)vn(t0)),\displaystyle\tilde{v}_{n}(s)=d_{n}^{-2}\big(v_{n}(t_{0}+sd_{n})-v_{n}(t_{0})\big),\quad

for s[c,c]s\in[-c,c], and note that it entails that v~n(s)\tilde{v}_{n}(s) is the sum of the three integrals in (9)(\ref{eq:process-partition}) with the integrals going from t0t_{0} to t0+sdnt_{0}+sd_{n}, and all multiplied by dn2d_{n}^{-2}. We may now use a change of variables inside the integrals, so for u[t0,t0+sdn]u\in[t_{0},t_{0}+sd_{n}] and ss fixed we let s[0,s]s^{\prime}\in[0,s] vary and and thus du=dndsdu=d_{n}ds^{\prime} so that we obtain

v~n(s)\displaystyle\tilde{v}_{n}(s) =\displaystyle= dn2n1n0sJ(t0+sdn)((1nR(2)(t0+sdn))1(r(2)(t0+sdn))1)\displaystyle\frac{d_{n}^{-2}}{\sqrt{n}}\frac{1}{\sqrt{n}}\int_{0}^{s}J(t_{0}+s^{\prime}d_{n})\Big(\big(\tfrac{1}{n}R^{(2)}(t_{0}+s^{\prime}d_{n})\big)^{-1}-\big(r^{(2)}(t_{0}+s^{\prime}d_{n})\big)^{-1}\Big)\cdot
YT(t0+sdn)dM(t0+sdn)\displaystyle\cdot Y^{T}(t_{0}+s^{\prime}d_{n})dM(t_{0}+s^{\prime}d_{n})
+dn2n[1n0sJ(t0+sdn)(r(2)(t0+sdn))1YT(t0+sdn)𝑑M(t0+sdn)]\displaystyle+\frac{d_{n}^{-2}}{\sqrt{n}}\left[\frac{1}{\sqrt{n}}\int_{0}^{s}J(t_{0}+s^{\prime}d_{n})(r^{(2)}(t_{0}+s^{\prime}d_{n}))^{-1}Y^{T}(t_{0}+s^{\prime}d_{n})dM(t_{0}+s^{\prime}d_{n})\right]
+dn2n[n0s(J(t0+sdn)1)β(t0+sdn)dn𝑑s]\displaystyle+\frac{d_{n}^{-2}}{\sqrt{n}}\left[\sqrt{n}\int_{0}^{s}(J(t_{0}+s^{\prime}d_{n})-1)\beta(t_{0}+s^{\prime}d_{n})d_{n}ds^{\prime}\right]
=:\displaystyle=: v~n(1)(s)+v~n(2)(s)+v~n(3)(s).\displaystyle\tilde{v}_{n}^{(1)}(s)+\tilde{v}_{n}^{(2)}(s)+\tilde{v}_{n}^{(3)}(s).

We will now treat the three terms v~n(1),v~n(2),v~n(3)\tilde{v}_{n}^{(1)},\tilde{v}_{n}^{(2)},\tilde{v}_{n}^{(3)} above separately, and show that the first and last vanish asymptotically, while the second v~n(2)\tilde{v}_{n}^{(2)} gives rise to the asymptotic distribution.

(i):v~n(1)(i):\tilde{v}_{n}^{(1)} vanishes asymptotically.

If we denote write the jj’th component of the first term v~n(1)\tilde{v}_{n}^{(1)} as v~n,j(1)\tilde{v}_{n,j}^{(1)}, we get

v~n,j(1)(s)\displaystyle\tilde{v}_{n,j}^{(1)}(s^{\prime}) =\displaystyle= dn2n1n0sJ(t0+sdn)i=1nl=0p[(1nR(2)(t0+sdn))1\displaystyle\frac{d_{n}^{-2}}{\sqrt{n}}\frac{1}{\sqrt{n}}\int_{0}^{s^{\prime}}J(t_{0}+sd_{n})\sum_{i=1}^{n}\sum_{l=0}^{p}\Big[\big(\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n})\big)^{-1}
(r(2)(t0+sdn))1]jlYil(t0+sdn)dMi(t0+sdn),\displaystyle-\big(r^{(2)}(t_{0}+sd_{n})\big)^{-1}\Big]_{jl}Y_{il}(t_{0}+sd_{n})dM_{i}(t_{0}+sd_{n}),

for arbitrary but fixed j=0,,pj=0,\ldots,p. Therefore the predictable variation process becomes

v~n,j(1)(s)\displaystyle\langle\tilde{v}_{n,j}^{(1)}\rangle(s^{\prime}) =\displaystyle= dn4n1n0sJ(t0+sdn)i=1n{l=0p((1nR(2)(t0+sdn))1\displaystyle\frac{d_{n}^{-4}}{n}\frac{1}{n}\int_{0}^{s^{\prime}}J(t_{0}+sd_{n})\sum_{i=1}^{n}\Bigg\{\sum_{l=0}^{p}\Big(\big(\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n})\big)^{-1}
(r(2)(t0+sdn))1)jlYil(t0+sdn)}2dMi(t0+sdn)\displaystyle-\big(r^{(2)}(t_{0}+sd_{n})\big)^{-1}\Big)_{jl}\cdot Y_{il}(t_{0}+sd_{n})\Bigg\}^{2}d\langle M_{i}\rangle(t_{0}+sd_{n})
=\displaystyle= dn4n1n0sJ(t0+sdn)i=1n{l=0p((1nR(2)(t0+sdn))1\displaystyle\frac{d_{n}^{-4}}{n}\frac{1}{n}\int_{0}^{s^{\prime}}J(t_{0}+sd_{n})\sum_{i=1}^{n}\Bigg\{\sum_{l=0}^{p}\Big(\big(\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n})\big)^{-1}
(r(2)(t0+sdn))1)jlYil(t0+sdn)}2dnλi(t0)ds,\displaystyle-\big(r^{(2)}(t_{0}+sd_{n})\big)^{-1}\Big)_{jl}\cdot Y_{il}(t_{0}+sd_{n})\Bigg\}^{2}d_{n}\lambda_{i}(t_{0})ds,

since dMi(t0+sdn)=dnλi(t0)dsd\langle M_{i}\rangle(t_{0}+sd_{n})=d_{n}\lambda_{i}(t_{0})ds, and since dMi,Mj=0d\langle M_{i},M_{j}\rangle=0 for iji\neq j. From the above we see that the local rescaling rate dn=nαd_{n}=n^{-\alpha} is determined by the condition

dn4ndn\displaystyle\frac{d_{n}^{-4}}{n}\cdot d_{n} =\displaystyle= dn3n=1\displaystyle\frac{d_{n}^{-3}}{n}=1

and thus we must have dn=n1/3d_{n}=n^{-1/3}.

By Assumption 1 we have that

sups[0,s]1nR(2)(t0+sdn)r(2)(t0+sdn)\displaystyle\sup\limits_{s\in[0,s^{{}^{\prime}}]}||\displaystyle\frac{1}{n}R^{(2)}(t_{0}+sd_{n})-r^{(2)}(t_{0}+sd_{n})|| P\displaystyle\stackrel{{\scriptstyle P}}{{\rightarrow}} 0\displaystyle 0

and by Assumption 3, r(2)(t0)r^{(2)}(t_{0}) is nonsingular. Then, since on the set of points tt where J(t)=1J(t)=1 we have that R(2)(t)1R^{(2)}(t)^{-1} exists, and since the matrix inverse map is a continuous map (under the supnorm metric), then by the continuous mapping theorem, for jj fixed and for every l=0,,pl=0,\ldots,p,

sups[0,s]|J(t0+sdn)((1nR(2)(t0+sdn))1(r(2)(t0+sdn))1)jl|\displaystyle\sup_{s\in[0,s^{\prime}]}|J(t_{0}+sd_{n})\Big((\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n}))^{-1}-(r^{(2)}(t_{0}+sd_{n}))^{-1}\Big)_{jl}| P\displaystyle\stackrel{{\scriptstyle P}}{{\rightarrow}} 0.\displaystyle 0.

Therefore, there are random variables C0(n),,Cp(n)C_{0}^{(n)},\ldots,C_{p}^{(n)} such that

sups[0,s]|J(t0+sdn)l=0p((1nR(2)(t0+sdn))1(r(2)(t0+sdn))1)jlYil(t0+sdn)|\displaystyle\sup_{s\in[0,s^{\prime}]}|J(t_{0}+sd_{n})\sum_{l=0}^{p}\Big((\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n}))^{-1}-(r^{(2)}(t_{0}+sd_{n}))^{-1}\Big)_{jl}Y_{il}(t_{0}+sd_{n})|
\displaystyle\leq l=0pCl(n)Yil(t0+sdn),\displaystyle\sum_{l=0}^{p}C_{l}^{(n)}Y_{il}(t_{0}+sd_{n}),

and such that Cl(n)=oP(1)C_{l}^{(n)}=o_{P}(1), for all l=0,,pl=0,\ldots,p.

Thus, for dn=n1/3d_{n}=n^{-1/3},

v~n,j(1)(s)\displaystyle\langle\tilde{v}_{n,j}^{(1)}\rangle(s^{\prime}) \displaystyle\leq 0sJ(t0+sdn)l,k=0pCl(n)Ckn1ni=1nYil(t0+sdn)Yik(t0+sdn)λi(t0+sdn)ds.\displaystyle\int_{0}^{s^{\prime}}J(t_{0}+sd_{n})\sum_{l,k=0}^{p}C_{l}^{(n)}C_{k}^{{n}}\frac{1}{n}\sum_{i=1}^{n}Y_{il}(t_{0}+sd_{n})Y_{ik}(t_{0}+sd_{n})\lambda_{i}(t_{0}+sd_{n})ds.

From the second line of Assumption 1, we have

P(sups[0,s]|1ni=1nYil(t0+sdn)Yik(t0+sdn)rkl(2)(t0+sdn)|>ε)0,\displaystyle P\left(\sup_{s\in[0,s^{\prime}]}\Big|\frac{1}{n}\sum_{i=1}^{n}Y_{il}(t_{0}+sd_{n})Y_{ik}(t_{0}+sd_{n})-r^{(2)}_{kl}(t_{0}+sd_{n})\Big|>\varepsilon\right)\to 0,

and thus

v~n,j(1)(s)𝑃0.\displaystyle\langle\tilde{v}_{n,j}^{(1)}\rangle(s^{\prime})\xrightarrow{P}0.

Finally by Lenglart’s inequality, for all ε,η>0\varepsilon,\eta>0,

P(sups[0,s]|v~n,j(1)(s)|>ε)\displaystyle P\left(\sup_{s\in[0,s^{\prime}]}|\tilde{v}_{n,j}^{(1)}(s)|>\varepsilon\right) \displaystyle\leq ηε2+P(v~n,j(1)(s)>η),\displaystyle\frac{\eta}{\varepsilon^{2}}+P\big(\langle\tilde{v}_{n,j}^{(1)}\rangle(s^{\prime})>\eta\big),

which shows that that v~n,j(1)\tilde{v}_{n,j}^{(1)} converges in probability to zero, uniformly on [0,s][0,s^{\prime}], as claimed.

(ii):v~n(2)(ii):\tilde{v}_{n}^{(2)} gives the limiting behaviour.

We derive the asymptotic normality of v~n(2)\tilde{v}_{n}^{(2)} by first establishing the limit in probability of the predictable covariation processes of v~n(2)\tilde{v}_{n}^{(2)} and then by checking the Lindeberg condition for v~n(2)\tilde{v}_{n}^{(2)}.

Let v~n,j(2)\tilde{v}_{n,j}^{(2)} denote the jj’th component of v~n(2)\tilde{v}_{n}^{(2)}, for j=0,,pj=0,\ldots,p. Thus

v~n,j(2)(s)\displaystyle\tilde{v}_{n,j}^{(2)}(s^{\prime}) =\displaystyle= dn2n1n0si=1nl=0p(r(2)(t0+sdn))jl1Yil(t0+sdn)dMi(t0+sdn).\displaystyle\frac{d_{n}^{-2}}{\sqrt{n}}\frac{1}{\sqrt{n}}\int_{0}^{s^{\prime}}\sum_{i=1}^{n}\sum_{l=0}^{p}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{jl}Y_{il}(t_{0}+sd_{n})dM_{i}(t_{0}+sd_{n}). (10)

We first establish the asymptotic limit of the quadratic covariation processes. For dn=n1/3d_{n}=n^{-1/3}, the predictable quadratic covariation between components jj and kk is

v~n,j(2),v~n,k(2)(s)\displaystyle\langle\tilde{v}_{n,j}^{(2)},\tilde{v}_{n,k}^{(2)}\rangle(s^{\prime}) =\displaystyle= dn4n1ni=1n0sl,m=01(r(2)(t0+sdn))jl1(r(2)(t0+sdn))km1\displaystyle\frac{d_{n}^{-4}}{n}\frac{1}{n}\sum_{i=1}^{n}\int_{0}^{s^{\prime}}\sum_{l,m=0}^{1}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{jl}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{km}
Yil(t0+sdn)Yim(t0+sdn)dMi(t0+sdn)\displaystyle\cdot Y_{il}(t_{0}+sd_{n})Y_{im}(t_{0}+sd_{n})d\left\langle M_{i}\right\rangle(t_{0}+sd_{n})
=\displaystyle= 1ni=1n0sl,m=01(r(2)(t0+sdn))jl1(r(2)(t0+sdn))km1\displaystyle\frac{1}{n}\sum_{i=1}^{n}\int_{0}^{s^{\prime}}\sum_{l,m=0}^{1}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{jl}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{km}
Yil(t0+sdn)Yim(t0+sdn)λi(t0+sdn)ds.\displaystyle\cdot Y_{il}(t_{0}+sd_{n})Y_{im}(t_{0}+sd_{n})\lambda_{i}(t_{0}+sd_{n})ds.

Since

λi(t0+sdn)=g=0pβg(t0+sdn)Zig(t0+sdn)Yi(t0+sdn),\displaystyle\lambda_{i}(t_{0}+sd_{n})=\sum_{g=0}^{p}\beta_{g}(t_{0}+sd_{n})Z_{ig}(t_{0}+sd_{n})Y_{i}(t_{0}+sd_{n}),

this becomes

v~n,j(2),v~n,k(2)(s)\displaystyle\langle\tilde{v}_{n,j}^{(2)},\tilde{v}^{(2)}_{n,k}\rangle(s^{\prime}) =\displaystyle= 1ni=1n0sl,m=0p(r(2)(t0+sdn))jl1(r(2)(t0+sdn))km1\displaystyle\frac{1}{n}\sum_{i=1}^{n}\int_{0}^{s^{\prime}}\sum_{l,m=0}^{p}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{jl}(r^{(2)}(t_{0}+sd_{n}))^{-1}_{km}
Yil(t0+sdn)Yim(t0+sdn)g=0pβg(t0+sdn)Yig(t0+sdn)ds.\displaystyle\cdot Y_{il}(t_{0}+sd_{n})Y_{im}(t_{0}+sd_{n})\sum_{g=0}^{p}\beta_{g}(t_{0}+sd_{n})Y_{ig}(t_{0}+sd_{n})ds.

Since, by Assumption 2,

sups[0,s]|1ni=1nYilYimYig(t0+sdn)rlmg(3)(t0+sdn)|\displaystyle\sup_{s\in[0,s^{\prime}]}|\frac{1}{n}\sum_{i=1}^{n}Y_{il}Y_{im}Y_{ig}(t_{0}+sd_{n})-r^{(3)}_{lmg}(t_{0}+sd_{n})| P\displaystyle\stackrel{{\scriptstyle P}}{{\to}} 0,\displaystyle 0,

we obtain

v~n,j(2),v~n,k(2)(s)\displaystyle\langle\tilde{v}_{n,j}^{(2)},\tilde{v}^{(2)}_{n,k}\rangle(s^{\prime}) P\displaystyle\stackrel{{\scriptstyle P}}{{\to}} 0sg,l,m=0p(r(2)(t0))jl1(r(2)(t0))km1rlmg(3)(t0)βg(t0)ds.\displaystyle\int_{0}^{s^{\prime}}\sum_{g,l,m=0}^{p}(r^{(2)}(t_{0}))^{-1}_{jl}(r^{(2)}(t_{0}))^{-1}_{km}r^{(3)}_{lmg}(t_{0})\beta_{g}(t_{0})ds.

The result also shows that the asymptotic covariance of v~n,j(2)\tilde{v}_{n,j}^{(2)} and v~n,k(2)\tilde{v}^{(2)}_{n,k} is given by the right hand side of the above expression. Similarly, if we let v~(2)\tilde{v}^{(2)} denote the process obtained as the limit in distribution of v~n(2)\tilde{v}_{n}^{(2)}, using the conditional indepedence of the increments of the martingale difference sequence, we can show that for s,s′′>0s^{\prime},s^{\prime\prime}>0,

Cov(v~j(2)(s),v~k(2)(s′′))=\displaystyle Cov(\tilde{v}_{j}^{(2)}(s^{\prime}),\tilde{v}_{k}^{(2)}(s^{\prime\prime}))=
0min(s,s′′)g,l,m=0p(r(2)(t0))jl1(r(2)(t0))km1rlmg(3)(t0)βg(t0)ds.\displaystyle\int_{0}^{\min(s^{\prime},s^{\prime\prime})}\sum_{g,l,m=0}^{p}(r^{(2)}(t_{0}))^{-1}_{jl}(r^{(2)}(t_{0}))^{-1}_{km}r^{(3)}_{lmg}(t_{0})\beta_{g}(t_{0})ds. (11)

We next verify the Lindeberg condition for v~n(2)\tilde{v}_{n}^{(2)}, for establishing the asymptotic normality. We have, with the choice dn=n1/3d_{n}=n^{-1/3}, and v~n,j(2)\tilde{v}_{n,j}^{(2)} the jj^{\prime}th component, defined in (10)(\ref{eq:v_n-tilde-j}),

j=1rnE[(v~n,j(2)(s))2𝟏{|v~n,j(2)(s)|>ε}]\displaystyle\sum_{j=1}^{r_{n}}\mathbb{E}\left[(\tilde{v}_{n,j}^{(2)}(s^{\prime}))^{2}\cdot\mathbf{1}_{\{|\tilde{v}_{n,j}^{(2)}(s^{\prime})|>\varepsilon\}}\right]\leq
E[(1ni=1n0sl=0p(r(2)(t0+sdn))jl1Yil(t0+sdn))2λi(t0+sdn)ds𝟏{|v~n,j(2)(s)|>ε}]\displaystyle\mathbb{E}\left[\left(\displaystyle\frac{1}{n}\sum\limits_{i=1}^{n}\int\limits_{0}^{s^{{}^{\prime}}}\sum_{l=0}^{p}\left(r^{(2)}(t_{0}+sd_{n})\right)^{-1}_{jl}Y_{il}(t_{0}+sd_{n})\right)^{2}\lambda_{i}(t_{0}+sd_{n})ds\cdot\mathbf{1}_{\{|\tilde{v}_{n,j}^{(2)}(s^{\prime})|>\varepsilon\}}\right]
\displaystyle\leq i=1rn1ni=1n0sl=0pk=0psups[0,s]|r(2)(t0+sdn)|jl1sups[0,s]|r(2)(t0+sdn)|jk1\displaystyle\sum\limits_{i=1}^{r_{n}}\dfrac{1}{n}\sum\limits_{i=1}^{n}\int\limits_{0}^{s^{{}^{\prime}}}\sum\limits_{l=0}^{p}\sum\limits_{k=0}^{p}\sup\limits_{s\in[0,s^{\prime}]}\left|r^{(2)}(t_{0}+sd_{n})\right|^{-1}_{jl}\sup\limits_{s\in[0,s^{\prime}]}\left|r^{(2)}(t_{0}+sd_{n})\right|^{-1}_{jk}
supi=1,,n,s[0,s]|Yil(t0+sdn)|λi(t0+sdn)ds×E(𝟏{|v~n,j(2)(s)|>ε})\displaystyle\sup\limits_{i=1,...,n,\,s\in[0,s^{\prime}]}\left|Y_{il}(t_{0}+sd_{n})\right|\lambda_{i}(t_{0}+sd_{n})ds\times\mathbb{E}\left(\mathbf{1}_{\left\{\left|\tilde{v}_{n,j}^{(2)}(s^{\prime})\right|>\varepsilon\right\}}\right)

where the last inequality follows by expanding the square and the triangle inequality.

Under Assumption 1, r(2)r^{(2)} are continuous functions on the compact [0,s][0,s^{{}^{\prime}}], and thus they are bounded.

Furthermore, from Chebyshev’s inequality we get,

P(|dn2n1ni=1n0sl=0p(r(2)(t0+sdn))jl1Yil(t0+sdn)dMi(t0+sdn)|>ε)\displaystyle P\left(\left|\frac{d_{n}^{-2}}{\sqrt{n}}\frac{1}{\sqrt{n}}\sum\limits_{i=1}^{n}\int\limits_{0}^{s^{{}^{\prime}}}\sum\limits_{l=0}^{p}\left(r^{(2)}(t_{0}+sd_{n})\right)^{-1}_{jl}Y_{il}(t_{0}+sd_{n})\,dM_{i}(t_{0}+sd_{n})\right|>\varepsilon\right)
\displaystyle\leq ε2E[1ni=1n0s(l=0p(r(2)(t0+sdn))jl1Yil(t0+sdn))2λi(t0+sdn)𝑑s].\displaystyle\varepsilon^{-2}E\left[\displaystyle\frac{1}{n}\sum\limits_{i=1}^{n}\int\limits_{0}^{s^{{}^{\prime}}}\left(\sum\limits_{l=0}^{p}\left(r^{(2)}(t_{0}+sd_{n})\right)^{-1}_{jl}Y_{il}(t_{0}+sd_{n})\right)^{2}\lambda_{i}(t_{0}+sd_{n})ds\right].

Thus, our proof is accomplished due to the uniform convergence in probability to zero of the YilY_{il} function, in Assumption 2.

(iii)(iii): The term v~n(3)\tilde{v}_{n}^{(3)} is asymptotically negligible.
From the definition of v~n(3)\tilde{v}_{n}^{(3)}, we see that

v~n(3)(s)\displaystyle\tilde{v}_{n}^{(3)}(s^{\prime}) =\displaystyle= dn10s(J(t0+sdn)1)β(t0+sdn)𝑑s.\displaystyle d_{n}^{-1}\int_{0}^{s^{\prime}}\big(J(t_{0}+sd_{n})-1\big)\beta(t_{0}+sd_{n})\,ds.

Recall that the process J(t0+sdn)J(t_{0}+sd_{n}) is the indicator that YT(t0+sdn)Y^{T}(t_{0}+sd_{n}) has full rank. Define the set

En={sups[0,s]1nR(2)(t0+sdn)r(2)(t0+sdn)<ε}.\displaystyle E_{n}=\left\{\sup_{s\in[0,s^{\prime}]}\Big\|\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n})-r^{(2)}(t_{0}+sd_{n})\Big\|<\varepsilon\right\}.

We have established that r(2)(t)r^{(2)}(t) is nonsingular, and hence 1nR(2)(t0+sdn)\tfrac{1}{n}R^{(2)}(t_{0}+sd_{n}) is invertible on EnE_{n}, and therefore J(t0+sdn)=1J(t_{0}+sd_{n})=1 for all s[0,s]s\in[0,s^{\prime}].

Thus, with the choise dn=n1/3d_{n}=n^{-1/3},

P(sups[0,s]|dn10s(J(t0+sdn)1)β(t0+sdn)𝑑s|>ε)\displaystyle P\left(\sup_{s\in[0,s^{\prime}]}\left|{d_{n}}^{-1}\int_{0}^{s^{\prime}}\big(J(t_{0}+sd_{n})-1\big)\beta(t_{0}+sd_{n})\,ds\right|>\varepsilon\right)
=\displaystyle= P(sups[0,s]|dn10s(J(t0+sdn)1)β(t0+sdn)𝑑s|>εEn)\displaystyle P\left(\sup_{s\in[0,s^{\prime}]}\left|{d_{n}}^{-1}\int_{0}^{s^{\prime}}\big(J(t_{0}+sd_{n})-1\big)\beta(t_{0}+sd_{n})\,ds\right|>\varepsilon\cap E_{n}\right)
+P(sups[0,s]|dn10s(J(t0+sdn)1)β(t0+sdn)𝑑s|>εEnc)\displaystyle+P\left(\sup_{s\in[0,s^{\prime}]}\left|{d_{n}}^{-1}\int_{0}^{s^{\prime}}\big(J(t_{0}+sd_{n})-1\big)\beta(t_{0}+sd_{n})\,ds\right|>\varepsilon\cap E_{n}^{c}\right)
\displaystyle\leq P(Enc)\displaystyle P(E_{n}^{c})
\displaystyle\to 0,\displaystyle 0,

where the inequality follows since the first term vanishes since J(t0+sdn)=1J(t_{0}+sd_{n})=1 on EnE_{n}. Consequently,

P(sups[0,s]|v~n(3)|>ε)\displaystyle P\left(\sup_{s\in[0,s^{\prime}]}|\tilde{v}_{n}^{(3)}|>\varepsilon\right) \displaystyle\longrightarrow 0,\displaystyle 0,

as nn\to\infty, i.e. v~n(3)\tilde{v}_{n}^{(3)} is asymptotically negligible.

\Box

The coordinate-wise result now follows by the Cramér-Wold device. Recall that

v~k,n(s)\displaystyle\tilde{v}_{k,n}(s) =\displaystyle= dn2(vk,n(t0+sdn)vk,n(t0)),\displaystyle d_{n}^{-2}\big(v_{k,n}(t_{0}+sd_{n})-v_{k,n}(t_{0})\big),

is the coordinate-wise rescaled process, for k=0,,pk=0,\ldots,p.

Corollary 1

Suppose that Assumptions 1- 3 hold. Then, for any k=0,,pk=0,\ldots,p,

v~k,n(s)\displaystyle\tilde{v}_{k,n}(s) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} v~k(s),\displaystyle\tilde{v}_{k}(s),

on D(c,c)D(-c,c), as nn\to\infty, where v~\tilde{v} is mean zero Gaussian process with covariance structure

Cov(v~k(s),v~k(s′′)\displaystyle Cov(\tilde{v}_{k}(s^{\prime}),\tilde{v}_{k}(s^{\prime\prime}) =\displaystyle= σk2min(s,s′′),\displaystyle\sigma^{2}_{k}\,\min(s^{\prime},s^{\prime\prime}),

where

σk2\displaystyle\sigma^{2}_{k} =\displaystyle= g,l,m=0p(r(2)(t0))kl1(r(2)(t0))km1rlmg(3)(t0)βg(t0).\displaystyle\sum_{g,l,m=0}^{p}(r^{(2)}(t_{0}))^{-1}_{kl}(r^{(2)}(t_{0}))^{-1}_{km}r^{(3)}_{lmg}(t_{0})\beta_{g}(t_{0}).

Proof. The result follows from the previous theorem and the Cramér-Wold device with choice of coefficients αk=1\alpha_{k}=1, and αj=0\alpha_{j}=0 for jkj\neq k. \Box

The corollary thus establishes Assumption A1 of [4] for v~k,n\tilde{v}_{k,n}.

Note 1

We note that the limit process v~k\tilde{v}_{k} can be identified with a (two-sided) Brownian motion with covariance structure Cov(v~k(s),v~k(s′′)=σk2min(s,s′′)Cov(\tilde{v}_{k}(s^{\prime}),\tilde{v}_{k}(s^{\prime\prime})=\sigma^{2}_{k}\,\min(s^{\prime},s^{\prime\prime}), with σk2\sigma^{2}_{k} defined above. \Box

Next we define kk’th coordinate of the rescaled deterministic part

gk,n(s)\displaystyle g_{k,n}(s) =\displaystyle= dn2(t0t0+sdnY1(u)Y(u)𝑑B(u)Y1(t0)Y(t0)dB(t0)sdn)k\displaystyle d_{n}^{-2}\left(\int_{t_{0}}^{t_{0}+sd_{n}}Y^{-1}(u)Y(u)\,dB(u)-Y^{-1}(t_{0})Y(t_{0})\,dB(t_{0})sd_{n}\right)_{k}
=\displaystyle= dn2(t0t0+sdn𝑑Bk(u)βk(t0)sdn).\displaystyle d_{n}^{-2}\left(\int_{t_{0}}^{t_{0}+sd_{n}}\,dB_{k}(u)-\beta_{k}(t_{0})sd_{n}\right)\ .

We will first show that gk,ng_{k,n} satisfies Assumption A2 of [4], i.e. that for every finite c>0c>0 there is an Ak<0A_{k}<0 such that

sup|s|c|gk,n(s)Aks2|\displaystyle\sup_{|s|\leq c}\left|g_{k,n}(s)-A_{k}s^{2}\right| \displaystyle\to 0\displaystyle 0 (12)

as nn\to\infty. But it is elementary so see that this holds if βk\beta_{k} is differentiable with βk<0\beta_{k}^{\prime}<0 in a neighbourhood around t0t_{0}, with Ak=βk(t0)A_{k}=\beta^{\prime}_{k}(t_{0}).

Next we want to establish Proposition 1 of [4], from which Assumptions A3 and A4 of that paper will follow.

Lemma 1

Suppose that Assumptions 1, 2 and 3 hold and that βk\beta_{k} is differentiable with βk<0\beta_{k}^{\prime}<0 in a neighbourhood around t0t_{0}. Then Proposition 1 in [4] holds for v~k,n\tilde{v}_{k,n} and gk,ng_{k,n}, i.e. they satisfy ε,δ>0\forall\varepsilon,\delta>0, τ=τ(ε,δ)<\exists\tau=\tau(\varepsilon,\delta)<\infty, such that

lim supnP(sup|s|τ|v~k,n(s)gk,n(s)|>ε)\displaystyle\limsup_{n\to\infty}\mathbb{P}\left(\sup_{|s|\geq\tau}\left|\frac{\tilde{v}_{k,n}(s)}{g_{k,n}(s)}\right|>\varepsilon\right) <\displaystyle< δ,\displaystyle\delta,

Proof. We show the result by first bounding gk,ng_{k,n}, and then using that bound to prove, via Doob’s and Chebyshev’s inequalities and the Ito isometry and properties of v~k,n\tilde{v}_{k,n}, the full result.

(i)(i) Bounding gk,n(s)g_{k,n}(s): We have shown that gk,ng_{k,n} satisfies Assumption A2 in [4], i.e. that (12)(\ref{Ass:A2}) holds. Then in particular, τ>0\forall\tau>0, 0<ε<12|Ak|τ20<\varepsilon<\displaystyle\frac{1}{2}|A_{k}|\tau^{2} and for s=±τs=\pm\tau, we get

gk,n(±τ)\displaystyle g_{k,n}(\pm\tau) \displaystyle\leq Akτ2+ε.\displaystyle A_{k}\tau^{2}+\varepsilon.

Since gk,n(0)=0g_{k,n}(0)=0 and gk,ng_{k,n} is concave, for some finite n0=n0(ε)n_{0}=n_{0}(\varepsilon), we have that

gk,n(s)\displaystyle g_{k,n}(s) \displaystyle\leq gk,n(τ)τ|s|\displaystyle\displaystyle\frac{g_{k,n}(\tau)}{\tau}|s|
\displaystyle\leq Akτ2ετ|s|\displaystyle\displaystyle\frac{A_{k}\tau^{2}-\varepsilon}{\tau}|s|
\displaystyle\leq 12Akτ21|s|,\displaystyle\displaystyle\frac{1}{2}A_{k}\tau^{2-1}|s|,

for all |s|τ|s|\geq\tau and all nn0n\geq n_{0}. Thus we have established that for all |s|τ|s|\geq\tau and all nn0n\geq n_{0},

gn(s)12Akτ|s|.\displaystyle g_{n}(s)\geq\displaystyle\frac{1}{2}A_{k}\tau|s|. (13)

(ii)(ii) Bounding v~k,n(s)\tilde{v}_{k,n}(s): The proof is similar to the corresponding result for the rescaled process in [4] in the cases of rescaled partial sum processes and empirical process, for which one partitioned {|s|τ}\{|s|\geq\tau\} into intervals, exhibited bounds of the process at the boundaries of those intervals, and used a modulus of continuity for the processes on the intervals. We, however, will use the fact that we have martingales to our advantage by using Doob’s maximal L2L^{2} inequality to bound the maximum over an interval by the values at its endpoints, and then the Ito isometry.

Thus we partition the tail set {|s|τ}\{|s|\geq\tau\} into dyadic intervals, by

{|s|τ}\displaystyle\{|s|\geq\tau\} =\displaystyle= j=0Bj,\displaystyle\cup_{j=0}^{\infty}B_{j},

with Bj={s:2jτs2j+1τ}B_{j}=\{s:2^{j}\tau\leq s\leq 2^{j+1}\tau\}, for j=0,1,2,.j=0,1,2,\ldots. Then for |s|τ|s|\geq\tau, and using the bound (13)(\ref{eq:gn_bound}), we get

|v~k,n(s)||gk,n(s)|\displaystyle\frac{|\tilde{v}_{k,n}(s)|}{|g_{k,n}(s)|} \displaystyle\leq 2Ak|v~k,n(s)|τ|s|.\displaystyle\frac{2}{A_{k}}\frac{|\tilde{v}_{k,n}(s)|}{\tau|s|}.

Therefore

P(sup|s|τ|v~k,n(s)||gk,n(s)|>ϵ)\displaystyle\mathbb{P}\left(\sup_{|s|\geq\tau}\frac{|\tilde{v}_{k,n}(s)|}{|g_{k,n}(s)|}>\epsilon\right) \displaystyle\leq P(sup|s|τ|v~k,n(s)||s|>Ak2ϵτ)\displaystyle\mathbb{P}\left(\sup_{|s|\geq\tau}\frac{|\tilde{v}_{k,n}(s)|}{|s|}>\frac{A_{k}}{2}\epsilon\tau\right) (14)
\displaystyle\leq P(j=0{supsBj|v~k,n(s)||s|>Ak2ϵτ})\displaystyle\mathbb{P}\left(\cup_{j=0}^{\infty}\{\sup_{s\in B_{j}}\frac{|\tilde{v}_{k,n}(s)|}{|s|}>\frac{A_{k}}{2}\epsilon\tau\}\right)
\displaystyle\leq j=0P(supsBj|v~k,n(s)||s|>Ak2ϵτ)\displaystyle\sum_{j=0}^{\infty}\mathbb{P}\left(\sup_{s\in B_{j}}\frac{|\tilde{v}_{k,n}(s)|}{|s|}>\frac{A_{k}}{2}\epsilon\tau\right)
\displaystyle\leq j=0P(supsBj|v~k,n(s)|>ϵAk2τ22j),\displaystyle\sum_{j=0}^{\infty}\mathbb{P}\left(\sup_{s\in B_{j}}|\tilde{v}_{k,n}(s)|>\epsilon\frac{A_{k}}{2}\tau^{2}2^{j}\right),

where the last inequality follows since on BjB_{j} we have |s|>2jτ|s|>2^{j}\tau.

We now bound the individual terms in the above sum, by

P(supsBj|v~k,n(s)|>ϵAkτ22j1)\displaystyle\mathbb{P}\left(\sup_{s\in B_{j}}|\tilde{v}_{k,n}(s)|>\epsilon A_{k}\tau^{2}2^{j-1}\right) \displaystyle\leq E(supsBjv~k,n2(s))(ϵAkτ22j1)2\displaystyle\frac{\mathbb{E}\left(\sup_{s\in B_{j}}\tilde{v}^{2}_{k,n}(s)\right)}{(\epsilon A_{k}\tau^{2}2^{j-1})^{2}} (15)
\displaystyle\leq 4E(v~k,n2(2j+1τ))(ϵAkτ22j1)2,\displaystyle\frac{4\mathbb{E}\left(\tilde{v}^{2}_{k,n}(2^{j+1}\tau)\right)}{(\epsilon A_{k}\tau^{2}2^{j-1})^{2}},

where the first inequality follows by Chebyshev’s inequality and the second by Doob’s maximal L2L^{2} inequality. By the Ito isometry

E(v~k,n2(2j+1τ))=dn4(t0t0+2j+1τdnY(u)2dM(u))k,\displaystyle\mathbb{E}\left(\tilde{v}^{2}_{k,n}(2^{j+1}\tau)\right)=d_{n}^{-4}(\int_{t_{0}}^{t_{0}+2^{j+1}\tau d_{n}}\|Y^{-}(u)\|^{2}\,d\langle M\rangle(u))_{k},

which is bounded, by C<C<\infty say, since YY^{-} is bounded in probability and MM is a square-integrable martingale.

Thus, from (14)(\ref{eq:bounding-partition}) and (15)(\ref{eq:bounding-individual}), we get

P(sup|s|τ|v~k,n(s)||gk,n(s)|>ϵ)\displaystyle\mathbb{P}\left(\sup_{|s|\geq\tau}\frac{|\tilde{v}_{k,n}(s)|}{|g_{k,n}(s)|}>\epsilon\right) \displaystyle\leq 64C(ϵAkτ2)2j=02j\displaystyle\frac{64\,C}{(\epsilon A_{k}\tau^{2})^{2}}\sum_{j=0}^{\infty}2^{-j}
<\displaystyle< δ,\displaystyle\delta,

where the last inequality follows by choosing τ=τ(ϵ,δ)\tau=\tau(\epsilon,\delta) large enough for fixed ϵ,δ>0\epsilon,\delta>0, and nn0n\geq n_{0}. \Box

Finally we establish the tail behaviour of the components of the limit process v~\tilde{v}, i.e. we prove that v~k\tilde{v}_{k} satisfies Assumption A5 in [4].

Lemma 2

Suppose that Assumptions 1, 2 and 3 hold and that βk\beta_{k} is differentiable with βk<0\beta_{k}^{\prime}<0 in a neighbourhood around t0t_{0}, for a fixed k=0,1,,pk=0,1,\ldots,p. Then the component v~k\tilde{v}_{k} of the limit process v~\tilde{v} satisfies Assumption A5 in [4], i.e. for every ϵ,δ>0\epsilon,\delta>0

P(sup|s|τ|v~k(s)|s2>ϵ)\displaystyle\mathbb{P}\left(\sup_{|s|\geq\tau}\frac{|\tilde{v}_{k}(s)|}{s^{2}}>\epsilon\right)

Proof. The proof is a straight-forward adaptation of the methods in the proof of Lemma 1, with the use of Doob’s and Chebyshev’s inequalities and the Ito isometry. \Box

We are next able to state a limit distribution result for the order restricted estimator B~k\tilde{B}_{k}, defined in (8)(\ref{eq:cumulative-isotonic-regression}), of the cumulative function BkB_{k}.

Theorem 2

Suppose that Assumptions 1, 2 and 3 hold and that βk\beta_{k} is differentiable with βk<0\beta_{k}^{\prime}<0 in a neighbourhood around t0t_{0}. Then

n2/3c(t0)(B~k(t0)Bk(t0))\displaystyle n^{2/3}c(t_{0})(\tilde{B}_{k}(t_{0})-B_{k}(t_{0})) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(s2+w(s))(0),\displaystyle S(-s^{2}+w(s))(0),

as nn\to\infty, where

c(t0)\displaystyle c(t_{0}) =\displaystyle= 21/3|βk(t0)|1/3(σk2)2/3,\displaystyle 2^{-1/3}|\beta_{k}^{\prime}(t_{0})|^{1/3}(\sigma_{k}^{2})^{-2/3},

and ww is a standard two-sided Brownian motion.

Proof. Since we have established that Assumption A1-A5 in [4] hold, we have that

n2/3[B~k(t0)Bk(t0)]\displaystyle n^{2/3}[\tilde{B}_{k}(t_{0})-B_{k}(t_{0})] d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(Aks2+v~k(s))(0),\displaystyle S(A_{k}s^{2}+\tilde{v}_{k}(s))(0),

as nn\to\infty, as a consequence of Theorem 1 in [4].

Furthermore, since v~k\tilde{v}_{k} is a two-sided Brownian motion, with v~k(0)=0\tilde{v}_{k}(0)=0, and with covariance Cov(v~k(s),v~k(s))=σk2min(s,s)Cov(\tilde{v}_{k}(s),\tilde{v}_{k}(s^{\prime}))=\sigma_{k}^{2}\min(s,s^{\prime}), we have by the self similarity properties of Brownian motion that v~k(s)=d(σk2)1/2w(s)\tilde{v}_{k}(s)\stackrel{{\scriptstyle d}}{{=}}(\sigma_{k}^{2})^{1/2}\,w(s), with ww a standard (two-sided) Brownian motion. In fact, we can simplify the expression for limit distribution further, by the change of variable s=γus=\gamma u, to obtain

Aks2+v~k(s)\displaystyle A_{k}s^{2}+\tilde{v}_{k}(s) =\displaystyle= Akγ2u2+v~k(γu)\displaystyle A_{k}\gamma^{2}u^{2}+\tilde{v}_{k}(\gamma u)
=d\displaystyle\stackrel{{\scriptstyle d}}{{=}} Akγ2u2+γ1/2(σk2)1/2w(u)\displaystyle A_{k}\gamma^{2}u^{2}+\gamma^{1/2}(\sigma_{k}^{2})^{1/2}\,w(u)
=\displaystyle= (σk2)2/3Ak1/3[u2+w(u)]\displaystyle(\sigma_{k}^{2})^{2/3}A_{k}^{-1/3}[-u^{2}+w(u)]

where the second equality follows by the self similarity of Brownian motion, and the third by choosing γ\gamma so that Akγ2=γ1/2(σk2)1/2-A_{k}\gamma^{2}=\gamma^{1/2}(\sigma_{k}^{2})^{1/2}, i.e. with γ=(σk2)1/3(Ak)2/3\gamma=-(\sigma_{k}^{2})^{1/3}(-A_{k})^{-2/3}.

Finally, we use that S(cg(u))=cS(g(u))S(cg(u))=cS(g(u)) for any function gg and any constant c>0c>0, by properties of the least concave majorant SS, cf. e.g. Lemma A1 in [4] (noting the typo in formula (74) in [4]; the constant aa must be positive), to establish that

S(Aks2+v~k(s))\displaystyle S(A_{k}s^{2}+\tilde{v}_{k}(s)) =d\displaystyle\stackrel{{\scriptstyle d}}{{=}} (σk2)2/3(Ak)1/3S(s2+w(s)).\displaystyle(\sigma_{k}^{2})^{2/3}(-A_{k})^{-1/3}S(-s^{2}+w(s)).

Finally, noting that Ak=|βk(t0)|/2-A_{k}=|\beta_{k}^{\prime}(t_{0})|/2, proves the formula for c1(t0)c_{1}(t_{0}), and ends the proof of the theorem.

\Box

In order to state the final limit distribution, for the solution β~k\tilde{\beta}_{k}, we need to study the limit process y(s)=s2+w(s)y(s)=-s^{2}+w(s), and show that it satisfies the assumptions of Proposition 2 in [4], with the appropriate analog statements for the least concave majorant, and thus that Assumption A6 in [4] for y(s)y(s) holds. However, this has in fact been already established for the process y(s)y(s) in [4]. Thus we have the following theorem.

Theorem 3

Suppose that Assumptions 1, 2 and 3 hold and that βk\beta_{k} is differentiable with βk<0\beta_{k}^{\prime}<0 in a neighbourhood around t0t_{0}. Then

n1/3c(t0)(β~k(t0)βk(t0))\displaystyle n^{1/3}c(t_{0})(\tilde{\beta}_{k}(t_{0})-\beta_{k}(t_{0})) d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(s2+w(s))(0),\displaystyle S(-s^{2}+w(s))^{\prime}(0),

as nn\to\infty, where

c(t0)\displaystyle c(t_{0}) =\displaystyle= 21/3|βk(t0)|1/3(σk2)4/3,\displaystyle 2^{-1/3}|\beta_{k}^{\prime}(t_{0})|^{1/3}(\sigma_{k}^{2})^{-4/3},

and ww is a standard two-sided Brownian motion.

Proof. Since we have established that Assumptions A1-A6 in [4] hold, from Theorem 2 in [4] it follows that

n1/3[β~k(t0)βk(t0)]\displaystyle n^{1/3}[\tilde{\beta}_{k}(t_{0})-\beta_{k}(t_{0})] d\displaystyle\stackrel{{\scriptstyle d}}{{\to}} S(Aks2+v~k(s))(0),\displaystyle S(A_{k}s^{2}+\tilde{v}_{k}(s))^{\prime}(0),

as nn\to\infty. Rescaling and use of self similarity for the Brownian motion as in the proof of Theorem 2, shows the statement of the theorem. \Box

Note 2

The limit distribution S(s2+w(s))(0)S(-s^{2}+w(s))^{\prime}(0) is a version of the Chernoff distribution argmaxsR(s2+w(s))\mathrm{argmax}_{s\in{\mathbb R}}(-s^{2}+w(s)), that arises in many cases of nonparametric order restricted inference.

5 Discussion

In this paper we have derived limit distributions for the coordinate wise least squares projection of a naive estimator on the space of decreasing functions. The results are derived using a general approach presented in [4], and the main work in this paper has been to establish the necessary conditions required in [4] for the conclusions of that paper to hold. That in fact gives us our two main results Theorems 2 and 3. The conditions under which we are able to establish these results are the conditions required in [1] for the derivation of limit distributions of the starting estimator B^\hat{B}; thus we do not need to demand more than is demanded in [1].

One of main vehicles for this is our Theorem 1, which derives the limit distribution for the rescaled process v~n\tilde{v}_{n}. We note that the result in Theorem 1 is in fact stronger than necessary for our need, and that we only need its consequence Corollary 1.

6 Acknowledgments

The research of DA is partially supported by the Swedish Research Council (SRC). DA gratefully acknowledges the SRC’s support.

References

  • [1] Per Kragh Andersen, Ørnulf Borgan, Richard D. Gill and Niels Keiding (1993). Statistical Models Based on Counting Processes. Springer series in Statistics
  • [2] Robertson, T., Wright, F. T. and Dykstra R. L. (1988). Order restricted statistical inference. John Wiley & Sons, Ltd., Chichester.
  • [3] van der Vaart, A.W. (1998). Asymptotic Statistics. Cambridge University Press, New York.
  • [4] Anevski, D. and Hössjer, O. (2006) A general asymptotic scheme for inference under order restrictions. Annals of Statistics, 34(4): 1874-1930
  • [5] Yijian Huang (2017) Restoration of monotonicity respecting in dynamic regression. Journal of the American Statistical Association, 112:518, 613-622,
  • [6] Yunro Chung, Anastasia Ivanova and Jason P. Fine (2024) Shape restricted additive hazards models: Monotone, unimodal, and U-shaped hazard functions. Statistics in Medicine, 43:1671–1687.