Skip to main content
Internet Archive's 25th Anniversary Logo

Full text of "Strengthened Chernoff-type variance bounds"

See other formats


(N 



oo 



W 



Strengthened Chernoff-type variance bounds' 

G. Afendras^ and N. Papadatos* 



^^ I Department of Mathematics, Section of Statistics and O.R., University of Athens, 

^^ ■ Panepistemiopolis, 157 84 Athens, Greece. 



Abstract: Let X be an absolutely continuous random variable from 
the integrated Pearson family and assume that X has finite moments 
of any order. Using some properties of the associate orthonormal 
polynomial system we provide a class of strengthened Chernoff-type 
variance bounds. 



^ ; AMS 2000 subject classification: Primary 60E15 

Key words and phrases: Integrated Pearson family; Rodrigues 
^r^ . polynomials; Derivatives; Chernoff-type variance bounds. 

^O ■ 1 Introduction 

r- 

Let Z be a standard normal random variable and g : R — )• R any absolutely continuous 
P^ ! function with derivative g' such that E(g'(X))^ < oo. Chemoff (1981), using Hermite 

polynomials, proved that 

Varg(Z) ^ E(g'(Z))2; (1.1) 

see, also, Nash (1958) and Brascamp and Lieb (1976). In (1.1) the equality holds if and 
rN ! only if g is a polynomial of degree at most one - a linear function. This inequality plays 

cd \ an important role in the isoperimetric problem, as well as to several areas in probability 

and statistics. It has been extended and generalized by many authors, including [13], [10], 
[8], [19], [11], [23], [18], [17], [22], [21], [24], [25], [1]. On the other hand, CacouUos 
(1982) showed the inequality 

Varg(Z)^EV(Z), (1-2) 

in which the equality again holds if and only if g is linear. 

In this article we provide improvements on Chemoff's bound. In particular, an appli- 
cation of the main result (Theorem 3.1) to Z yields, for n = 1, the inequality 

Varg(Z) ^ iEV(Z) + ^E(g'(Z))2, (1.3) 



*Work partially supported by the University of Athens Research Grant 70/4/5637 

^e-mail: g_afendras@math.uoa.gr 

^Corresponding author e-mail; npapadat@math.uoa.gr, url: users.uoa.gr/^npapadat/ 

1 



G. Afendras, N. Papadatos 



in which the equality holds if and only if ^ is a polynomial of degree at most two. In view 
of ( 1 .2) it is clear that the upper bound in ( 1 .3) improves the one given in ( 1 . 1 ) and, in fact, 
it is strictly better, unless g is linear. 

Similar bounds are valid for all distributions that will be studied in the sequel, namely. 
Beta, Gamma and Normal. The main result applies to any Pearson (more precisely, inte- 
grated Pearson) random variable possessing moments of any order. Hence, Theorem 3.1 
also improves the bounds for Beta random variables, given by [24], [25]. The integrated 
Pearson distributions are defined as follows, [18], [3], [1], [2]: 

Definition 1.1 (Integrated Pearson Family). LetZ be an absolutely continuous random 
variable with density / and finite mean /i = EX. We say that X (or its density /) belongs 
to the integrated Pearson family if there exists a quadratic polynomial q{x) = 5x^ +I5x + Y 
with 5,/3,7GR, |5| + |/3| + |7| > 0, such that 

/" {ll-t)f{t)dt = q{x)f{x) foralljceR. (1.4) 

This fact will be denoted by 

X~IP(/i;(?) or/~IP(/i;(?) or, more explicitly, X or / ~ IP (/i; 5, /3, 7). (1.5) 

In the sequel, whenever we claim that X or / ~ IP(/i; 5, /3, 7), it will be understood 
that the density / has been chosen in C°°(a, co) and is vanishing outside (a, (o), where 
(a,tt)) := (essinf(X), esssup(X)) is the interval support of X; see [2], Proposition 2.1. 
Consider an arbitrary real polynomial q with deg(^) ^ 2 such that the set S'^{q) := {x : 
q{x) > 0} is nonempty. It can be shown that for any /i G S^{q) (i.e., with q{lJ.) > 0), there 
exists a unique (up to equality in distribution) random variable X with mean ji such that 
its density / satisfies (1.4); see [2], Section 2. 

Many commonly used continuous distributions are members of the integrated Pearson 
family, e.g.. Normal, Beta, Gamma, Negative Gamma, Pareto (with a > 1), Reciprocal 
Gamma (with a > 1), Fn,m (with m > 2) and r„ (with n> I) distributions, including their 
location-scale families and their negatives — see Table 2.1 in [2] for a complete descrip- 
tion. The proof of the main result is based on specific properties of the associated or- 
thogonal polynomials that can be found in [2]. For easy reference, all required results are 
reviewed in Appendix A. 

2 Preliminaries 

The following definition will be used in the sequel. 

Definition 2.1 (cf. [l],p. 3629). Assume that X ~IP(/i;<5r) and denote by <5r(x) = 5x'^ + 
Px+ Y its quadratic polynomial. Let (a, co) be the support of X and fix an integer n E 
{1,2, . . .}. We shall denote by J^"{X) the class of functions g : (a, ft)) — )■ R. satisfying the 
following two properties: 



Strengthened Chernoff-Type Variance Bounds 



Hi : For each /r G {0, 1 , . . . , n — 1 }, g^^' (with g^^' = g) is an absolutely continuous func- 
tion with a.s. derivative _g(^+i'. That is, g G C"~^(a, (o) and the function g^"~^' : 
(a,w) -)■ R, with 

is absolutely continuous in (a, (o) with a.s. derivative g^'^^ such that 
g("-i)(^j) _g("-i)(;(;) = / ^(")(f)df, for every compact interval [x,y] C (a, co). 

Jx 

H2: E^"(X)(g(")(X))2<oo. 

Also, we denote by J^'^{X) and J^°°{X) the following classes of functions: 

Jf^{X) := l2(R,X) = {g : (a, co) ^ R, Borel measurable, such that Varg(X) < 00}; 
jr°°(X) := n;r=o^"(^) = {^ e C°°(a, w) : E^"(X)(gW(X))2 < 00 for all n = 0, 1, . . . }. 

It is clear that E^q"{X)\g^''\X)\ ^ Eq"{X)Eq"{X){g^"\X))^ < 00, provided E\X\^" < 
00 (equivalently, 5 < l/(2n — 1); see Lemma A.l). On the other hand, under suitable 
moment conditions on X, the assumption H2 implies that 'Eq\X){g^'\X))^ < 00 for all 
i E {0, 1, . . . , n}. In particular, if all moments exist (equivalently, if 5 ^ 0), then 

l2(R,X) = ^O(X) D J^\X) D J^^{X) ^ ■ ■ . D J^°°{X), 

i.e., Jf"{X) = n'l^Qjf'{X) for all n. In order to verify this fact we first show a lemma. 

Lemma 2.1. If X ~ lP{ii;q) with support (a,£o) and g : (a,©) -)> R is an absolutely 
continuous function with a.s. derivative g' such that 'Eq{X) {g'{X))^ < 0° then Fjg^{X) < 0°. 

Proof. Observe that g2(X) ^ 2g2(^) +2(g(X) -g(/i))2. Since /i G (a,ft)), 

E(g(X)-g(M))2 = £/w(/%'(0dr) dx + J^f{x)(^f^g'{t)dt^ dx 

f{x){^-x) {g'{t))^dtdx+ f{x){x-pi) {g'{t)fAtdx 

Jx Jjl J^ 



= Eq{X){g'{X))\ 

by the Cauchy-Schwarz inequality; cf. Lemma 3.1. in [22]. D 

Corollary 2.1. If X ~ lP{^;q), E\X\^"-^ < 00 and g g jr"{X) for some fixed n g 
{1,2,...} then Eq'{X){g^'\X)f < 00 for all / G {0, 1, . . . ,n}. In particular, Varg(X) < 00, 
thatis,gGL2(R,X). 



G. Afendras, N. Papadatos 



Proof. According to Theorem A. 3, the assumptions on X enable us to define the random 
variables X^ with densities 

q^ix) fix) 
fk{x) = ^J: !;'\ a<x<CO, fc = 0,l,...,n-l, 

where (a,©) is the support of X (and of each X^;.). If ^(x) = 5x^ +/3x + 7is the quadratic 
of X then Xj^ ~ IP(/ifc;^yt) with mean /i^t and quadratic qi^ given by 

^+k[5 . . 5x^+[5x + Y cj 2 , o , , n 1 

^1^= l_2kd ' g/^(^) = i_2kd ^ ^^^ "^^' k = 0,l,...,n-l. 

Set g = g'^"^^^ Ji = /i„-i, q = qn-i,X =X„_i and observe thatZ ~ W{Ji;q) and 

^^,-.,^,-..2 ^q"{X){g^-\X)f ^ 

^^^^)^^^^)) ^ (l-(2.-2)g)E^"-HX) <°°- 

because g G J^"{X) so that the nominator is finite. [In view of Lemma A.l, E|Xp"^^ < 
oo implies the inequality (2n — 2)5 < 1; moreover, deg(^"^') ^ 2n — 2 shows that < 
E^"~^(X) < oo.] An application of Lemma 2.1 to g, X shows that Eg^(X) < oo, and thus, 

'Eq"-\X){g^"-^\x)f = Ef{X)Eq"-\X) < oo. 

Hence, g G J^"^^{X). Continuing inductively the result follows. D 

Turn now to the case where X ~ IP(/i; d.fi.y) with 5 ^ 0. It follows that all moments 
exist and, moreover, the moment generating function of X is finite in a neighborhood 
of zero (see [2], Table 2.1, types 1-3). Then, it is well-known that the orthonormalized 
polynomial system {^jt}r=0' gi^^i^ by (A. 6) (with n = oo), is complete in L^(E,X); see, 
C-g- [7], [3]; see also Remark A. 3, below. Consider a function g G J^"{X) for some fixed 
n G {1,2, . . .}. Since J^"{X) C L^{R,X), g can be expanded as 

g{x) ~ £ aMx), (2.1) 

A:=0 

where a^ = E^k{X)g{X) are the Fourier coefficients of g. The series converges in the 
norm of L^{R,X), that is, E[g(X) - i:f^o a^(^fc(X)]2 ^ o as A^ ^ oo. Parseval's identity 
shows that 

oo 

Varg(X)=£a2^ geL\R,X). (2.2) 

k=l 

On the other hand, since g G J^"{X), (A. 8) yields the expression 

E^^(X)g«(X) , , _ 
ak = ^^==^^= for A:=l,2,...,n, 



Strengthened Chernoff-Type Variance Bounds 



where q.(5) = Ufjk-ii^ - J^)^ see (A.3), and E/(X) is given explicitly in (A.9). Thus, 
in the particular case where g E Jif"{X), (2.2) produces the equivalent formula 

Var^W = lf4wWr+ t < se^"{X). (2.3) 

Formally, one can differentiate term by term (n times) the series (2.1) to get, in view 
of Theorem A.5, the expansion 

oo oo 

gW W ~ £ a^^,,<^^l{x) = £ vf a,+,0,,„W. (2.4) 

k=0 k=0 

The constants v^ = v^ {il\q) are given by (A. 18) and {<l>k,n{^)}k=Q i^ the orthonor- 
mal polynomial system (with lead(^;t.;7) > 0) corresponding to X„ with density /„ = 
q" f /'Eq'\X)\ ^i^„ is a (positive) scalar multiple of the polynomial i\„ given in (A. 16). 
Now, if the expansion (2.4) was indeed correct in the L^(R,X,i) -sense, then the com- 
pleteness of the system {^^ „}^^q in L^(R,X„) would result to the corresponding Parseval 
identity: 

Mg^.Efe(")(X„))^=£(v<"')^a,^„„ ,.^"(X). ,2.5) 

Finally, from (A. 18) we have 

fn)! ^+^"-^ 
k\E, 



(^f-^i^. n (>-i«). 



A combination of the last equation with (2.5) yields the identity 

E.-(.)fe<->(.))-^£ '^-"''"'f-'''-^^' a^„^£ ^'"'g:^l'r^^' <^. (2.6, 

This must be correct for all g G M'"{X), provided that expansion (2.4) is valid. However, 
the above arguments are heuristic; they are not sufficient even to conclude convergence of 
the series (2.6) or (2.5). Notice that the same technicality appeared in Chernoff's (1981) 
proof, although in this case the polynomials are the well-known Hermite (with derivatives 
again Hermite, i.e., orthogonal to the same weight function, the normal density). Chernoff 
overcame this difficulty by applying Weierstrass (uniform) approximations to g in compact 
intervals. 

In the sequel we shall make the above arguments rigorous by applying a different 
technique, in the spirit of Sturm-Liouville theory. In fact, we shall show more, namely, 
that an initial segment of the Fourier coefficients for the n-th derivative of g, suggested by 
(2.4), can be derived for any X ~ IP(jU;5,/3,7) having a sufficient number of moments. 
This result holds even if 5 > 0, noting that if 5 > then X possesses only a finite number 
of moments. Specifically, the following result, which may have some interest in itself, 
holds true. 



G. Afendras, N. Papadatos 



Lemma 2.2. Assume that X has density /, support (a, cq), X ~ IP(jU; 5, j8 , 7) and E|Zp^ < 
00 for some// ^ 1 i.e., 5 < 2n-\ - Let {^k\k=Q — L^(^,X) be the orthonormal polynomial 
system associated with X (standardized by lead {^j^) > 0). Then, for every x E (a, ft)), 

q{x)f{xW,{x) = -x,{5) / Uy)f{y)^y = 4(5) / Uy)f{y)^y, 

J a Jx \^-l) 

fc=l,2,...,A^, 

where X^{b) := k{l-{k-\)d). Moreover, if g G Jf"{X) for some n G {1, 2, . . . ,A^} then 

E(j)k,niX„)g^"\Xn) = v{"^E(^,+„(X)^(X), fc = 0, 1, . . . ,A^ - n, (2.8) 

where X„ has density /„ = q"f/'Eq'^{X), 



^k 



in)_j{k+ny.u%H--i(^-jS) 



k\ E^"(X) 



is given by (A. 18) and {^k.n}^=Q ^ ^^(I^,^«) is the orthonormal polynomial system cor- 
responding to X„, standardized by lead(^;t,n) > 0. 

Proof. From (1.4) it follows that 

f'{x) ^-x-q'{x) -(1+25)jc+(ai-/3) 



f{x) q{x) 5x^ + [5x + Y 



a <x< CO. 



Since each ^^ is a scalar multiple of the Rodrigues-type polynomial h^ — D'^[q^f]/f (be- 
cause Pk = (— l)^/zyt). Theorem 1 of Diaconis and Zabell (1991) (see, also, eq. (4.4) in [2]) 
implies that 

[q{x)f{x)(l)',{x)]' = -Xk{5)(j)k{x)f{x), a<x<(0, k=l,2,...,N. (2.9) 

Fix t and x with a <t <x < CO and integrate (2.9) over the interval [t,x] to get 

-h{8) rUy)m^y = q{x)f{xmx) - q{t)f{tW,{t)- 



thus, taking limits as r \ a we see that the l.h.s. converges to —Xk{d) J^ 0yt (>')/(>') dy, by 
dominated convergence, while the r.h.s. tends to q{x)f{x)^l^{x) because, by Lemma A.2, 
lim,\^c^^(f)/(r)/z(r) = for any polynomial h with deg(/?) ^ 2N — 1. This verifies the 
first equality in (2.7), while the second one is obvious since 'E^i^{X) = (because ^j^- is 
orthogonal to (^0 = !)• 

Fix now an integer fce {0, 1, .. .,A^- 1}. Observing that deg(<5r(jc)x^^) ^ 2fc-|-2 ^ 2A^ 
we have E(Xf )^ = Eq{X)X^^/Eq{X) < 00 and, thus, the Rodrigues-type polynomial i\,i 
belongs to L?-{R,Xi). By Corollary 2.1, E{g'{Xi))^ is also finite. Indeed, n ^N implies 
that E|X|2"-i < 00 so that g e J^"{X) C J^i (X) and, therefore. 



Strengthened Chernoff-Type Variance Bounds 



by the fact that g E J^^{X). Hence, the Fourier coefficient of g' with respect to ^k,i, 
E(j)k^i{Xi)g\Xi), is well-defined (and finite): 

E2|0fe,l(Xi)g'(Xi)| ^ E(0,,i(Xi))2E(/(Zi))2 = EU'(Xi))2 < CO. 

Let pi < p2 < •■ ■ < pm be the distinct roots of ^^t+i that lie into the interval (a, ft)). 
Clearly, 1 ^ m^k+l because Fj^i^^i{X) = and deg(^;t+i) = k+l. Fix now a number 
p G [pi,Pm] C {a,co). From (A.19) we see that (pkA^) = <^;t+iW/'^l where v^ ' = 
y/{k+l){l-k5)/'Eq{X). Therefore, using (2.7), we have 

1 /■" 

E(^fc,l(Xi)g'(Xi) = ^^y^ g'{x)qix)fix)Mx)dx 

1 /■" 



^^+1^^) ^%'W / /(3;)<^,+i(y)dydx 



vi^^'EqiX) -Ice ' ' Jcc 






Observing that 



h+l{5) _ ik+\){\-k6) m 



n ' 



vi'%{X) Eq{X)^{k+\){\-kd)/Eq{X) 
the preceding equation can be rewritten as 

E(^k,i{Xi)g'{Xi) = vi'\l2-h) (2.10) 

where 

rp rx t-m rCO 

h:= g'{x) fiy)(l)k+iiy)dydx, h:= g'{x) fiy)(j)k+iiy)dydx. (2.11) 
Ja Ja Jp Jx 

Now, we wish to change the order of integration to both integrals /i and h- To this end, 
for 72 it suffices to show that 



/* - 



/ l/Wl/ nym+i{y)\dydx<oo. (2.12) 

Jp Jx 



Similarly, for h it suffices to show that /j* := /^ \g'{x) \ J^ f{y) \ ^^+i (y) \ dydx < oo. We now 
proceed to verify (2.12). Write /| = /^j +I22 where 



fPm fCO r(0 fCO 

i2i-= k'Wl/ nym+i{y)\dydx, q2-= k'WI / /{yM+iiyMydx. 

Jp Jx J Pm Jx 



G. Afendras, N. Papadatos 



Since the polynomial ^k+\ does not change sign in the interval (pm, CO), we can define the 
constant n as 

;r:=sign((^^+i(x)) G {-1,1}, p,„<x< co. 

Then, K(l)k+\ {x) = \^k+\ (x) \ holds for all x G (p,,,, CO) and from (2.7) we get 

Il^ = K \g'{x)\ f(y)(^,_,,(y)dydx=^—-^ |/ W |^W/W<^;+1 ^dx 

1 r^ 

^^^^/ \s{x)w)mWk+M\^ 

^k+\\0) Jpm 

I i-a 1 

W^) 1 



:ij|<oo. 



This shows that /|2 < °°- On the other hand, the function x t-^ q{x)f{x) is strictly positive 
and continuous for X in the compact interval [p.pm] ^ (oc.O)), so that, 6 :=min{^(;c)/(jc) : 
p ^ X ^ pm} > 0. Then, from the fact that g G Jf H^)' we get 



1 fP>n 1 

g'{x)\dx ^ -J q{x)nx)\g\x)\dx^-Eq{X)\g'{X)\ 



p u jp 

1 



^ -\/Eq{X)Eq{X){g'{X))^<oo. 



Moreover, for any wi, M2 with a ^ wi ^ M2 ^ w it is readily seen that 

i'U2 rCO 

/ \(l>k+i{y)\ny)dy^ \(j)k+i{y)\ny)dy = n(l)k+i{X)\:=Mk+i<oo. 



'1 
Combining the above we conclude that 

rPm f(0 rp, 

Ig'WI / f{y)\(l>k+i{y)\dydx^Mk+i / 

ip Jx Jp 



rPm fO) fPm 

i2i= Ig'Wl/ nym+iiy)\dydx^Mk+i \g'{x)\dx<oo. 

Jo Jx Jo 



Therefore, /| = /|j +/I2 < °° and (2.12) follows. Using similar arguments it is shown that 
I^ < 00. Thus, we can indeed interchange the order of integration to both integrals /i and 
/2 of (2. 11). It follows that 

rCO ry rO) rCO 

h= f{y)^k+i{y)i g'{x)dxdy= f{y)h+i{y)siy)dy-8ip) f{y)h+i{y)dy 
Jp Jp Jp Jp 

and, similarly, 

fP fp 

h=g{p) f{y)(l>k+i{y)dy- f{y)(l>k+i{y)g{y)dy. 

Ja J a 

Taking into account the fact that J^ f{y)h+i {y)dy = E^/t+i (^) = 0. we get 

h-h^ f{y)(l>k+i{y)g{y)dy-gip) /(};)#+ i(j)d); = E(^,+i(X)g(X). 
Ja Ja 



Strengthened Chernoff-Type Variance Bounds 



Finally, from (2.10) we conclude that 



E(^k,i{Xi)g'{X,) = J ^^^^^^x^ ^^^ E0,+i(X)g(X), fc = 0,l,...,A^-l. (2.13) 

So far we have shown that g G J^"{X) and E|X|^^ < 0° for some N ^ n implies that 
g e Jf^{X) and (2.13) is fulfilled. Assume now that for some « G {1,2, . . .,n — 1} we 
have shown that g G J^'{X) and that for every k E {0, 1 , . . . , A'^ — z}, 



EMX,)g('\x,) = f^t!^'^^^%'{X) '^^ ^<^^+ii^Mn (2.14) 

Clearly we can apply (2.13) for g — g^'\ X — X,- and for k = 0,1,. .. ,N —I, provided that 

E|X,|2^ < 00. Observing that E|Z,p^ = ^^l^^JK''"^ it follows that A^ = A^ - / is a suitable 
choice. Therefore, for /: = 0, 1 , . . . , A^ — z — 1 , (2. 1 3) yields 



where 5/ = ^^-g , qi{x) = -j^^ (see Theorem A. 3) and, thus, 

Eq{Xi) Eq'+\X) 



Eq,{Xt] 



1-2/5 (l-2z5)E^'(X) 
Finally, calculating E(j)k+i/Xi)g^'\Xi) from (2.14) (for /: = 0, 1, ... ,A^- z- 1) we see that 

E(^,,+i(X,+i)^('+i)(X,+i) 



\j {l-2i5)'Ec/(X) 

which verifies the inductional step and shows that (2.14) holds for all z G {l,2,...,n}. 
Letting z = n in (2.14) completes the proof. D 

3 The strengthened inequahty 

In the present section we deal with the first three types of the integrated Pearson sys- 
tem, corresponding to X ~ IP(jU; 5,/3, 7) with 5^0. These are the well-known Normal, 
Gamma and Beta random variables and their affine transformations - see [2], Table 2.1. 
In this case the orthonormal polynomial system {^k}'k=o is complete in L^(IR,X) and, 
therefore, the following result holds. 



10 G. Afendras, N. Papadatos 



Lemma 3.1. IfX ~IP(jU;5,/3,7) with 5 ^0, then 

Varg(X) = £a2 ^^^ ^^^ geL'{R,X), (3.1) 

k=l 

where 

a, = E(^fc(X)g(X), fc = 0,l,2,..., (3.2) 

are the Fourier coefficients ofg with respect to the orthonormal polynomial system {^k}'^^Q- 
If, furthermore, g E J^"{X) for some nG{l,2,...}, then 

ak = -E<^k{X)g{X)^ ^ I ;^ I ; fc=l,2,...,n (3.3) 

y/fc!E^^(X)n5i-,'i(l-j5) 

and 

k=n ^ ' ' 

with Uk given by (3.2). 

Proof. (3.1) is the well-known Parseval's identity. Also, if g G M"^{X) then, by Corollary 
2.1, g G ^^{X) for all ^ G {0, 1, . . .,n}. Therefore, the Cauchy-Schwarz inequality shows 
that ¥.q^{X)\g^^\X)\ ^ E^*(X)E^'^(X)(g('^)(X))2 < oo. Hence, (3.3) follows from (A.4) 
- see Theorem A. 2 - and the fact that the polynomials Pk{x) :— {—l)'^D'^[q^{x)f{x)]/f{x) 



are related to 0^ by Pk{x) = 0^(x) J{k\Eq''{X) Ufjk~i ( 1 " J^) for all yt G { 1 , 2, . . .}. More- 
over, by Lemma 2.2 we have that for any g G J^"(Z), the Fourier coefficients a^ = 
E^fc(X)g(X) (ofg with respect to X) and the Fourier coefficients aj;."^ := E0^^„(X„)g(") (Z„) 
of g^ '^ with respect to Xn are related through 



(„,_.,'(*+«)! n;a;^,(i-is) 



< - V « E,-(X) «'-" * = 0,1,2,.,„ 

where E^"(X) is given explicitly by (A. 9). Finally, Theorem A. 3 asserts that 

Xn ~ IP(jU„; 5„, A7, Yn) with 5,, = ^_^ ^ 0. 

Hence, 5„ ^ guarantees that the corresponding orthonormal polynomial system {^k,n}'k=o 
is complete in l2(R,X„). Since g G J^"{X), g^"^ G L2(R,X„) and, by Parseval's identity, 

E(g (X,,)) -l.(aj ) -E,„(x)jt-„ « '"'" 

(thus, the series converges). Observing that 

(3.4) is deduced and the proof is complete. D 



Strengthened Chernoff-Type Variance Bounds 1 1 



We are now in a position to state and prove the main result of the paper. 

Theorem 3.1. If X ~ IP(jU; 5,/3, 7) with 5 ^ and if g e jr"{X) for some n G {1,2, . . .} 
then 



Var,(X)^£ EV(X).«(X) 



'tiklEqkiXmfjk-iC^-jd) 



Eg"(Z)(gW(X))2-^EV(X)gW(X) ^^-^^ 



with equality if and only if g is a polynomial of degree at most n + l. In particular, if o^ = 
VarX and g is absolutely continuous with a.s. derivative g' such that E^(X)(g'(X))~ < 00 
(that is, geJfi(X)) then 

Varg(X) ^ (^1 - 2(Y^) ^e2^(^)^'(^) + j^^Eq{X){g'{X))\ (3.6) 

with equality if and only if g is a polynomial of degree at most two. 

Three examples of (3.6) are as follows: 

Example 3.1. If X r^ N{ij.,o^) = IP(jU;0,0,a2) then 5 = 0, q{x) = o^ and we obtain 
the inequality 

Varg(X) ^ ^a2EV(^) + ^a2E(g'(X))2, (3.7) 

in which the equality holds if and only if g is a polynomial of degree at most two. Cher- 
noff's upper bound, Varg(X) ^ a^E(g'(X))^, is strictly weaker than (3.7) since, obvi- 
ously, E^g'(X) ^ E(g'(X))^, and the equality holds if and only if g is linear. It should be 
noted that a^E^g'(X) is, actually, a lower bound for Varg(X); see, e.g., [10]. 

Example 3.2. If X ~ r{a,?i) = IP(a/A;0, 1/A,0) then 5 = 0, q{x) = jc/A, o^ = a/X^ 
and we obtain the inequality 

Varg(X) ^ lE2Xg'(X) + ^EX(g'(X))2, (3.8) 

in which the equality holds if and only if g is a polynomial of degree at most two. 
Example 3.3. If X ~5(a,l,) = IP(^;^,^,0) then 5 = ^, q{x) = ^, a^ = 
{a+b/{a+b+i) ^"^ *^ °^^^^" *^ inequality 

Varg(X) ^ ^^^E2x(l-X)g'(X) + ^^-^^EX(l-X)(g'(X))2, (3.9) 

in which the equality holds if and only if g is a polynomial of degree at most two. In the 
particular case where a = Z7 = l,X = L'^is uniformly distributed over the interval (0, 1) 
and (3.9) yields an improvement of Polya's inequality (see, e.g., [4]). Indeed, we get 

J\\x)dx~(j\{x)dx\ ^2n\{\-x)g'{x)dx\ +y\i\-x){g'{x))^dx, 



12 G. Afendras, N. Papadatos 



and the upper bound is smaller than Polya's bound because, by the Cauchy-Schwarz in- 
equality, 

/ x{l-x)g'{x)dx] ^ [ x{\-x)dx f x{l-x){g'{x)fdx=l [ x{l-x){g'{x))^dx. 
Jo J Jo Jo 6 Jo 

Remark 3.1. In [11], [22] it was shown that Varg(X) ^ Eq{X){g'{X))^; the equality in 
this Chemoff-type variance bound is attained only by linear functions g. Also, in [10], [22] 
it was shown that Varg(X) ^ -^'E^q{X)g'{X), in which the equality characterizes again 
the linear functions. We observe that the upper bound in (3.6) is a convex combination 
of the preceding lower and upper bounds and, thus, smaller than the Chernoff-type upper 
bound, 'Ejq{X){g'{X))^. Also, the last term in the upper bound (3.5) can be rewritten as 

Eg"(X)(gW(X))^-^EV(X)gW(X) ^ E^"(X) ^^^ („) 

Thus, we can apply the Chemoff-type upper bound to Varg(")(X„), provided that g^"^ G 
J^^{Xn). Recall that g^"^ G J^^(X,,) means that g^"^ is absolutely continuous with a.s. 
derivative g("+i) such that E^„(X„)(g("+i)(X„))^ < °°. Since X„ ~ /„ = q"f/Eq"{X), 
5^0 and qn{x) = q{x)/{l —2nd), the preceding requirement is equivalent to 

n+l/v^/'„fn+l 



'l-2n5)Eq"{X) 



Eq"+'{X){g^"+'){X)y<oo- 



thus, g(") G J^i (X„) if and only if g G Jf"+^ (X). Therefore, if g G J^"+^ (X) then we 
have 

with equality if and only if g^"^ is linear, that is, g is a polynomial of degree at most n+l. 
The preceding inequality shows that for any g G J^"^^ (X), 

Eq"{X){g("\X)f-^^E^q"iX)g("\X) Eq'^+'{X){g("+'\X))^ 



^ 



{n+l)\U%-Hi-jS) " {n + \)lU%„{l-jd) ' 

with equality only for polynomial g of degree at most n + l. Combining the upper bound 
in (3.5) with the last displayed inequality we obtain the weaker bound 

v„(x) < E ^'^m/^'m ^ ^^°f\'^'°'<^)'\ (3.10) 

which holds for any g G J^" {X ) , and the equality is attained if and only if g is a polynomial 
of degree at most n. For n = I this is the Chemoff-type variance bound. Also, for X ~ 
B{a,b), (3.10) has been shown by Wei and Zhang (2009), using Jacobi polynomials. 



Strengthened Chernoff-Type Variance Bounds 13 



Proof of Theorem 3.1. From (3.1) and (3.3), 

with cUyt given by (3.2). Also, from (3.3) with k = n. 

Thus, in view of (3.4), 

E^"(X)(/")(X))2-^-i^EV(X)gW(X) 

Therefore, 

E^"(X)(^W(X))2-jg^EV(^)^("H^) 



f ^!n;+r?(l-i^) 2 2 ^ f , 2 

kir+lik-n)l{n + l)lU%-'{l-jd) "+' ,i;2 

where 

. 1 fk\ u)tr-'i^-js) 

The sequence {^k}'k=n+2 ^^ nondecreasing in k. Indeed, since 5 ^ 0, we have 

1 ^ 1-5^ 1-25^ l-35^-- 

and thus, k i— J- n,=^'li (1 ~ j^) is nondecreasing in k and positive (for each k the product 
contains n positive factors). Also, 

is, obviously, positive and nondecreasing in k. Thus, for every k^ n + 2. 



^'>^--('-i)('-T^)>'. 



because 1 +n/2 > 1 and 1 — nd/{l —nd) ^ 1 (since 8 ^ 0). It follows that 

E^"(Z)(gW(Z))2- 1 eV(^)^("H^) 2 2 

f^^ ^ a^ ^ (^2 ^ . . (3_ j2) 

{n+l)mYs„\l-jd) ^ "+' "+' 

with equality if and only if a„+2 = OCn+3 = ■ ■ • = 0, that is, if and only if ^ is a polynomial 
of degree at most n+l. A combination of (3.11) and (3.12) completes the proof. D 



14 G. Afendras, N. Papadatos 



Remark 3.2. The upper bound in (3.5) is meaningful (it is nonnegative and makes sense) 
even for < 5 < 5;^, in which case E|Xp" < co. Also, since x"+^ G L?-(R,X) if and only 
if 5 < 2jrn' ^t would be desirable to show the validity of (3.5) at least when < 5 < 2jhl- 
For example, we have tried, without success, to prove (3.6) when < 5 < ^. In contrast 
to the corresponding Chernoff-type bound, which can be shown directly (without Fourier 
expansions - see, e.g., [13]; cf. Lemma 2.1, above), it seems that the completeness of the 
corresponding orthonormal polynomial system in L^(R,X) plays a crucial role in proving 
(3.6). 



A Appendix 

Proposition A. 1 ([2], Proposition 2.1). LetX--IP(;U;^) andset (a,a)) := (essinf(X),esssup(X)). Then, 
there is a version / of the density of X such that 

(i) f{x) is strictly positive for x in {a,(o) and zero otherwise, i.e., {x : f{x) > 0} = (a, o); 

(ii) / G C°°{cx,, co), that is, / has derivatives of any order in (a, ©); 

(iii) X is a (usual) Pearson random variable supported in (a, ft)), that is, f'{x)/ f(x) = p\{x)/q{x), x G 
(a,©), wherepi(x) = jj,—x — q'{x) is a polynomial of degree at most one; 

(iv) q{x) = 5x^ + j5x+Y>0 for all x e {a,(o); 

(v) if a > — oo then q{a) = and, similarly, if co < +°° then q{a)) = 0; 

(vi) for any 0,c G IR with 7^ 0, the random variable X :— OX + c ^ lP{Ji;q) with /I = 0/x + c and 

qix) = e^qiix-c)/9). 

Lemma A.l ([2], Corollary 2.2). Assume thatX - lP(jU;5,j3,7). 
(i) If 5 ^ then ^X\'^ < oc for any 9 G [0,oo). 
(ii) If 5>0thenE|X|'' < 00 for any G [0, 1 + 1/5), while E|X|i+'/^ = 00. 

Lemma A.2 ([2], Lemma 2.1). IfX '-^1P(;U;5,J3,7) =lP{ii;q) has support (a, co) andE[X|" <oo for some 
n ^ 1 (equivalently, 5 < l/(« — 1)) then for any polynomial 2»-i of degree at most « — 1, 

]imq(x)f{x)Q„^l{x) = lim q{x)f{x)Q„^i{x) =0. (A.l) 

x/'m x\a 

Theorem A.l ([16], p. 401; [6], pp. 99-100; [15], p. 295; [2], Theorem 4.1). Assume that / is the density 
of a random variable X ■~lP(jU;g') =IP(;U;5,j3,7) with support {a,(o). Then, the functions /\ : {a,a)) — > R 
with 

^AW-=^T7^^b'W/W]' «<x<to, ^ = 0,1,2,... (A.2) 

are (Rodrigues-type) polynomials with 

2k-2 

dsgiPk)^k and lead(P,)= JJ (1 - j5) := q(5), A: = 0, 1,2,. . . , (A.3) 

j=k-l 

where lead (Pk) is the coefficient of x*^ in Pk{x). Here co(5) := 1, i.e., an empty product should be treated as 



Strengthened Chernoff-Type Variance Bounds 15 



Theorem A.2 ([3], pp. 515-516; [2], Theorem 5.1). Let X ~ IP(;U; 5,j3, 7) = lP{fi;q) with density / and 
support {a,Co). Assume thatX has 2k finite moments for some fixed ^g{1,2,...}. Let g : {a,Co) — > Rbe 
any function such that g e C^^^{a, (o), and assume that the function 

is absolutely continuous in {a,a)) with a.s. derivatives'*^'. If E^*^(X)|§(*^'(X)| < 0° then E|fi(X)g(X)| < 00, 
where P^ is the polynomial defined by (A.2) of Theorem A. I, and the following covariance identity holds: 

EPkiX)g{X) = E/(X)gW(X). (A.4) 

It should be noted that when we claim that li : {a,0}) — > R is an absolutely continuous function with a.s. 
derivative h' we mean that there exists a Borel measurable function h' : (a, o) — > R such that /;' is integrable 
in every finite subinterval [x,y] of (a, ffl), and 

/ h'{t)dt ~ h{y) — h{x) for all compact intervals [x,y] C (a, ©). 

Jx 

Corollary A.l ([3], eq. (3.5), p. 516; [2], Corollary 5.1). Let X - IP(jLi;5,i8,7) =1P(^;^). Assume 
that for somen G {1,2,. . .}, E|Xp" < 00 or, equivalently, 5 < l/(2« — 1). Then, the polynomials defined by 
(A.2) of Theorem A. 1 satisfy the orthogonality condition 

2i--2 

nPk{X)P,„{X)] = 5k.,„k\Eq''iX) n (l-jd) = 5k.,„klck{5)I]q''iX), k,m G {0, !,...,«}, (A.5) 

where 5k„, is Kronecker's delta and where an empty product should be treated as one. 



Remark A.l. The orthogonality of P^ and Pm, k ^ m, k,m (^ {0,1,..., «}, remains valid even if 5 G 



[y^, 2^); in this case, however, P„ ^L^(M,,X) since lead(P„) > and ^X\'^" = oc. 



Remark A.2 . In view of Lemma A.l, the assumption E |Z p" < oo is equivalent to the condition 5 < j^—^ . 
Therefore, for each ^ G {!,...,«} and for all y G {^— 1,...,2A: — 2} we have I — j5 > because 

{yt-l,...,2yt-2}C{0,l,...,2n-2}. 

Thus, Ck{5) > 0. Since T[q{X) > 0] = 1, deg(^) sC 2 and E\X\'^" < oc we conclude that < E^*(X) < oc for 
all^G {0, !,...,«}. It follows that the set {0o,0i,- • • ,0n} CL^(R,X), where 

(^,W:= '-^ ^ = - i^^^>^ -— , k = 0,l,...,n, (A.6) 

ik\c,i5)i:q^iX)f- (^klMqk^X)U%t,il-j5)) 

is an orthonormal basis of all polynomials with degree at most n. By (A. 3), the leading coefficient of (pi; is 

kmql'iX) j \kmq''{X)^ 



lead ((/)<:)= ,. . = — ^Vt^ >0, k = 0,l,...,n. (A.7) 



The orthonormal system {^aJ^^q i^ characterized by the fact that deg((/)A.) = k and lead {(pic) > for each k. 

Remark A.3 . The identity (A.4) enables a convenient calculation of the Fourier coefficients of any (smooth 
enough) function g with Y3sg{X) < oc. More precisely, if X ~ IP(jLi;5,)3,7) =lP{n;q) andE|X|'" < oc for 
some « ^ 1 then the Fourier coefficients of g, a^ ~ E0i,(X)^(X), are given by Oq = E§(X) and 

Eq^{X)g(^{X) 
"^•"(fcM5)E^^(X))'/2' ^-1,2,...,«, (A.8) 



16 G. Afendras, N. Papadatos 



provided that g is smooth enough so that Eg''^(X)|g''^'(X)| < oo for^ £ {1,2,. ..,«}; cf. [3], Theorem 5.1(a). 
Here 0^.(5) is given by (A. 3) and for any k G {1, . . . ,n} (see [2], Corollary 5.3) 

n;:j(i-(2j+i)«)M vi-2/«; 

In the particular case where X ^ lP(/i; 5, j3 , 7) and 5^0 (i.e. if X is of Normal, Gamma or Beta-type), it 
follows that E|X|" < 00 for all n. Moreover, there exists an e > such that Ee'^ < 00 for |f | < e (see types 
1-3 of Table 2.1 in [2]). Hence, the polynomials {^A:}r=0' given by (A. 6) (with n = 00), form a complete 
orthonormal system in L^{M.,X); see, e.g., [7], [3]. Therefore, the Fourier coefficients are easily obtained 
for any smooth enough function g such that Var^(X) < 00 and Eq'^(X)|g'^'(X)| < 00 for all k^ 1. Indeed, 
in this case we have 

where Eq^{X) is as in (A. 9). Thus, by Parseval's identity, the variance of g equals to ([3], Theorem 5.1(a)) 
with E^*(X) given by (A.9) and Ck{d) by (A.3). 



Theorem A.3 ([2], Theorem 5.2). Let X be arandom variable with density /'-- IP(jU;^) = IP(/z;5,j3,7) 

_i_ 

2« 



supported in {a,Co). Furthermore, assume that E|Xp"+' < 00 (i.e. d < j^) for some n e {0, 1, . . .}. Define 



the random variable Xj^ with density fi^ given by 



W.q'^iX) 



fk{^)-^^T(^^ Oi<x<(0, k = 0,l,...,n. (A.12) 



Then, f/^ ^ lP(/i^,;g'j.) with (the same) support (a, 0)), 



Li -\- kB ci(x) 

Ai^ = YT^ and qk{x) = ^_2fcg ' «<^<«. fc = 0, !,...,«. (A.13) 



Theorem A.4 ([2], Theorem 5.3; cf. [5], p. 207). If X - 1P(^; 5, j3 , 7) with support (a, (o) and W.\X\^" < 00 

2n-l' 



for some n ^ 1 (i.e. 5 < 2^7~r) then for any me {1,2, . . . ,«}, 



where 



n+iW-c["''(5)P,,,„W, a<x<(0, k = 0,l,...,n~m, (A.14) 

CJ:'\S) :- ^^±;^(1 -2m5)^ '^f\' (1 -;5). (A.15) 

'*'• j=k+m-l 

Here, /\ are the polynomials given by (A.2) associated with /, and 7^ ,„ are the corresponding Rodrigues 
polynomials of (A.2), associated with the density f,„{x) = -^ mix) ^ a < x < (O, of the random variable 
X,„ ~ lP{jj.,„;q,„) defined in Theorem A.3, i.e., 

^^-W - feg^t^'-W^'-W] - (l-2i')VW/W ^[^""(-)^(-)^' (A.16) 

a <x < CO, k = 0,l,...,n — m. 



Strengthened Chernoff-Type Variance Bounds 17 



Theorem A.5 ([2], Corollary 5.4). Let X - IP(^;5,j3,7) = IP(jLi;?) and assume that E|X|2« < oo for 
some fixed n ^ 1 (i.e. 5 < jn^^' "^^^ {0*:}^=o ^^ '■^^ orthonormal polynomials associated with X, with 
lead (0j^) > 0; see (A.6), (A.7). Fix a number m G {0,1,..., «}, and consider the corresponding orthonormal 
polynomials {^/i,m}^Io , with lead {(^k.m) > 0, associated with X,„ ^ /„, = q'"f/Viq"'{X). Then, 

<^i+LW = '^A'"'^M>W: ^ = 0,l,...,H-m, (A.17) 

where the constants v|™' ~ v|™' (^;q') > are given by 

1/2 

vr^vr(M;.):=<! ^^"cr,r^'^'^ i , (a.i8) 



with W,q'"{X) as in (A.9) with m in place of A:. In particular, setting a = VarX = 'Eq{X) we have 



fc, W ^ ^"" '«' -'^' fc.M = J "^"" J" ~*'V » W. ' = »■> — ■ <A-") 



References 

[1] Afendras, G. and Papadatos, N. (201 1). On matrix variance inequalities, J. Statist. Plann. Infer- 
ence 141 362S-3631. 

[2] Afendras, G. and Papadatos, N. (2012). Integrated Pearson family and orthogonality of the Ro- 
drigues polynomials: A review including new results and an alternative classification of the Pearson 
system. Submitted for publication. arXiv: 1205. 2903. vl 

[3] Afendras, G., Papadatos, N. and Papathanasiou, V. (2011). An extended Stein- 
type covariance identity for the Pearson family, with applications to lower variance bounds. 
Bernoulli 17(2) 507-529. 

[4] Arnold, B.C. and Brockett, P.L. (1988). Variance bounds using a theorem of Polya. 
Statist. Probab. Lett. 6 321-326. MR0933290 

[5] Beale, F.S. (1937). On the polynomials related to Pearson's differential equation. Ann. 
Math. Statist. 8 206-223. 

[6] Beale, F.S. (1941). On a certain class of orthogonal polynomials. Ann. Math. Statist. 12 
97-103. MR0003852 

[7] Berg, C. and Christensen, J.P.R. (1981). Density questions in the classical theory of 
moments. Ann. Inst. Fourier (Grenoble) 31 99-114. MR0638619 

[8] BOROVKOV, A. A. and Utev, S.A. (1983). On an inequahty and on the related characteriza- 
tion of the normal distribution. Teor. Veroyatnost. i Primenen. 28(2) 209-218. 

[9] Brascamp, H. and Lieb, E. (1976). On extensions of the Brunn-Minkowski and Prekopa- 
Leindler theorems, including inequalities for log concave functions, and with an application 
to the diffusion equation. J. Functional Analysis 22(4) 366-389. MR0450480 
[10] Cacoullos, T. (1982). On upper and lower bounds for the variance of a function of a 

random variable. Ann. Probab. 10 799-809. MR0659549 
[11] Cacoullos, T. and Papathanasiou, V. (1985). On upper bounds for the variance of 

functions of random variables. Statist. Probab. Lett. 3 175-184. MR0801687 
[12] Cacoullos, T. and Papathanasiou, V. (1989). Characterizations of distributions by 
variance bounds. Statist. Probab. Lett. 1 351-356. MR1001133 



18 G. Afendras, N. Papadatos 



[13] Chen, L.H.Y. (1982). An inequality for the multivariate normal distribution. J. Multivariate 

Anal. 12 306315. MR0661566 
[14] Chernoff, H. (1981). A note on an inequality involving the normal distribution. Ann. 

Probab. 9 533-535. MR06 14640 

[15] DiACONiS, P. and Zabell, S. (1991). Closed form summation for classical distributions: 
variations on a theme of De Moivre. Statist. Science 6 284-302. MRl 144242 

[16] HiLDEBRANDT, E.H. (1931). Systems of polynomials connected with the Chaiiier expan- 
sions and the Pearson differential and difference equations. Ann. Math. Statist. 1 379-439. 

[17] HOUDRE, C. and Kagan, A. (1995). Variance inequalities for functions of Gaussian vari- 
ables. /. Theoret. Probab. 8 23-30. MR 1308667 

[18] Johnson, R.W. (1993). A note on variance bounds for a function of a Pearson vaiiate. 

Statist. Decisions 11 273-278. MR1257861 
[19] Klaassen, C.A.J. (1985). On an Inequahty of Chernoff. Ann. Probab. 13(3), 966-974. 
[20] Nash, J. (1958). Continuity of solutions of parabolic and elliptic equations Amer J. Math. 

80 931-954. MR0100158 
[21] Olkin, I. and Shepp, L. (2005). A matrix variance inequality. /. Statist. Plann. Inference 

130,351-358. 
[22] Papadatos, N. and Papathanasiou, V. (2001). Unified variance bounds and a Stein-type 

identity. In: Probability and Statistical Models with Applications (Ch.A. Charalambides, 

M.V. Koutras and N. Balakrishnan, Eds.), Chapman & Hall/CRC, New York, pp. 87-100. 

[23] Papathanasiou, V. (1988). Variance bounds by a generalization of the Cauchy-Schwarz 
inequality. Statist. Probab. Lett. 7 29-33. MR0996849 

[24] Prakasa Rao, B.L.S. (2006). Matrix variance inequalities for multivariate distributions. 
Statistical Methodology 3 416^30. MR2252395 

[25] Wei, Z. and Zhang, X. (2009). Covariance matrix inequalities for functions of Beta ran- 
dom variables. Statist. Probab. Lett. 79 873-879. MR2509476