Variance d'une fonction d'une variable aléatoire


33

Disons que nous avons la variable aléatoire X avec une variance et une moyenne connues. La question est: quelle est la variance de f(X) pour une fonction donnée f. La seule méthode générale que je connaisse est la méthode delta, mais elle ne donne qu’une approximation. Maintenant, je suis intéressé par f(x)=x , mais ce serait aussi bien de connaître quelques méthodes générales.

29.12.2010
J'ai effectué des calculs avec les séries de Taylor, mais je ne sais pas s'ils sont corrects. Je serais donc ravi que quelqu'un puisse les confirmer .

Nous devons d’abord approximer E[f(X)]
E[f(X)]E[f(μ)+f(μ)(Xμ)+12f(μ)(Xμ)2]=f(μ)+12f(μ)Var[X]

Maintenant , nous pouvons rapprocher E [ ( f ( X ) - E [ f ( X ) ] ) 2 ] E [ ( f ( μ ) + f ' ( μ ) ( X - μ ) + 1D2[f(X)]
E[(f(X)E[f(X)])2]E[(f(μ)+f(μ)(Xμ)+12f(μ)(Xμ)2E[f(X)])2]

Using the approximation of E[f(X)] we know that f(μ)Ef(x)12f(μ)Var[X]

Using this we get:
D2[f(X)]14f(μ)2Var[X]212f(μ)2Var[X]2+f(μ)2Var[X]+14f(μ)2E[(Xμ)4]+12f(μ)f(μ)E[(Xμ)3]
D2[f(X)]14f(μ)2[D4X(D2X)2]+f(μ)D2X+12f(μ)f(μ)D3X


Delta method is used for asymptotic distributions. You cannot use when you have only one random variable.
mpiktas

@mpiktas: Actually I dont know much about Delta method, I've just read something on wikipedia. This is quotation from wiki: "The delta method uses second-order Taylor expansions to approximate the variance of a function of one or more random variables".
Tomek Tarczynski

it seems wikipedia has exactly what you want: en.wikipedia.org/wiki/…. I will reedit my answer, it seems that I underestimated Taylor expansion.
mpiktas

Tomek, if you disagree with the edits that were made (not by me), you can always change them again, or roll them back, or just point out the differences and ask for clarification.
Glen_b -Reinstate Monica

2
@Glen_b: I agree with them E(X-mu) = 0 doesn't implyt that E[(X-mu)^3] = 0.
Tomek Tarczynski

Réponses:


33

Update

I've underestimated Taylor expansions. They actually work. I assumed that integral of the remainder term can be unbounded, but with a little work it can be shown that this is not the case.

The Taylor expansion works for functions in bounded closed interval. For random variables with finite variance Chebyshev inequality gives

P(|XEX|>c)Var(X)c

So for any ε>0 we can find large enough c so that

P(X[EXc,EX+c])=P(|XEX|c)<1ε

First let us estimate Ef(X). We have

Ef(X)=|xEX|cf(x)dF(x)+|xEX|>cf(x)dF(x)
where F(x) is the distribution function for X.

Since the domain of the first integral is interval [EXc,EX+c] which is bounded closed interval we can apply Taylor expansion:

f(x)=f(EX)+f(EX)(xEX)+f(EX)2(xEX)2+f(α)3(xEX)3
where α[EXc,EX+c], and the equality holds for all x[EXc,EX+c]. I took only 4 terms in the Taylor expansion, but in general we can take as many as we like, as long as function f is smooth enough.

Substituting this formula to the previous one we get

Ef(X)=|xEX|cf(EX)+f(EX)(xEX)+f(EX)2(xEX)2dF(x)+|xEX|cf(α)3(xEX)3dF(x)+|xEX|>cf(x)dF(x)
Now we can increase the domain of the integration to get the following formula

Ef(X)=f(EX)+f(EX)2E(XEX)2+R3
where
R3=f(α)3E(XEX)3++|xEX|>c(f(EX)+f(EX)(xEX)+f(EX)2(xEX)2+f(X))dF(x)
Now under some moment conditions we can show that the second term of this remainder term is as large as P(|XEX|>c) which is small. Unfortunately the first term remains and so the quality of the approximation depends on E(XEX)3 and the behaviour of third derivative of f in bounded intervals. Such approximation should work best for random variables with E(XEX)3=0.

Now for the variance we can use Taylor approximation for f(x), subtract the formula for Ef(x) and square the difference. Then

E(f(x)Ef(x))2=(f(EX))2Var(X)+T3

where T3 involves moments E(XEX)k for k=4,5,6. We can arrive at this formula also by using only first-order Taylor expansion, i.e. using only the first and second derivatives. The error term would be similar.

Other way is to expand f2(x):

f2(x)=f2(EX)+2f(EX)f(EX)(xEX)+[(f(EX))2+f(EX)f(EX)](XEX)2+(f2(β))3(XEX)3

Similarly we get then

Ef2(x)=f2(EX)+[(f(EX))2+f(EX)f(EX)]Var(X)+R~3
where R~3 is similar to R3.

The formula for variance then becomes

Var(f(X))=[f(EX)]2Var(X)[f(EX)]24Var2(X)+T~3
where T~3 have only third moments and above.

I dont need to know the exact value of the variance, approximation should works for me.
Tomek Tarczynski

Indeed, the approximate formula for E[f(X)] in the OP is often used in risk analysis in economics, finance and insurance.
Raskolnikov

@Raskolnikov, yes but it contradicts my admitedly stale knowledge of Taylor expansion. Clearly the remainder term must be taken into account. If the random variable is bounded, then no problem, since polynomials approximate continuous functions on bounded interval uniformly. But we deal with unbounded random variables. Of course for random normal we can say that it is effectively bounded, but still in general case, some nasty surprises can arise, or not. I will fix my answer when I'll have the clear answer.
mpiktas

2
@Tomek Tarczynski, the third derivative of x goes to zero quite quickly for large x, but is unbounded near zero. So if you picked uniform distribution with support close to zero, the remainder term can get large.
mpiktas

1
Note that in your link the the equality is approximate. In this answer all the equations are exact. Furthermore for the variance note that the first derivative is estimated at the EX, not x. Also I never stated that this will not work for x, only that for x the approximate formula might have huge error if X domain is close to zero.
mpiktas

8

To know the first two moments of X (mean and variance) is not enough, if the function f(x) is arbitrary (non linear). Not only for computing the variance of the transformed variable Y, but also for its mean. To see this -and perhaps to attack your problem- you can assume that your transformation function has a Taylor expansion around the mean of X and work from there.

En utilisant notre site, vous reconnaissez avoir lu et compris notre politique liée aux cookies et notre politique de confidentialité.
Licensed under cc by-sa 3.0 with attribution required.