Avec les informations fournies par @Glen_b, j'ai pu trouver la réponse. En utilisant les mêmes notations que la question
P(Zk≤x)=∑j=0k+1(k+1j)(−1)j(1−jx)k+,
où si a > 0 et 0 sinon. Je donne également l'attente et la convergence asymptotique à la distribution de Gumbel ( NB : pas Beta)a+=aa>00
E(Zk)=1k+1∑i=1k+11i∼log(k+1)k+1,P(Zk≤x)∼exp(−e−(k+1)x+log(k+1)).
The material of the proofs is taken from several publications linked in the references. They are somewhat lengthy, but straightforward.
1. Proof of the exact distribution
Soit des variables aléatoires uniformes IID dans l'intervalle ( 0 , 1 ) . En les commandant, nous obtenons les statistiques d'ordre k notées ( U ( 1 ) , … , U ( k ) ) . Les espacements uniformes sont définis comme Δ i = U ( i ) - U ( i - 1 ) , avec U ((U1,…,Uk)(0,1)k(U(1),…,U(k))Δi=U(i)−U(i−1)U(0)=0 and U(k+1)=1. The ordered spacings are the corresponding ordered statistics Δ(1)≤…≤Δ(k+1). The variable of interest is Δ(k+1).
For fixed x∈(0,1), we define the indicator variable 1i=1{Δi>x}. By symmetry, the random vector (11,…,1k+1) is exchangeable, so the joint distribution of a subset of size j is the same as the joint distribution of the first j. By expanding the product, we thus obtain
P(Δ(k+1)≤x)=E(∏i=1k+1(1−1i))=1+∑j=1k+1(k+1j)(−1)jE(∏i=1j1i).
We will now prove that E(∏ji=11i)=(1−jx)k+, which will establish the distribution given above. We prove this for j=2, as the general case is proved similarly.
E(∏i=121i)=P(Δ1>x∩Δ2>x)=P(Δ1>x)P(Δ2>x|Δ1>x).
If Δ1>x, the k breakpoints are in the interval (x,1). Conditionally on this event, the breakpoints are still exchangeable, so the probability that the distance between the second and the first breakpoint is greater than x is the same as the probability that the distance between the first breakpoint and the left barrier (at position x) is greater than x. So
P(Δ2>x|Δ1>x)=P(all points are in (2x,1)∣∣all points are in (x,1)),soP(Δ2>x∩Δ1>x)=P(all points are in (2x,1))=(1−2x)k+.
2. Expectation
For distributions with finite support, we have
E(X)=∫P(X>x)dx=1−∫P(X≤x)dx.
Integrating the distribution of Δ(k+1), we obtain
E(Δ(k+1))=1k+1∑j=1k+1(k+1j)(−1)j+1j=1k+1∑j=1k+11j.
The last equality is a classic representation of harmonic numbers Hi=1+12+…+1i, which we demonstrate below.
Hk+1=∫101+x+…+xkdx=∫101−xk+11−xdx.
With the change of variable u=1−x and expanding the product, we obtain
Hk+1=∫10∑j=1k+1(k+1j)(−1)j+1uj−1du=∑j=1k+1(k+1j)(−1)j+1j.
3. Alternative construction of uniform spacings
In order to obtain the asymptotic distribution of the largest fragment, we will need to exhibit a classical construction of uniform spacings as exponential variables divided by their sum. The probability density of the associated order statistics (U(1),…,U(k)) is
fU(1),…U(k)(u(1),…,u(k))=k!,0≤u(1)≤…≤u(k+1).
If we denote the uniform spacings Δi=U(i)−U(i−1), with U(0)=0, we obtain
fΔ1,…Δk(δ1,…,δk)=k!,0≤δi+…+δk≤1.
By defining U(k+1)=1, we thus obtain
fΔ1,…Δk+1(δ1,…,δk+1)=k!,δ1+…+δk=1.
Now, let (X1,…,Xk+1) be IID exponential random variables with mean 1, and let S=X1+…+Xk+1. With a simple change of variable, we can see that
fX1,…Xk,S(x1,…,xk,s)=e−s.
Define Yi=Xi/S, such that by a change of variable we obtain
fY1,…Yk,S(y1,…,yk,s)=ske−s.
Integrating this density with respect to s, we thus obtain
fY1,…Yk,(y1,…,yk)=∫∞0ske−sds=k!,0≤yi+…+yk≤1,and thusfY1,…Yk+1,(y1,…,yk+1)=k!,y1+…+yk+1=1.
So the joint distribution of k+1 uniform spacings on the interval (0,1) is the same as the joint distribution of k+1 exponential random variables divided by their sum. We come to the following equivalence of distribution
Δ(k+1)≡X(k+1)X1+…+Xk+1.
4. Asymptotic distribution
Using the equivalence above, we obtain
P((k+1)Δ(k+1)−log(k+1)≤x)=P(X(k+1)≤(x+log(k+1))X1+…+Xk+1k+1)=P(X(k+1)−log(k+1)≤x+(x+log(k+1))Tk+1),
where Tk+1=X1+…+Xk+1k+1−1. This variable vanishes in probability because E(Tk+1)=0 and Var(log(k+1)Tk+1)=(log(k+1))2k+1↓0. Asymptotically, the distribution is the same as that of X(k+1)−log(k+1). Because the Xi are IID, we have
P(X(k+1)−log(k+1)≤x)=P(X1≤x+log(k+1))k+1=(1−e−x−log(k+1))k+1=(1−e−xk+1)k+1∼exp{−e−x}.
5. Graphical overview
The plot below shows the distribution of the largest fragment for different values of k. For k=10,20,50, I have also overlaid the asymptotic Gumbel distribution (thin line). The Gumbel is a very bad approximation for small values of k so I omit them to not overload the picture. The Gumbel approximation is good from k≈50.
6. References
The proofs above are taken from references 2 and 3. The cited literature contains many more results, such as the distribution of the ordered spacings of any rank, their limit distribution and some alternative constructions of the ordered uniform spacings. The key references are not easily accessible, so I also provide links to the full text.
- Bairamov et al. (2010) Limit results for ordered uniform spacings, Stat papers, 51:1, pp 227-240
- Holst (1980) On the lengths of the pieces of a stick broken at random, J. Appl. Prob., 17, pp 623-634
- Pyke (1965) Spacings, JRSS(B) 27:3, pp. 395-449
- Renyi (1953) On the theory of order statistics, Acta math Hung, 4, pp 191-231