Je me suis demandé cela aussi. La première explication n’est pas mauvaise, mais voici mes deux mots clés.
Tout d'abord, la perplexité n'a rien à voir avec la caractérisation de la fréquence à laquelle vous devinez quelque chose de bien. Il s'agit davantage de caractériser la complexité d'une séquence stochastique.
Nous examinons une quantité, 2−∑xp(x)log2p(x)
Annulons d'abord le journal et l'exponentiation.
2−∑xp(x)log2p(x)=1∏xp(x)p(x)
Je pense que cela vaut la peine de souligner que la perplexité est invariante avec la base que vous utilisez pour définir l'entropie. En ce sens, la perplexité est infiniment plus unique / moins arbitraire que l’entropie en tant que mesure.
Relation avec les dés
11212×1212=2
Now what happens when we look at an N sided dice? Perplexity is 1(1N1N)N=N
So perplexity represents the number of sides of a fair die that when rolled, produces a sequence with the same entropy as your given probability distribution.
Number of States
OK, so now that we have an intuitive definition of perplexity, let's take a quick look at how it is affected by the number of states in a model. Let's start with a probability distribution over N states, and create a new probability distribution over N+1 states such that the likelihood ratio of the original N states remain the same and the new state has probability ϵ. In the case of starting with a fair N sided die, we might imagine creating a new N+1 sided die such that the new side gets rolled with probability ϵ and the original N sides are rolled with equal likelihood. So in the case of an arbitrary original probability distribution, if the probability of each state x is given by px, the new distribution of the original N states given the new state will be p′x=px(1−ϵ)
, and the new perplexity will be given by:
1ϵϵ∏Nxp′xp′x=1ϵϵ∏Nx(px(1−ϵ))px(1−ϵ)=1ϵϵ∏Nxppx(1−ϵ)x(1−ϵ)px(1−ϵ)=1ϵϵ(1−ϵ)(1−ϵ)∏Nxppx(1−ϵ)x
In the limit as ϵ→0, this quantity approaches 1∏Nxpxpx
So as you make make rolling one side of the die increasingly unlikely, the perplexity ends up looking as though the side doesn't exist.