This is quite an old question with some very good answers, however I think it can benefit from a new answer to address a more pragmatic perspective.
When should one not permit a fixed effect to vary across levels of a random effect ?
I won't address the issues already described in the other answers, instead I will refer to the now-famous, though I would rather say "infamous" paper by Barr et al (2013) often just referred to as "Keep it maximal"
Barr, D.J., Levy, R., Scheepers, C. and Tily, H.J., 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, 68(3), pp.255-278.
In this paper the authors argue that all fixed effects should be allowed to vary across levels of the grouping factors (random intercepts). Their argument is quite compelling - basically that by not allowing them to vary, it is imposing constraints on the model. This is well-described in the other answers. However, there are potentially serious problems with this approach, which are described by Bates el al (2015):
Bates, D., Kliegl, R., Vasishth, S. and Baayen, H., 2015. Parsimonious mixed models. arXiv preprint arXiv:1506.04967
It is worth noting here that Bates is the primary author of the lme4
package for fitting mixed models in R, which is probably the most widely used package for such models. Bates et al note that in many real-world applications, the data simply won't support a maximal random effects structure, often because there are insufficient numbers of observations in each cluster for the relevant variables. This can manifest itself in models that fail to converge, or are singular in the random effects. The large number of questions on this site about such models attests to that. They also note that Barr et al used a relatively simple simulation, with "well-behaved" random effects as the basis for their paper. Instead Bates et al suggest the following approach:
We proposed (1) to use PCA to determine the dimensionality of the variance-covariance matrix of the random-effect structure, (2) to initially constrain correlation parameters to zero, especially when an initial attempt to fit a maximal model does not converge, and (3) to drop non-significant variance components and their associated correlation parameters from the model
In the same paper, they also note:
Importantly, failure to converge is not due to defects of the estimation algorithm, but is a straightforward consequence of attempting to fit a model that is too complex to be properly supported by the data.
And:
maximal models are not necessary to protect against anti-conservative
conclusions. This protection is fully provided by comprehensive models that are guided by realistic expectations about the complexity that the data can support. In statistics, as elsewhere in science, parsimony is a virtue, not a vice.
Bates et al (2015)
From a more applied perspective, a further consideration that should be made is whether or not, the data generation process, the biological/physical/chemical theory that underlies the data, should guide the analyst towards specifying the random effects structure.