

Notes from a life lived on the frontier of Cosmetrics, the science.
We show that forms of Bayesian learning that are often applied to classification problems can be *statistically inconsistent* when the model is wrong . More precisely, we present a family ('model') M of probability distributions, a distribution P outside M and a Bayesian prior distribution on M, such that - M contains a distribution Q within a small distance \delta from P. Nevertheless: - when data are sampled according to P, then, no matter how many data are observed, the Bayesian posterior puts nearly all its mass on distributions that are at a distance from P that is much larger than \delta. The classifier based on the Bayesian posterior can perform substantially worse than random guessing, no matter how many data are observed, even though the classifier based on Q performs much better than random guessing. The result holds for a variety of distance functions, including the KL (relative entropy) divergence.Misspecification themes are usually overlooked by the Bayesian statistical literature and it could not be differently. As long as one is concerned with actions to be taken and these action are taken with respect to the likelihood principle misspecification is noting more than a nuisance. Clearly the paper above add a (welcome) frequentist perspective.