The information divergence of a probability measure $P$ from an exponential family $\mathcal{E}$ over a finite set is defined as infimum of the divergences of $P$ from $Q$ subject to $Q\in\mathcal{E}$. All directional derivatives of the divergence from $\mathcal{E}$ are explicitly found. To this end, behaviour of the conjugate of a log-Laplace transform on the boundary of its domain is analysed. The first order conditions for $P$ to be a maximizer of the divergence from $\mathcal{E}$ are presented, including new ones when $P$~is not projectable to $\mathcal{E}$.
Keywords: Kullback--Leibler divergence; relative entropy; exponential family; information projection; log-Laplace transform; cumulant generating function; directional derivatives; first order optimality conditions; convex functions; polytopes;
AMS: 94A17; 62B10; 60A10; 52A20;
BACK to VOLUME 43 NO.5