Approximating the logistic response

The central challenge of variational methods is usually computing expectations of log probabilities. In the case of the RTM, this is $\mathbb{E}[\log p(y | z, z')] = y \mathbb{E}[x] - \mathbb{E}[\log(1 + \exp(x))],$ where $x = \eta^t z \circ z' + \nu$ .

The first term is linear and so is easy enough, the second is problematic though. One approach is to use a Taylor approximation. The issue then becomes choosing the point around which to center the approximation. The partition function above really has two regimes: for small $x, \log(1 + \exp(x)) \approx 0$ , but for large $x, \log(1 + \exp(x)) \approx x$ . The solution that the delta method uses is to center it at the mean $\mu = \mathbb{E}[x]$ . But does this give us any real guarantee that we won’t be better off by centering it elsewhere?

I couldn’t really answer this question analytically, so I decided to experiment. I sampled $x$ using settings typical of the corpora I look at. Turns out that the first order approximation at the mean is really good because the variance on z is really low when you have enough words.

That of course brings up another question. Why does doing the “correct” ( $\psi_\sigma$ ) thing not work as well as the “incorrect” ( $\psi_e$ ) approximation?

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Approximating the logistic response

Leave a comment Cancel reply

Blog Stats

Archives

Meta

Approximating the logistic response

Share this:

Related

Leave a comment Cancel reply

Blog Stats

Archives

Meta