# Daily Archives: November 24, 2008

## Another Wrinkle…

So one more wrinkle to add to the pile.  I was wondering why a “better” method should fare worse.  All the explorations below have only confirmed that.  Well, there’s one other way (other than the E-step and the M-step) in which the two differ: prediction.  The link predictions routines are different and link prediction is where the real performance differences are as well.   In the case of $\psi_e$, we treat the model as a mixed-membership model.  This means that no additional approximations are needed to compute the quantity of interest $\mathbb{E}[p(y_{ij} | z_i, z_j)]$.  That is to say, rather than calculate $\mathbb{E}[\log p(y_{ij} | z_i, z_j)]$ as one does for the ELBO during inference and then exponentiating, instead we calculate the desired marginal directly (which is easy since we treat the covariates as indicators).

A different approach is used for $\psi_\sigma$.  There, we compute the expected log likelihood, as in the ELBO, and then exponentiate.  We can do this by applying a first-order approximation; this basically linearizes this term and allows us to move the expectation freely around.  How much is lost by this?

Instead of answering this question directly, I ask another question; how much is gained by doing the right thing on $\psi_e$.  I rewrote that computation to better mirror what we were doing in the $\psi_\sigma$ case.   Answer: $e > \sigma > e'$ where $e'$ is my short-hand for $\psi_e$ with the incorrect prediction scheme.   So we see that all of the gains that you get by going with $\psi_e$ evaporate when we change how prediction is done.  This indicates that maybe the real culprit is how we predict using $\psi_\sigma$