So one more wrinkle to add to the pile. I was wondering why a “better” method should fare worse. All the explorations below have only confirmed that. Well, there’s one other way (other than the E-step and the M-step) in which the two differ: prediction. The link predictions routines are different and link prediction is where the real performance differences are as well. In the case of , we treat the model as a mixed-membership model. This means that no additional approximations are needed to compute the quantity of interest
. That is to say, rather than calculate
as one does for the ELBO during inference and then exponentiating, instead we calculate the desired marginal directly (which is easy since we treat the covariates as indicators).
A different approach is used for . There, we compute the expected log likelihood, as in the ELBO, and then exponentiate. We can do this by applying a first-order approximation; this basically linearizes this term and allows us to move the expectation freely around. How much is lost by this?
Instead of answering this question directly, I ask another question; how much is gained by doing the right thing on . I rewrote that computation to better mirror what we were doing in the
case. Answer:
where
is my short-hand for
with the incorrect prediction scheme. So we see that all of the gains that you get by going with
evaporate when we change how prediction is done. This indicates that maybe the real culprit is how we predict using
…