Daily Archives: September 2, 2009

R LDA package minor update: 1.0.1

Dave gently reminded me that properly assessing convergence of our models is important and that just running a sampler for N iterations is unsatisfactory. I agree wholeheartedly. As a first step, the collapsed Gibbs sampler in the R LDA package can now optionally report the log likelihood (to within a constant). For example, we can rerun the model fit in demo(lda) but with an extra flag set:

result <- lda.collapsed.gibbs.sampler(cora.documents,
                                      K,  ## Num clusters
                                      cora.vocab,
                                      25,  ## Num iterations
                                      0.1,
                                      0.1,
                                      compute.log.likelihood=TRUE)

Using the now-available variable result$log.likelihoods, we can plot the progress of the sampler versus iteration:

log likelihood as a function of iteration

Grab it while it’s hot: http://www.cs.princeton.edu/~jcone/lda_1.0.1.tar.gz.

Advertisement

6 Comments

Filed under Uncategorized