R LDA package updated to version 1.2 and an ideal-point model for political blogs

I’ve been on a bit of a R tear lately. Today you should see a new version of the R lda package. This version has lots of fixes including a working mmsb demo with the latest version of ggplot2, corrected RTM code, improved likelihood reporting, better documentation, and much more. Grab it from CRAN today! Special thanks to the following people for bug reports/feature requests (sorry if I forgot anyone):

  • Edo Airoldi
  • Jordan Boyd-Graber
  • Khalid El-Arini
  • Roger Levy
  • Solomon Messing
  • Joerg Reichardt

One of the new features is a method to make sLDA predictions on response variables conditioned on documents. In the demo accompanying the package, I fit an sLDA model to a corpus of political blogs tagged as being either liberal or conservative. With this fitted model, I can now use the new predict method to predict the political bent of each of the blogs within a continuous space. The density plot of these predictions is given below, broken down by the the original conservative/liberal label (color of shading).

I like how there’s some bimodality for each contingency — a moderate group and a more extreme group. The model also predicts a heavy tail of super-conservative blogs. There is a real notable bump down by -3. I dunno if this represents reality; it’s probably worthwhile to do more extensive model checking.

Advertisements

3 Comments

Filed under Uncategorized

3 responses to “R LDA package updated to version 1.2 and an ideal-point model for political blogs

  1. Pingback: R LDA package updated to version 1.2 and an ideal-point model for … | Politics Blog

  2. Hi Jonathan,

    sorry to bother you here in the comment section, but i need some help. If theres a more decent way to contact you, please tell me. However here is my “little” problem:

    I am using LDA to discover topics (what a suprise 🙂 ). Now my goal is to create a plot with the lets say K frequent topics (a simple histogram of the distribution of topics in ALL my documents). it should look like your demo plot which you constructed with topic.proportions.df but it should be about all the documents . Maybe you could help me out to generate a valid ggplot2 code , for as you can see I am really “noob” at this R programming. I love it though and I will get into it more.

    thanks in advance

    Best regards,

    Kai

    p.s.: excuse my bad english (-;

  3. Farshad

    I want to use the LDA in MMSB for held out strategy like below paper http://www.cs.purdue.edu/homes/alanqi/papers/YanXuQi-MVGP-UAI2011.pdf
    Can some one tell me to how manipulate input data to hide some nodes?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s