Monthly Archives: March 2009

Some simple attempts at feature selection

Not too long ago John Langford stopped on by and gave a fascinating talk. There were a lot of take-aways from the talk but here’s one that really got my noodle going: A lot of times we get really high-dimensional … Continue reading

6 Comments

Filed under Uncategorized

Optimization instead of inference

You know, I’ve always taken it for granted that what we want to do is probabilistic inference but lately I’ve been thinking more about what we really want and how to get there. To illustrate my point, consider our dear … Continue reading

6 Comments

Filed under Uncategorized

Randomness makes parallelization interesting

A friend of mine recently posed a problem which seems at first blush quite simple but turns out to be quite interesting. Suppose you have a program which takes time and which is perfectly parallelizable — that is, I can … Continue reading

5 Comments

Filed under Uncategorized

Another perspective on link probability functions

When deciding when a link between two documents should exist, we need to define a function of the covariates which we call a link probability function.  We have looked at many candidate functions but concentrated on two such functions: Recently, … Continue reading

Leave a Comment

Filed under Uncategorized

A generative model of binary data

We are often faced with data sets whose elements are vectors of binary data. For example, a tag corpus has for each document a binary vector the length of the tag vocabulary whose elements indicate tag presence/absence. A corpus of … Continue reading

Leave a Comment

Filed under Uncategorized