Audioscrobbler is this really cool data set from a few years ago; back then, Audioscrobbler had not yet been rolled into the last.fm but it had about the same functionality as it does now. Basically, it’s a little plugin for iTunes et al. that lets someone keep track of all the artists you listen to. The listening habits of several thousand people were collected and distributed under a creative commons license.
After some normalization/cleanup, we end up with a set of artists each user is liable to listen to.
This is the sort of co-occurrence statistic which Ising models are good at capturing. The Ising model contains a matrix of parameters which indicate the correlations between artists — that is, the relative likelihood that a given user will end up listening to both artists.
Because this is a rather high-dimensional problem, we can employ some L1 + L2 penalization; what we end up learning is a relatively sparse parameter matrix that is often easier to interpret.
With some magic (cough cough) we can learn this parameter matrix fairly quickly. I thought I’d post some of the correlations between artists here for your {be/a}musement.
Now the actual parameter matrix consists of several thousand artists. Here, I’m selecting the 10 artists with the highest total correlations. You might say that these are the artists which tug most fiercely on other artists (the most cliquey artists if you want). For each of these 10 artists, I show the 5 most highly correlated artists.
The results make pretty good sense; it’s actually kind of disturbing how predictable people’s musical tastes are. And for some reason the main cliques at the top of the list are all either metal bands or the sort of indie bands likely to populate OC soundtracks =). I should point out that if you go further down the list you eventually find a few other cliques such as trip hop (Portishead, Massive Attack, Lamb, Tricky, et al. [note to self: how cool would “et al.” be as a band name?]), 80s rock with remarkable staying power (Aerosmith, Bon Jovi, Guns N’ Roses), wuss rock (Counting Crows, DMB, Goo Goo Dolls), and just plain bad music (3DD, Hoobastank, Staind, Nickleback).
Artist… | …is correlated with | ||||
Metallica | Iron Maiden | Megadeth | Pantera | Slayer | Nightwish |
In Flames | Dark Tranquillity | Soilwork | Children of Bodom | Arch Enemy | Dimmu Borgir |
The Arcade Fire | The Fiery Furnaces | Broken Social Scene | The Go! Team | Bloc Party | Stars |
Nightwish | Within Temptation | Sonata Arctica | Blind Guardian | Stratovarius | Therion |
Rammstein | Nightwish | Apocalyptica | KoЯn | Marilyn Manson | Metallica |
Belle and Sebastian | The Magnetic Fields | Neutral Milk Hotel | Yo La Tengo | Elliott Smith | Camera Obscura |
Iron Maiden | Judas Priest | Iced Earth | Helloween | Manowar | Bruce Dickinson |
Elliott Smith | Iron & Wine | The Decemberists | Bright Eyes | Sufjan Stevens | Belle and Sebastian |
Bright Eyes | Rilo Kiley | Death Cab for Cutie | Desaparecidos | Cursive | The Good Life |
Death Cab for Cutie | The Postal Service | Bright Eyes | The Shins | Rilo Kiley | Cursive |