What would we do without the train/test paradigm? It gives us a convenient way of saying “Aha, I’m better than X et al. by 5%!” But rarely do we actually get data which has been partitioned into train/test by a higher power; instead we get a bunch of labeled data which we have to split [...]