Getting scores on holdout data

As described in an earlier blog post, there will be a special issue of Socius devoted to the Fragile Families Challenge. We think that the articles in this special issue would benefit from reporting their scores on both the leaderboard data and the holdout data. However, we don’t want to release the holdout data on August 1 because that could lead to non-transparent reporting of results. Therefore, beginning on August 1, we will do a controlled release of the scores on the holdout data. Here’s how it will work:

  • All models for the special issue must be submitted by August 1.
  • Between August 1 and October 1 you can complete a web form requesting scores on the holdout data for a list of the models. We will send you those scores.
  • You must report all the scores you requested in your manuscript or the supporting online material. We are requiring you to report all the scores that you request in order to prevent selective reporting of especially good results.

We realize that this procedure is a bit cumbersome, but we think that this extra step is worthwhile in order to ensure the most transparent reporting possible of results.

Submit your request for scores here.

Matthew Salganik is a Professor of Sociology at Princeton University. He is also the author of the forthcoming book Bit by Bit: Social Research in the Digital Age (http://www.bitbybitbook.com). You can learn more about his research at http://www.princeton.edu/~mjs3.


Steve McKay - August 2, 2017 Reply

Could you release the hold-out scores for ‘baseline’, the mean prediction, as perhaps >1 person might want to use them? No problem if not, I can request …

Ian Lundberg - August 10, 2017 Reply

Good suggestion. When we reveal the hold-out scores, we’ll include “baselineā€ in the reported scores as though that was a real participant. We need to finish building our ensemble model before we open the holdout data, so we expect to make these scores available sometime in the next few weeks.

