Our Blog

Our Blog

upload your contribution

Uncategorized No comments
featured image

This post will walk you through the steps to prepare your files for submission and upload them to the submission platform. The organizer of your group (i.e. your professor or TA) will provide a link to the submission platform.

1. Save your predictions as prediction.csv.

This file should be structured the same way as the “prediction.csv” file provided as part of your data bundle.

This file should have 4,242 rows: one for each observation in the test set.

We are asking you to make predictions for all 4,242 cases, which includes both the training cases from train.csv and the held-out test cases. We would prefer that you not simply copy these cases from train.csv to prediction.csv. Instead, please submit the predictions that come out of your model. This way, we can compare your performance on the training and test sets, to see whether those who fit closely to the training set perform more poorly on the test set (see our blog discussing overfitting). Your scores will be determined on the basis of test observations alone, so your predictions for the cases included in train.csv will not affect your score.
There are some observations that are truly missing: we do not have the true answer for these cases because respondents did not complete the interview or did not answer the question. This is true for both the training and the test sets. Your predictions for these cases will not affect your scores. We are asking you to make predictions for missing cases because it is possible that we will find those respondents sometime in the future and uncover the truth. It will be scientifically interesting to know how well the community model was able to predict these outcomes which even the survey staff did not know at the time of the Challenge.

This file should have 7 columns for the ID number and the 6 outcomes. They should be named:

challengeID, gpa, grit, materialHardship, eviction, layoff, jobTraining

The top of the file will look like this (numbers here are random). challengeID numbers can be in any order.

 

2. Save your code.

3. Create a narrative explanation of your study. This should be saved in a file called “narrative” and can be a text file, PDF, or Word document.

At the top of this narrative explanation, tell us your names of everyone on the team that produced the submission, or your name if you worked alone, in the format:

Homer Simpson,
homer@gmail.com

Marge Simpson,
msimpson@gmail.com

Then, tell us about how you developed the submission. This might include your process for preparing a the data for analysis, methods you used in the analysis, how you chose the submission you settled on, things you learned, etc.

4. Zip all the files together in one folder.

It is important that the files be zipped in a folder with no sub-directories. Instructions are different for Mac and windows.

On Mac, highlight all of the individual files.

Right click and choose “Compress 3 items”.

On Windows, highlight all of the individual files.

Right click and choose
Send to -> Compressed (zipped) folder

5. Upload the zipped folder to the submission site. The link to this will be provided to you by the organizers (i.e. your professor or TA) of your specific instance of the Fragile Families Challenge.

Click the “Participate” tab at the top, then the “Submit / View Results” tab on the left. Click the “Submit” button to upload your submission.

6. Wait for the platform to evaluate your submission.

Click “Refresh status” next to your latest submission to view its updated status and see results when they are ready. If successful, you will automatically be placed on the leaderboard when evaluation finishes.

About Matt Salganik

Matthew Salganik is a Professor of Sociology at Princeton University. He is also the author of Bit by Bit: Social Research in the Digital Age (http://www.bitbybitbook.com). You can learn more about his research at http://www.princeton.edu/~mjs3.

Add your comment