Data
Topic Overview
The Challenge
The dataset for BOW DataFest 2026 is provided by the American Statistical Association and will remain confidential until the opening ceremony on Friday, April 10 at 7:00 PM.
Past DataFest challenges have included data from diverse sources such as dating apps, online job platforms, and public health organizations. The datasets are large (typically 1-3 GB), messy, and designed to mimic the complexity of real-world data analysis projects.
Your mission is to explore the data, identify interesting patterns or insights, formulate research questions, and present your findings in a compelling 5-minute video presentation, followed by live questions from the judges.
Dataset Access
Receiving the Data
All registered teams will receive access to the dataset during the opening ceremony on Friday, April 10 at 7:00 PM. The data will be distributed via USB drives or a secure download link.
Data Description
A detailed data dictionary and schema will be provided alongside the dataset. This documentation will explain variable definitions, data collection methods, and any known limitations or quirks in the data.
Rules & Submission Guidelines
Competition Rules
Team composition:
Teams must have 2-5 members. All team members must be currently enrolled undergraduate students at Babson, Olin, or Wellesley.
Original work:
All analysis and presentation materials must be the original work of your team, completed during the competition weekend.
Attendance & competition hours:
While as many of your team members as possible should be present at the introduction at 6:15 PM on Friday, you do not have to be present for the entire duration of the competition. Team members can come and go as you please, but all work has to be done during the competition hours (i.e., no all-nighters!).
Friendly competition & support:
This is a friendly competition. We encourage you to collaborate and help out other teams when they encounter issues you know how to solve. There will be a Slack channel to post questions, and faculty and mentors will also be available throughout the weekend to attempt to answer questions you might have.
Software:
There are no limitations on what software you use.
External resources allowed:
You may use any software, libraries, online documentation, or reference materials. However, you may not consult with people outside the event (family, friends, professors not serving as mentors).
Confidentiality:
The dataset is confidential. You may not share it with anyone outside the competition or post it online.
Presentation format:
Each team must prepare a 5-minute video presentation, followed by live questions from the judges. Submission details will be confirmed and communicated during the opening ceremony.
Judging criteria:
Presentations will be evaluated on clarity of research question, appropriateness of analysis methods, quality of visualizations, insight and creativity, and overall presentation quality.
Submission Process
Detailed submission instructions, including file formats and upload links, will be provided during the opening ceremony and posted in your team workspace.