Skip to content

Latest commit

 

History

History
54 lines (37 loc) · 4.96 KB

gradproject.md

File metadata and controls

54 lines (37 loc) · 4.96 KB
layout title nav_order description markdown
page
Graduate Project
3
Specifications for the grad project for Data 200.
kramdown

Graduate Project

{:.no_toc}

  • TOC {:toc}

Introduction

The graduate project is offered only to students enrolled in Data C200, CS C200A, or Data 200S. Other students are welcome to explore the questions and datasets in the project for personal learning, but their work will not be graded or counted towards their final grades.

The purpose of the project is to give students experience in both open-ended data science analysis and research in general.

Teamwork

You must work in groups of two or three students. In order to give everyone experience in collaborating on a data science project, individual projects are not allowed. Everyone in the same group will receive the same grade (except for exceptional circumstances).

Milestones and Grading Breakdown

Milestones Deadline (11:59 PM Pacific) Event Deliverables Submission Link Grading Weight
Milestone 1 March 3 Group Formation + Research Proposal Project Proposal Form Google Form{:target="_blank"} 5%
Milestone 2 March 17 EDA EDA Write-Up + Notebook EDA Gradescope{:target="_blank"} 10%
Milestone 3 March 30 Mandatory Check-In Progress Report + Meeting Booking Gradescope{:target="_blank"} 10%
Milestone 4 April 14 Project Report First Draft Final Report Draft Write-Up + Notebook Gradescope{:target="_blank"} 20%
Milestone 5 April 21 External Peer-Review External Peer Review Gradescope{:target="_blank"} 7%
Final Submission April 28 Final Project Report Final Project Report + Presentation Video Project Report PDF Gradescope{:target="_blank"}
Project Report Code Gradescope{:target="_blank"}
CV Predictions Gradescope{:target="_blank"}
NLP Predictions Gradescope{:target="_blank"}
42%
Weekly Internal Peer Reviews Every Monday After M2 Internal Peer-Review Internal Peer Review [Gradescope] (Please refer to corresponding link each week) 6%

For each milestone listed above, detailed expectations can be found in the "Milestone" section under each of the project topics (Computer Vision or Natural Language Processing) which is explained above. Please refer to these sections for specific requirements and guidelines related to your chosen project topic.

In addition to these milestones, you will have weekly internal peer reviews to fill out, each accounting for 1% of your grade (6 internal reviews in total). Internal reviews are important to understand how each member of the group is contributing to the project and how the tasks are distributed among members. This is graded on completion and submitted via Gradescope each week.

Late Policy

  • No Extensions for Milestones: Must be submitted on time; no extensions are permitted. Milestones cannot be submitted late as they are crucial for the peer review process.
  • Final Report and Presentation Video: Late submissions incur a 10% daily penalty, up to a maximum of two days. Submissions are rounded to the nearest day (e.g., 2 minutes late counts as 1 day late).

Accessing Datasets

All of the provided datasets can be found in the Datahub directory shared/course/data100-shared-readwrite/sp25_grad_project_data. You can access the data directly from Datahub. If you wish to work on the project locally, you can also download the files containing the datasets for each topic by right-clicking on the file in JupyterLab and select "Copy Download Link". If you choose to train more complex models, DataHub might not have enough hardware resources or memory, in which case you can use Google Colab{:target="_blank"} or your local machine. If you would like to use Google Colab, feel free to check out this link{:target="_blank"} to get started.

Project Topics

Please choose one of the following projects (CV or NLP) and its associated datasets to work on. You will be expected to complete both Task A and B provided for your chosen dataset. Click the below links to go to the details of each project.

Project 1: Computer Vision{:target="_blank"}

Project 2: Natural Language Processing{:target="_blank"}