Materials We Provide
|Option 1: IMDB Starter Code||Project Prompts and Description||Here|
|Option 1: IMDB Dataset||IMDB Dataset||Here|
|Option 1: IMDB Solution Code||Sample solutions for project questions (Instructors Only)||Here|
|Option 2: Chipotle Starter Code||Project Prompts and Description||Here|
|Option 2: Chipotle Dataset||Dataset File||Here|
|Option 2: Chipotle Solution Code||Sample solutions for project questions (Instructors Only)||Here|
Note: Instructors should withold providing project solutions until students have submitted their drafts.
For this project, you will be conducting basic exploratory data analysis, practicing your data analysis skills while becoming comfortable with Python (Pandas not required).
For this project, we have provided two options. Students should choose one of the following options, then complete all of the required sections for the option they've chosen:
Option 1: Best for New Programmers
Using your new python skills, complete a series of guided prompts exploring the top-rated movies on IMDB. IMDB stands for "the Internet Movie Database," an online collection of film information and reviews.
In these exercises, students will be looking to answer such questions as:
- What is the average rating per genre?
- How many different actors are in a movie?
The IMDB dataset provided is created from data scraped from the [Internet Movie Database website](https://www.imdb.com. The dataset describes top ranking movies, including: title, data, duration, content rating, headlining actors, and ranking.
Option 2: Best for Intermediate Programmers
Using python, conduct some exploratory data analysis on Chipotle's order data. You will be looking to answer such questions as:
- How many orders are being made?
- What is the average price per order?
- How many different ingredients?
The Chipotle data set is taken from "The Upshot" column in The New York Times. It was chosen because the data is from a familiar source representing real world consumer transaction data - plus their guacamole is delicious.
This dataset was analyzed in-depth by data scientists from the New York Times. We have modified our questions based on their analysis, but we encourage students not to review their analysis until after they have made their own attempt.
In a Jupyter Notebook, create working solutions for all of the required questions for the Option you've chosen. Your notebook should include:
Text for each question, copy and pasted from the starter-code provided.
A working solution to each problem.
- Do not include test, practice, or broken code (unless you were unable to create a working solution).
Comments for all of your code.
- In your comments, describe any assumptions you made in order to solve these problems.
Optional: After completing the required portions, try your hand at the other option or complete the bonus sections for an additional challenge!
For all projects, requirements will be evaluated on a simple point scale of 0, 1, or 2. Additionally, instructors will provide you with feedback on required portions of your project.
|1||Does not meet expectations.|
|2||Meets expectations, good job!|
|3||Surpasses our wildest expectations!|
Note: Scores of
2mean that a requirement has been completely fulfilled, while
3is typically reserved for bonus objectives.
Your instructor will explain how to submit your assignment. Typically, this is done either by:
- Creating a repository in your github profile, hosting your materials, and sharing a link with your instructor. [or]
- Forking the project repository, adding your solutions, and submitting a pull request back to the relevant repo.