No description, website, or topics provided.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Exploratory Data Analysis in Python

Unit Project

Materials We Provide

Item Description Link
Option 1: IMDB Starter Code Project Prompts and Description Here
Option 1: IMDB Dataset IMDB Dataset Here
Option 1: IMDB Solution Code Sample solutions for project questions (Instructors Only) Here
--- --- ---
Option 2: Chipotle Starter Code Project Prompts and Description Here
Option 2: Chipotle Dataset Dataset File Here
Option 2: Chipotle Solution Code Sample solutions for project questions (Instructors Only) Here

Note: Instructors should withold providing project solutions until students have submitted their drafts.


Project Objectives

For this project, you will be conducting basic exploratory data analysis, practicing your data analysis skills while becoming comfortable with Python (Pandas not required).

For this project, we have provided two options. Students should choose one of the following options, then complete all of the required sections for the option they've chosen:

Option 1: Best for New Programmers

Using your new python skills, complete a series of guided prompts exploring the top-rated movies on IMDB. IMDB stands for "the Internet Movie Database," an online collection of film information and reviews.

In these exercises, students will be looking to answer such questions as:

  • What is the average rating per genre?
  • How many different actors are in a movie?

The IMDB dataset provided is created from data scraped from the [Internet Movie Database website]( The dataset describes top ranking movies, including: title, data, duration, content rating, headlining actors, and ranking.

Option 2: Best for Intermediate Programmers

Using python, conduct some exploratory data analysis on Chipotle's order data. You will be looking to answer such questions as:

  • How many orders are being made?
  • What is the average price per order?
  • How many different ingredients?

The Chipotle data set is taken from "The Upshot" column in The New York Times. It was chosen because the data is from a familiar source representing real world consumer transaction data - plus their guacamole is delicious.

This dataset was analyzed in-depth by data scientists from the New York Times. We have modified our questions based on their analysis, but we encourage students not to review their analysis until after they have made their own attempt.

Project Requirements

In a Jupyter Notebook, create working solutions for all of the required questions for the Option you've chosen. Your notebook should include:

  1. Text for each question, copy and pasted from the starter-code provided.

  2. A working solution to each problem.

    • Do not include test, practice, or broken code (unless you were unable to create a working solution).
  3. Comments for all of your code.

    • In your comments, describe any assumptions you made in order to solve these problems.
  4. Optional: After completing the required portions, try your hand at the other option or complete the bonus sections for an additional challenge!


For all projects, requirements will be evaluated on a simple point scale of 0, 1, or 2. Additionally, instructors will provide you with feedback on required portions of your project.

Score Expectations
0 Incomplete.
1 Does not meet expectations.
2 Meets expectations, good job!
3 Surpasses our wildest expectations!

Note: Scores of 2 mean that a requirement has been completely fulfilled, while 3 is typically reserved for bonus objectives.


Your instructor will explain how to submit your assignment. Typically, this is done either by:

  • Creating a repository in your github profile, hosting your materials, and sharing a link with your instructor. [or]
  • Forking the project repository, adding your solutions, and submitting a pull request back to the relevant repo.