Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
assets
code/starter-code
readme.md

readme.md

title type duration creator
API Lab
lab
1:25
name city
Francesco Mosconi
SF

API Lab

Introduction

In this lab we will retrieve data from APIs and use it to solve two problems:

  1. build a decision tree regressor that estimates the quality of a wine
  2. analyze the top 250 movies from the IMDB

This combines what you have learned about decision trees with what you have learned about APIs. We will start using Sheetsu, a neat free service that allows you to turn any spreadsheet into an API. Then we will move onto scraping data from the Internet Movie Database (IMDB) and use the data we obtain to investigate top grossing movies.

Instructor notes:

  1. Check that the spreadsheet is available at https://docs.google.com/spreadsheets/d/1mZ3Otr5AV4v8WwvLOAvWf3VLxDa-IeJ1AVTEuavJqeo/

Exercise

Requirements

  1. Get Data From Sheetsu
  • Post Data to Sheetsu
  • Munge data
    • explore missing data
    • perform EDA
  • Extract Features and Train Model
  • IMDB Movies EDA
    1. Get top movies
    • Get top movies data
    • Get grossing data
    • Munge data:
      • explore missing
      • use correct column types
    • Vectorize text

Bonus:

  • Final Questions: what relationship is there between top actors and movie grossing?

Starter code

Starter Code

Solution Code

Useful Links