Permalink
Switch branches/tags
Nothing to show
Find file Copy path
9012b62 Jan 10, 2019
1 contributor

Users who have contributed to this file

71 lines (50 sloc) 2.72 KB

KNN & Classification

Unit 3: Required


Materials We Provide

Topic Description Link
Lesson K-Nearest Neighbors with Scikit-Learn Here
Solution Solution code for lesson prompts Here
Data 2015 Season Statistics for ~500 NBA Players Here
The Iris Dataset (Flowers) Here
Practice Two sample activities to practice KNN Here
Slides Sample slide deck for lesson topic (PPTX) Here

This lesson uses the Iris dataset and the NBA player statistics dataset. The Iris dataset allows students to easily make their own rules-based model and is easy to visualize for KNN. The NBA dataset results in a very nice curve for choosing K.


Learning Objectives

After this lesson, students should be able to:

  • Utilize the KNN model on the iris data set.
  • Implement scikit-learn's KNN model.
  • Assess the fit of a KNN Model using scikit-learn.

Student Requirements

Before this lesson(s), students should already be able to:

  • Load, explore, and manipulate data using Pandas
  • Create simple visualizations with Matplotlib
  • Interpret statistical information from box and scatter plots
  • Describe the statistical meaning of an "error"

Lesson Outline

TOTAL (170 min)

  • Learning Objectives (5 min)
  • Overview of the Iris Data Set (10 min)
    • Terminology
  • Exercise: "Human Learning" With Iris Data (60 min)
  • Human Learning on the Iris Data Set (10 min)
  • K-Nearest Neighbors (KNN) Classification (30 min)
    • Using the Train/Test Split Procedure (K=1)
  • Tuning a KNN Model (30 min)
    • What Happens If We View the Accuracy of our Training Data?
    • Training Error Versus Testing Error
  • Standardizing Features (15 min)
    • Use StandardScaler to Standardize our Data.
  • Comparing KNN With Other Models (10 min)

Additional Resources

For more information on this topic, check out the following resources: