Unit 3: Required
Materials We Provide
|Lesson||K-Nearest Neighbors with Scikit-Learn||Here|
|Solution||Solution code for lesson prompts||Here|
|Data||2015 Season Statistics for ~500 NBA Players||Here|
|The Iris Dataset (Flowers)||Here|
|Practice||Two sample activities to practice KNN||Here|
|Slides||Sample slide deck for lesson topic (PPTX)||Here|
This lesson uses the Iris dataset and the NBA player statistics dataset. The Iris dataset allows students to easily make their own rules-based model and is easy to visualize for KNN. The NBA dataset results in a very nice curve for choosing K.
After this lesson, students should be able to:
- Utilize the KNN model on the iris data set.
- Implement scikit-learn's KNN model.
- Assess the fit of a KNN Model using scikit-learn.
Before this lesson(s), students should already be able to:
- Load, explore, and manipulate data using Pandas
- Create simple visualizations with Matplotlib
- Interpret statistical information from box and scatter plots
- Describe the statistical meaning of an "error"
TOTAL (170 min)
- Learning Objectives (5 min)
- Overview of the Iris Data Set (10 min)
- Exercise: "Human Learning" With Iris Data (60 min)
- Human Learning on the Iris Data Set (10 min)
- K-Nearest Neighbors (KNN) Classification (30 min)
- Using the Train/Test Split Procedure (K=1)
- Tuning a KNN Model (30 min)
- What Happens If We View the Accuracy of our Training Data?
- Training Error Versus Testing Error
- Standardizing Features (15 min)
- Use StandardScaler to Standardize our Data.
- Comparing KNN With Other Models (10 min)
For more information on this topic, check out the following resources: