Intuit Data Science - Spring 2019 Course Info
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 18 commits behind IntuitPlano-DS:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
LICENSE.md
README.md
ds-installation-guide.pdf
git-instructions.md
intro-data-science.pdf

README.md

Intuit + GA: Data Science Course

Welcome to Data Science!

  1. Master Schedule
  2. Course Overview
  3. Your Team
  4. Projects
  5. Tech Requirements
  6. Student Expectations
  7. Python Practice Resources

Master Schedule

Fill this out after each class: Post-class exit ticket!

Office Hours:

  • Tuesdays 6-7pm (remote)

  • Thursdays 6-7pm (remote)

  • Book a time with Amit on Tuesday or Thurday

  • Tuesdays 6-10 pm (remote)

  • Sundays 12-6 pm (remote)

  • Book a time with Ed Salinas on Tuesday or Sunday

  • Fridays 12-1pm (in classroom)

You need to book a timeslot if you plan to attend remote office hours. A calendar booking link will be posted in Slack, and you will receive an invite to join a Zoom call at your scheduled time. You might like to download the Zoom client ahead of time at https://zoom.us/download#client_4meeting

Unit 1: Fundamentals

Week Date (Fridays) Class Topic
1 Feb. 1 Welcome to Data Science
ISL: Ch. 2.1
2 Feb. 8 Your Development Environment
3 Feb. 15 Python Foundations
4 Feb. 22 FLEX - Recommendation Systems
Unit 1 Quiz

Unit 2: Working with Data

Week Date (Fridays) Class Topic
5 Mar. 1 Exploratory Data Analysis in Pandas
Milestone: Unit 1 Project DUE - 11:59pm
6 Mar. 8 Experiments & Hypothesis Testing
7 Mar. 15 Data Visualization in Python
8 Mar. 22 Statistics in Python
Milestone: Unit 2 Project DUE
9 Mar. 29 FLEX - Intro to Natural Language Processing
Unit 2 Quiz
Milestone: Final Project: Project Proposal DUE

Unit 3: Data Science Modeling

Week Date (Fridays) Class Topic
10 Apr. 5 Linear Regression
ISL: Ch. 3.1-3, 3.5, 6.1, 6.2
11 Apr. 12 Train-Test Split & Bias-Variance
ISL: Ch. 2.2
12 Apr. 19 KNN / Classification
13 Apr. 26 Logistic Regression
ISL: Ch. 4.1-3
14 May 3 FLEX - Decision Trees, Random Forests, Bagging & Boosting
ISL: Ch. 8
Milestone: Final Project: Initial EDA DUE
Milestone: Unit 3 Project DUE

Unit 4: Data Science Applications

Week Date (Fridays) Class Topic
15 May 10 Unsupervised Learning (K-Means, Hierarchical)
Milestone: Final submission date for EDA and Project 3.
16 May 17 PCA & Anomaly Detection
Unit 3 Quiz
17 May 24 Intro to Time Series
18 May 31 Intro to Neural Networks
19 Jun. 7 FLEX
Unit 4 Quiz
Milestone: Final Project: Notebook Progress DUE

Capstone Project

Week Date (Fridays) Class Topic
20 Jun. 14 Capstone Preparation (Office Hours only)
21 Jun. 21 Capstone Preparation (Office Hours only)
22 Jun. 28 Capstone Preparation (Office Hours only)
23 Jul. 5 Capstone Preparation (Office Hours only)
24 Jul. 12 Final Project Presentations
Milestone: Final Project Presentation DUE

ISL: James, Gareth et al. "An Introduction to Statistical Learning." [PDF]


Your Instructional Team

Instructor:

Assistants:

Support Team:

Course Overview

Welcome to our Data Science Fundamentals course! We're building a global community of lifelong learners who are excited about using data to solve real business problems.

In this program, we will learn to use Python programming to explore datasets, build regression models, and communicate data driven insights. Specifically, you will learn how to:

  • Define common approaches and considerations that data scientists use to solve real world problems.
  • Perform exploratory data analysis with powerful programmatic tools in Python.
  • Build and refine basic regression and time series models to predict patterns from data sets.
  • Communicate data driven insights to peers and stakeholders in order to inform business decisions.

What You Will Learn

Statistical Analysis with Python

  • Perform visual and statistical analysis on data using Python and its associated libraries and tools.

Data-Driven Decision-Making

  • Define and determine the trade-offs involving feature selection, model accuracy, and data quality.

Data Science Modeling Techniques

  • Explore supervised learning techniques, focusing primarily on linear and logistic regression.

Visualizations & Presentations

  • Create visualizations and interactive notebooks to present to business stakeholders.

Project Structure

This course will ask you to complete a series of projects in order to practice and apply the skills covered in-class.

Unit Projects

At the end of each unit, you'll work on short structured projects. These activities will test your understanding of each unit’s most important concepts with in-class practice and instructor support.

For those of you who want to go above and beyond, we’ve also included stretch options, bonus activities, and other opportunities for further reading and practice.

Final Project

You'll also complete a final project, asking you to apply your skills to a business problem of your choice.

The capstone is an opportunity for you to demonstrate your new skills and tackle a pressing issue relevant to your team, division, or organization. You’ll create a hypothesis, analyze internal data, and generate a working model, prototype, solution, or recommendation.

You will get structured guidance and designated time to work throughout the course. Final project deliverables include:

  • Proposal: Describe your chosen problem and identify relevant data (while confirming you have access).
  • Brief: Share a summary of your initial analysis and next steps in order to get assistance from your instructional team.
  • Report: Submit a cleanly formatted Jupyter notebook (or other files) documenting your code and process for technical/peer stakeholders.
  • Presentation: Present a summary of your business problem, approach, and recommendation to an audience of non-technical executive stakeholders.

Project Breakdown

  1. Unit Project 1: Python Coding
  2. Unit Project 2: Exploratory Data Analysis
  3. Unit Project 3: Modeling
  4. Final Project: Solve an Intuit Business Problem
    • Part 1: Proposal & Dataset
    • Part 2: Initial EDA
    • Part 3: Solution Prototype
    • Part 4: Presentation

Technology Requirements

See the data science installation guide.

Please note: the curriculum materials for this course are written in Python 3.6.


Expectations

  1. Be on time
  2. Be willing to ask questions
  3. Be willing to ask stupid questions
  4. This is not a competition, we are all trying to climb our own mountain with the shared goal of learning
  5. Grow first by asking yourself "have I done all I can to answer this question - am I truly stuck?" before asking others. And when you do ask others, have them shepard you to your own conclusion not just give you the answer. This will help the teaching student and you in the long run
  6. You get out what you put in - no one else can do it for you, you will learn as much as you are willing.
  7. Be humble - no one here is an expert at everything

Road to Success

The emotional cycle of change: This course is fast and covers a lot of material. There will be times when you may feel discouraged or overwhelmed, but don't give up - this is natural (and part of the design). By the end of the course, you'll feel more confident in your ability to define problems, analyze data, and prototype solutions. Student learning responsibility: Our lessons cover topic foundations, but there is always more to learn! You are responsible for your learning experience - but don't get overwhelmed! Instead, just make sure you follow along, practice as much as possible, and ask questions. GA requirements: Show up. Be on time. Participate. Submit your projects. Allow yourself to struggle. Read the docs. Have fun! Q/A.


Python Practice Resources

If you have been enjoying the practice problems at the beginning of some classes, here are some great resources to find a library of more: