title  duration  creator  

Intro to Pandas & Time Series Data 
1:25 

Intro to Time Series
Week 9  Lesson 1.3
LEARNING OBJECTIVES
After this lesson, you will be able to:
 Understand what time series analysis is and what it is used for
 Use Pandas to model and manipulate a Time Series
 Explain the functionality afforded to the DateTime object
STUDENT PREWORK
Before this lesson, you should already be able to:
 Load data into a Pandas DataFrame
 Access data in a DataFrame object
 Use Pandas' built in descriptive statistics functions
INSTRUCTOR PREP
Before this lesson, instructors will need to:
 Review the GOOG data set from week 2 for familiarity
 Review both .ipynb notebooks. You will be livecoding the Introduction, Demo, and part of the Discussion sections in these notebooks. Feel free to diverge from the provided solution code, these are just suggestions. Also, feel free to add more exercises. You know your students best!
LESSON GUIDE
TIMING  TYPE  TOPIC 

5 min  Opening  What Is Time Series Analysis? 
15 min  Introduction  The DateTime Object 
20 min  Demo  Time Series In Pandas 
15 min  Discussion  Date Ranges and Frequencies 
25 min  Independent Practice  Manipulating a Time Series 
5 min  Conclusion  Recapitulation 
What Is Time Series Analysis? (5 mins)
 Statistical modeling of time ordered data observations
 Two main goals:
 Identifying the underlying mechanisms represented by the sequence of observations
 Forecasting: predicting the future values of a variable described in the time series
 Examining multiple time series to model dynamic relationships
Instructor Note: Have the students list the possible business uses for time series analysis, i.e.: Financial Analysis/Forecasting, retail inventory planning, CDC predictions, neuroscience, signal processing, etc.
Check: Recall the np.correlate
function from Week 2, which we used to analyse the relationship between GOOG and AAPL stocks.
The DateTime Object (15 mins)
As our data will be ordered by time, we will need a powerful library for dealing with timestamps. Luckily, Python provides a module that gives us both simple and complex methods of manipulating dates and times. The cornerstone of the datetime module is the DateTime object, a container representing a time that is either aware or naive. Aware DateTimes have information regarding time zone and daylight savings time, a naive DateTime does not.
Let's check out the DateTime Documentation.
from datetime import datetime
# Time this lesson plan was written
lesson_date = datetime(2016, 3, 5, 23, 31, 1, 844089)
The DateTime object has all kinds of descriptive methods. Let's try some!
lesson_date.day
lesson_date.month
lesson_date.year
lesson_date.hour
NOTE: See Reference A below for all components that can be extracted from a DateTime object.
We can also use a timedelta
object to shift a DateTime object. Here's an example:
from datetime import timedelta
offset = timedelta(days=1, seconds=20)
offset.days
offset.seconds
offset.microseconds
now = datetime.now()
now
now + offset
now  offset
Code: Open the datetime.ipynb notebook and complete the 4 exercises
Time Series In Pandas (20 mins)
Let's load switch over to the timeseries.ipynb notebook, and I'll walk you through loading a time series into Pandas. We'll also go over applying the DateTime functionality to the time series.
Date Ranges and Frequencies (15 mins)
Using the Pandas documentation, take a few minutes to read about the asfreq
and resample
methods.
Instructor's Note: Give the students a few minutes to read about these methods. Have a brief discussion about the implications of both.
Let's go back to our timeseries.ipynb notebook and implement the two functions to get a better idea of what they do.
Note that asfreq
gives us a method
keyword argument. Backfill, or bfill, will propogate the last valid observation forward. In other words, it will use the value preceding a range of unknown indices to fill in the unknowns. Inversely, pad, or ffill, will use the first value succeeding a range of unknown indices to fill in the unknowns.
Now, let's discuss the following points:
 What does
asfreq
do?  What does
resample
do?  What is the difference?
 When would we want to use each?
We can also create our own date ranges using a built in function, date_range
. The periods
and freq
keyword arguments grant the user finegrained control over the resulting values. To reset the time data, use the normalize=True
directive.
NOTE: See Reference B below for all of the possible
We are also given a Period object, which can be used to represent a time interval. The Period object consists of a start time and an end time, and can be created by providing a start time and a given frequency.
Manipulating a Time Series (25 mins)
Let's break up into groups and look at the different ways we can manipulate our time series.
Try the following to mutate df_goog
to represent a daily, weekly, and monthly granularity.
When you have data on a daily level, use the Period and date_range functionalities to practice retrieving data from a DataFrame for a given range or frequency.
asfreq
resample
Period
date_range
BONUS:
 Create a new DataFrame with the daily change for each column in df_goog (hint: you'll need to reset the index to a daily timeframe)
 Apply models studied previously to gauge the relationship between a random sampling of columns from df_goog
 Create an Aware DateTime object with Boston's UTC offset.
Recapitulation (5 mins)
 Recap the objects and methods discussed
 Discuss how these techniques will help with the Kaggle challenge
 Repeat the importance of reading the documentation (does it do what you think it does, are you reinventing the wheel, etc.)
ADDITIONAL RESOURCES
Reference
A) Time/Date components that can be accessed from a DateTime object (source)
Alias  Description 

year  The year of the datetime 
month  The month of the datetime 
day  The days of the datetime 
hour  The hour of the datetime 
minute  The minutes of the datetime 
second  The seconds of the datetime 
microsecond  The microseconds of the datetime 
nanosecond  The nanoseconds of the datetime 
date  Returns datetime.date 
time  Returns datetime.time 
dayofyear  The ordinal day of year 
weekofyear  The week ordinal of the year 
week  The week ordinal of the year 
dayofweek  The day of the week with Monday=0, Sunday=6 
weekday  The day of the week with Monday=0, Sunday=6 
quarter  Quarter of the date: Jan=Mar = 1, AprJun = 2, etc. 
days_in_month  The number of days in the month of the datetime 
is_month_start  Logical indicating if first day of month (defined by frequency) 
is_month_end  Logical indicating if last day of month (defined by frequency) 
is_quarter_start  Logical indicating if first day of quarter (defined by frequency) 
is_quarter_end  Logical indicating if last day of quarter (defined by frequency) 
is_year_start  Logical indicating if first day of year (defined by frequency) 
is_year_end  Logical indicating if last day of year (defined by frequency) 
B) Time offset aliases (source)
Alias  Description 

B  business day frequency 
C  custom business day frequency (experimental) 
D  calendar day frequency 
W  weekly frequency 
M  month end frequency 
BM  business month end frequency 
CBM  custom business month end frequency 
MS  month start frequency 
BMS  business month start frequency 
CBMS  custom business month start frequency 
Q  quarter end frequency 
BQ  business quarter endfrequency 
QS  quarter start frequency 
BQS  business quarter start frequency 
A  year end frequency 
BA  business year end frequency 
AS  year start frequency 
BAS  business year start frequency 
BH  business hour frequency 
H  hourly frequency 
T, min  minutely frequency 
S  secondly frequency 
L, ms  milliseonds 
U, us  microseconds 
N  nanoseconds 