Start from basis is important to me to understand how to handle basic data and start having real contact with the data and the code.
I found this tutorial very useful for these purposes. The use of Pandas for reading data from yahoo, google… and matplotlib to build an easy chart is key to take the first steps. The best of the tutorial are the comments about each step.
Problems I have found
Yahoo has closed the API that enabled Pandas to retrieve data. My colleague has found a workaround to continue using the API. The solution consists on:
- Add with pip the library: fix_yahoo_finance.
- Add these 2 lines to the code:
import fix_yahoo_finance as yf
yf.pdr_override() # <== that’s all it takes 🙂
This video is a must see to me as Jev Kuznetsov explains it from scratch.
- He shows the example using IPython NoteBook.
- He takes an strategy from: Quantified Strategies.
- Jev documents different topics here.
The first code I performed was this one:
from pandas_datareader import data
import pandas as pd
import matplotlib.pyplot as plt
# Define the instruments to download. We would like to see Apple, Microsoft and the S&P500 index.
tickers = [‘T’, ‘VZ’, ‘SPY’]
# Define which online source one should use
data_source = ‘google’
# We would like all available data from 01/01/2000 until 12/31/2016.
start_date = ‘2015-01-01’
end_date = ‘2017-10-10’
# User pandas_reader.data.DataReader to load the desired data. As simple as that.
panel_data = data.DataReader(tickers, data_source, start_date, end_date)
# Getting just the adjusted closing prices. This will return a Pandas DataFrame
# The index in this DataFrame is the major index of the panel_data.
close = panel_data.ix[‘Close’]
# Getting all weekdays between 01/01/2000 and 12/31/2016
all_weekdays = pd.date_range(start=start_date, end=end_date, freq=’B’)
# How do we align the existing prices in adj_close with our new set of dates?
# All we need to do is reindex close using all_weekdays as the new index
close = close.reindex(all_weekdays)
#print(close.head(-100))
print(close.describe())
close = close.fillna(method=’ffill’)
# Get the MSFT time series. This now returns a Pandas Series object indexed by date.
vz = close.ix[:, ‘VZ’]
spy = close.ix[:, ‘SPY’]
# Calculate the 20 and 100 days moving averages of the closing prices
short_rolling_vz = vz.rolling(window=20).mean()
long_rolling_vz = vz.rolling(window=100).mean()
# Plot everything by leveraging the very powerful matplotlib package
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.plot(vz.index, vz, label=’VZ’)
#ax.plot(spy.index, spy, label=’SPY’)
ax.plot(short_rolling_vz.index, short_rolling_vz, label=’20 days rolling’)
ax.plot(long_rolling_vz.index, long_rolling_vz, label=’100 days rolling’)
ax.set_xlabel(‘Date’)
ax.set_ylabel(‘Closing price ($)’)
ax.legend()
plt.show()