The Web today provides a wealth of financial information for free. Web giants such as Google or Yahoo! have comprehensive financial data offerings. Although the quality of the data sometimes does not fulfill professional requirements, for example with regard to the handling of stock splits, such data is well suited to illustrate the “financial power” of
pandas.
To this end, we will use the pandas built-in function DataReader to retrieve stock price
data from Yahoo! Finance, analyze the data, and generate different plots of it.[25] The required function is stored in a submodule of pandas:
In [47]: import pandas.io.data as web
At the time of this writing, pandas supports the following data sources:
Yahoo! Finance (yahoo)
Google Finance (google)
St. Louis FED (fred)
Kenneth French’s data library (famafrench)
World Bank (via pandas.io.wb)
We can retrieve stock price information for the German DAX index, for example, from Yahoo! Finance with a single line of code:
In [48]: DAX = web.DataReader(name=‘^GDAXI’, data_source=‘yahoo’, start=‘2000-1-1’)
DAX.info()
Out[48]: <class ‘pandas.core.frame.DataFrame’>
DatetimeIndex: 3760 entries, 2000-01-03 00:00:00 to 2014-09-26 00:00:00 Data columns (total 6 columns):
Open 3760 non-null float64 High 3760 non-null float64 Low 3760 non-null float64 Close 3760 non-null float64 Volume 3760 non-null int64 Adj Close 3760 non-null float64 dtypes: float64(5), int64(1)
Table 6-5 presents the parameters that the DataReader function takes.
Table 6-5. Parameters of DataReader function
Parameter Format Description
name String Name of data set — generally, the ticker symbol
data_sourceE.g., “yahoo” Data source
start String/ datetime/NoneLeft boundary of range (default “ 2010/1/1“)
The tail method provides us with the five last rows of the data set:
In [49]: DAX.tail()
Out[49]: Open High Low Close Volume Adj Close Date 2014-09-22 9748.53 9812.77 9735.69 9749.54 73981000 9749.54 2014-09-23 9713.40 9719.66 9589.03 9595.03 88196000 9595.03 2014-09-24 9598.77 9669.45 9534.77 9661.97 85850600 9661.97 2014-09-25 9644.36 9718.11 9482.54 9510.01 97697000 9510.01 2014-09-26 9500.55 9545.34 9454.88 9490.55 83499600 9490.55
To get a better overview of the index’s history, a plot is again generated easily with the
plot method (cf. Figure 6-3):
In [50]: DAX[‘Close’].plot(figsize=(8, 5))
Figure 6-3. Historical DAX index levels
Retrieving data and visualizing it is one thing. Implementing more complex analytics tasks is another. Like NumPyndarrays, pandas allows for vectorized mathematical operations on
whole, and even complex, DataFrame objects. Take the log returns based on the daily
closing prices as an example. Adding a column with the respective information could be achieved with the following code, which first generates a new, empty column and then iterates over all indexes to calculate the single log return values step by step:
In [51]: %%time
DAX[‘Ret_Loop’] = 0.0
for i in range(1, len(DAX)):
DAX[‘Ret_Loop’][i] = np.log(DAX[‘Close’][i] / DAX[‘Close’][i - 1]) Out[51]: CPU times: user 452 ms, sys: 12 ms, total: 464 ms Wall time: 449 ms
In [52]: DAX[[‘Close’, ‘Ret_Loop’]].tail() Out[52]: Close Ret_Loop Date 2014-09-22 9749.54 -0.005087 2014-09-23 9595.03 -0.015975 2014-09-24 9661.97 0.006952 2014-09-25 9510.01 -0.015853 2014-09-26 9490.55 -0.002048
Alternatively, you can use vectorized code to reach the same result without looping. To this end, the shift method is useful; it shifts Series or whole DataFrame objects relative
to their index, forward as well as backward. To accomplish our goal, we need to shift the
In [53]: %time DAX[‘Return’] = np.log(DAX[‘Close’] / DAX[‘Close’].shift(1)) Out[53]: CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.52 ms
In [54]: DAX[[‘Close’, ‘Ret_Loop’, ‘Return’]].tail() Out[54]: Close Ret_Loop Return Date 2014-09-22 9749.54 -0.005087 -0.005087 2014-09-23 9595.03 -0.015975 -0.015975 2014-09-24 9661.97 0.006952 0.006952 2014-09-25 9510.01 -0.015853 -0.015853 2014-09-26 9490.55 -0.002048 -0.002048
This not only provides the same results with more compact and readable code, but also is the much faster alternative.
VECTORIZATION WITH DATAFRAMES
In general, you can use the same vectorization approaches with pandasDataFrame objects as you would whenever
you could do such an operation with two NumPyndarray objects containing the same data.
One column with the log return data is enough for our purposes, so we can delete the other one:
In [55]: del DAX[‘Ret_Loop’]
Now let us have a look at the newly generated return data. Figure 6-4 illustrates two stylized facts of equity returns:
Volatility clustering
Volatility is not constant over time; there are periods of high volatility (both highly
positive and negative returns) as well as periods of low volatility. Leverage effect
Generally, volatility and stock market returns are negatively correlated; when markets come down volatility rises, and vice versa.
Here is the code that generates this plot:
In [56]: DAX[[‘Close’, ‘Return’]].plot(subplots=True, style=‘b’, figsize=(8, 5))
Figure 6-4. The DAX index and daily log returns
traders might be more interested in moving averages, or so-called trends. A moving average is easily calculated with the rolling_mean function of pandas (there are other
“rolling” functions as well, like rolling_max, rolling_min, and rolling_corr):
In [57]: DAX[‘42d’] = pd.rolling_mean(DAX[‘Close’], window=42) DAX[‘252d’] = pd.rolling_mean(DAX[‘Close’], window=252)
In [58]: DAX[[‘Close’, ‘42d’, ‘252d’]].tail()
Out[58]: Close 42d 252d Date 2014-09-22 9749.54 9464.947143 9429.476468 2014-09-23 9595.03 9463.780952 9433.168651 2014-09-24 9661.97 9465.300000 9437.122381 2014-09-25 9510.01 9461.880476 9440.479167 2014-09-26 9490.55 9459.425000 9443.769008
A typical stock price chart with the two trends included then looks like Figure 6-5: In [59]: DAX[[‘Close’, ‘42d’, ‘252d’]].plot(figsize=(8, 5))
Figure 6-5. The DAX index and moving averages
Returning to the more options trader-like perspective, the moving historical standard deviation of the log returns — i.e. the moving historical volatility — might be more of interest:
In [60]: import math
DAX[‘Mov_Vol’] = pd.rolling_std(DAX[‘Return’],
window=252) * math.sqrt(252) # moving annual volatility
Figure 6-6 further supports the hypothesis of the leverage effect by clearly showing that the historical moving volatility tends to increase when markets come down, and to decrease when they rise:
In [61]: DAX[[‘Close’, ‘Mov_Vol’, ‘Return’]].plot(subplots=True, style=‘b’, figsize=(8, 7))