There are not too many fields in finance that really benefit from visualization in three dimensions. However, one application area is volatility surfaces showing implied
volatilities simultaneously for a number of times-of-maturity and strikes. In what follows, we artificially generate a plot that resembles a volatility surface. To this end, we consider:
Strike values between 50 and 150
Times-to-maturity between 0.5 and 2.5 years
This provides our two-dimensional coordinate system. We can use NumPy’s meshgrid
function to generate such a system out of two one-dimensional ndarray objects:
In [32]: strike = np.linspace(50, 150, 24) ttm = np.linspace(0.5, 2.5, 24)
strike, ttm = np.meshgrid(strike, ttm)
This transforms both 1D arrays into 2D arrays, repeating the original axis values as often
as needed: In [33]: strike[:2] Out[33]: array([[ 50. , 54.34782609, 58.69565217, 63.04347826, 67.39130435, 71.73913043, 76.08695652, 80.43478261, 84.7826087 , 89.13043478, 93.47826087, 97.82608696, 102.17391304, 106.52173913, 110.86956522, 115.2173913 , 119.56521739, 123.91304348, 128.26086957, 132.60869565, 136.95652174, 141.30434783, 145.65217391, 150. ], [ 50. , 54.34782609, 58.69565217, 63.04347826, 67.39130435, 71.73913043, 76.08695652, 80.43478261, 84.7826087 , 89.13043478, 93.47826087, 97.82608696, 102.17391304, 106.52173913, 110.86956522, 115.2173913 , 119.56521739, 123.91304348, 128.26086957, 132.60869565, 136.95652174, 141.30434783, 145.65217391, 150. ]])
Now, given the new ndarray objects, we generate the fake implied volatilities by a simple,
scaled quadratic function:
In [34]: iv = (strike - 100) ** 2 / (100 * strike) / ttm
# generate fake implied volatilities
The plot resulting from the following code is shown in Figure 5-23: In [35]: from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(9, 6)) ax = fig.gca(projection=‘3d’)
surf = ax.plot_surface(strike, ttm, iv, rstride=2, cstride=2, cmap=plt.cm.coolwarm, linewidth=0.5, antialiased=True)
ax.set_xlabel(‘strike’)
ax.set_ylabel(‘time-to-maturity’) ax.set_zlabel(‘implied volatility’) fig.colorbar(surf, shrink=0.5, aspect=5)
Figure 5-23. 3D surface plot for (fake) implied volatilities
Table 5-7 provides a description of the different parameters the plot_surface function
can take.
Table 5-7. Parameters for plot_surface
Parameter Description
X, Y, Z Data values as 2D arrays
rstride Array row stride (step size)
cstride Array column stride (step size)
color Color of the surface patches
cmap A colormap for the surface patches
facecolorsFace colors for the individual patches
norm An instance of Normalize to map values to colors
vmin Minimum value to map
shade Whether to shade the face colors
As with two-dimensional plots, the line style can be replaced by single points or, as in what follows, single triangles. Figure 5-24 plots the same data as a 3D scatter plot, but now also with a different viewing angle, using the view_init function to set it:
In [36]: fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111, projection=‘3d’) ax.view_init(30, 60)
ax.scatter(strike, ttm, iv, zdir=‘z’, s=25, c=‘b’, marker=‘^’)
ax.set_xlabel(‘strike’)
ax.set_ylabel(‘time-to-maturity’) ax.set_zlabel(‘implied volatility’)
Conclusions
matplotlib can be considered both the benchmark and the workhorse when it comes to
data visualization in Python. It is tightly integrated with NumPy and the basic functionality
is easily and conveniently accessed. However, on the other hand, matplotlib is a rather
mighty library with a somewhat complex API. This makes it impossible to give a broader overview of all the capabilities of matplotlib in this chapter.
This chapter introduces the basic functions of matplotlib for 2D and 3D plotting useful
in most financial contexts. Other chapters provide further examples of how to use this fundamental library for visualization.
Further Reading
The major resources for matplotlib can be found on the Web:
The home page of matplotlib is, of course, the best starting point:
http://matplotlib.org.
There’s a gallery with many useful examples: http://matplotlib.org/gallery.html. A tutorial for 2D plotting is found here:
http://matplotlib.org/users/pyplot_tutorial.html. Another one for 3D plotting is here:
http://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html.
It has become kind of a standard routine to consult the gallery, to look there for an appropriate visualization example, and to start with the corresponding example code. Using, for example, IPython Notebook, only a single command is required to get started
once you have found the right example.
Chapter 6. Financial Time Series
The only reason for time is so that everything doesn’t happen at once.
— Albert Einstein
One of the most important types of data one encounters in finance are financial time series. This is data indexed by date and/or time. For example, prices of stocks represent financial time series data. Similarly, the USD-EUR exchange rate represents a financial time series; the exchange rate is quoted in brief intervals of time, and a collection of such quotes then is a time series of exchange rates.
There is no financial discipline that gets by without considering time an important factor. This mainly is the same as with physics and other sciences. The major tool to cope with time series data in Python is the library pandas. Wes McKinney, the main author of pandas, started developing the library when working as an analyst at AQR Capital
Management, a large hedge fund. It is safe to say that pandas has been designed from the
ground up to work with financial time series. As this chapter demonstrates, the main inspiration for the fundamental classes, such as the DataFrame and Series classes, is
drawn from the R statistical analysis language, which without doubt has a strength in that
kind of modeling and analysis.
The chapter is mainly based on a couple of examples drawn from a financial context. It proceeds along the following lines:
First and second steps
We start exploring the capabilities of pandas by using very simple and small data
sets; we then proceed by using a NumPyndarray object and transforming this to a DataFrame object. As we go, basic analytics and visualization capabilities are
illustrated. Data from the Web
pandas allows us to conveniently retrieve data from the Web — e.g., from Yahoo!
Finance — and to analyze such data in many ways. Using data from CSV files
Comma-separated value (CSV) files represent a global standard for the exchange of
financial time series data; pandas makes reading data from such files an efficient
task. Using data for two indices, we implement a regression analysis with pandas.
High-frequency data
In recent years, available financial data has increasingly shifted from daily quotes to tick data. Daily tick data volumes for a stock price regularly surpass those volumes of daily data collected over 30 years.[24]
All financial time series data contains date and/or time information, by definition. Appendix C provides an overview of how to handle such data with Python, NumPy, and pandas as well as of how to convert typical date-time object types into each other.