Implementing the Kalman filter on stock data.

Daryl
4 min readDec 11, 2020

In yet another exploratory post, we attempt to understand and implement the Kalman filter on time series data, namely on the analysis of share price fluctuations.

In an earlier post, I covered a method via the Moving Average Convergence Divergence (MACD) indicator, as a means of uncovering trading opportunities.

This time, we shall go beyond the simple arithmetic of the MACD indicator and implement a more mathematically rigorous approach via the Kalman filter.

I covered the Fourier Transform in an earlier post as well, should the reader take an interest.

Similar to the Fourier Transform, the Kalman Filter is also another extremely useful tool developed by scientists and engineers that has been used in the analysis of financial markets.

A brief overview of the mathematical logic

Similar to the MACD, the Kalman filter on time series operates on the principle that more recent data should have a greater bearing on the calculation, as opposed to earlier data.

However, a full understanding of the Kalman Filter does require a comprehensive understanding of statistics and linear algebra.

Namely, nonlinear state estimator, covariance matrices, and stochastic processes.

Credits: Wikipedia

We shall not go into the math into too much detail here, as there already exist multiple articles & videos on Medium itself, Mathworks, and Youtube that go into great detail.

However, I will recommend the following series of videos provided by Mathworks, the creators of Matlab, that use the analogy of the sensor readings of a car.

What is the output of the Kalman Filter

In the analogy of our car, we consider the physical attributes related to the motion of the car.

The speed of a car, the direction it is heading, and other physical attributes of the car are given by the Optimal State Estimate.

In this specific case of our car, it is 2 dimensional vector of position and velocity, where velocity is the time derivative of position.

Optimal State Estimate

It is a multidimensional vector quantity, that specifies the relevant information, where the subscript k refers to the time step.

Credits: Mathworks

These estimates are also evolving in time, leading to the estimated value being continuously updated at each time step.

Credits: Mathworks

Here, we shall attempt an implementation of it via Python and see how it looks like.

Subsequent code samples are executed via Jupyter Lab

#Importing dependencies#from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import yfinance as yf
from scipy import poly1d
from datetime import datetime
import matplotlib.pyplot as plt
%matplotlib inline

Importing time series data, namely a stock price chart. We can choose any arbitrary choice of stock.

As with similar posts from before, we use Tesla.

ticker= yf.Ticker('TSLA')
tsla_df = ticker.history(period='max')
tsla_df['Adj Close'].plot(title='TSLA stock price ($)')

Confining our analysis to the periods of 2014 to 2019, as it looks fairly stable.

tsla_df = yf.download('TSLA',
start='2014-01-01',
end='2019-12-31',
progress=False)
tsla_df.head()

As before, we restrict out analysis to the adjusted close.

df = tsla_df[['Adj Close']]
df.head()

Plotting the graph of adjusted close against time

Implementing the Kalman filter to help aid deduction of buying/selling opportunities.

f = KalmanFilter(transition_matrices = [1],
observation_matrices = [1],
initial_state_mean = 0,
initial_state_covariance = 1,
observation_covariance = 1,
transition_covariance = 0.0001)
mean, cov = kf.filter(df['Adj Close'].values)
mean, std = mean.squeeze(), np.std(cov.squeeze())
plt.figure(figsize=(12,6))
plt.plot(df['Adj Close'].values - mean, 'red', lw=1.5)
plt.title("Kalman filtered price fluctuation")
plt.ylabel("Deviation from the mean ($)")
plt.xlabel("Days")

From this price chart, instead of just looking at raw price , we have transformed the data into a function of deviation against time.

Instead of just reporting the deviation from the static mean, the Kalman filtered data measures the deviation from the time evolving mean.

For more aggressively managed portfolios, this may prove to be a more useful metric.

And that shall be all for today!

--

--

Daryl

Graduated with a Physics degree, I write about physics, coding and quantitative finance.