Autocorrelation
Autocorrelation is correlation within a dataset and can indicate a trend.
For a given time series, with known mean and standard deviations, we can define the autocorrelation for times s
and t
using the expected value operator as follows:
This is, in essence, the formula for correlation applied to a time series and the same time series lagged.
For example, if we have a lag of one period, we can check if the previous value influences the current value. For that to be true, the autocorrelation value has to be pretty high.
In the previous chapter, Chapter 6, Data Visualization, we already used a Pandas function that plots autocorrelation. In this example, we will use the NumPy correlate()
function to calculate the actual autocorrelation values for the sunspots cycle. At the end, we need to normalize the values we receive. Apply the NumPy correlate()
function as follows:
y = data - np.mean(data) norm = np.sum(y ** 2) correlated = np.correlate(y, y, mode='full')/norm...