ARIMA Video Lecture Transcript
This transcript was automatically generated, so there may be discrepancies between the video and the text.

Hi, everybody. Welcome back. In this video, we're going to learn more about time series forecasting with an ARIMA model. So let me go ahead and share my Jupiter notebook and we can get started, okay. So we're going to learn one last forecasting model called ARIMA. And in this notebook specifically, we'll touch on autoregressive models, which we have not talked about yet.Then we will combine auto aggressive models with moving average models and differencing. We'll introduce the ARIMA model, we'll mention why stationarity is important and then we'll demonstrate how to fit in ARIMA model in Python. So ARIMA models consist of three components and we've touched on the first of these moving, averaging, moving average models and our averaging and smoothing notebook.In this section, a component we have not discussed are auto regressive models. So what is an auto regressive model? Well, this is a model in which you regress onto previous observations of the time series. So YT will be a sort of model with a regression type model where instead of features from other other data sources. So for instance, maybe for the heat you might regress upon humidity.As well as time in a solely autoregressive model, you're only going to regress temperature on previous observations of the temperature. So specifically, what are we saying if I have a time series denoted by Y sub T?Then the autoregressive model of order P is YT is equal to A1 Y t -, 1 plus, and you don't see it here, but it'd be A2 Y t -, 2 plus, all the way up to alpha PY t -, P plus epsilon T here. All the alpha I's are parameters that you know you will need to fit, and epsilon T is some random noise.So we sometimes denote this as a capital ARP process, so auto aggressive of order P in the case when P is equal to 1, the model would reduce to YT is equal to alpha, Y, t -, 1. And this is a pretty common process and probability theory called the Markov process where your state at the next time step is only dependent upon your current state.So the value at YT is only dependent upon the value at Y, t -, 1. We're not going to explicitly demonstrate how to fit an auto aggressive model in Python because it is actually just a special case of the AREMA model, which we're going to talk about more later on. So if auto aggressive models are one part and then we said if moving averages models are another part, then there's sort of an ARMA model which is one step away from being an AREMA.And an ARMA model combines auto aggressive, so AR and moving average AM a model, so AR auto aggressive, MA moving average that's ARMA. And so remember for the statistical model underlying a moving average process of order Q, we have Y sub T is equal to beta 0 epsilon T.Plus Beta 1, epsilon t -, 1, plus... beta Q, epsilon t -, Q. So this is the epsilon T's. The epsilons here are a sequence of independent identically distributed random variables with mean zero and a set variance. Okay. So this is the moving average statistical model that underlies the statistical sorry, that underlies the moving average forecast that we saw above.Not above, but in a previous notebook. So combining the ARP with the MAQ gives an ARMA or Arma PQ process. And what do we mean by that? Well, specifically, the statistical model underlying an ARMA process is YT is equal to A1 YT minus one plus.And then again there'd be an A2, Y t -, 2... plus alpha, PY t -, P and then you add on the moving average process, which is beta 0, epsilon T plus beta 1, epsilon t -, 1 plus... plus beta Q, epsilon t -, Q. So from this we can see that if P is equal to 0, we would have an MAQ process, right? Because if P is 0, the left hand, the highlighted part disappears.And the other hand, if Q is equal to 0, we would recover an AR process. So if Q is equal to 0, all these would go away and we would be left with the autoregressive process. So we should note that an ARMA PQ price process is a stationary time series. So for instance if our assumptions hold that these epsilon T is a sequence of independent identically distributed random variables.Then it can be shown that this armor process that I'm highlighting right now is equivalent to or is a stationary time series, and so if for instance in practice you're dealing with a time series which is not stationary, it's not going to give you. This type of model will not give you a good fit or a good forecast in the long run, so this is an extremely handy model to be able to use.And as we've discussed in previous videos and notebooks, not every time series is stationary. So what can we do? That's the idea for the ARIMA model. So in our stationarity and autocorrelation video slash notebook we talked about how differencing can be used to produce a stationary time series from a non stationary 1 and so differencing is the third component.Of the auto aggressive integrated moving average model or ARIMA model. So we talked about the two modeling components which is the AR and the MA processes. So the auto regression, the moving averages the the thing that ties them together or not ties them together. The the third component to an ARIMA model is this differencing feature. So remember we showed this before by doing a first differencing on the Google stock closing price data set.And that first differencing took a nonstationary data set turn it into a stationary data set. The idea for an ARIMA model is you first perform something, you first perform this differencing procedure either once, twice, three turns out D times to the original time series and then you fit an ARMA model onto the difference series. So you take in your original time series, OKAY. Then you go ahead and.