Averaging and Smoothing II Video Lecture Transcript
This transcript was automatically generated, so there may be discrepancies between the video and the text.

Hi, everybody. Welcome back. In this video, we continue to learn about averaging and smoothing forecasts and we're going to focus more on the smoothing part of this title. Let me go ahead and go back to my Jupiter notebook. So where we left off in the last video was with weighted averaging models and forecasts. Now we're going to talk about exponential smoothing forecasts.So in order to define these types of forecasts, we're going to see three types of forecasts. Again for non seasonal, non trend data, then for trend data. Then for seasonal data, we're going to have to reintroduce this hat notation that we've seen in the past. So remember in the past, we usually take the hats to denote the thing that's being estimated. So in this setting, we'll let the hat denote the forecast according to the model at time T okay?So the first model we're going to see for smoothing is called simple exponential smoothing. And so here for simple exponential smoothing we define it to be given the forecast at time T or the estimate at time T to be given in the following way. So we take alpha times Y sub T so the observed value at time T plus I think actually this should be t -, 1.And then this should also be t -, 1. OK, so we take the observed value at the previous time step, multiply it times alpha.And then plus one minus alpha times the forecasted value at the previous time step. So that's where in the training set. And then on the test set you take for, sorry, not the test set. But then going into the future after observation N, you would do alpha times the actual value at N + 1 minus alpha times the forecasted value at N where this forecast would be given by the above for the value N.So alpha is between zero and one, and it's a hyperparameter that you can select by hand. Or you could use an algorithm. For instance, you could do what's called maximum likelihood to fit it according to the best fit from the data set. Or you could do a cross validation to find the one that gives you the best generalization error. So there are two ways to maybe think about simple exponential smoothing that might help you process with the models doing.So one way we can do it is you can make an adjustment here where you can rearrange it so that the forecast at time T + 1, well, that's equal to the forecast at time T, which you get from distributing out the 1 minus alpha plus alpha times the predicted or the actual minus the predicted at time T.So in some sense, you can think of this as a little bit of an adjustment of a naive forecast from that baseline notebook where you're taking the predicted value at time T and then adding on a little bit of an error term. And that's sort of one way to think about it is it's kind of just like A twist on the naive forecast or the random walk model.Another way we can think about it as is as a weighted average that includes all the previous observations. So if you go through the time and work all this out recursively going back to time step one, you can see that the forecast at time step T + 1 is alpha times the quantity Y sub T + 1 minus alpha, Y sub t -, 1 + 1 minus alpha squared, Y sub t -, 2 plus dot dot .1 minus alpha to the t -, 1 Y sub one.So this is a believe this is a geometric sum and that's actually why it's called exponential smoothing, because the coefficients will lie on an exponential curve and so anyway. So in essence it might be better to call it geometric smoothing. But this is what they called it. So we can think of this when we write it out like this, it's a weighted sum that includes all of the prior points.And then this allows us a way to sort of find an optimal value of alpha where in this sense we're thinking optimal. Maybe it means different things depending on what you're trying to do. Maybe you're trying to find the best model that fits a training set, or just a set of data. Or maybe you're trying to find the best value of alpha for making forecasts and predictions. So that's what we mean by choosing optimal, by contrast.Let's say that we wanted to find unique weights for each observation. That would be very difficult. We would have T yeah, we would have T weights to try and tune, which might be, you know, expensive. So in Python, you can implement simple exponential smoothing using the simple exponential smoothing model and stats models. Before we go ahead and show you how to do that, let's just do a quick check to see that you have it installed.So if you can run these two pieces of code and they run so import runs and then you can print out what your version is, then you're good to go. You may have some issues if your version is earlier than my version, so when I wrote this I had 0.13 point one. You may also have some issues if your versions is later than mine.But for the most part it should run and then if you have stats models and it's earlier and it's later and you're getting errors, it might be useful to dive into the documentation and see why it's not working, or just upgrade, or I guess in some sense downgrade. Anyway, if you need help installing with for PIP or Conda, depending on what you use, you can go ahead and click on this link and they give you the the code to to install it. Okay.So simple. The simple exponential smoothing can be implemented with the simple X smoothing model type in stats models. So from stats models and so this is the actual. An extra part that we're not used to seeing maybe is we have to do so. It's stored within the time series analysis API within stats models, so we have to import simple Exp smoothing.And now we're going to see how you make the model. So the first thing you do is you're actually going to both make the model and then fit the model in the same step. So we call simple Exp smoothing, and then you put in the data set that you're going to train on. So you put in GOOG train, dot closing price dot values.And then you go ahead. Let's see, is there anything else you have to do? Nope. And then the next thing you do is you do dot fits. So think of this as if we were to.