XGBoost Video Lecture Transcript
This transcript was automatically generated, so there may be discrepancies between the video and the text.

Hi, everybody. Welcome back in this video. We're gonna touch on ensemble learning X G boost. So this is a, a nice Python package that implements the gradient boosting algorithm in a highly efficient way. Uh Let me go ahead and share the Jupiter notebook and we'll get started. So we're gonna introduce X G boost as a package and point to the package installation process.
So this one isn't necessarily as easy as some of our other packages have been to install. Uh We'll discuss what X G boost is and why we might use it over S K learn. Uh We'll show you how to implement gradient boosting in X G boost and particularly regression. Essentially, we're just gonna remake some of the stuff we covered in the gradient boosting notebook.
And then we'll demonstrate how X G boost implements early stopping, which is uh easier than the way we had to do it in S K learn. ... So in the previous notebook, we learned what gradient boosting actually is. And so as a reminder, we can think of it as we're just going to iterative train, weak learners where each next uh the next week learner trains to predict the residuals of the current we learner.
So for instance, in a regression problem, the first week learner maybe is a decision stump progressor that tries to predict the data, then we calculate the residuals and then the second week learner will try and predict those residuals. And then our prediction at any step is just to sum of all of the uh predictions from all of the weak learners uh up to that point.
So we learned how to use gradient boosting or regress or gradient booster regress or I forget the actual name in ESQ learn, but we learned how to do this in S Q learn. So what the heck is X G boost? And why would I use it? So um X G boost stands for extreme gradient boosting. It is a particular package in Python that uh does get utilized a lot and winning data science competitions.
Uh Maybe this is why it is increased in its popularity in recent years. Uh recent having 2022 as a as a reference point is when this was recorded. Um So X G boost is a package that implements gradient boosting in a way that's different than the way that S K learn implements it. And so the way that gradient or the way that X G boosts fits a gradient boosting classifier uh is quicker, it builds the decision trees uh in a in a quicker fashion.
Um It does this in a way that can also be parallelized. So you can run it on parallel processing um which also makes it faster. So essentially, the people behind X G boost just came up with an algorithm for fitting these decision trees doesn't have to be a stump, it can be like two or three deep. Um But found a way to fit these decision trees in a way that's much faster.
And in and in a way that can paralyze uh you can parallelize. I'm not sure if that's the way to say it, but essentially you can just make it run faster than um the gradient boosting, regress and S K learn. So that's why it's popular. Um I said that we talk about installation. So as a quick note, you likely don't have this installed on your computer yet unless you've used X G boost before.
Um when I installed X G boost on my machine, I'm, I'm running a uh a Macbook, a Macbook machine. So an a, an Apple laptop, uh it didn't work. At first, I had to install an extra bit of software onto my Macbook that is not necessarily directly related to Python. After I did that it worked. Uh just follow the instructions. If you use P P or conda, you can check out the links here on how to install it and see if that works.
Um You may also run into issues if you're running a Macbook with an M one chip the standard installation instructions may not work for you. So just simply running con to install X G boost may not work. Uh So you may want to perform a web search to find the relevant instructions for what you need to do to get it to work on your M one machine. Um If you have one, OK.
So with all that in mind, so you know, we know now how to install it hopefully, or at least we know where to go where to install it. And we know why people want maybe to use X G boost. So at this point, I'm assuming you have X G boost installed, we're gonna show you how to implement regression and X G boost. ... OK. So we're gonna go back to this problem where we have Y which as we've uh defined it as just X squared and we have a feature X and we're gonna build some regression trees on this.
So one way there's multiple ways to do this. But one way that's very similar to what we learned how to do an S K learn is you can make a gradient boosting regress and X G boost using X GB regressors. So the notation and syntax for X GB regressors has been written in a way that um emulates the way that you build a gradient boosting tree uh or gradient boosting regress and S K learn.
So this is a good starting point. And then if you want to learn more about extra boost, you can always go into the documentation. Um So we're gonna start here. So what we're gonna do is we're going to create an X GB regress or object. So we're gonna call and we're gonna essentially, we're gonna mimic that um graphic. We had uh where we demonstrate the different learning rates and how that impacts.
So we're gonna call X G boost dot uh And you know, here we imported the package. Uh We could have also just imported X GB regressors directly, but I didn't um X GB regress, we set a learning rates uh in the top. It'll be 0.1 and then we'll go down and change the one X GB R two. We will set a max depth of one. I believe the max depth is something like three.
It's not one um by default. Uh And then we'll set the number of estimators to just be 10. OK? And now we'll go down here and we'll change our learning rate now to fit. As I said, they wrote the specifically, they wrote X GB regressors to be incredibly similar to S K learn. So we just call dot fit X dot reshape negative 11. And then why? ... OK. And so you're seeing this just like we would have seen uh if this was an S K or an object.
So this means everything was fit correctly and this is just what was printed out when we called dot fit uh for the highlighted code. And then here's what we got. And then here we can also see this is how you predict. OK. So predicting is just the same as an S K R and it's not predict. And so these graphics should look somewhat similar. They're probably not exactly the same as the ones in the notebook number six gradient boosting, but they should be pretty close.
So a nice feature of X G boosts method in comparison to S K learns is that if you feed into it a validation set, you can go ahead and get the uh performance on that validation set along with like as soon as you've fitted something. So what we're gonna do is we're gonna generate a validation set because remember this is randomly generated. I don't need to do like a train test split.
I can just go, I can go out and get some data any time I want because it's randomly generated. OK. So if we feed in this validation set X V Y V, when we go to fit the regress, then I can go ahead and it will return um the performance at each step of the training process. So remember, we're and train a bunch of uh incrementally train a bunch of weak learners.
So at each step of that, we'll have a predictor or we'll have a regress. We can see how the regressive performs at that step which can allow us to see when we should stop training the model. When was the model the best? So what we do here is we define an X GB regress with 500 estimators and then the learning rate is 5000.1 and the maximum depth is one.
And then we're gonna fit the model. So this should look familiar X dot reshape negative 11 comma Y. And now the new argument here is the EVAL set. So the vow set uh argument is where you put the validation set, you put in a list. And then this list can hold a set of multiple validation sets. So if I had two validation sets, uh this would be a list with two entries, I only have a single validation set.
So that's why it's a list with one entry. And then each entry of your list for a validation set should have uh be stored in a tuple. The first entry of the two should be the features and the second entry of the tuple should be the output. OK. So now when I call fit a bunch of stuff is gonna get printed out. And what you see printed out is the validation set performance metric for regression.
The default is the root mean squared error. And this was for instance, the R MS E on the just a single decision stump. This is the R MS E with two decision stumps, three decision stumps. And we can track it as it goes, goes all the way to 504 99 because it's Python. Uh And then we can also see, well, what do we get? How can I access this? It's nice that it was printed out, but I was told I could access it.
How do I do that? You call X GB rag, which is the variable we stored this in and then it has a function called E vows result. And you can see what gets returned is a dictionary. And within that dictionary is another dictionary where we can access things like the R MS E which is what we want. And so here we can see X GB reg and then what is the key we need, we need validation zero.
... OK? Um Let's just copy and paste and what I actually, OK. This is what it was. I forgot Eva's Result. OK? And so we can see here now we have a dictionary ... and that dictionary has R MS E in it. And if we scroll, that's the only thing in it. So now all we have to do is access the R MS C with the key. And now we have the list which is the values. OK? And so once I have this, I could plot the R MS E as a function of the number of trees I've added, which is what I do with this code chunk.
And we can see here is the root mean square error on the vertical axis and the number of weak learners, and I've plotted the minimum. So the number of weak learners I needed to reach the minimum root, mean square error uh somewhere between 203 100. So this uh is another instance where you can see, I had to train like 250 more than I needed to once I hit the minimum here.
So it might be nice to do sort of an early stopping. So how can I do early stopping? Well, you just have to put it in as an argument um in your fit step. So here I'm gonna redefine the same exact X GB regressors. So I'm gonna have 500 weak learners as the default number that would get fit. Uh Each of them will have a maximum depth of one and a learning rate of 10.1.
So what I would do now is X GB ... reg dot Fit X dot reshape. So this is the same comma Y and then I want early stopping rounds. So all I have to do to ensure that early stopping happens is I put in the number of times that I need my uh my error measurement to not go down uh below the lowest, the number of times in a row, right? For it to not go below the lowest.
Uh I just need to set that with early stopping rounds. So early stopping rounds and if I said that equal to 10. Then this will mimic the exact same thing we did in S K learns version. And then I can set my Eval set just like before um X vow ... dot reshape negative 11 Y vow. And this should be stored in a tuple. So it's a two bowl within a list because you can, you can put in more than a single.
So if I wanted to, I could have put in a second validation set if I wanted to. OK. And so here you can see, well, this didn't go all the way to 500. And when I plot it out, we can see why ... because the minimum, uh the minimum value occurred probably around two. I'm sorry to tell what this is. This is in increments of 50 probably around 2 25 and then 10 steps after that, it realized, OK, I didn't get below that.
My early stopping rounds was sent to 10. So I'm gonna be done training this and say that I've hit my, my minimum. OK? ... And then here's where that looks like. So this is what the, um, ... this is what the prediction looks like using that minimum. OK. All right. So we've really just scratched the surface of what XX G boost can do. We've shown you how you can recreate what we did in S K learn. But in X G boost, uh if you wanna learn more, there's a lot more features to this that make it nice.
Um If you wanna learn more, you can check out the practice problems that go along with this notebook and you can just read the X G boost documentation that I've linked to here. OK. So I hope you enjoyed learning about X G boost. Uh I enjoyed having you come watch this video and I can't wait to see you until next time. All right. Have a great rest of your day. Bye.