Voter Models Video Lecture Transcript This transcript was automatically generated, so there may be discrepancies between the video and the text. Hi, everybody. Welcome back in this video. We're going to learn more about ensemble learning with voter models. Let me go ahead and share my Jupiter notebook. So in this video, we're gonna talk about voter models. Uh There's not much more to say. So we will cast this in the light of classification. So these are gonna be voter models that work on classification problems but it also works for aggression will show ex uh will mention explicitly how it works for regression later in the notebook. So the idea behind the voter model method uh is that you say maybe have a few different classifiers for whatever classification problem you're working on, you think they're all pretty good. And so for instance, maybe you have a logistic regression model, a cane neighbors model and a support vector machine and maybe a random force. So there are four different models, a voting classifier then looks at how each of your classifiers decides to classify a point and goes with the decision that receives the most votes. So what this means is, for instance, for each observation, you feed into one of these four classifiers, each of them would then say, uh All right, I think this is a zero or I think this is a one, you count out the number of zeros, you count up the number of ones and you go with the majority. Uh If there is a tie, it will just randomly choose, you know, sort of by flipping a coin. Uh this works for multi class models as well. So for instance, if you have 01 and 3012 and three, it will look for the one with the most. And again, if there's a tie, it will just randomly choose the winner from the tide classes. So if 01 and three, uh so if 012 and three each had one vote, it would just randomly choose between the four. OK. So we're gonna implement this on this data set, which would've been using a lot for our classification problems. Uh We're going to show you how to fit a voting classifier and then show you the decision boundary that results. So we're going to fit exactly what we said. So we need to import all of the base classifiers. So from, from S K learn dot neighbors import nearest neighbors classifier. Oh jeez, I probably got this wrong. Uh Maybe K neighbors. Here we go from S K learn dot linear model import logistic regression from S K learn dot S BC, import uh linear S BC. And this should be actually BS V M from S K learn, what's the last time we need random forest from S K learn dot ensemble import random forest classifier. Now, we need the voting classifier from S K learn dot ensemble import voting classifier. And then we're gonna use accuracy as a metric. So from S K learn dot metrics import accuracy score. Now we will make our base models, meaning that we're going to make for comparison purposes. A K neighbors model, a logistic regression model, a support vector machine model and a random forest. So K N N is equal to K neighbors classifier. And let's go with 10 10 neighbors. Uh logistic regression is equal to logistic regression. ... I'm just checking some names real quick. OK. So actually, this should be log, that's why I called it down here. I just wanna make sure I'm consistent. Uh support vector machine is S V M equals to linear S V C and let's sit C equal to one. And then finally, for our random forest, our F is equal to random forest classifier and then we'll say 500 then we'll have maximum depth equal to five. Uh Is there anything else I need ... a rand? Let's set a random states just so uh let's say 0.22203. OK. Now we're gonna show you how to make a voting classifier. So these are, we're gonna fit each of one of these individually and then show you how it performs on this data set as a comparison point. And now here we're gonna show you how to make a voting classifier. And the syntax is very similar to a pipeline. So you first call voting classifier and then you're gonna input a list ... of two poles. Uh Let me not close that list yet of two poles. And so each two pole is gonna have a string, which is the name of that classifier as well as the classifier object. So the first one we'll put in is the K nearest neighbors. And we just want it to be exactly the same. So we'll copy and then paste. And so that's our first classifier. So again, you have a list of classifiers and each classifier in your list is gonna have a name. So you can access it if you would like as well as the classifier object. So now that's K near neighbors. The next one we need is logistic regression. ... Then we need our S V M ... and then finally, ... we need our random forest ... copy, all right R F and then close your list because you've filled them all in. OK. Now I have a loop here. What this loop is gonna do is it's gonna go through fit each individual classifier, print out the accuracy on the training set for that classifier and then draw the decision boundary that results. So we're gonna do this for all four of our base classifiers. And then we'll show you how the voting classifier compares. So we once we have the voting classifier object, you fit it and you predict on it in exactly the same way you would for any other S K learn model. ... OK. So here's the logistic regression. Here is a random forest. ... Here is the support vector machine, here is key near neighbors ... and then here is the voting classifier. And so you can see if we can compare the different decision boundaries. There are some, you know, there are some here that have perfect uh accuracy, maybe we can change it. So it's a little bit more instructive. It's not to say that you want worse classifiers, but I want to be able to show sort of show what's going on with the um um uh with the model for the voting classifier. Um ... OK. ... So you can see here that individual classifiers may make mistakes on their own. So for instance, it looks like the logistic regression boundary is a little bit above the actual boundary. The random forest kind of has sort of the box that the the base decision trees have which cause it to maybe uh incorrectly classify. Now, we, you know, we could have solved this with maybe increasing the depth which um I changed just to have illustrative purposes for the voting classifier. In practice, I would have went with the random forest that is better than this one. Um But here we're getting some of the boundaries that aren't as good due to the overfitting on the data set maybe. Uh And then we have support vector and support vector classifier seems pretty good except there is sort of that similar with the logistic aggression. The line appears to go a little bit above. Uh And then K nearest neighbors, we miss this corner set because these three triangles are too close to the rest of the blue circles and far away from the remaining triangles. And so what can happen with the idea here with the voting classifier is it's supposed to if you have base classifiers. So for instance, this incorrectness is different than the way that the random forest is incorrect is different than the way that the logistic regress uh regress is incorrect. Uh And so the idea here is that if you have your voting classifier where each of its uh base classifiers are wrong in different enough ways the voting classifier should be able to overlap enough on the things that are right in order to outperform any single base. And so the idea being so for, I think a good illustrative purpose of this is the cannier neighbors gets these three triangles wrong. But for the most part, it gets everything else right. Uh All the other classifiers get those three triangles correct. ... And so when it comes time for voting, uh three of the four are correct, meaning that the voting classifier will be correct down there. OK? And that's, that's sort of the idea with this voting classifier. If you have models that are independent enough from one another, uh make mistakes in unique ways, those mistakes will get covered up when you bring all of them together to vote on each individual observation. So one thing you might be wondering when I did this, there is this argument called voting equals hard, which I didn't say anything about when I fit the model above. And so why did I do that? Well, voting equals hard uh is the exact type of uh prediction method that we talked about. So there's literally just a counting of hands between. All right. How many of you think this is a one? How many of you think this is a zero? Uh This is called hard voting. It's voting in the way that you might think of it. Um based on like, you know, just taking a straw poll, straw poll and asking, you know, how many people do you think there should be a zero? How many people think it should be a one? Uh There's also the option to do voting equals soft. And so this is called soft voting. Um For this type of voting classifier, you need predictions according you make predictions according to the probabilities assigned. So it's soft because you're considering the considering the nuance of, well, maybe I think that this has a 52% chance of being a one versus uh observations you think a 99% chance of being a one and so forth. And so for soft voting, the way that the major or the way that the prediction is determined, um You choose the observation for which this sum, uh which is the probability for each class from each of the different classifiers summed together is largest. So for instance, uh if you had an observation where this sum uh across all four classifiers, that the probability is equal to one uh is largest, that's what you'd go for. OK. So each of these is what's the probability that this observation is of class C given the features, I sum all those up across all the different classifiers. Capital B V being the number of possible classifiers. Uh And then you choose the class for which this sum is largest. That's soft voting. Uh I also point out that you can do this sort of weighted voting by giving uh weights to each classifier. So maybe you would want to weigh the classifier by its accuracy on some training or validation set. Um That's also an option. Uh So it doesn't have to be uniform voting. In which case if it was weighted voting, maybe one of the classifiers gets two uh as a weight and the others just have one as a weight. And so the one with a weight of two has twice the input as any of the other classifiers. For instance, um the same thing would go for here where you just multiply these probabilities by the weights of the classifiers. So that's how it works for classification. I said I would explain to you what to do for regression. Uh So voter models for regression work in a very similar way. But instead of doing like a all right, I think it should be 2.4 versus I think it should be 3.6. Your voting is just taking an average of the prediction. So for any particular value that or any particular observation that you want a prediction for, you get the, the regression predictions from all the base regressors. And then you just take a numerical and arithmetic mean of those. So for instance, if you had a model that said 1.2, a model that said 1.4 and a model that said 1.6, you would add those up, take the average the arithmetic mean and I think it should end up being 1.4. Um Assuming I can still do arithmetic on the fly. OK. Now, the big key here is when you do a voter model for regression, you shouldn't just do something where you fit four or five, slightly different linear regression models. So maybe you're thinking, OK. Well, I'll just make a linear regression where I use features one and two, I'll make a separate linear regression where I use features two and three and I'll make a third linear regression where I use features one and three, that's not what you should do here. Uh You need to have unique regression models. So what I mean by unique regression models is maybe one of them is a linear regression. But then maybe the other one is a cane neighbors regression. Uh Maybe the third one is a support vector regression. And then maybe the fourth one is a random forest regression or a gradient boosting regression. Um The idea here is you want your models to be unique from one another and independent of one another. So they aren't making the same types of mistakes. Uh So you can hopefully sort of get that overlap where one makes a mistake in a unique way that the others do not. And so when you combine them together, the mistake, it gets averaged out. If you want to implement a voting, a voter model for regression, you can use the voting regress. Uh It works in much the same way as the voting classifier where you'll input, where you will input a list of regressors. Uh As to um one thing I want to demonstrate before we sign off of this video. And so we gave these things names, right? What did I call it? I think I called it just voting. Yeah, voting. So I believe we should be able to access them just like a pipeline. So if I did voting K and N um ... so this is a nice uh tip that maybe I haven't shown before. If you want to know what you can access from a variable, you can put directory D I R and then parentheses. And then this shows you all the methods and attributes it has. ... So we can get voting for both transforms that paras predict named estimators. So if we call voting dot named estimators, then we can access the individual estimators that we've stored in there. And so for instance, we could get the predictions from this. Um uh We would have to fit it first, but we can access the individual um classifiers stored within voting by calling named estimators and then putting in the, the, the string that we use to label it when we defined it. OK. All right. All right. So that's it for this video. You now know about voter models for classification. We've made one and then theoretically you should also know how to make one for regression, but maybe you'd need to try it out before you can be sure. So I hope you enjoyed learning about voter voter models. I enjoyed having you learn about voter models with me and I hope to see you in the next video. Have a great rest of your day. Bye.