What is Clustering Video Lecture Transcript This transcript was automatically generated, so there may be discrepancies between the video and the text. Hi, everybody. Welcome back in this video. We're going to introduce our clustering subsection of unsupervised learning with a video just talking about what clustering is. So uh we're gonna briefly define what a clustering problem is. And so essentially on this notebook, we just define it and then we'll move on to actual clustering algorithms in a different uh notebook and video. Uh So let me go ahead actually and clear my kernel and we'll be ready to go. So uh in other videos, we've talked about a process known as dimension reduction. So we use things like PC A and T S. Uh There are other techniques but those are the two that we covered where you take a high dimensional data set and turn it into a lower dimensional data set. Um There are other unsupervised learning tasks that we're gonna talk about is called clustering. And so in clustering, we look to identify groupings of similar data points and otherwise unlabeled data. And so we're gonna keep labeling this data X. So essentially the idea is you might uh think maybe you think or maybe you're trying to see if there is a natural groupings of the data points in your data set. So in a marketing setting, maybe this refers to a different market segments. So people who have similar interests with one another will group together and therefore you can um send different types of advertisements to the different groups of people that will hopefully work better for them. Um At the same time, maybe you're working for some sort of app company or um some sort of hardware company, something like this where you're selling products to people or you have a product that people are using. Uh maybe you can do this sort of segmentation or clustering to try and identify different types of product users. So people who use your product, but maybe for different reasons or in different ways. Uh So, for instance, we're going to see an example of this with uh just sort of what a clustering algorithm might do. Uh So let's say we have some data that's given by X here where it's just going to look like this. This is what the data looks like in your data set. It's just an unlabeled series of points. In this example, we have X one and X two, but in practice, you'll have many more observations. Um And so you might see here, well, it does appear that the two groups tend to, there tends to be or appears to be two groups forming uh one in this upper blob in the upper right. And here this blob in the lower left. And so a clustering algorithm might go through this data and then produce something like this. So cluster one. However, this algorithm work maybe cluster one is uh these points down here, identified as these points down here as the red circles. And then it outputs cluster two as these blue crosses up above blue plus symbols. Uh And so this is sort of um you know a standard output from a clustering algorithm it takes in your points and then tries in some way. Uh We'll see two specific ways in a second or in a, in a later video in a later notebook. Uh and then produces clusterings on them using the data points as inputs into the algorithm. So as I said in this section, we will learn two clustering algorithms and we'll talk about those in other videos and other notebooks. So I hope I gave you a good idea of what clustering is all about why we might be interested in using it in real world applications research or industry or personal projects. Uh And I hope to see you in those other videos where you can learn more about two clustering algorithms. All right. Have a great rest of your day. Bye.