Data Cleaning Video Lecture Transcript This transcript was automatically generated, so there may be discrepancies between the video and the text. Hi, everybody. Welcome back in this short video. We'll do a brief introduction to data cleaning, otherwise known as data preprocessing. So sort of a a needless Jupiter notebook, but maybe it's nice to focus your eyes on something um other than the black background. So in this series of notebooks stored under the cleaning folder in the lecture's repository, um we're gonna have an introduction to data preprocessing. So a lot of the times that we've worked on things, maybe we've done something like one hot and coating or multiplied some columns together to get a square or an interaction term. If we're working on regression, as you continue to go through and learn more uh model types as well as other types of data science techniques, you're gonna need to know more and more about data preprocessing as things become more complicated. So the the brief little asides we have in the notebooks about the different techniques aren't gonna be enough and sometimes it'll warrant just having its own Jupiter notebook uh in this cleaning folder. So the stuff in this cleaning folder is mostly about data preprocessing, which is the act of cleaning our data or preparing it uh preparing it for the steps of being fit by a model uh or used for some sort of prediction. Uh So we'll do things like scaling data uh handling missing data with imputation. We'll also learn about pipelines which are nice uh seamless ways for you to take a data set from data frame or numb high array uh through all the steps like cleaning it with the scaling or maybe making one hot encodings or multiplying columns together and making polynomial transformations. All of this will be covered in this cleaning folder. Uh which is another word for data preprocessing. So if you watch these videos, I hope you enjoy learning about different cleaning steps in the data science process. Uh It's a lot of what you do in the real world when you're working with data and data science. All right. I hope you enjoyed this video and I hope you enjoy learning about cleaning or data preprocessing. Bye.