Plotting Tips Video Lecture Transcript This transcript was automatically generated, so there may be discrepancies between the video and the text. Hi, everybody. Welcome back in this video. We're gonna talk about more presentation tips and tricks focusing mainly on plotting. Uh So bad plots are a good way to sync a great presentation. So in this video, I'm gonna try and give you some tips and tricks to improve presentation plots using Python. Uh So some of this will touch on Python plotting that maybe you weren't familiar with before others will just share uh share with you tips that I think are good for making nice clean presentation plots um and making it as easily readable for your audience as possible. So I, one of the first things that I think is really important is having good access and tick labels. So I've seen a lot of presentations in the past where uh participants or presenters will put up a plot, but they don't label their axis or their tick labels making it almost impossible for me to understand what information is trying to be conveyed. This is particularly an issue. If you're going through a quick presentation where there's not too much time spent on anyone slide, you wanna make sure that your access and tick labels are as clear as possible. So let's demonstrate what we mean using the S L data set. So here, I've plotted sl width against uh on the horizontal and sl length on the vertical. So this is what it looks like. Um But you notice that we don't have a vertical axis label or a horizontal axis label. Uh So if I were to put this in a presentation, uh how would anybody know what I was talking about if they're just looking at the image? And maybe they didn't hear me say that the vertical axis was cl length and that the horizontal axis was sl width. So in general, you should label your plots, horizontal and vertical axis. So you know what it is they're looking at. Uh so they know what it is they're looking at. So you can do that with just in P L T uh with mat plot lib, you can do that just by calling dot X label, putting in a string for the X label. And then you can make sure the labels are pretty large by changing the font size. OK? So here's X label and then Y label gives the vertical axis. So this is an improvement. But you may also say, and you know, just looking on my laptop screen, I can read that this is an eight, this is a 7.5. But think in terms of the audience that are gonna be consuming your slides while you're giving your presentation, this may be pretty small. If you're doing an in-person presentation, think of the person at the back of the room. And if you're doing AAA an asynchronous pre-recorded presentation, people may not have the same quality laptop screen that you have or maybe they're focusing on your face. I don't know. But I think in general it's important to have bigger tick labels as big as you can as you can reasonably get, you know, within reason. So one way in map plot lib that you can increase the tick labels is to call uh P L T dot X ticks and then just put it in a font size argument and then same with P L T dot Y. OK. So now we have a nice clear label for the axis and our ticks are relatively large. Our tick labels are relatively large. So I would say this is a vast improvement over the one that we started with with small tick labels and no axis labels. Ok. ... All right. So I want to point out that this was when we created the figure using P L T dot Figure. Sometimes you may have plots that are subplots and maybe you created it with P L T dot subplots. OK. So if you use P L T dot subplots, what you're probably gonna want to use to change both the label and the X tick or the label, the ticks and the tick labels are these arguments here. Uh It works close to the same way. They're slightly different, but I'll leave it to you to read through the documentation I provided here in figure it out for your particular plots. Uh There's also a lot of good just like stack exchange posts, talking about how to change different things in map plot lib plots. So I encourage you to just if there's a question that you have, you can't find it easily in the documentation, just do a web search and I'm sure you'll find what you need. ... So the next piece of advice I would say is when you have a data set that you're or something you're trying to plot and a legend would be useful. You should include an informative legend. And so let's look at an example here again with our, our Irish data. So remember the IRS data, right? It looks at three different types of iris and gives four measurements for it. Here, we've pictured two. So here I've labeled this, I colored the points according to their iris class. And I've provided a legend in this example. Now for you and I that have looked at this data set before we know maybe what Y equals zero, Y equals one and Y equals two means. But for the audience member that might be seeing this data set for the first time, they don't know what Y equals zero mean or what Y equals one means or what Y equals two means um so when I say informative legends, that should be informative for the people consuming your plot. So put, don't put in labels like 123. If those labels have an actual meaning in terms of a problem, like the type of Iris, you should put in the name of the Iris instead of Y equals zero. So in this example, you could do this by including a label argument when you call dot Scatter or dot plot and then when you call your legend, it will come up. So up here, we've made, you know, this is the satos or when Y is zero, the versicolor is when Y is one and the Virginia is when Y is two. And then when I call my legend, now this is a legend that is informative to somebody who doesn't know the data set. But does you know, is listening to me talk about, we made a classifier for different types of uh for different types of irises. And then here are those different types of irises in the data. So this is, you know, over here this and so someone would be able to process OK, Acetosa, this is a different type of virus from a versicolor from a Virginica. But if you're just showing them abstract things like, well, Y equals zero, Y equals one, Y equals two, that's not as directly relatable to the problem that you're working on? ... Uh Do do, do I also want to point out. So here the legend went up into the upper right hand corner on its own. But let's say that this was a bad position from your, you know, you're looking at it, it feels like a bad position. You can always change the position of your legend using a, a loke argument inside of the legend. And so if you're interested in learning more about that, I've provided the link to the documentation here where you can figure out how to change the location of your legend using this argument. ... OK. So the next piece of advice I want to give is you should appropriately use your colors and your markers. So here we've used colors to convey extra information that is the type of the iris. But what if this type of iris was already conveyed by another piece of information in the plot? So I think when you're using and this is a personal preference. Um But I think it's a, a general preference uh that a lot of people hold um when you're using a color or a different marker or line type, make sure that that's conveying information that is not already pleasant present in the plot. So in this example, the type of the iris was not present in the earlier version of the plot where all of the points were the same color. So the addition of color allows us to convey additional information to the audience that without that color is not present. However, let's consider a different plot. This swarm plot here. So here is a swarm plot where it has on the horizontal axis, the type of iris and then on the vertical axis, the sequel width. And so each point here represents one of our observations gives the width for that observation and then it's uh categorized according to the class of the iris. Now, here we have different colors for the iris classes just like we did above. OK. But the key difference for this plot is the horizontal axis already provides that information for us. So by looking at the horizontal axis, we can tell that all of these ones are a zero which is Acetosa, we'll, we'll change the labels on this later. Uh And then all of these ones are one, all of these ones are a two. So it might be preferable to not repeat the information and instead give, provide these with a uniform color uh because the color in this is repetitive, it doesn't add any additional information that the horizontal axis doesn't already convey. OK. Now I will say, and this might be a slight contradiction to what I just said about repetitive information. It can be there can be instances where you do wanna repeat information in this me in this sense. So if you're going to encode information with color like we did above, it can be preferable to include both a different color as well as a different marker. So what I mean by this is we'll have blue circles, maybe we'll have orange triangles and we'll have green Xs. So something like this. So you might be saying, well, Matt, you just told me not to repeat information with my plotting information, right? So don't, so don't uh you know, use color if it's already repeating something available in the plot. So the reason here why I'm going to say we might wanna break that rule is sometimes there are individuals who could look at this plot and not tell the difference between the green dots, the orange dots and the blue dots. So color blind individuals, however, by using both color and marker uh shape. So X triangle circle, this allows us to expand the number of people we're able to convey the meaning to of this plot. Uh So what do I mean by that? More people, people who are color blind and can't see the difference between green, orange and blue uh can look at this plot and see. OK. Well, I might not be able to see the colors, but I can see Virginica or X's Versa. Colors or triangles circles are satos. So if it's possible, you may want to both do color and marker or line style. Um If it's not possible, that's OK. Sometimes you'll have too many different types of things to change the, the marker for all of them. Um But I think it's preferable to try and be accommodating to those people who can't tell the difference between the different colors uh when you can. And so I would say that this is an appropriate instance where you can use more than one piece of plotting choice. You can use more than one plotting choice to convey meaning if it's allowing you to increase the number of people who will be able to look at your plot and understand it. ... So if you're interested for all the different matte plot lib color types, you can come here and it gives you all the different named colors. Uh I think one that I always find pretty fun is Dodger blue. Uh I don't like the Dodgers. I, I'm neutral on the Dodgers, but I just think it's funny that they made it into Matt flat web. Uh for different scatter plot marker types, you can come here. OK? They show you what string to input for the marker type and then for different line styles, you can come here. OK? And they tell you different line styles as well. My next piece of advice is to not just make a plot and stick to the default. So sometimes that's OK. Other times it's not. So let's go back to that swarm plot example that I told you we'd come back to. So if we look at the defaults here, the default horizontal labels, tick labels are using those non informative 01 and two as opposed to the names. So Sasa Versicolor, Virginica, it's also using a horizontal label that is not as human readable as we might like it to be. Uh the vertical axis label is pretty small, it's hard to read. And then the tick labels could also be increased. And we already talked about the colors. Maybe we want to change them to be the same color so we can do all of that. And here we've recreated the plot. Now the labels are larger, they're more readable in terms of size, but also the words are more readable. The tick labels have become more informative for the iris type and we changed the colors, which again is a personal preference of mine. Um But you know, I if you didn't want to change the colors, I guess you don't have to, I just think it's best to not, you know, be repetitive with the information you're conveying in this sense. So really the key takeaway for this tip is don't just make the plot using a default function from Seaborne or plot le and then keep all of the default values, make some of these nice little changes sometimes just by making these quick little changes like adding in labels, changing the X tick labels, changing the font size, just doing those quick little changes can make your plot infinitely better and more easily uh have it be more easily read by your audience and then the final piece of advice I'm gonna give is that you should save large, high resolution video, uh high, large high resolution plots. So a lot of people and I've done this before, if I just want to do a quick thing is maybe they'll just take like a clip, do a quick little like, you know, screenshot of this image and then copy and paste it and then just scale it so it becomes bigger in whatever presentation software they're using. I would say it's preferable to save a larger image to your computer and then move that image into the software instead. So when you take a smaller image and scale it up, it's gonna stretch and then also make your image blurry because it was saved at a, at a lower resolution and then you're increasing it, which is gonna cause blurriness and, and uh distortions when you try and present. So it's preferable to save a large high resolution version of your image and then drag that large image into the software and then scale it down uh because taking a high resolution and making it smaller will, will retain the details of that high resolution better than if you take a low resolution and make it bigger. So how can I do that in map plot lib? Well, I'm gonna show you with this image. So the first thing you can do is you can increase the size of the figure with the fig size argument. So notice before it was 10 by eight. Now I've doubled it in both dimensions to make it 20 by 16. So it's OK to make it large, right? Because we're gonna just gonna scale it down when we put it in our presentation. Another thing that we can do is save the image, not as a JPEG or a P N G but as a higher resolution format like a PDF file or an E P S image. OK. And so how can we do that when you call P L T dot Save fig, you put in the name of the image, but now with PDF or E P S and then you include this extra argument that gives the format that you're using. So if you're using a PDF format, you put in format equals PDF. If you're using an E P S format, you put in format equals E P S. OK? And so now we can see um I will point out in my version of it that I'm about to show you're not gonna see like a lot of white space. If you just run this code chunk and then you look at the images you've saved, you'll notice that there's a ton of white space on the border. You may want to try out the tight layout function that will trim some of that white space and then make it so that the image itself takes up the entire saved file. OK? So I'll show you what I mean by showing you the one that was the um the ones that were the ones saved with tight layouts. ... OK? So let me go ahead and I've opened that file. Let me go ahead and show it to you now. So here is the larger image that was shaved as a saved as a PDF file. OK? So we can see this now, so we can see the, the version that was saved by running this code, uh the code that has P L T that tight layout. Uh You can see that that goes to the edges of the image. It's nice and large. There's large labels, the tick labels, I could increase them if I'd like to. But now when I put this in my, in my power point, I can scale it down. Uh and it will retain this nice image quality uh as opposed to taking a smaller lower resolution image and scaling it up. OK. So let me go back to my Jupiter notebook. ... Uh And that's really it. OK? So if you want more fine, also if you want more fine grain control, so maybe you have a subplot that you'd like to adjust the white space for uh you can call subplots, adjust it's similar to tight layout, but offers you more control over how much white space on the left, the right, the top, the bottom and in between subplots. So here's a link to that documentation So these are some good rules of thumb for presentation plots. This is by no means um a definitive like these are the plotting rules, all of the plotting rules. But I think that these are some of the most common mistakes that people make. Uh And hopefully this video helped you so that you won't make those mistakes when you're going through your presentation plots. Uh If you're interested in learning more about data visualization, best practices, I think a good starting point is this book by Alberto Cairo. Uh How charts lie, getting smarter about visual information. So this is a nice book. I like it quite a bit. Um You can support uh a fellow academic researcher by purchasing his book and if you purchase it at bookshop dot org, you'll also be supporting independent bookstores as opposed to Amazon uh uh Amazon. So do you feel good about yourself for doing that? ... OK. So I hope you enjoyed this video. I always enjoy talking to people about data visualization. Uh And I just enjoyed talking to you about it. So I hope you enjoyed watching it. Have a great rest of your day. Bye.