Python and APIs Video Lecture Transcript This transcript was automatically generated, so there may be discrepancies between the video and the text. 11:43:29 Hi! Everybody! Welcome back in this video we continue to look at data collection by talking about python and how they how it interacts with Api's. 11:43:39 So now that we're in our Jupiter notebook, let's get started. 11:43:42 So we have now an idea of how to scrape code using beautiful soup, which is great, but sometimes there are websites that are either, or apps that are either impossible to scoop with, but a scrape with a beautiful soup alone, or it's just way too difficult and not worth the time of scraping with 11:44:01 beautiful soup. So, for instance, a lot of social media websites which may be a bad example given recent changes in Apis that we'll talk about a little bit later. 11:44:12 Are difficult to scrape with just beautiful soup, so we might want to use an Api for that. 11:44:18 So in this notebook we'll introduce sort of what the idea of an Api is. 11:44:23 We'll discuss python wrappers for Api's and give an example of a python wrapper for an Api, and then at the end we'll point you to a list where there's lots of nice packages that provide python 11:44:34 wrappers for Apis. So what is an Api? 11:44:36 So what one way to think of an Api that I think, is useful sort of like a waiter? 11:44:41 So on one end of that any restaurant experience. There's you, the customer that would like to order food and then on the other end there's the chef that has the food and just needs to know it to cook. 11:44:53 So the Api serves the purpose of like a waiter that will come to you. 11:44:57 Figure out what you would like from the chef. Go to the chef, tell the chef what you want, and the language that he understands, and then bring back what the chef prepares to your table, and so that's sort of what an Api does there's either a web application or some sort of 11:45:12 phone app, some sort of phone app, some sort of server that you would like information or data from you with code will tell the Api. Hey? 11:45:19 I would like this information. So, for instance, I would like all Twitter posts in the past month that mention influenza, and then the Api will take that request. 11:45:34 Go to Twitter Servers and then bring back the relevant data. 11:45:38 Assuming you have access. So another role of the Api is to make sure that you are able to access the data you're asking for so that's sort of what's happening. 11:45:46 So it takes your request, and takes your request, interprets it in a language that the web application can understand, then gets the apps reply, and then brings you the response in the version, and whatever language that you know it's speaking to you in so in our instance it would be python stuff and other instances. 11:46:04 It's sort of just web browser like HTML and Javascript, and that sort of thing. 11:46:10 So these are really useful using Api's are really useful when, instead of just scraping the raw HTML code like for trying to scrape something like Twitter and the way I described would be very difficult with just HTML so an Api can be useful it makes your scraping job. 11:46:28 A lot easier. So some things that we're going to be using are not actually directly the Api. 11:46:35 But instead of python wrapper for an Api, so Python wrapper for an Api is just a python package for interacting with an Api is just a Python package for interacting with an Api so often Apis are not 11:46:45 written in Python, that written in some other kind of programming language. 11:46:48 So the python wrapper is someone on their own has gone through and written code that allows you to write in Python, and then in the background it gets converted into whatever language it needs to to submit the request to the Api then it gets back what that request was from the api and translates 11:47:06 it, back into python. So couple of popular ones are spotify, which is Python's Api wrapper for Spotify's Api's. 11:47:15 There's pro which is the one for reddits. 11:47:19 New York Times. Is pi and Y. Times, and there are more which we'll touch on for a second and a later in the notebook, not for a second, but later in the notebook. 11:47:26 So as an example, we're gonna use the python wrapper for the New York Times. 11:47:30 Api to scrape some data. So if you'd like to code along and get this to run, you'll need to install that package which you can do by following the instructions at this link. 11:47:41 So this notebook was written in recorded with 0 point 8.0. 11:47:47 That version of the Api. If your version is a little bit newer or a little bit older, it should be okay. 11:47:52 But again, if there are any differences, it's probably due to differences in version. 11:47:58 So almost all Apis are going to require you to have something known as a developer key so developer keys are the way that the Apis keep track of what sort of data that you have access to and what limits you have to collecting that data so a lot of Apis will limit you to only 11:48:16 collecting, so much data within a given time span or so much data at a relative speed. 11:48:21 So you'll follow if you'd like to follow along. 11:48:25 There are instructions pasted here on how to get started and get a New York Times application developer Key, after you work through those steps. 11:48:34 There's also a file called my underscore Api underscore info dot. 11:48:40 And what you'll need to do is go into that file, edit the function, get underscore. 11:48:47 New York Times. Underscore key, so that it returns. 11:48:49 Whatever the string of your Api key is, instead of the string. 11:48:53 Your key here. So I've done that and stored that in a file called Mat underscore Api underscore. 11:49:00 So why did I do it this way? Because you never want to share your Api key with someone else. 11:49:05 So think of your Agi key as like your social security card or your driver's license. 11:49:11 Any form of identification that tells the world it is you. 11:49:14 So if you give somebody else your Api key, and then they go do something terrible with it. 11:49:20 You're the one that's seen as don't have done something terrible right. 11:49:23 It's sort of like, you know, identity, theft. 11:49:25 So your Api key is unique to you. You wanna make sure you're the only one using it. 11:49:31 Otherwise someone else might use it with malicious intent. So with that being said a common way to do this is to write a function and then store it in a python script, and then import that function that way. 11:49:44 You're the only one seeing it. You can also set settings on your laptop computer and then and then store it in a python script and then import that function that way. You're the only one seeing it. 11:49:55 You can also set settings on your laptop computer, your server to have it built in as a variable within whatever environment you're in. 11:50:02 And you can just call the environment for me what I found. What works best is this function approach and then importing the function. 11:50:02 But all that being said, it's time to use the Api. 11:50:04 So we're going to from Hi Ny, times, we're going to import Nyt Api. 11:50:12 So this is the instance of the Api connection. So this is going to be what allows us to connect to the Api and then request data. 11:50:21 So now I'm gonna import my get New York Times key. 11:50:26 And then store that in a variable called Nyt Api. 11:50:29 So, or sorry not in a store that in a variable the next thing I'm gonna do is create that connection store that connection into a variable. 11:50:39 So I call Nyt Api. I import my key, which is just returned by get New York Times Key. 11:50:47 And then I'm gonna set the argument of parse states equal to true we'll see why I want that in a little bit. 11:50:55 So we're gonna use this Api to search for articles that have been posted by the New York Times about basketball from March first 2023 to April nineteenth, 2023. 11:51:07 So to do. This date part of it. I am going to need the date time package, which is how Python handles dates and we'll see that in a second. 11:51:15 So in order to search you, do nyt api dot article search. 11:51:24 Then you wanna put in query, so that will be the string that you're searching for for us. 11:51:29 It's the word basketball. And then, however, many results you want. 11:51:35 So I'm just gonna choose 30. 11:51:38 And then I want the dates that I want to search from. 11:51:41 So dates. And then the input, there is a dictionary with a start date, which is, gonna be the march first of 2023, and then the end date which I'm gonna make April nineteenth. 11:51:59 Okay. So it took a little bit because it had to send the request, and then received the response. 11:52:05 And what we can see is returned is addictionary of Objects. 11:52:09 So here we have this dictionary, where the first entry is in itself it, or a list which its first entry is itself a dictionary, and it's printing those out apparently a lot of articles about basketball on that time. 11:52:24 Range. Okay? And then we can look at the. 11:52:29 First article like I said, so we could do the title. 11:52:37 Maybe no title. We could do the abstract. I'm just trying to see what is available to us, so we could do the abstract. 11:52:45 What else is in here? We wanna see what else is in this dictionary? 11:52:48 We could just do results at 0. Dot keys. 11:52:57 Print, section, print page, and so forth. Okay, so as an exercise. 11:53:01 To get some practice, go to the documentation page here and search for the book reviews and points so here we use the article, search endpoint of the Api, using the book reviews, endpoint, try and write the code to find all book reviews, posted for books, by Maya Angelou 11:53:21 so, see if you can do that, and if you're ready you can. You know. 11:53:25 Come back to watch me do it. Okay? So we're gonna do results is equal to nyt. 11:53:32 Api dot book underscore reviews, and then author is equal to Mya Angelou. 11:53:44 And then here, go back to code. I will look at the results. 11:53:48 Okay. So we've got these 3 book reviews of Maya. 11:53:53 Angelou. 11:53:55 So, there's a list of Api wrappers that you can find here. 11:54:00 It has a nice I don't know if it's comprehensive, but it's a nice list to get started with. 11:54:05 I will also make it note that do, I think, do in part to the proliferation of large language. 11:54:12 Models like Chat Gpt, that have used Api wrappers and scraping to collect a lot of training data. 11:54:21 A lot of these Apis are now starting to be put behind. 11:54:24 Sort of a paywall. So the Twitter Api used to have a free level. 11:54:29 The reddit Api is recently announced that it's going to start charging people. 11:54:33 So be mindful of these. Using these Api's may cost money. 11:54:38 So, not everything is going to be free data for you. If you use these Api's, some of them are going to start to cost money soon, if not already. 11:54:46 All right. So you now know how to use a python wrapper for Api's. 11:54:51 You're aware of what they are, and you have a resource that shows you a list of them. 11:54:54 So this might be worth looking into for your data Science Boot camp project collecting data, using an Api. 11:55:00 Now that you're familiar at least with the general process of doing it all right. 11:55:07 I hope you enjoyed learning about Python and Api's.