[MUSIC] Many programs, I'm not going to just generate everything directly inside the program. I want to operate on some data that I get from outside the program. Well how do you do that? Well usually you do that by accessing files. 'Kay? So in this lecture we're going to talk about how you read files in Python. Now, I want to point out that CodeSkulptor runs in a web browser, so it cannot read files directly off your hard drive. This would be a security breach [LAUGH], if you could do that from a webpage. 'Kay? So we're going to read files over the network. In Python, this works pretty much the same way. The way you operate on network files files and local files is very similar. 'Kay. In fact, many of the functions are identical. 'Kay. So we're going to talk today about reading files over the network in CodeSkulptor. In order to read files, we're going to need two modules in Co, CodeSkulptor. The first is urllib2, and this is a standard Python module that allows you to access files over the network. The second is CodeSkulptor, which is obviously CodeSkulptor specific, and this is where we've encapsulated the utility functions that you need that are specific to CodeSkulptor. All right, so here's the file that I want to access. examples_files_dracula.text. [LAUGH] Very ominous, huh? [LAUGH] Very appropriate depending on when you're watching this video. Okay, so, I need to first convert that file name in to a URL. Now you can do this manually and figure out where I'm storing my files, or you can use the utility functions in CodeSkulptor, in the CodeSkulptor module. File two URL. Okay, and I, it takes a file name of the form above. Okay. Okay, so if I give you a file name like that, then I'm storing in my online storage. You can convert it to a URL as follows. All right. Now I actually want to open it. All right. So I want to get a file object. So we're going to call it net file just to let you realize that it is a file that's coming over the network. Okay, and I use urllib2, and there is a function in the urllib2 module called the URL open. And what this does is it takes a URL and it opens that file, okay. All right, so let's run this. Well, that was exciting [LAUGH], pretty much nothing happened, okay, that should be expected, I didn't print anything. Let's, print net file and see what happens. Okay. Fantastic. It's an object. All right. And in fact it's a net file object. All right. That's not very useful, okay. What I really want is the data out of this file so I'm going to read the file, okay. So then one of the ways that I can read a file in Python is to call the read method, okay. So, I've got my net file. I'm going to read it. And let's print out that data. Okay, run. And here we go, we have a paragraph from Dracula. Perhaps you recognize it and perhaps you don't. If you don't, maybe you need to read more. All right, okay. So, that gives me the entire file. That could be useful, all right. If the file is really big, I now have a really big string basically. Okay. So what is this data? It's a string. Let me show you that, print type data. Hey, it is a string. Now, often, this is not really what I want to do. Okay, what I really want to do, is read the file line by line. Okay? So now I can do that in Python very easily. So I say for line in netfile. readLines, 'kay, print line. All right, let's see what happens here! Okay, now I'm printing it out line by line, and you'll notice there are sort of these extra linefeeds. When I do this in readLines, it adds a, a linefeed onto the end of the line. And since. There was already a line feed at the end of the line, a kind of interesting, an interesting choice that Python makes. Okay, I could actually strip that out in a variety of ways, let's do it this way, okay? And I'm going to not remind you what that does. I want you to think about it and go back and, and figure it out if you don't know. If I do that, it looks exactly the same. I can prove to you that. I'm printing out the line, by doing that line by line. Okay, I have each line separate. Okay. So, between read, which gives me the entire file contents or read lines that allows me to look at the file contents line by line, I have two interesting and useful ways of accessing files over the network. Okay, after seeing that I suspect that there are a bunch of you who are out there trying to open all kinds of URLs and are confused about why it doesn't work. CodeSkulptor is running inside the bread, web browser, and I haven't restricted in any way the sets of files CodeSkulptor can access. However the web browser restricts that, right? It is a big security hole if, the JavaScript that's running in your webpage can go off and access all kinds of files all over the web, all right? So there are restrictions on what can actually be done. You can open any of the files that I give you. The files that are off in my storage are all accessible to CodeSkulptor. And it is also possible to create, you know, publicly accessible files in Dropbox, for instance, that you could access as well. But you're not going to be able to access arbitrary webpages using that interface, okay. And this again, is a limitation of the security policies in the web browser. And, maybe you don't want to view it as a limitation, it's actually protecting you, [LAUGH] okay? It is not a particular limitation of Python or CodeSkulptor. All right, so your ability to access files in this way using urllib2 and CodeSkulptor may be limited depending on the quality of your Internet connection. 'Kay, so I want to talk a little bit about how you might do this on the desktop. So first of all, you can do this on the desktop almost identically, right? The first thing you just have to understand is, where is this URL? So I can simply print it. So if I do this. It prints out the URL, you can actually take that, cut that into, cut and paste that into a web browser and you'll actually get the data directly. Okay, and you could then copy and paste that into a file on your local machine if you wanted. Or, you could use this URL directly on the desktop and continue to use your urllib2. And just use that URL and you'll end up being able to directly access the file from the desktop. Now. If I want to dissociate myself from the internet here and be able to run my program and read from files directly on the desktop. Okay, we can get rid of these things here, comment them out. All right, again now I would want to cut and paste this file, sort of locally so that I don't have to get it on the network. And then. All I really have to do now is I'll call it netfile just for consistency although it's not a netfile anymore, equals open, filename, and you want to open it for readings. You have to give it the mode, okay? Once you do this. Everything below there will just work exactly the same. Okay. So, it really is pretty much the same if I open a file up over a network, over the network in Python or if I try to open a file locally on my hard drive in Python. Okay. You might though, have some pro, troubles here. Now this is not going to work in CodeSkulptor. Let me, let me show you, right. You're going to get an error. There's no, no way to just call open in CodeSkulptor. Okay, and you might also run into problems, wh, if you place this file in the wrong location. Depending on what development environment you're using on your local machine. You have to make sure that the file, is placed in a directory that Python can actually access. And there's just too many different development environments out there for me to give you guidance on how to do that. You're going to have to use your favorite search engine and figure out where you should put the file, okay. But otherwise I want to make the point here again that reading files whether you read them over the network or if you read them locally, it's pretty much exactly the same. Now you don't have to generate all the data from your program as it runs, instead we can read in files and access data that way, okay? And we've seen that, once we have a file object, we can call the read or read lines methods and access the data as a string. And I encourage you to read the documentation to better understand how these methods work, okay? I want you to remember though, in CodeSkulptor we're restricted to accessing things from URLs so we're downloading them over the network. But, in Python on the desktop, you can also read files off your hard drive and the interface is pretty much exactly the same you just don't call URL open, right. You're just opening a file with a different interface. 'Kay, but once you have that file you're still going to use it exactly the same way, okay? Now, we can write some more interesting programs because we don't have to process the exact same data every time. We can open different files each time and do different things on that data. All right?