Manipulating files is an essential aspect of scripting in Python, and luckily for us, the process isn’t complicated. The built-in
open function is the preferred method for reading files of any type, and probably all you’ll ever need to use. Let’s first demonstrate how to use this method on a simple text file.
For clarity, let’s first write our text file string in a standard text editor (MS Notepad in this example). When opened in the editor it will look like this (note the empty trailing line):
To open our file with Python, we first have to know the path to the file. In this example the file path will be relative to your current working directory. So we won’t need to type the full path into the interpreter.
>>> tf = 'textfile.txt'
Open a File in Python
Using this variable as the first argument of the
open method, we’ll have our file saved as an object.
>>> f = open(tf)
Read a File with Python
When we reference our file-object
f, Python tells us the status (open or closed), the name, and the mode, as well as some info we don’t need (about the memory it’s using on our machine).
We already knew the name, and we haven’t closed it so we know it’s open, but the mode deserves special attention. Our file
f is in mode r for read. Specifically, this means we can only read data from the file, not edit or write new data to the file (it’s also in
t mode for
text, though it doesn’t say this explicitly —it’s the default mode, as is r). Let’s read our text from the file with the
>>> f.read() 'First line of our text.\nSecond line of our text.\n3rd line, one line is trailing.\n'
This doesn’t exactly look like what we typed into the notepad, but it’s how Python reads the raw text data. To get the text as we typed it (without the
\n newline characters, we can print it):
>>> print(_) First line of our text. Second line of our text. 3rd line, one line is trailing.
Note how we used the
_ character in the Python IDLE to reference the most recent output instead of using the
read method again. Here’s what happens if we try to use
>>> f.read() ''
This happens because read returned the full contents of the file, and the invisible position marker (how Python keeps track of your position in the file) is at the end of the file; there’s nothing left to read.
Partial Reading of Files in Python
Note: You can use an integer argument with
read if you don’t want the full contents of the file; Python will then read however many bytes you specify as an integer argument for
To get back to the start of the file (or anywhere else in the file), use the
seek(int) method on
f. By going back to the start you can read the contents from the beginning again with
>>> f.seek(0) # We only read a small chunk of the file, 10 bytes print(f.read(10)) First line
Also, to tell where the current position of the file is, use the
tell method on
f like so:
>>> f.tell() 10L
If you don’t know the size of your file or how much of it you want, you might not find that useful.
Reading Files Line by Line in Python
What is useful, however, is reading the contents of the file line-by-line. One way we can do this with the
readlines methods—the first reads one line at a time, the second returns a list of every line in the file; both have an optional integer argument to indicate how much of the file (how many bytes) to read:
# Make sure we're at the start of the file >>> f.seek(0) >>> f.readlines() ['First line of our text.\n', 'Second line of our text.\n', '3rd line, one line is trailing.\n'] >>> f.readline() 'First line of our text.\n' >>> f.readline(20) 'Second line of our t' # Note if int is too large it just reads to the end of the line >>> f.readline(20) 'ext.\n'
Another option for reading a file line-by-line is treating it as a sequence and looping through it, like so:
>>> f.seek(0) >>> for line in f: >>> print(line) First line of our text. Second line of our text. 3rd line, one line is trailing.
Python File Writing Modes
That covers the basic reading methods for files. Before looking at writing methods, we’ll briefly examine the other modes of file-objects returned with
We already know mode r, but there are also the w and a modes (which stand for write and append, respectively). In addition to these there are the options + and b. The + option added to a mode makes the file open for updating, in other words to read from it or write to it.
With this option it might seem like there’s no difference between an r+ mode and w+ mode, but there’s a very important difference between these two: in w mode, the file is automatically truncated, meaning its entire contents are erased — so even in w+ mode the file will be completely overwritten as soon as it’s opened, so be careful. Alternatively, you can truncate the open file yourself with the
If you want to write to the end of the file, just use append mode (with + if you also want to read from it).
The b option indicates to open the file as a binary file (instead of the text mode default). Use this whenever you have data in the file that is not regular text (e.g. when opening an image file).
Now let’s look at writing to our file. We’ll use a+ mode so we don’t erase what we have. First let’s close our file
f and open a new one
# It's important to close the file to free memory >>> f.close() >>> f2 = open(tf, 'a+')
We can see that our
f file is now closed, meaning it isn’t taking up much memory, and we can’t perform any methods on it.
Note: If you don’t want to have to call
close explicitly on the file, you can use a
with statement to open the file. The
with statement will close the file automatically:
# f remains open only within the 'with' >>> with open(tf) as f: >>> print(f.read()) First line of our text. Second line of our text. 3rd line, one line is trailing. # This constant tells us if the file is closed >>> f.closed True
With f2, let’s write to the end of the file. We’re already in append mode so we can just call
f2.write('Our 4th line, with write()\n')
Writing Multiple Lines to a File in Python
With this we’ve written to our file, and we can also write multiple lines with the
writelines, which will write a sequence (e.g., a list) of strings to the file as lines:
f2.writelines(['And a fifth', 'And also a sixth.']) f2.close()
Note: The name
writelines is a misnomer, as it does not write newline characters to the end of each string in the sequence automatically, as we’ll see.
Ok, now we’ve written our text and we’ve closed
f2 so the changes we’ve made should be seen in the file when we open it in our text editor:
We can see the
writelines method didn’t separate our fifth and sixth lines for us, so keep that in mind.
Now that you have a good starting point, get scripting and discover what you can do when reading and writing files in Python — and don’t forget to utilize all of the extensive formatting methods Python has for strings!