left-icon

Python Succinctly®
by Jason Cannon

Previous
Chapter

of
A
A
A

CHAPTER 8

File I/O

File I/O


In previous chapters you’ve learned how to use the built-in input() function to accept standard input from the keyboard. You’ve also learned how to send data to standard output—the screen—using the print() function. While understanding and utilizing standard input and output will work well for certain types of applications, you will often need a place to store the data generated by your program. Also, you will need a way to retrieve saved data as well. One of the most common places to store data is within a file. You can read input and write output to a file, just like you can read input from a keyboard and display output on a screen.

To open a file, use the built-in open() function. The pattern for this is open(path_to_file). The path_to_file can be either an absolute or a relative path, and it includes the file name. An absolute path will contain the entire path beginning at the root of the file system, be that a / in Mac or Linux, or a drive letter in Windows. Examples of absolute paths are /var/log/messages and C:\Log\Messages\data.txt. A relative path however, will comprise just the file name or a portion of the path which starts at the current working directory. An example of a relative path is log/messages. This example supposes the current working directory is /var.

Making use of forward slashes as a directory separator will be familiar to most of us, even those that have never worked on a Unix or Unix-like operating system. Python however recognizes forward slashes even when running on the Windows operating system. The Windows operating system uses backslashes as the directory separator. For instance, C:/Users/david/Documents/python-notes.txt is a valid absolute path within Python. Also, Documents/python-notes.txt is a valid relative path.

The open() function will return a file object, which is sometimes referred to as a stream object. This can be used to perform operations on the file passed to the open() function. To read the entire file in at once, use the read() method on the file object. The read() method returns a string comprising the file's contents. The following code listing is an example.

Code Listing 229

hosts = open('/etc/hosts')

hosts_file_contents = hosts.read()

print(hosts_file_contents)

Output:

Code Listing 230

127.0.0.1 localhost

In order to modify the previous example to work on a Windows system, set the hosts variable to C:/Windows/System32/drivers/etc/hosts.

Code Listing 231

hosts = open('C:/Windows/System32/drivers/etc/hosts')

File Position

Whenever a file is read, Python will keep track of your current position within that file. In cases where the read() method returns the entire file, the current position will always be at the end of the file. If you were to call read() again, an empty string would be returned since there is no more data to return at your current position in the file. To change the current file position, use the seek() method and pass in a byte offset. For instance, to go back to the beginning of the file, use seek(0). If, however, you are looking to start at the fifth byte of the file, use seek(5). Take note that in many cases the Nth byte will correspond to the Nth character in the file. However, in some cases it will not, so be aware of that. For UTF-8 encoded files you will often come across characters that are longer than one byte. You will encounter this situation when using Kanji, Korean, or Chinese. In order to determine your current position in the file, use the tell() method.

Code Listing 232

hosts = open('/etc/hosts')

print('Current position: {}'.format(hosts.tell()))

print(hosts.read())

print('Current position: {}'.format(hosts.tell()))

print(hosts.read())

hosts.seek(0)

print('Current position: {}'.format(hosts.tell()))

print(hosts.read())

Output:

Code Listing 233

Current position: 0

127.0.0.1 localhost

Current position: 20

Current position: 0

127.0.0.1 localhost

The read() method will accept the number of bytes/characters to read. The following example demonstrates reading the first three characters of the hosts file. In this case, the first three characters correspond with the first three bytes.

Code Listing 234

hosts = open('/etc/hosts')

print(hosts.read(3))

print(hosts.tell())

Output:

Code Listing 235

127

3

Closing a File

It is always a best practice to completely close a file once you are finished with it. Keep in mind that if your Python application opens too many files during its execution you could be faced with a "too many open files" error. To close a file, simply use the close() method on the file object.

Code Listing 236

hosts = open('/etc/hosts')

hosts_file_contents = hosts.read()

print(hosts_file_contents)

hosts.close()

Output:

Code Listing 237

127.0.0.1 localhost

Note that each file object has a closed attribute that returns True if the file is closed and False if it is not. You can make use of this attribute to ensure that a file is indeed closed.

Code Listing 238

hosts = open('/etc/hosts')

hosts_file_contents = hosts.read()

print('File closed? {}'.format(hosts.closed))

if not hosts.closed:

    hosts.close()

print('File closed? {}'.format(hosts.closed))

Output:

Code Listing 239

File closed? False

File closed? True

Automatically Closing a File

To automatically close a file use the with statement. The pattern is with open(file_path) as file_object_variable_name followed directly by a code block. Whenever the code block finishes, Python will automatically close the file. Also, in cases where the code block is interrupted for any reason, including an exception, the file will be closed.

Code Listing 240

print('Started reading the file.')

with open('/etc/hosts') as hosts:

    print('File closed? {}'.format(hosts.closed))

    print(hosts.read())

print('Finished reading the file.')

print('File closed? {}'.format(hosts.closed))

Output:

Code Listing 241

Started reading the file.

File closed? False

127.0.0.1 localhost

Finished reading the file.

File closed? True

Reading a File One Line at a Time

To read a file one line at a time, use a for loop. The pattern is for line_variable in file_object_variable: directly followed by a code block.

Code Listing 242

with open('file.txt') as the_file:

    for line in the_file:

        print(line)

Output:

Code Listing 243

This is the first line of the file.

Here is the second line.

Finally!  This is the third and last line!

The contents of file.txt:

Code Listing 244

This is the first line of the file.

Here is the second line.

Finally!  This is the third and last line!

The output will contain a blank line between each one of the lines in the file. This is because the line variable encompasses the complete line from the file which includes a carriage return, or new line, character. To remove any trailing white space, including the new line and carriage return characters, use the rstrip() string method.

Code Listing 245

with open('file.txt') as the_file:

    for line in the_file:

        print(line.rstrip())

Output:

Code Listing 246

This is the first line of the file.

Here is the second line.

Finally!  This is the third and last line!

File Modes

Whenever you open a file you have the option of specifying a mode. The pattern is open(path_to_file, mode). So far in this book we have been relying on the default file mode of r which opens a file in read-only mode. If you want to write to a file, clearing any of its current contents, use the w mode. If you want to create a new file and write to it, use the x mode. In cases where the file already exists an exception will be raised. Using the x mode will prevent you from accidentally overwriting existing files. If you are looking to keep the contents of an existing file and append or add additional data to it, use the a mode. With both the w and a modes, if the file does not already exist, it will be created. If you want to read and write to the same file, use the + mode.

Table 4: File Modes

Mode

Description

r

Open for reading (default).

w

Open for writing, truncating the file first.

x

Create a new file and open it for writing.

A

Open for writing, appending to the end of the file if it exists.

b

Binary mode.

t

Text mode (default).

+

Open a disk file for updating (reading and writing).

Keep in mind that you can also specify if the file you are working with is a text file or a binary file. By default, all files are opened as text files unless you directly specify otherwise. Simply append a t or b to one of the read or write modes. For example, to open a file for reading in binary mode, use rb. To append to a binary file, use ab.

Also keep in mind that while text files contain strings, binary files contain a series of bytes. Put simply, text files are readable by humans, while binary files are not. Examples of binary files include images, videos, and compressed files.

To look into the current mode of a file, examine the mode attribute on a file object.

Code Listing 247

with open('file.txt') as the_file:

    print(the_file.mode)

Output:

Code Listing 248

r

Writing to a File

Now that you are familiar with the different file modes, let's try writing some data to a file. This is as simple as calling the write() method on the file object and then supplying the text you wish to write to that file.

Code Listing 249

with open('file2.txt', 'w') as the_file:

    the_file.write('This text will be written to the file.')

    the_file.write('Here is some more text.')

with open('file2.txt') as the_file:

    print(the_file.read())

Output:

Code Listing 250

This text will be written to the file.Here is some more text.

Keep in mind however that the output you receive may not be exactly what you expected. The write() method will write exactly what was supplied to the file. In the previous example no carriage return or line feed was provided. As a result, all the text ended up on the same line. The \r sequence represents the carriage return character and \n represents a new line. Let’s work through the example again, but this time let’s ensure that we are using a new line character at the end of the line.

Code Listing 251

with open('file2.txt', 'w') as the_file:

    the_file.write('This text will be written to the file.\n')

    the_file.write('Here is some more text.\n')

with open('file2.txt') as the_file:

    print(the_file.read())

Output:

Code Listing 252

This text will be written to the file.

Here is some more text.

Keep in mind that Unix-style line endings will only contain the \n character. Mac and Linux files use this type of line ending. Windows-style line endings can be formed by using \r\n.

Binary Files

The key thing to remember when you are dealing with binary files is that you are working with bytes, not characters. The read() method will always accept bytes as an argument when dealing with binary files. Remember that the read() method will accept characters whenever the file is opened as a text file.

Code Listing 253

with open('pig.jpg', 'rb') as pig_picture:

    pig_picture.seek(2)

    pig_picture.read(4)

    print(pig_picture.tell())

    print(pig_picture.mode)

Output:

Code Listing 254

6

rb

Exceptions

Whenever you are working with anything that exists outside of your program you greatly increase the chance of errors and exceptions. Working with files falls squarely into this category. An example of this may occur when a file you are attempting to write to may be read-only. Or, a file you are attempting to read from may not be available. In a previous chapter we briefly examined the try/except block. In the following example we’ll see how it can be put to use.

Code Listing 255

# Open a file and assign its contents to a variable.

# If the file is unavailable, create an empty variable.

try:

    contacts = open('contacts.txt').read()

except:

    contacts = []

   

print(len(contacts))

Output:

Code Listing 256

3

If the file was unable to be read, the output would be:

Code Listing 257

0

Review

Use the built-in open() function to open a file. The pattern is open(path_to_file, mode).

If mode is not supplied when opening a file it will default to read-only.

Forward slashes can be used as directory separators, even when you are using Windows.

Using the read() file object method will return the entire contents of the file as a string.

Use the close() file object method to close a file.

Use the with statement to automatically close a file. The pattern is with open(file_path) as file_object_variable_name: directly followed by a code block.

Use a for loop to read a file one line at a time. The pattern is for line_variable in file_object_variable:.

Use the rstrip() string method to remove any trailing white space.

Write data to a file using the write() file object method.

The read() file object method accepts the number of bytes to read when a file is opened in binary mode. When a file is opened in text mode, which is the default, read() will accept characters.

In the majority of cases a character will be one byte in length, but keep in mind that this does not hold true in every situation.

Always plan for exceptions when you are working with files. Use try/except blocks.

Exercises

Line Numbers

Try creating a program that opens file.txt. Read each line of the file and then prepend it with a line number.

Sample output:

Code Listing 258

1: This is the first line of the file.

2: Here is the second line.

3: Finally!  This is the third and last line!

Solution

Code Listing 259

with open('file.txt') as file:

    line_number = 1

    for line in file:

        print('{}: {}'.format(line_number, line.rstrip()))

        line_number += 1

Alphabetize

Try reading the contents of animals.txt and from there create a file named animals-sorted.txt that is sorted alphabetically.

The contents of animals.txt:

Code Listing 260

toad

lion

seal

fox

owl

whale

elk

Once the program has been executed the contents of animals-sorted.txt should be:

Code Listing 261

elk

fox

lion

owl

seal

toad

whale

Solution

Code Listing 262

unsorted_file_name = 'animals.txt'

sorted_file_name = 'animals-sorted.txt'

animals = []

try:

    with open(unsorted_file_name) as animals_file:

        for line in animals_file:

            animals.append(line)

    animals.sort()

except:

    print('Could not open {}.'.format(unsorted_file_name))

try:

    with open(sorted_file_name, 'w') as animals_sorted_file:

        for animal in animals:

            animals_sorted_file.write(animal)

except:

    print('Could not open {}.'.format(sorted_file_name))

Resources

Core tools for working with streams: https://docs.python.org/3/library/io.html

Handling Exceptions: https://wiki.python.org/moin/HandlingExceptions

open() documentation: https://docs.python.org/3/library/functions.html#open

Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.