9.1. Readings: files#

9.1.1. Why are files useful?#

Until now, all the data we have dealt with were created on the fly and resided in the memory for the duration of the execution of the program. If we close the browser window and reopen it again to run the program, the data will be recreated from the beginning in the computer that will run the code. But what if we spend a lot of time to create the data each time we need it? This is not an efficient approach. For that reason, we can create the data once, save it in a file and each time we need it, we simply read it from the file. This saves a lot of time.

Moreover, we usually write programs to perform specific operations on potentially any input provided, e.g., so that we can perform the same operation on many different inputs (as opposed to data that are hardcoded into our program). We also usually need to share and store our data. For these reasons, files are used to conveniently store data in portable formats that can be shared and/or input to different programs.

From Python’s perspective, a file is nothing more than a sequence of elements. Elements can be of different types. For example, in the case of text files the elements that make the sequence are nothing more than characters, while in the case of images, videos, etc. these elements are simply bytes. As all sequences, even these are 0-index-based. An end-of-file marker (placed at the end of the file) sends a signal to the method that reads the file indicating that its end was reached.

9.1.2. Writing data to a file#

In this section we will see how we can create a file and write content to it. The general syntax is:

with open(file_name, mode='w') as object_name:
    object_name.write(data_to_be_written)

The with keyword is used to acquire the resource specified by the file_name argument in the open() function. In other words, at this moment our program will get exclusive access to the file specified by the file_name. The mode='w', tells the open() function the reason why the program is accessing the file. There are different modes that we can pass as arguments that we list below:

Table 9.1 Mode types#

Mode

Description

r

(read), opens a file for reading it

w

(write), opens a file for writing to it. If the file does not exist it will be created and if it already exists, its contents will be overwritten

a

(append), opens a file for writing to it, but the difference is that it will not overwrite the contents of the file if it already exists. Otherwise it is the same as the w mode

x

(exclusive creation), this mode will try to create the file from scratch and if the file exists it will raise an error

r+

opens a file to read and write from/to it and contents will be overwritten

w+

opens a file to read and write from/to it but any existing content will be deleted

a+

opens a text file to read and append to it at the end of the file

Each time we deal with a file, be it for reading or writing purposes, the interpreter will assign a file object to it, to handle all the underlying tasks like opening the file and actually reading/writing from/to it. In the above syntax we refer to this object as object_name. Then, in the case of writing to a file we use the write() method of the object_name to write the actual content to the file. Let us look at an example:

path = '../../files/'
with open(path + 'file.txt', mode='w') as file_object:
    file_object.write('Welcome to Food Science!\n')

The above code will create the file.txt file in the files folder, which is located one directory above the one where this notebook is located. After the file is created, then the corresponding object that manages the access to the file will write the sentence Welcome to Food Science! inside the file.

Important

If the file that we are trying to open in w mode already exists, then its contents will be overwritten by the new content. If you would like to append to the file then the a mode should be used instead.

When the code inside the with clause finishes executing, then the resource (the file) is automatically released and as such other programs that might want to access is are free to do so. It is very important to release resources as soon as possible in order not to create bottlenecks.

If later during the program we would like to add some more text to the file, then we need to open it using the append method in order not to overwrite anything:

with open(path + 'file.txt', mode='a') as file_object:
    file_object.write('Are you excited to learn more about food science and programming?')

If you open the file, you will see that the content was added to a new line and did not overwrite the previous message.

Warning

In case you want to modify some content that was already written to a file, then most likely if the new and old content will not have the same length, the process will create a mess. In case of updates, it is better to write everything from the beginning, since there is no direct, built-in method to do so.

9.1.3. Reading data from a file#

Reading a file is similar to writing or appending to a file, the only thing we need to do is to change the value of the mode parameter to r, to indicate that we want the resource for reading purposes. Also, we need to call the read() method of the file_object so that it will read the contents of the file. Let us read the contents of the file.txt that we wrote in the previous section.

with open(path + 'file.txt', mode='r') as file_object:
    text = file_object.read()

print(text)
Welcome to Food Science!
Are you excited to learn more about food science and programming?

As you saw from above, the read() method read the entire content of the file and assigned it to the text variable. We can also read the lines of the file separately and get instead a list of strings, where each line is a separate string, using the readlines() method:

with open(path + 'file.txt', mode='r') as file_object:
    list_of_lines = file_object.readlines()

print(list_of_lines)
['Welcome to Food Science!\n', 'Are you excited to learn more about food science and programming?']