File Handling in Python

author-image  By Dhiraj, 20 March, 2019   0K

In this tutorial, we will be discussing about handling files and resources in Python. We will write sample codes for different file operations such as read, write, append etc. Also, we will see how Python eases reading of large text files and images. We will also deal with context managers while performing such operations to prevent any memory leaks.

Open File in Python

We make a call to buil-in function open() to open a file in Python. This function takes number of arguments but the required parameter is the path to file and returns a file object whose type depends on the mode. Below is the signature of open()

def open(file, mode='r', buffering=None, encoding=None, errors=None, newline=None, closefd=True):

In the above definition, file is the path to file.

mode is an optional string that specifies the mode in which the file is opened. It defaults to 'r' which means open for reading in text mode. We will discuss other modes later.

buffering is an optional integer used to set the buffering policy. Binary files are buffered in fixed-size chunks.

encoding to tell Python runtime about the encoding used by file. This should only be used in text mode.

errors is an optional string that specifies how encoding errors are to be handled---this argument should not be used in binary mode.

newline controls how universal newlines works.

If closefd is False, the underlying file descriptor will be kept open when the file is closed.

Python file open modes

Character Meaning
'r' open for reading (default)
'w' open for writing, truncating the file first
'x' create a new file and open it for writing
'a' open for writing, appending to the end of the file if it exists
'b' binary mode
't' text mode (default)
'+' open a disk file for updating (reading and writing)
Example
f = open('test.txt', mode= 'wt', encoding= 'utf-8')

Here, w meaning write and t for text. All mode should contain read, write and append mode.

Writing to File in Python

write() function is used to write to a file. It accepts text to be written to the file as an argument. Always remember to use close() methodafter any file operation. This method returns the number of codepoints and not the number of bytes. Below is an example.

f = open('test.txt', mode= 'wt', encoding= 'utf-8')

f.write('Hello There! ')
f.write('I am learning Python \n')
f.write('What are you learning?')

f.close()

Above snippet, creates a file, if it does not exist, and writes given text in it. If exists, it truncates the file and writes the new set of text that we provided. If we do not want to override the text that is already present in the file then we can use 'a' character meaning append.

While writing or reading files, we can use seek(0) anytime to move the pointers to the start of the file. Hence, the file write will start from the beginning of the file and overrides whatever text comes to its place. It does not override all the text instead it overrides the text that is required to accommodate the new text.

f = open('test.txt', mode= 'wt', encoding= 'utf-8')

f.write('Hello There! ')
f.write('I am learning Python \n')
f.seek(0)
f.write('overridden ')

f.close()

While executing above code, Hello There will be overriden by the text 'overridden ' as we moved the pointer to start of the file with seek()

Instead of using write() method, we can also use writelines() function to write multiple lines at once.

Reading Text File in Python

read() method is used to read file in Python. It accepts an optional parameter as number of characters to read from the file and return the text from the text file or binary data for binary file.

Example
f = open('test.txt', mode= 'rt', encoding= 'utf-8')
chunk_text = f.read(12)
print(chunk_text)
print('*******************')
text = f.read()
print(text)

f.close()
Output

With the first read(12) method, it only read 12 characters from the start of the text file and read() returns all the text in a text file.

Hello There!
*******************
 I am learning Python 
What are you learning?

It is always suggested to read files in chunk if you are not sure about the size of file or the file size is bigger.

We can also read a text file line by line using readline() or readlines(). readlines() method returns all the lines of the text file seperated with line break(\n) into a list whereas readline() reads one line at a time with every line ending with '\n' except the last line.

Reading Files as Iterator

readlines() function reads all the text at once which is not suitable to read large files whereas readline() method can not be used to large files too as we may not know the iteration we may require to read all the text from a file. To overcome this issue in some way, we can use Iterators to read text from a file. It reads text file with one line at a time till the end of the line. Below is an example.

f = open('test.txt', mode= 'rt', encoding= 'utf-8')

for line in f:
    print(line, end='')
    
f.close()

Using Context Managers while Reading File

Python provides with block to force the resource cleanup after any I/O operations with context-managers.

All the examples shown above uses close() function to close the I/O operation after completing the read or write operation. But what if any exception occurs and the close() function is not executed. In that case, there can be chance of memory leak. We can avoid this by putting close() function inside finally block. But Python provides a cleaner way to achive this using context-managers.

with open('test.txt', mode='rt', encoding='utf-8') as f:
    for line in f:
        print(line, end='')


Though we have access to file object outside the with block, we can't perform any file operations outside the with block because the file is already close with the end of with block.

Copying One File Content to Another File in Python

We can use iterator to read each line from the source file and copy text content to another file to create a copy or duplicate a file in Python. Below is the sample implementation.

with open('test.txt', mode='rt') as f:
    with open('test1.txt', mode="wt") as g:
        for line in f:
            g.write(line)
            

Reading and Writing Binary Files or Images in Python

The abstract layer to read and write image or binary file is very similar to read and write text file. We only need to supply the mode of file as binary.

with open('image.PNG', mode='rb') as f:
    with open('image1.PNG', mode="wb") as g:
        for line in f:
            g.write(line)

Reading Large File in Python

Due to in-memory contraint or memory leak issues, it is always recommended to read large files in chunk. To read a large file in chunk, we can use read() function with while loop to read some chunk data from a text file at a time.

with open('test.txt', mode='rt') as f:

    text = f.read(100) # Reads the first 100 character and moves pointer to 101th character

    while len(text) > 0:
        print(text)
        text = f.read(100) # Move pointer to end of next 100 character

Conclusion

In this tutorial, we discussed handling files and resources in Python. We implemented sample codes for different file operations such as read, write, append etc. Also, we saw how Python eases reading of large text files and images and used context managers while performing file operations to prevent any memory leaks.

About The Author

author-image

Further Reading on Python

1. Python Data Types

2. Different Python Operators

3. Python Flow Control

4. Python Functions

5. Python Classes

References

Python File System