Python

Working with Text Files is the most common form of file handling in Python. While basic reading and writing covers the "how," working with text files involves understanding how to manipulate strings, handle newlines, and navigate through the data efficiently.

1. Theoretical Overview

The Nature of Text Files

Text files consist of sequences of characters structured into lines. In Python, when you open a file in text mode (the default 't' mode), Python automatically handles:

  • Encoding: Usually UTF-8, which allows for a wide range of characters and symbols.

  • Newline Translation: Python maps different OS line endings (like \r\n on Windows or \n on Linux) to a simple \n for consistency.

Key Methods for Text Manipulation

When working with text files, you often choose between reading the file as one giant string or as a list of individual lines:

  1. read(size): Reads the whole file or a specific number of characters.

  2. readline(): Reads exactly one line at a time.

  3. readlines(): Reads the entire file and returns a list of strings, where each string is a line.

The "Cursor" (File Pointer)

Think of the file pointer as a blinking cursor in a text editor. When you read 10 characters, the cursor moves 10 characters forward. If you try to read again, it starts from the new position. To go back to the start, you use file.seek(0).

2. Code Implementation

A. Writing and Appending Text

You can use write() for single strings or writelines() to write a list of strings at once.

Python
lines_of_text = [
    "First line of data\n",
    "Second line of data\n",
    "Third line of data\n"
]

# Writing a list of lines
with open("notes.txt", "w") as file:
    file.writelines(lines_of_text)

# Appending a single line later
with open("notes.txt", "a") as file:
    file.write("This line was added later.\n")

B. Advanced Reading Techniques

Using a loop is the most memory-efficient way to process large text files because it doesn't load the whole file into RAM at once.

Python
# Efficiently processing large files
with open("notes.txt", "r") as file:
    for line_number, line in enumerate(file, 1):
        # .strip() removes the newline character \n from the end
        clean_line = line.strip()
        print(f"Line {line_number}: {clean_line}")

C. Moving the Pointer (seek and tell)

If you need to re-read a file without closing and re-opening it, you must move the pointer.

Python
with open("notes.txt", "r") as file:
    print(f"Current Position: {file.tell()}") # Starts at 0
    data = file.read(5)
    print(f"Read 5 chars: {data}")
    print(f"New Position: {file.tell()}") # Now at 5
    
    # Go back to the start
    file.seek(0)
    print("Back to start:", file.read(5))

3. Best Practices for Text Data

FeatureBest Practice
NewlinesAlways add \n manually when using write(); it doesn't add them for you.
EncodingUse open("file.txt", "r", encoding="utf-8") to avoid errors with special symbols.
StrippingAlways use .strip() or .rstrip() when reading lines to remove trailing newlines.
Upcoming Course
Upcoming Course
Learn More
Instructor Tips
Instructor Tips
View Tips
Join Community
Join Community
Join Now