Skip to content

Files

  • Concept: The open() function generates a file object. This object contains built-in attributes that provide metadata about the file’s current state.
  • Methods:
    • open(file_name [, access_mode][, buffering]): The standard initialization.
    • fo.name: Returns the name of the file.
    • fo.closed: Returns a boolean indicating if the file is currently closed (True) or open (False).
    • fo.mode: Returns the access mode used to open the file (e.g., 'w', 'r').

Program:

# Open a file
with open("foo.txt", "w") as fo:
print("Name of the file: ", fo.name)
print("Closed or not : ", fo.closed)
print("Opening mode : ", fo.mode)
# Output:
# Name of the file: foo.txt
# Closed or not : False
# Opening mode : w

  • Concept: Pushing string data into a file. The behavior changes strictly based on the access mode.
  • Methods:
    • "w" (Write Mode): Rewrites the file with new content. If the file does not exist, it creates a new one.
    • "a" (Append Mode): Appends output strictly to the end of the file. If the file does not exist, it creates a new one.
    • fo.write(string): The method used to insert the text.

Program:

# Rewrite / Create
with open("foo.txt", "w") as fo:
fo.write("Python is a great language.\nYeah, it's great!!\n")
# Append
with open("foo.txt", "a") as fo:
fo.write("Python is a great language.\nYeah, it's great!!\n")

  • Concept: Python provides four distinct methods to extract data from a file object, depending on how much data you need in memory at once.
  • Methods:
    • fo.read(n): Reads exactly n characters (or bytes) from the file.
    • fo.read(): Reads the entire file content into a single string.
    • for line in fo:: Iterates through the file object line by line (highly memory efficient).
    • fo.readlines(): Reads all lines and compiles them into a single Python list.

Program:

(Assuming foo.txt contains four lines from the Zen of Python)

# 1. Read exact characters
with open("foo.txt") as fo:
s = fo.read(10)
print(s)
# Output: Explicit i
# 2. Read the whole file
with open("foo.txt") as fo:
s = fo.read()
print(s)
# Output:
# Explicit is better than implicit.
# Simple is better than complex.
# Complex is better than complicated.
# Flat is better than nested.
# 3. Read the file line by line
with open("foo.txt") as fo:
for line in fo:
print(line)
# Output: (Prints each line with an extra newline because print() adds one natively)
# Explicit is better than implicit.
#
# Simple is better than complex.
# ...
# 4. Read lines into a list
with open("foo.txt") as fo:
lines = fo.readlines()
print(lines)
# Output: ['Explicit is better than implicit.\n', 'Simple is better than complex.\n', ...]

  • Concept: Bypassing file objects to perform direct filesystem operations using the os module.
  • Methods:
    • os.rename(old_name, new_name): Changes the name of an existing file.
    • os.remove(file_name): Permanently deletes the specified file from the filesystem.
import os
# Rename a file from test1.txt to test2.txt
os.rename("test1.txt", "test2.txt")
# Delete file test2.txt
os.remove("test2.txt")

  • Concept: Combining line-by-line file streaming with the re (Regular Expression) module to extract specific key-value pairs from standard system logs, mirroring standard Bash scripting logic but with Pythonic structure.

Program (Extracting Endpoints and Response Times):

sys_logs.log
import re
# [2026-03-06T10:01:02Z] endpoint=/api/v1/users ResponseTime=120ms Code=200 IP=192.168.1.10
# [2026-03-06T10:01:05Z] endpoint=/api/v1/orders Code=201 IP=192.168.1.11
# [2026-03-06T10:01:08Z] endpoint=/api/v1/products ResponseTime=95ms Code=200 IP=192.168.1.12
# [2026-03-06T10:01:11Z] ResponseTime=310ms Code=200 IP=192.168.1.20
log_file = "sys_logs.log"
with open(log_file, "r") as file:
for line in file:
# Regex search: Look for 'endpoint=' followed by anything that is NOT a space
endpoint_match = re.search(r"endpoint=([^ \\n]+)", line)
# Regex search: Look for 'ResponseTime=' followed by digits, ending in 'ms'
time_match = re.search(r"ResponseTime=([0-9]+)ms", line)
# Logical condition: Only process if BOTH extraction patterns found a match
if endpoint_match and time_match:
# .group(1) extracts the exact value captured inside the regex parentheses
endpoint = endpoint_match.group(1)
time = time_match.group(1)
print(f"Endpoint={endpoint} ResponseTime={time}ms")
# Output:
# Endpoint=/api/v1/users ResponseTime=120ms
# Endpoint=/api/v1/products ResponseTime=95ms