Files
File Initialization & Attributes
Section titled “File Initialization & Attributes”- Concept: The
open()function generates a file object. This object contains built-in attributes that provide metadata about the file’s current state. - Methods:
open(file_name [, access_mode][, buffering]): The standard initialization.fo.name: Returns the name of the file.fo.closed: Returns a boolean indicating if the file is currently closed (True) or open (False).fo.mode: Returns the access mode used to open the file (e.g.,'w','r').
Program:
# Open a filewith open("foo.txt", "w") as fo: print("Name of the file: ", fo.name) print("Closed or not : ", fo.closed) print("Opening mode : ", fo.mode)
# Output:# Name of the file: foo.txt# Closed or not : False# Opening mode : w2. Writing to Files (write())
Section titled “2. Writing to Files (write())”- Concept: Pushing string data into a file. The behavior changes strictly based on the access mode.
- Methods:
"w"(Write Mode): Rewrites the file with new content. If the file does not exist, it creates a new one."a"(Append Mode): Appends output strictly to the end of the file. If the file does not exist, it creates a new one.fo.write(string): The method used to insert the text.
Program:
# Rewrite / Createwith open("foo.txt", "w") as fo: fo.write("Python is a great language.\nYeah, it's great!!\n")
# Appendwith open("foo.txt", "a") as fo: fo.write("Python is a great language.\nYeah, it's great!!\n")3. Reading File Data (read())
Section titled “3. Reading File Data (read())”- Concept: Python provides four distinct methods to extract data from a file object, depending on how much data you need in memory at once.
- Methods:
fo.read(n): Reads exactlyncharacters (or bytes) from the file.fo.read(): Reads the entire file content into a single string.for line in fo:: Iterates through the file object line by line (highly memory efficient).fo.readlines(): Reads all lines and compiles them into a single Python list.
Program:
(Assuming foo.txt contains four lines from the Zen of Python)
# 1. Read exact characterswith open("foo.txt") as fo: s = fo.read(10) print(s)# Output: Explicit i# 2. Read the whole filewith open("foo.txt") as fo: s = fo.read() print(s)# Output:# Explicit is better than implicit.# Simple is better than complex.# Complex is better than complicated.# Flat is better than nested.# 3. Read the file line by linewith open("foo.txt") as fo: for line in fo: print(line)# Output: (Prints each line with an extra newline because print() adds one natively)# Explicit is better than implicit.## Simple is better than complex.# ...# 4. Read lines into a listwith open("foo.txt") as fo: lines = fo.readlines() print(lines)# Output: ['Explicit is better than implicit.\n', 'Simple is better than complex.\n', ...]4. OS-Level File Handling
Section titled “4. OS-Level File Handling”- Concept: Bypassing file objects to perform direct filesystem operations using the
osmodule. - Methods:
os.rename(old_name, new_name): Changes the name of an existing file.os.remove(file_name): Permanently deletes the specified file from the filesystem.
import os
# Rename a file from test1.txt to test2.txtos.rename("test1.txt", "test2.txt")
# Delete file test2.txtos.remove("test2.txt")Advanced Application: Log Parsing Engine
Section titled “Advanced Application: Log Parsing Engine”- Concept: Combining line-by-line file streaming with the
re(Regular Expression) module to extract specific key-value pairs from standard system logs, mirroring standard Bash scripting logic but with Pythonic structure.
Program (Extracting Endpoints and Response Times):
import re
# [2026-03-06T10:01:02Z] endpoint=/api/v1/users ResponseTime=120ms Code=200 IP=192.168.1.10# [2026-03-06T10:01:05Z] endpoint=/api/v1/orders Code=201 IP=192.168.1.11# [2026-03-06T10:01:08Z] endpoint=/api/v1/products ResponseTime=95ms Code=200 IP=192.168.1.12# [2026-03-06T10:01:11Z] ResponseTime=310ms Code=200 IP=192.168.1.20
log_file = "sys_logs.log"
with open(log_file, "r") as file: for line in file: # Regex search: Look for 'endpoint=' followed by anything that is NOT a space endpoint_match = re.search(r"endpoint=([^ \\n]+)", line)
# Regex search: Look for 'ResponseTime=' followed by digits, ending in 'ms' time_match = re.search(r"ResponseTime=([0-9]+)ms", line)
# Logical condition: Only process if BOTH extraction patterns found a match if endpoint_match and time_match: # .group(1) extracts the exact value captured inside the regex parentheses endpoint = endpoint_match.group(1) time = time_match.group(1)
print(f"Endpoint={endpoint} ResponseTime={time}ms")
# Output:# Endpoint=/api/v1/users ResponseTime=120ms# Endpoint=/api/v1/products ResponseTime=95ms