notes
What is Shell Scripting
Shell scripting is basically writing a series of commands in a text file that your computer’s terminal can run automatically. Think of it like creating a recipe where, instead of cooking steps, you’re telling your computer what tasks to do in order. It’s super handy for automating repetitive stuff like backing up files, managing system tasks, or running multiple programs with just one command. Most people use Bash shell on Linux/Mac, but there’s also PowerShell on Windows - either way, it saves tons of time once you get the hang of it.
How to prepare for a Shell Scripting Interview
Get your basics rock solid - Make sure you’re comfortable with fundamental commands like ls, grep, sed, awk, and pipes. Most interview questions for shell scripting start with these building blocks, so practice combining them in different ways.
Practice writing actual scripts - Don’t just memorize syntax. Write scripts for real tasks like file backups, log parsing, or system monitoring. Interviewers love seeing practical problem-solving skills.
Master the common shell scripting interview questions - You’ll definitely get asked about variables, loops, conditionals, and functions. Practice explaining the difference between $@ and $*, when to use double vs single quotes, and how exit codes work.
Debug like crazy - Learn to use set -x for debugging and understand error handling with trap commands. Being able to troubleshoot broken scripts on the spot is a huge plus.
Know your shell differences - Understand what makes bash different from sh, ksh, or zsh. Some companies are picky about POSIX compliance.
Study real-world scenarios - Look up common automation tasks like monitoring disk space, rotating logs, or deploying applications. These make great discussion points during interviews.
Practice explaining your code - The best script in the world won’t help if you can’t walk through your logic clearly. Practice talking through your problem-solving approach out loud.
Shell Scripting Interview Questions and Answers For Beginners
Starting your journey with shell scripting interviews? Don’t worry - most bash scripting interview questions for beginners focus on practical basics rather than complex theory. We’ll walk through the most common bash shell interview questions that entry-level positions typically cover, helping you understand not just what to answer, but why things work the way they do. These questions come up constantly because they test whether you can actually write useful scripts, not just memorize commands.
Q1: What exactly is a shell script?
Think of it as a text file filled with commands you’d normally type one by one in the terminal. Instead of typing them manually every time, you save them in a file and run them all at once. It’s like creating a macro for your terminal - super helpful when you need to do the same tasks repeatedly.
Q2: How do I create my first shell script?
Open any text editor (nano, vim, or even notepad), type your commands, and save it with a .sh extension. Here’s the simplest example:
bash
#!/bin/bash
echo “My first script!”
Save it as myscript.sh, make it executable with chmod +x myscript.sh, then run with ./myscript.sh.
Q3: Why do some variables have $ and others don’t?
You use $ when you want to get the value from a variable, but not when you’re setting it. Setting: name=“John” (no $). Using: echo “Hello $name” (with $). It’s like the difference between writing on a nametag versus reading what’s already written there.
Q4: What are those -f, -d, -e things I see in if statements?
These are test operators that check file conditions. -f checks if something is a regular file, -d checks for directories, -e checks if anything exists at that path. There are tons more: -r for readable, -w for writable, -x for executable. They’re your script’s way of looking before it leaps.
Q5: How do I do basic math in bash?
Bash is quirky with math - you need special syntax. Use $((expression)) for arithmetic: result=$((5 + 3)). For decimals, bash can’t handle them natively - you’d need bc: echo “5.5 + 2.3” | bc. Remember, spaces don’t matter inside $(( )), which is unusual for bash.
Q6: What’s this 2>&1 thing I keep seeing?
This redirects error messages to wherever regular output is going. The 2 represents stderr (error messages), 1 represents stdout (normal output), and & tells bash “I mean the file descriptor, not a file named 1”. So command > output.txt 2>&1 sends both regular output and errors to output.txt.
Q7: How do I check if a command succeeded?
Every command returns an exit code - 0 for success, anything else for failure. Check it right after the command:
bash
cp file1 file2
if [ $? -eq 0 ]; then
echo “Copy worked!”
else
echo “Copy failed!”
fi
Q8: What’s the deal with spaces in bash?
Bash is super picky about spaces, especially in conditions. if [ $a = $b ] needs spaces around the brackets and operators. But a=5 can’t have spaces around the =. It’s annoying at first, but you’ll develop muscle memory for it pretty quickly.
Q9: How do I loop through a list of items?
The for loop is your friend here. For a simple list:
bash
for fruit in apple banana orange; do
echo “I like $fruit”
done
For files: for file in *.txt; do something; done. The loop automatically splits on spaces unless you mess with IFS.
Q10: Can I use variables from outside my script?
Yes! These are environment variables. Your script can read system variables like $HOME, $USER, $PATH. To pass your own, either export them first (export MYVAR=“value”) or set them inline: MYVAR=“value” ./myscript.sh. Just remember they’re readable, not writable - changes inside the script won’t affect the outside.
Q11: What happens if my script crashes midway?
By default, bash keeps going even if commands fail. To make it stop on errors, add set -e near the top. Want to see what’s happening? Add set -x for debug mode. Combine them as set -ex. Just remember these make your script stricter - sometimes you want to handle errors yourself instead.
Q12: How do I get input from users?
The read command is what you need. Basic usage: read username waits for input and stores it. Fancy it up with read -p “What’s your name? ” username for a prompt. Want a password? Use read -s to hide what they type. Set a timeout with read -t 10 to wait only 10 seconds.
Shell Scripting Coding Interview Questions and Answers For Beginners
Ready to tackle actual shell scripting coding interview questions? These hands-on problems test whether you can write working scripts, not just talk about concepts. We’ll cover the classic coding challenges that beginners face in interviews - the ones where you need to write real bash code to solve practical problems. Each solution includes the full script with explanations, so you can understand the logic and adapt it for similar questions.
Q1: Write a script to check if a number is even or odd
This tests your understanding of arithmetic operations and conditionals:
bash
#!/bin/bash
echo “Enter a number: ”
read num
Check if the input is actually a number
Section titled “Check if the input is actually a number”if ! [[ “$num” =~ ^[0-9]+$ ]]; then
echo “Please enter a valid number”
exit 1
fi
Use modulo operator to check even/odd
Section titled “Use modulo operator to check even/odd”if [ $((num % 2)) -eq 0 ]; then
echo “$num is even”
else
echo “$num is odd”
fi
The key here is using the modulo operator (%) inside $(( )). If a number divided by 2 has no remainder, it’s even. The regex check prevents the script from breaking on non-numeric input.
Q2: Create a script that counts files in a directory by extension
This problem tests file handling and associative arrays:
bash
#!/bin/bash
Check if directory argument provided
Section titled “Check if directory argument provided”if [ $# -eq 0 ]; then
echo “Usage: $0 <directory>”
exit 1
fi
directory=“$1”
Check if directory exists
Section titled “Check if directory exists”if [ ! -d “$directory” ]; then
echo “Directory $directory doesn’t exist”
exit 1
fi
Count files by extension
Section titled “Count files by extension”echo “File count by extension in $directory:”
for file in “$directory”/*; do
Skip if no files found
Section titled “Skip if no files found”[ -e “$file” ] || continue
Skip directories
Section titled “Skip directories”[ -f “$file” ] || continue
Extract extension
Section titled “Extract extension”extension=”${file##*.}”
Handle files without extension
Section titled “Handle files without extension”if [ “$extension” = “$file” ]; then
extension=“no_extension”
fi
Count using associative array would be ideal, but
Section titled “Count using associative array would be ideal, but”for beginners, let’s use a simpler approach
Section titled “for beginners, let’s use a simpler approach”echo “$extension”
done | sort | uniq -c | sort -rn
This solution uses parameter expansion (${file##*.}) to extract extensions and pipes the output through sort and uniq to count occurrences. It’s a common pattern in shell scripting - using Unix tools together.
Q3: Write a script to backup files modified in the last 7 days
This tests your knowledge of the find command and date handling:
bash
#!/bin/bash
Set source and backup directories
Section titled “Set source and backup directories”source_dir=”${1:-.}” # Use current directory if not specified
backup_dir=“backup_$(date +%Y%m%d)“
Create backup directory
Section titled “Create backup directory”mkdir -p “$backup_dir”
echo “Starting backup of files modified in last 7 days…”
Find and copy files modified in last 7 days
Section titled “Find and copy files modified in last 7 days”count=0
find “$source_dir” -type f -mtime -7 | while read -r file; do
Skip if file is in backup directory
Section titled “Skip if file is in backup directory”[[ “$file” == “$backup_dir” ]] && continue
Create subdirectory structure in backup
Section titled “Create subdirectory structure in backup”rel_path=”${file#$source_dir/}”
backup_path=“$backup_dir/$rel_path”
backup_subdir=”$(dirname “$backup_path”)”
mkdir -p “$backup_subdir”
Copy file preserving timestamps
Section titled “Copy file preserving timestamps”if cp -p “$file” “$backup_path” 2>/dev/null; then
((count++))
echo “Backed up: $rel_path”
fi
done
echo “Backup complete! $count files copied to $backup_dir”
The script uses find with -mtime -7 to locate recent files. The while read loop processes filenames with spaces correctly. We preserve the directory structure in the backup and use cp -p to keep timestamps.
Q4: Create a script that monitors disk usage and alerts when it exceeds 80%
This demonstrates system monitoring and parsing command output:
bash
#!/bin/bash
Set threshold
Section titled “Set threshold”threshold=80
echo “Checking disk usage…”
echo ”========================“
Get disk usage and process each line
Section titled “Get disk usage and process each line”df -h | grep -E ’^/dev/’ | while read -r line; do
Parse the df output
Section titled “Parse the df output”device=$(echo “$line” | awk ‘{print $1}’)
usage=$(echo “$line” | awk ‘{print $5}’ | sed ‘s/%//’)
mount=$(echo “$line” | awk ‘{print $6}‘)
Check if usage exceeds threshold
Section titled “Check if usage exceeds threshold”if [ “$usage” -gt “$threshold” ]; then
echo “WARNING: $mount ($device) is at $usage% capacity!”
You could add email notification here
Section titled “You could add email notification here”echo “Disk $mount is at $usage%” | mail -s “Disk Alert” admin@example.com
Section titled “echo “Disk $mount is at $usage%” | mail -s “Disk Alert” admin@example.com”else
echo “OK: $mount ($device) is at $usage% capacity”
fi
done
Also check if any partition is critically full (>90%)
Section titled “Also check if any partition is critically full (>90%)”critical=$(df -h | grep -E ’^/dev/’ | awk ‘{print $5}’ | sed ‘s/%//’ | sort -n | tail -1)
if [ “$critical” -gt 90 ]; then
echo ""
echo “CRITICAL: At least one partition is over 90% full!”
exit 1
fi
This script parses df output using awk to extract specific fields. The percentage sign is stripped with sed so we can do numeric comparison. Real-world scripts might send emails or write to log files.
Q5: Write a script that validates and processes a CSV file
This tests file parsing and error handling:
bash
#!/bin/bash
Check if file provided
Section titled “Check if file provided”if [ $# -eq 0 ]; then
echo “Usage: $0 <csv_file>”
exit 1
fi
csv_file=“$1”
Check if file exists and is readable
Section titled “Check if file exists and is readable”if [ ! -r “$csv_file” ]; then
echo “Error: Cannot read file $csv_file”
exit 1
fi
echo “Processing CSV file: $csv_file”
echo ”================================“
Process header
Section titled “Process header”header=$(head -1 “$csv_file”)
num_fields=$(echo “$header” | awk -F’,’ ‘{print NF}’)
echo “Header fields ($num_fields): $header”
echo ""
Process data lines
Section titled “Process data lines”line_num=1
error_count=0
valid_count=0
tail -n +2 “$csv_file” | while IFS=’,’ read -r field1 field2 field3 remainder; do
((line_num++))
Check if line has correct number of fields
Section titled “Check if line has correct number of fields”current_fields=$(echo “$field1,$field2,$field3,$remainder” | awk -F’,’ ‘{print NF}’)
if [ “$current_fields” -ne “$num_fields” ]; then
echo “Error on line $line_num: Expected $num_fields fields, got $current_fields”
((error_count++))
continue
fi
Basic validation - check if fields are not empty
Section titled “Basic validation - check if fields are not empty”if [ -z “$field1” ] || [ -z “$field2” ] || [ -z “$field3” ]; then
echo “Error on line $line_num: Empty required fields”
((error_count++))
continue
fi
Process valid line (example: just echo it)
Section titled “Process valid line (example: just echo it)”echo “Processing: $field1 | $field2 | $field3”
((valid_count++))
done
echo ""
echo “Summary: Processed $valid_count valid lines, found $error_count errors”
This script shows proper CSV handling using IFS (Internal Field Separator) and read. It validates the number of fields per line and checks for empty values. The tail -n +2 skips the header for processing.
Shell Scripting Interview Questions and Answers For Intermediate (2-4 years Exp).
Moving beyond the basics, intermediate shell scripting coding interview questions focus on real-world problem solving, performance optimization, and handling complex scenarios you’ve likely encountered in production. We’ll explore questions that test your experience with error handling, process management, and writing maintainable scripts that other developers can understand and modify. These questions reflect the challenges you face when scripts move from personal tools to team resources that need to be reliable and efficient.
Q1: How do you handle errors gracefully in production scripts?
At this level, you need multiple strategies working together. I use trap to catch errors and cleanup, set -euo pipefail to stop on failures, and custom error functions:
#!/bin/bash
set -euo pipefail
Global error handler
Section titled “Global error handler”error_exit() {
echo “Error on line $1: $2” >&2
cleanup
exit 1
}
cleanup() {
Remove temp files, unlock resources, etc
Section titled “Remove temp files, unlock resources, etc”rm -f /tmp/mylockfile
echo “Cleanup completed”
}
Set trap for errors and exit
Section titled “Set trap for errors and exit”trap ‘error_exit $LINENO “Command failed”’ ERR
trap cleanup EXIT
Your actual script logic here
Section titled “Your actual script logic here”The key is combining these techniques - set -e stops on errors, -u catches undefined variables, -o pipefail catches errors in pipes, and trap ensures cleanup happens no matter what.
Q2: Explain process substitution and when you’d use it
Process substitution <(command) creates a temporary file descriptor that acts like a file. It’s brilliant for comparing outputs or feeding multiple processes:
Compare two command outputs
Section titled “Compare two command outputs”diff <(ls dir1) <(ls dir2)
Feed sorted data to a command expecting a file
Section titled “Feed sorted data to a command expecting a file”join <(sort file1) <(sort file2)
Multiple inputs to a single command
Section titled “Multiple inputs to a single command”paste <(cut -d: -f1 /etc/passwd) <(cut -d: -f3 /etc/passwd)
I use it when I need to avoid temporary files or when working with commands that only accept file arguments but I want to give them dynamic data.
Q3: How do you implement proper logging in shell scripts?
Production scripts need structured logging with timestamps, levels, and rotation. Here’s my go-to pattern:
LOG_FILE=“/var/log/myscript.log”
LOG_LEVEL=${LOG_LEVEL:-“INFO”} # Can override via environment
log() {
local level=$1
shift
local message=”$@”
local timestamp=$(date ’+%Y-%m-%d %H:%M:%S’)
Check log level
Section titled “Check log level”case $LOG_LEVEL in
ERROR) [[ $level =~ ^(ERROR)$ ]] || return ;;
WARN) [[ $level =~ ^(ERROR|WARN)$ ]] || return ;;
INFO) [[ $level =~ ^(ERROR|WARN|INFO)$ ]] || return ;;
DEBUG) ;; # Log everything
esac
echo ”[$timestamp] [$level] $message” | tee -a “$LOG_FILE”
}
log INFO “Script started”
log ERROR “Connection failed”
log DEBUG “Variable X = $x”
Don’t forget log rotation - either use logrotate or implement size checking in your script.
Q4: How do you handle concurrent script execution?
Preventing race conditions is crucial. I typically use flock for reliable locking:
LOCK_FILE=“/var/run/myscript.lock”
exec 200>“$LOCK_FILE”
if ! flock -n 200; then
echo “Another instance is running. Exiting.”
exit 1
fi
Alternative: wait for lock with timeout
Section titled “Alternative: wait for lock with timeout”if ! flock -w 10 200; then
echo “Could not acquire lock after 10 seconds”
exit 1
fi
Script continues here with exclusive lock
Section titled “Script continues here with exclusive lock”For more complex scenarios, I might use mkdir as an atomic operation or implement a PID file system that checks if the process is actually still running.
Q5: What’s your approach to parsing complex command line arguments?
For production scripts, I use getopts for short options and manual parsing for long options:
usage() {
cat << EOF
Usage: $0 [-h] [-v] [-f FILE] [—debug] [—config CONFIG]
Options:
-h, —help Show this help
-v, —verbose Enable verbose mode
-f FILE Input file
—debug Enable debug mode
—config FILE Configuration file
EOF
}
Parse arguments
Section titled “Parse arguments”VERBOSE=0
DEBUG=0
while [[ $# -gt 0 ]]; do
case $1 in
-h|—help)
usage
exit 0
;;
-v|—verbose)
VERBOSE=1
shift
;;
-f)
INPUT_FILE=“$2”
shift 2
;;
—debug)
DEBUG=1
set -x # Enable bash debugging
shift
;;
—config)
CONFIG_FILE=“$2”
shift 2
;;
—)
shift
break
;;
-*)
echo “Unknown option: $1” >&2
usage
exit 1
;;
*)
break
;;
esac
done
Remaining arguments are in $@
Section titled “Remaining arguments are in $@”Q6: How do you optimize scripts that process large files?
Never load entire files into memory. Instead, stream process them:
Bad: loads entire file
Section titled “Bad: loads entire file”for line in $(cat hugefile.txt); do
process “$line”
done
Good: streams line by line
Section titled “Good: streams line by line”while IFS= read -r line; do
process “$line”
done < hugefile.txt
Better: parallel processing for CPU-bound tasks
Section titled “Better: parallel processing for CPU-bound tasks”cat hugefile.txt | parallel -j 4 process {}
For simple transformations, use tools designed for it
Section titled “For simple transformations, use tools designed for it”awk ‘{sum += $3} END {print sum}’ hugefile.txt # Sum third column
sed -i ‘s/old/new/g’ hugefile.txt # In-place replacement
Also consider splitting large files and processing chunks in parallel, or using tools like split, sort -S for memory limits, and join instead of loading everything into associative arrays.
Q7: Explain your strategy for making scripts portable across different systems
Portability is tricky. I start with POSIX compliance when possible, but pragmatically handle differences:
#!/usr/bin/env bash # More portable than /bin/bash
Detect OS
Section titled “Detect OS”if [[ “$OSTYPE” == “linux-gnu”* ]]; then
OS=“linux”
elif [[ “$OSTYPE” == “darwin”* ]]; then
OS=“macos”
elif [[ “$OSTYPE” == “freebsd”* ]]; then
OS=“freebsd”
fi
Handle command differences
Section titled “Handle command differences”if command -v gsed &> /dev/null; then
SED=gsed # GNU sed on Mac
else
SED=sed
fi
Check for required commands
Section titled “Check for required commands”for cmd in awk grep “$SED”; do
if ! command -v “$cmd” &> /dev/null; then
echo “Required command ‘$cmd’ not found” >&2
exit 1
fi
done
Use portable options
Section titled “Use portable options”find . -type f -name “*.txt” -print0 | xargs -0 grep “pattern” # Works everywhere
Q8: How do you implement timeout functionality for long-running commands?
The timeout command is great when available, but here’s a portable solution:
Using timeout command (if available)
Section titled “Using timeout command (if available)”if command -v timeout &> /dev/null; then
timeout 30s long_running_command
else
Manual implementation
Section titled “Manual implementation”long_running_command &
pid=$!
count=0
while kill -0 $pid 2>/dev/null && [ $count -lt 30 ]; do
sleep 1
((count++))
done
if kill -0 $pid 2>/dev/null; then
echo “Command timed out after 30 seconds”
kill -TERM $pid
sleep 2
kill -0 $pid 2>/dev/null && kill -KILL $pid
fi
fi
Alternative: using read with timeout for user input
Section titled “Alternative: using read with timeout for user input”if read -t 10 -p “Enter value (10 sec timeout): ” value; then
echo “You entered: $value”
else
echo “Timeout or cancelled”
fi
Q9: What’s your approach to debugging complex shell scripts?
Beyond set -x, I use structured debugging:
Debug function with levels
Section titled “Debug function with levels”DEBUG_LEVEL=${DEBUG_LEVEL:-0}
debug() {
local level=$1
shift
if [ “$DEBUG_LEVEL” -ge “$level” ]; then
echo “[DEBUG$level] $*” >&2
fi
}
Trace function execution
Section titled “Trace function execution”trace() {
local func=$1
shift
debug 1 “Entering $func with args: $*”
local start=$(date +%s.%N)
“$func” ”$@”
local ret=$?
local end=$(date +%s.%N)
local duration=$(echo “$end - $start” | bc)
debug 1 “Exiting $func (duration: ${duration}s, return: $ret)”
return $ret
}
PS4 for better trace output
Section titled “PS4 for better trace output”export PS4=’+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }‘
Conditional debugging
Section titled “Conditional debugging”[ “$DEBUG” = “1” ] && set -x
I also use ShellCheck religiously, and for really tough bugs, I’ll use bashdb or add extensive logging.
Q10: How do you manage configuration in shell scripts?
I layer configurations from multiple sources with clear precedence:
Default configuration
Section titled “Default configuration”declare -A CONFIG=(
[db_host]=“localhost”
[db_port]=“5432”
[log_level]=“INFO”
)
Load system config
Section titled “Load system config”[ -f /etc/myapp/config ] && source /etc/myapp/config
Load user config
Section titled “Load user config”[ -f ~/.myapp/config ] && source ~/.myapp/config
Load project config
Section titled “Load project config”[ -f ./config ] && source ./config
Environment variables override everything
Section titled “Environment variables override everything”[ -n “$MYAPP_DB_HOST” ] && CONFIG[db_host]=$MYAPP_DB_HOST
[ -n “$MYAPP_DB_PORT” ] && CONFIG[db_port]=$MYAPP_DB_PORT
Validate required settings
Section titled “Validate required settings”for key in db_host db_port; do
if [ -z ”${CONFIG[$key]}” ]; then
echo “Error: Missing required config: $key” >&2
exit 1
fi
done
Export for child processes if needed
Section titled “Export for child processes if needed”export MYAPP_CONFIG_DB_HOST=”${CONFIG[db_host]}”
This gives flexibility while maintaining predictable behavior - defaults, system-wide settings, user preferences, project overrides, and finally environment variables.
Shell Scripting Coding Interview Questions and Answers For Intermediate (2-4 years Exp)
At the intermediate level, shell scripting coding interview questions shift from basic syntax to solving real production challenges - the kind where your script needs to handle thousands of files, recover from failures, and play nice with other systems. We’ll tackle the practical problems that someone with a few years under their belt should handle confidently, focusing on efficiency, reliability, and maintainability. These are the scenarios where your experience shows through in how you structure your solution, not just whether it works.
Q1: Write a script that monitors a log file in real-time and sends alerts for specific error patterns
This tests your ability to handle continuous file monitoring and pattern matching:
#!/bin/bash
Configuration
Section titled “Configuration”LOG_FILE=”${1:-/var/log/application.log}”
ALERT_EMAIL=“admin@company.com”
ERROR_PATTERNS=(“ERROR” “CRITICAL” “FATAL” “OutOfMemory”)
ALERT_THRESHOLD=5 # Alert after 5 errors in 60 seconds
WINDOW_SIZE=60
Check if log file exists
Section titled “Check if log file exists”if [ ! -f “$LOG_FILE” ]; then
echo “Error: Log file $LOG_FILE not found”
exit 1
fi
Initialize error tracking
Section titled “Initialize error tracking”declare -A error_counts
declare -A error_timestamps
Function to send alert
Section titled “Function to send alert”send_alert() {
local pattern=$1
local count=$2
local message=“Alert: $count occurrences of ‘$pattern’ in last $WINDOW_SIZE seconds”
In real scenario, you’d send email or post to Slack
Section titled “In real scenario, you’d send email or post to Slack”echo ”[$(date ’+%Y-%m-%d %H:%M:%S’)] $message”
Example email command (commented out)
Section titled “Example email command (commented out)”echo “$message” | mail -s “Log Alert: $pattern” $ALERT_EMAIL
Section titled “echo “$message” | mail -s “Log Alert: $pattern” $ALERT_EMAIL”Reset counter after alert
Section titled “Reset counter after alert”error_counts[$pattern]=0
error_timestamps[$pattern]=""
}
Function to clean old timestamps
Section titled “Function to clean old timestamps”clean_old_timestamps() {
local pattern=$1
local current_time=$(date +%s)
local new_timestamps=""
Keep only timestamps within window
Section titled “Keep only timestamps within window”for ts in ${error_timestamps[$pattern]}; do
if [ $((current_time - ts)) -lt $WINDOW_SIZE ]; then
new_timestamps=“$new_timestamps $ts”
fi
done
error_timestamps[$pattern]=“$new_timestamps”
error_counts[$pattern]=$(echo “$new_timestamps” | wc -w)
}
echo “Monitoring $LOG_FILE for error patterns…”
echo “Patterns: ${ERROR_PATTERNS[*]}”
echo “Alert threshold: $ALERT_THRESHOLD errors in $WINDOW_SIZE seconds”
echo “Press Ctrl+C to stop”
Monitor log file
Section titled “Monitor log file”tail -F “$LOG_FILE” | while read -r line; do
Check each pattern
Section titled “Check each pattern”for pattern in ”${ERROR_PATTERNS[@]}”; do
if [[ “$line” =~ $pattern ]]; then
current_time=$(date +%s)
Add timestamp
Section titled “Add timestamp”error_timestamps[$pattern]=”${error_timestamps[$pattern]} $current_time”
Clean old timestamps and update count
Section titled “Clean old timestamps and update count”clean_old_timestamps “$pattern”
Check if threshold exceeded
Section titled “Check if threshold exceeded”if [ ”${error_counts[$pattern]}” -ge “$ALERT_THRESHOLD” ]; then
send_alert “$pattern” ”${error_counts[$pattern]}”
fi
Debug output
Section titled “Debug output”echo ”[$(date ’+%H:%M:%S’)] Found: $pattern (count: ${error_counts[$pattern]})”
fi
done
done
The script uses tail -F (capital F) to follow log rotation, tracks errors within a time window, and only alerts when thresholds are exceeded. This prevents alert spam while catching real issues.
Q2: Create a script that performs parallel backup of multiple directories with progress tracking
This demonstrates process management and parallel execution:
#!/bin/bash
Source directories to backup
Section titled “Source directories to backup”SOURCE_DIRS=(
“/home/user/documents”
“/home/user/projects”
“/var/www/html”
“/etc”
)
BACKUP_ROOT=“/backup/$(date +%Y%m%d_%H%M%S)”
MAX_PARALLEL=3
COMPRESSION=“gzip” # or “bzip2”, “xz”
Create backup directory
Section titled “Create backup directory”mkdir -p “$BACKUP_ROOT”
File to track progress
Section titled “File to track progress”PROGRESS_FILE=“/tmp/backup_progress_$$”
rm -f “$PROGRESS_FILE”
touch “$PROGRESS_FILE”
Function to backup a single directory
Section titled “Function to backup a single directory”backup_directory() {
local src_dir=$1
local dir_name=$(basename “$src_dir”)
local backup_file=“$BACKUP_ROOT/${dir_name}.tar.gz”
local pid=$$
local start_time=$(date +%s)
echo “$pid:$dir_name:0:STARTING” >> “$PROGRESS_FILE”
Calculate directory size for progress
Section titled “Calculate directory size for progress”local total_size=$(du -sb “$src_dir” 2>/dev/null | awk ‘{print $1}‘)
Create backup with progress monitoring
Section titled “Create backup with progress monitoring”tar cf - “$src_dir” 2>/dev/null | \
pv -s “$total_size” -n 2>/tmp/pv_progress_$$ | \
gzip > “$backup_file” &
local tar_pid=$!
Monitor progress
Section titled “Monitor progress”while kill -0 $tar_pid 2>/dev/null; do
if [ -f /tmp/pv_progress_$$ ]; then
progress=$(tail -1 /tmp/pv_progress_$$ 2>/dev/null || echo “0”)
echo “$pid:$dir_name:$progress:RUNNING” >> “$PROGRESS_FILE”
fi
sleep 1
done
Check if backup succeeded
Section titled “Check if backup succeeded”wait $tar_pid
local exit_code=$?
local end_time=$(date +%s)
local duration=$((end_time - start_time))
if [ $exit_code -eq 0 ]; then
local final_size=$(stat -f%z “$backup_file” 2>/dev/null || stat -c%s “$backup_file”)
echo “$pid:$dir_name:100:COMPLETED:$duration:$final_size” >> “$PROGRESS_FILE”
else
echo “$pid:$dir_name:0:FAILED:$duration:0” >> “$PROGRESS_FILE”
fi
rm -f /tmp/pv_progress_$$
}
Function to display progress
Section titled “Function to display progress”show_progress() {
while true; do
clear
echo “Backup Progress - $(date ’+%Y-%m-%d %H:%M:%S’)”
echo ”==================================================“
Read current progress
Section titled “Read current progress”while IFS=’:’ read -r pid dir progress status duration size; do
case $status in
STARTING)
printf ”%-30s [%s]\n” “$dir” “Initializing…”
;;
RUNNING)
Create progress bar
Section titled “Create progress bar”bar_length=30
filled=$((progress * bar_length / 100))
bar=$(printf ’%*s’ “$filled” | tr ’ ’ ’=’)
empty=$((bar_length - filled))
bar=”${bar}$(printf ’%*s’ “$empty” | tr ’ ’ ’-’)”
printf ”%-30s [%s] %3d%%\n” “$dir” “$bar” “$progress”
;;
COMPLETED)
size_mb=$((size / 1024 / 1024))
printf ”%-30s [DONE] %dMB in %ds\n” “$dir” “$size_mb” “$duration”
;;
FAILED)
printf ”%-30s [FAILED]\n” “$dir”
;;
esac
done < “$PROGRESS_FILE”
Check if all backups are done
Section titled “Check if all backups are done”if ! grep -q “RUNNING|STARTING” “$PROGRESS_FILE” 2>/dev/null; then
echo ""
echo “All backups completed!”
break
fi
sleep 1
done
}
Start progress display in background
Section titled “Start progress display in background”show_progress &
progress_pid=$!
Run backups in parallel with job control
Section titled “Run backups in parallel with job control”job_count=0
for dir in ”${SOURCE_DIRS[@]}”; do
Wait if we’ve hit the parallel limit
Section titled “Wait if we’ve hit the parallel limit”while [ $(jobs -r | wc -l) -ge $MAX_PARALLEL ]; do
sleep 0.5
done
Start backup job
Section titled “Start backup job”backup_directory “$dir” &
((job_count++))
done
Wait for all backup jobs
Section titled “Wait for all backup jobs”wait
Stop progress display
Section titled “Stop progress display”kill $progress_pid 2>/dev/null
show_progress # Show final status
Cleanup
Section titled “Cleanup”rm -f “$PROGRESS_FILE”
Generate summary report
Section titled “Generate summary report”echo ""
echo “Backup Summary”
echo ”==============”
echo “Location: $BACKUP_ROOT”
echo “Total archives: $(ls -1 “$BACKUP_ROOT”/*.tar.gz 2>/dev/null | wc -l)”
echo “Total size: $(du -sh “$BACKUP_ROOT” | awk ‘{print $1}’)“
List all backups with sizes
Section titled “List all backups with sizes”ls -lh “$BACKUP_ROOT”/*.tar.gz 2>/dev/null
This script showcases parallel job management, real-time progress tracking with pv, and proper cleanup. It handles multiple directories simultaneously while keeping the user informed.
Q3: Write a script that synchronizes configuration files across multiple servers
This tests your understanding of remote execution and error handling:
#!/bin/bash
Configuration
Section titled “Configuration”CONFIG_DIR=“/etc/myapp”
SERVERS=(“web01.example.com” “web02.example.com” “web03.example.com”)
SSH_USER=“deploy”
SSH_KEY=“$HOME/.ssh/deploy_key”
BACKUP_BEFORE_SYNC=true
DRY_RUN=false
Parse command line
Section titled “Parse command line”while getopts “dnh” opt; do
case $opt in
d) DRY_RUN=true ;;
n) BACKUP_BEFORE_SYNC=false ;;
h) echo “Usage: $0 [-d] [-n] [-h]”
echo ” -d Dry run (show what would be done)”
echo ” -n No backup before sync”
exit 0 ;;
esac
done
Colors for output
Section titled “Colors for output”RED=‘\033[0;31m’
GREEN=‘\033[0;32m’
YELLOW=‘\033[1;33m’
NC=‘\033[0m’
Logging function
Section titled “Logging function”log() {
local level=$1
shift
local msg=”$@”
local timestamp=$(date ’+%Y-%m-%d %H:%M:%S’)
case $level in
ERROR) echo -e ”${RED}[$timestamp] ERROR: $msg${NC}” >&2 ;;
SUCCESS) echo -e ”${GREEN}[$timestamp] SUCCESS: $msg${NC}” ;;
INFO) echo -e ”[$timestamp] INFO: $msg” ;;
WARN) echo -e ”${YELLOW}[$timestamp] WARN: $msg${NC}” ;;
esac
Also log to file
Section titled “Also log to file”echo ”[$timestamp] $level: $msg” >> “/var/log/config_sync.log”
}
Function to check server connectivity
Section titled “Function to check server connectivity”check_server() {
local server=$1
ssh -q -o ConnectTimeout=5 -o BatchMode=yes \
-i “$SSH_KEY” “$SSH_USER@$server” exit 2>/dev/null
}
Function to backup configs on remote server
Section titled “Function to backup configs on remote server”backup_remote_configs() {
local server=$1
local backup_name=“config_backup_$(date +%Y%m%d_%H%M%S).tar.gz”
log INFO “Creating backup on $server”
ssh -i “$SSH_KEY” “$SSH_USER@$server” ”
if [ -d ‘$CONFIG_DIR’ ]; then
sudo tar czf /tmp/$backup_name $CONFIG_DIR 2>/dev/null
sudo mv /tmp/$backup_name /var/backups/
echo ‘Backup created: /var/backups/$backup_name’
else
echo ‘Config directory not found, skipping backup’
fi
”
}
Function to sync single file
Section titled “Function to sync single file”sync_file() {
local server=$1
local file=$2
local rel_path=${file#$CONFIG_DIR/}
Calculate checksum
Section titled “Calculate checksum”local local_sum=$(md5sum “$file” | awk ‘{print $1}’)
local remote_sum=$(ssh -i “$SSH_KEY” “$SSH_USER@$server” \
“sudo md5sum ‘$file’ 2>/dev/null | awk ‘{print $1}’”)
if [ “$local_sum” = “$remote_sum” ]; then
echo ” ✓ $rel_path (unchanged)”
return 0
fi
if [ “$DRY_RUN” = true ]; then
echo ” → Would sync: $rel_path”
return 0
fi
Copy file
Section titled “Copy file”if scp -i “$SSH_KEY” “$file” “$SSH_USER@$server:/tmp/$(basename “$file”)” >/dev/null 2>&1; then
Move to final location with sudo
Section titled “Move to final location with sudo”if ssh -i “$SSH_KEY” “$SSH_USER@$server” ”
sudo mkdir -p $(dirname “$file”)
sudo mv /tmp/$(basename “$file”) $file
sudo chown root:root $file
sudo chmod 644 $file
”; then
echo ” ✓ $rel_path (updated)”
return 0
fi
fi
echo ” ✗ $rel_path (failed)”
return 1
}
Main sync process
Section titled “Main sync process”log INFO “Starting configuration sync”
[ “$DRY_RUN” = true ] && log WARN “Running in DRY RUN mode”
Check all servers first
Section titled “Check all servers first”log INFO “Checking server connectivity…”
available_servers=()
for server in ”${SERVERS[@]}”; do
if check_server “$server”; then
available_servers+=(“$server”)
echo ” ✓ $server”
else
log ERROR “Cannot connect to $server”
echo ” ✗ $server”
fi
done
if [ ${#available_servers[@]} -eq 0 ]; then
log ERROR “No servers available for sync”
exit 1
fi
Find all config files
Section titled “Find all config files”config_files=$(find “$CONFIG_DIR” -type f -name “.conf” -o -name “.yml” -o -name “*.json”)
file_count=$(echo “$config_files” | wc -l)
log INFO “Found $file_count configuration files to sync”
Sync to each server
Section titled “Sync to each server”for server in ”${available_servers[@]}”; do
log INFO “Syncing to $server”
Backup if requested
Section titled “Backup if requested”if [ “$BACKUP_BEFORE_SYNC” = true ] && [ “$DRY_RUN” = false ]; then
backup_remote_configs “$server”
fi
Sync each file
Section titled “Sync each file”success_count=0
fail_count=0
while IFS= read -r file; do
if sync_file “$server” “$file”; then
((success_count++))
else
((fail_count++))
fi
done <<< “$config_files”
Reload services if sync succeeded
Section titled “Reload services if sync succeeded”if [ $fail_count -eq 0 ] && [ “$DRY_RUN” = false ]; then
log INFO “Reloading services on $server”
ssh -i “$SSH_KEY” “$SSH_USER@$server” ”
sudo systemctl reload nginx 2>/dev/null || true
sudo systemctl reload myapp 2>/dev/null || true
”
fi
Report for this server
Section titled “Report for this server”if [ $fail_count -eq 0 ]; then
log SUCCESS “$server: $success_count files synced successfully”
else
log ERROR “$server: $fail_count files failed (${success_count} succeeded)”
fi
done
log INFO “Configuration sync completed”
This script handles multiple servers, does checksums to avoid unnecessary transfers, creates backups, and includes proper error handling with dry-run support.
Q4: Create a script that analyzes system performance and generates an HTML report
This shows data collection, processing, and formatted output:
#!/bin/bash
Output file
Section titled “Output file”REPORT_FILE=“system_report_$(hostname)$(date +%Y%m%d%H%M%S).html”
TEMP_DIR=“/tmp/sysreport_$$”
mkdir -p “$TEMP_DIR”
Thresholds for warnings
Section titled “Thresholds for warnings”CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
Function to get CSS color based on value
Section titled “Function to get CSS color based on value”get_color() {
local value=$1
local threshold=$2
if [ $(echo “$value >= $threshold” | bc) -eq 1 ]; then
echo “#ff4444” # Red
elif [ $(echo “$value >= $threshold * 0.8” | bc) -eq 1 ]; then
echo “#ff9944” # Orange
else
echo “#44ff44” # Green
fi
}
Collect system information
Section titled “Collect system information”echo “Collecting system information…”
Basic info
Section titled “Basic info”HOSTNAME=$(hostname)
UPTIME=$(uptime -p)
KERNEL=$(uname -r)
CPU_MODEL=$(grep “model name” /proc/cpuinfo | head -1 | cut -d: -f2)
CPU_CORES=$(nproc)
TOTAL_MEM=$(free -h | awk ’/^Mem:/ {print $2}‘)
CPU usage (over 5 seconds)
Section titled “CPU usage (over 5 seconds)”echo “Analyzing CPU usage…”
sar -u 1 5 > “$TEMP_DIR/cpu_stats.txt” 2>/dev/null || {
Fallback if sar not available
Section titled “Fallback if sar not available”top -b -n 2 -d 5 | grep “Cpu(s)” | tail -1 > “$TEMP_DIR/cpu_stats.txt”
}
CPU_USAGE=$(awk ‘/Average:/ {print 100 - $NF}’ “$TEMP_DIR/cpu_stats.txt” || \
awk ‘{print $2}’ “$TEMP_DIR/cpu_stats.txt” | tr -d ‘%id,’ || echo “0”)
Memory usage
Section titled “Memory usage”MEM_STATS=$(free -m | awk ’/^Mem:/ {printf “%.1f”, ($3/$2) * 100}‘)
Disk usage
Section titled “Disk usage”echo “Analyzing disk usage…”
DISK_DATA=""
while read -r line; do
device=$(echo “$line” | awk ‘{print $1}’)
mount=$(echo “$line” | awk ‘{print $6}’)
usage=$(echo “$line” | awk ‘{print $5}’ | tr -d ’%’)
size=$(echo “$line” | awk ‘{print $2}’)
used=$(echo “$line” | awk ‘{print $3}’)
avail=$(echo “$line” | awk ‘{print $4}’)
color=$(get_color “$usage” “$DISK_THRESHOLD”)
DISK_DATA=”${DISK_DATA}
<tr>
<td>$device</td>
<td>$mount</td>
<td>$size</td>
<td>$used</td>
<td>$avail</td>
<td style=‘color: $color; font-weight: bold;’>$usage%</td>
</tr>”
done < <(df -h | grep ’^/dev/’ | grep -v ‘/dev/loop’)
Top processes
Section titled “Top processes”echo “Analyzing processes…”
TOP_CPU=$(ps aux —sort=-%cpu | head -6 | tail -5)
TOP_MEM=$(ps aux —sort=-%mem | head -6 | tail -5)
Network statistics
Section titled “Network statistics”NETSTAT=$(ss -s 2>/dev/null | grep -E “TCP:|UDP:” | head -4)
Recent system logs (errors/warnings)
Section titled “Recent system logs (errors/warnings)”RECENT_ERRORS=$(journalctl -p err -n 10 —no-pager 2>/dev/null || \
dmesg | grep -iE “error|fail|warn” | tail -10)
Generate HTML report
Section titled “Generate HTML report”cat > “$REPORT_FILE” << ‘EOF’
<!DOCTYPE html>
<html>
<head>
<title>System Performance Report</title>
<meta charset=“utf-8”>
<style>
body {
font-family: Arial, sans-serif;
margin: 20px;
background-color: #f5f5f5;
}
.container {
max-width: 1200px;
margin: 0 auto;
background-color: white;
padding: 20px;
box-shadow: 0 0 10px rgba(0,0,0,0.1);
}
h1, h2 {
color: #333;
border-bottom: 2px solid #4CAF50;
padding-bottom: 10px;
}
.info-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
gap: 20px;
margin: 20px 0;
}
.info-box {
background-color: #f9f9f9;
padding: 15px;
border-radius: 5px;
border-left: 4px solid #4CAF50;
}
table {
width: 100%;
border-collapse: collapse;
margin: 20px 0;
}
th, td {
text-align: left;
padding: 12px;
border-bottom: 1px solid #ddd;
}
th {
background-color: #4CAF50;
color: white;
}
tr:hover {
background-color: #f5f5f5;
}
.metric {
font-size: 24px;
font-weight: bold;
margin: 10px 0;
}
.warning {
background-color: #fff3cd;
border-color: #ffeaa7;
color: #856404;
padding: 10px;
border-radius: 5px;
margin: 10px 0;
}
.timestamp {
text-align: right;
color: #666;
font-style: italic;
}
pre {
background-color: #f4f4f4;
padding: 10px;
border-radius: 5px;
overflow-x: auto;
}
</style>
</head>
<body>
<div class=“container”>
EOF
Add content
Section titled “Add content”cat >> “$REPORT_FILE” << EOF
<h1>System Performance Report - $HOSTNAME</h1>
<p class=“timestamp”>Generated on $(date ’+%Y-%m-%d %H:%M:%S’)</p>
<div class=“info-grid”>
<div class=“info-box”>
<strong>Hostname:</strong> $HOSTNAME<br>
<strong>Uptime:</strong> $UPTIME<br>
<strong>Kernel:</strong> $KERNEL
</div>
<div class=“info-box”>
<strong>CPU Model:</strong> $CPU_MODEL<br>
<strong>CPU Cores:</strong> $CPU_CORES<br>
<strong>Total Memory:</strong> $TOTAL_MEM
</div>
<div class=“info-box”>
<strong>CPU Usage:</strong>
<div class=“metric” style=“color: $(get_color $CPU_USAGE $CPU_THRESHOLD)”>
${CPU_USAGE}%
</div>
</div>
<div class=“info-box”>
<strong>Memory Usage:</strong>
<div class=“metric” style=“color: $(get_color $MEM_STATS $MEM_THRESHOLD)”>
${MEM_STATS}%
</div>
</div>
</div>
EOF
Add warnings if thresholds exceeded
Section titled “Add warnings if thresholds exceeded”if [ $(echo “$CPU_USAGE >= $CPU_THRESHOLD” | bc) -eq 1 ]; then
echo ‘<div class=“warning”>⚠️ High CPU usage detected!</div>’ >> “$REPORT_FILE”
fi
if [ $(echo “$MEM_STATS >= $MEM_THRESHOLD” | bc) -eq 1 ]; then
echo ‘<div class=“warning”>⚠️ High memory usage detected!</div>’ >> “$REPORT_FILE”
fi
Add disk usage table
Section titled “Add disk usage table”cat >> “$REPORT_FILE” << EOF
<h2>Disk Usage</h2>
<table>
<tr>
<th>Device</th>
<th>Mount Point</th>
<th>Size</th>
<th>Used</th>
<th>Available</th>
<th>Usage %</th>
</tr>
$DISK_DATA
</table>
<h2>Top CPU Consuming Processes</h2>
<pre>$(echo “$TOP_CPU” | awk ‘{printf ”%-8s %-8s %6s %s\n”, $1, $2, $9, $11}’)</pre>
<h2>Top Memory Consuming Processes</h2>
<pre>$(echo “$TOP_MEM” | awk ‘{printf ”%-8s %-8s %6s %s\n”, $1, $2, $10, $11}’)</pre>
<h2>Network Statistics</h2>
<pre>$NETSTAT</pre>
<h2>Recent System Errors/Warnings</h2>
<pre>$(echo “$RECENT_ERRORS” | head -20)</pre>
</div>
</body>
</html>
EOF
Cleanup
Section titled “Cleanup”rm -rf “$TEMP_DIR”
echo “Report generated: $REPORT_FILE”
Optional: open in browser
Section titled “Optional: open in browser”if command -v xdg-open >/dev/null 2>&1; then
xdg-open “$REPORT_FILE”
elif command -v open >/dev/null 2>&1; then
open “$REPORT_FILE”
fi
This creates a professional-looking HTML report with color-coded metrics, responsive design, and comprehensive system analysis.
Q5: Write a script that implements a job queue system with worker processes
This demonstrates advanced process control and inter-process communication:
#!/bin/bash
Configuration
Section titled “Configuration”QUEUE_DIR=“/var/spool/job_queue”
WORKERS=4
PID_FILE=“/var/run/job_queue.pid”
LOG_FILE=“/var/log/job_queue.log”
WORKER_TIMEOUT=300 # 5 minutes max per job
Create necessary directories
Section titled “Create necessary directories”mkdir -p “$QUEUE_DIR”/{pending,processing,completed,failed}
Logging function
Section titled “Logging function”log() {
echo ”[$(date ’+%Y-%m-%d %H:%M:%S’)] $*” | tee -a “$LOG_FILE”
}
Signal handlers
Section titled “Signal handlers”cleanup() {
log “Shutting down job queue system…”
Signal all workers to stop
Section titled “Signal all workers to stop”touch “$QUEUE_DIR/.shutdown”
Wait for workers to finish current jobs
Section titled “Wait for workers to finish current jobs”for pid in $(jobs -p); do
log “Waiting for worker $pid to finish…”
wait $pid 2>/dev/null
done
rm -f “$PID_FILE” “$QUEUE_DIR/.shutdown”
log “Shutdown complete”
exit 0
}
trap cleanup SIGINT SIGTERM
Function to add a job to queue
Section titled “Function to add a job to queue”add_job() {
local job_type=$1
local job_data=$2
local priority=${3:-5} # Default priority 5 (1=highest, 9=lowest)
local job_id=$(date +%s%N)_$$
local job_file=“$QUEUE_DIR/pending/${priority}_${job_id}.job”
cat > “$job_file” << EOF
JOB_ID=$job_id
JOB_TYPE=$job_type
JOB_DATA=$job_data
SUBMIT_TIME=$(date +%s)
SUBMIT_USER=$USER
STATUS=pending
EOF
log “Job added: $job_id (type: $job_type, priority: $priority)”
echo “$job_id”
}
Worker function
Section titled “Worker function”worker_process() {
local worker_id=$1
log “Worker $worker_id started (PID: $$)”
while [ ! -f “$QUEUE_DIR/.shutdown” ]; do
Find next job (sorted by priority and age)
Section titled “Find next job (sorted by priority and age)”local job_file=$(ls -1 “$QUEUE_DIR/pending/“*.job 2>/dev/null | sort | head -1)
if [ -z “$job_file” ]; then
No jobs, wait a bit
Section titled “No jobs, wait a bit”sleep 2
continue
fi
Try to claim the job (atomic operation)
Section titled “Try to claim the job (atomic operation)”local processing_file=“$QUEUE_DIR/processing/$(basename “$job_file”)”
if ! mv “$job_file” “$processing_file” 2>/dev/null; then
Another worker got it first
Section titled “Another worker got it first”continue
fi
Load job details
Section titled “Load job details”source “$processing_file”
log “Worker $worker_id processing job $JOB_ID (type: $JOB_TYPE)“
Update job status
Section titled “Update job status”echo “WORKER=$worker_id” >> “$processing_file”
echo “START_TIME=$(date +%s)” >> “$processing_file”
echo “STATUS=processing” >> “$processing_file”
Execute job with timeout
Section titled “Execute job with timeout”local job_output=“$QUEUE_DIR/processing/${JOB_ID}.out”
local job_result=0
case “$JOB_TYPE” in
“compress”)
timeout $WORKER_TIMEOUT tar czf ”${JOB_DATA}.tar.gz” “$JOB_DATA” \
“$job_output” 2>&1
job_result=$?
;;
“backup”)
timeout $WORKER_TIMEOUT rsync -av “$JOB_DATA” “/backup/$JOB_DATA” \
“$job_output” 2>&1
job_result=$?
;;
“report”)
timeout $WORKER_TIMEOUT /usr/local/bin/generate_report.sh “$JOB_DATA” \
“$job_output” 2>&1
job_result=$?
;;
“custom”)
Execute custom command (careful with security!)
Section titled “Execute custom command (careful with security!)”timeout $WORKER_TIMEOUT bash -c “$JOB_DATA” \
“$job_output” 2>&1
job_result=$?
;;
*)
echo “Unknown job type: $JOB_TYPE” > “$job_output”
job_result=1
;;
esac
Move to completed or failed
Section titled “Move to completed or failed”local end_time=$(date +%s)
local duration=$((end_time - START_TIME))
echo “END_TIME=$end_time” >> “$processing_file”
echo “DURATION=$duration” >> “$processing_file”
echo “EXIT_CODE=$job_result” >> “$processing_file”
if [ $job_result -eq 0 ]; then
echo “STATUS=completed” >> “$processing_file”
mv “$processing_file” “$QUEUE_DIR/completed/”
mv “$job_output” “$QUEUE_DIR/completed/”
log “Worker $worker_id completed job $JOB_ID in ${duration}s”
else
echo “STATUS=failed” >> “$processing_file”
mv “$processing_file” “$QUEUE_DIR/failed/”
mv “$job_output” “$QUEUE_DIR/failed/”
log “Worker $worker_id: job $JOB_ID failed (exit code: $job_result)”
fi
done
log “Worker $worker_id stopped”
}
Status monitoring function
Section titled “Status monitoring function”monitor_status() {
while [ ! -f “$QUEUE_DIR/.shutdown” ]; do
clear
echo “Job Queue Status - $(date)”
echo ”================================“
Count jobs by status
Section titled “Count jobs by status”local pending=$(ls -1 “$QUEUE_DIR/pending/“*.job 2>/dev/null | wc -l)
local processing=$(ls -1 “$QUEUE_DIR/processing/“*.job 2>/dev/null | wc -l)
local completed=$(ls -1 “$QUEUE_DIR/completed/“*.job 2>/dev/null | wc -l)
local failed=$(ls -1 “$QUEUE_DIR/failed/“*.job 2>/dev/null | wc -l)
echo “Pending: $pending”
echo “Processing: $processing”
echo “Completed: $completed”
echo “Failed: $failed”
echo ""
Show current processing jobs
Section titled “Show current processing jobs”if [ $processing -gt 0 ]; then
echo “Currently Processing:”
echo ”--------------------”
for job in “$QUEUE_DIR/processing/“*.job; do
[ -f “$job” ] || continue
source “$job”
local runtime=$(($(date +%s) - START_TIME))
printf ” Job %s (Worker %d): %s - %ds\n” \
“$JOB_ID” “$WORKER” “$JOB_TYPE” “$runtime”
done
echo ""
fi
Show worker status
Section titled “Show worker status”echo “Workers:”
echo ”--------”
for i in $(seq 1 $WORKERS); do
if kill -0 ${WORKER_PIDS[$i]} 2>/dev/null; then
echo ” Worker $i: Running (PID: ${WORKER_PIDS[$i]})”
else
echo ” Worker $i: Stopped”
fi
done
sleep 5
done
}
Main execution
Section titled “Main execution”if [ “$1” = “add” ]; then
Add job mode
Section titled “Add job mode”shift
add_job ”$@”
exit 0
fi
if [ “$1” = “status” ]; then
Status mode
Section titled “Status mode”monitor_status
exit 0
fi
Start queue manager
Section titled “Start queue manager”log “Starting job queue system with $WORKERS workers”
echo $$ > “$PID_FILE”
Start workers
Section titled “Start workers”declare -a WORKER_PIDS
for i in $(seq 1 $WORKERS); do
worker_process $i &
WORKER_PIDS[$i]=$!
done
Wait for all workers
Section titled “Wait for all workers”log “Job queue system running. Press Ctrl+C to stop.”
wait
This implements a complete job queue with priority handling, multiple workers, atomic job claiming, timeout protection, and comprehensive logging. It’s the kind of system you’d build to handle background tasks in production.
Shell Scripting Interview Questions and Answers For Experienced (5+ years Exp)
At the senior level, advanced shell scripting interview questions dig deep into architectural decisions, performance at scale, and the wisdom that comes from maintaining production systems through their entire lifecycle. These questions explore how you handle the messy realities of enterprise environments - legacy system integration, cross-platform compatibility nightmares, and scripts that need to run reliably for years. The focus shifts from “can you write it” to “have you lived through the consequences of different approaches” and whether you can design solutions that other engineers can maintain long after you’ve moved on.
Q1: How do you design shell scripts for high-availability environments where failure isn’t an option?
After years of 3 AM calls, I’ve learned that HA scripts need multiple layers of protection. First, I implement circuit breakers - if a script fails repeatedly, it stops trying and alerts instead of hammering a broken system. I use distributed locking across nodes (often with Redis or etcd) to prevent split-brain scenarios. Here’s my approach:
Health checks before actions: Never assume a service is up. Always verify endpoints are responding correctly before sending traffic.
Idempotency everywhere: Every operation should be safe to run multiple times. I use checksums, version checks, and state files to ensure this.
Graceful degradation: If a non-critical component fails, the script continues with reduced functionality rather than failing completely.
Audit trails: Every action gets logged with who/what/when/why, often shipped to a central logging system for correlation.
The key insight I’ve gained is that HA isn’t about preventing failures - it’s about failing gracefully and recovering automatically. I’ve seen too many “clever” scripts that made things worse during outages.
Q2: Explain your approach to handling sensitive data in shell scripts
This is where experience really shows. Never, ever put secrets in scripts. I’ve cleaned up too many messes where passwords were in version control. My approach:
Environment variables for local development only, never in production
Secrets management tools like HashiCorp Vault, AWS Secrets Manager, or Kubernetes secrets
Temporary credentials with short TTLs whenever possible
Audit logging for every secret access
I also use set +x before any line that might expose secrets in debug output, and I’m paranoid about temporary files - always created with mktemp and proper permissions, cleaned up in trap handlers. I’ve seen scripts that wrote database passwords to /tmp with world-readable permissions. That’s a career-limiting move.
Q3: How do you optimize shell scripts that process millions of records?
The biggest lesson I’ve learned: the shell isn’t always the answer. But when it is, here’s what works:
Streaming over loading: Never load large datasets into memory. Use pipes and process line by line.
GNU Parallel for CPU-bound tasks. It’s incredible how much faster things run with proper parallelization.
Sort/join over nested loops: Unix tools are optimized for this. A sort | join pipeline beats nested loops every time.
Minimize process spawning: That for loop calling sed 1000 times? Rewrite it as a single awk script.
Real example: I once replaced a 6-hour customer data processing script with a sort | join | awk pipeline that ran in 12 minutes. The original developer was reading a 2GB file into an associative array. Sometimes the old Unix philosophy of small, focused tools really shines.
Q4: Describe your most complex debugging experience with shell scripts
The worst one that comes to mind involved a script that worked perfectly for 3 years, then started randomly failing on Tuesdays. Turned out a log rotation job was creating a race condition, but only when the log file exceeded 2GB, which only happened on our busiest day.
My debugging toolkit has evolved from painful experiences:
strace/dtrace to see actual system calls
bash -x is just the starting point; I often add custom debug functions that can be toggled
Reproducing environments exactly - same shell version, same locale settings, same everything
Binary search debugging - comment out half the script, see if it still fails
The real skill is knowing when to stop debugging and rewrite. I’ve learned that if I’m spending more than a day debugging a complex script, it’s probably too complex and needs to be simplified or rewritten in a proper programming language.
Q5: How do you handle shell script deployment and versioning across hundreds of servers?
Configuration management is crucial here. I’ve used Puppet, Ansible, and Chef, but the principles remain the same:
Immutable deployments: Scripts are versioned packages, not edited in place
Canary deployments: Roll out to 1%, then 10%, then 50%, monitoring metrics at each stage
Feature flags: Even in shell scripts, I implement toggles for risky changes
Rollback procedures: Always have a quick way back. I version with symlinks for instant rollback
One hard lesson: never trust system package managers alone. I’ve been burned by different versions of bash, different coreutils implementations, even different versions of basic commands like ‘date’. Now I always include compatibility checks and sometimes ship specific binary versions with my scripts.
Q6: What’s your philosophy on when to use shell scripts versus a “real” programming language?
This is the question that separates experienced engineers from script kiddies. My rule of thumb:
Shell scripts are perfect for:
Gluing systems together
System administration tasks
Quick prototypes
Build and deployment automation
But I switch to Python/Go/Ruby when:
The script exceeds 200-300 lines
I need complex data structures
Error handling becomes more complex than the actual logic
I’m parsing anything more complex than simple delimited data
Performance is critical
I’ve maintained 2000-line bash scripts. It’s not fun. The maintenance cost grows exponentially with complexity. Now I’m quick to recognize when bash has served its purpose and it’s time to rewrite.
Q7: How do you ensure shell script security in a zero-trust environment?
Security has evolved way beyond just checking inputs. In zero-trust environments:
No permanent credentials: Everything uses temporary tokens with the minimum required scope
Mutual TLS for script-to-service communication
Signed scripts: We GPG sign critical scripts and verify signatures before execution
Minimal attack surface: Scripts run in containers or VMs with only required tools installed
Security scanning: All scripts go through static analysis tools looking for common vulnerabilities
I’ve also learned to be paranoid about seemingly innocent things. That curl command downloading a script from GitHub? What if someone compromises the repo? Now I verify checksums and use pinned versions for everything.
Q8: Describe your approach to making shell scripts cloud-agnostic
After migrating between AWS, GCP, and Azure multiple times, I’ve learned:
Abstract provider-specific calls: Create wrapper functions for cloud operations
Use cloud SDK tools carefully: They change. Always pin versions and have fallbacks
Metadata services are different: Each cloud has its own way. Abstract this early
Storage is never just storage: S3, GCS, and Azure Blob have subtle differences
Example approach:
cloud_provider_detect() {
if curl -s -f -m 1 http://169.254.169.254/latest/meta-data/instance-id >/dev/null 2>&1; then
echo “aws”
elif curl -s -f -m 1 -H “Metadata-Flavor: Google” http://metadata.google.internal/computeMetadata/v1/instance/id >/dev/null 2>&1; then
echo “gcp”
else
echo “azure” # Or onprem, needs more logic
fi
}
The key is planning for portability from day one. Retrofitting cloud-agnostic behavior is painful.
Q9: How do you handle backwards compatibility when system tools get updated?
This is a constant battle. GNU vs BSD tools, different bash versions, changing command options. My strategies:
Feature detection over version detection: Don’t check if it’s bash 4.3, check if it supports associative arrays
Wrapper functions: Abstract system commands behind functions that handle differences
Comprehensive testing matrix: Test on minimum supported versions of everything
Document requirements explicitly: Every script starts with a requirements block
I maintain compatibility libraries for common issues:
Example: readlink portability
Section titled “Example: readlink portability”portable_readlink() {
if readlink -f “$1” 2>/dev/null; then
return
elif command -v greadlink >/dev/null; then
greadlink -f “$1”
else
Fallback implementation
Section titled “Fallback implementation”python -c “import os; print(os.path.realpath(‘$1’))”
fi
}
Q10: What are the most important lessons you’ve learned about shell scripting in production?
The scars teach the best lessons:
Observability beats cleverness: A simple script with great logging beats a clever script you can’t debug
Plan for failure from line 1: Not just error handling, but operational failure - what happens when your script runs during a datacenter failover?
Other people will maintain your code: Write for them, not to show off. Comment the “why”, not the “what”
Test the unhappy paths: Everyone tests success. Test what happens when that API returns 500, when the disk is full, when DNS is flaking
Know when to stop: Some problems shouldn’t be solved in bash. Recognizing this saves weeks of pain
The meta-lesson: shell scripting in production is 20% writing code and 80% thinking about what could go wrong. Every senior engineer has war stories about simple scripts causing major outages. The difference is we’ve learned to be paranoid in productive ways.
Shell Scripting Coding Interview Questions and Answers For Experienced (5+ years Exp)
Senior-level advanced shell scripting interview questions go beyond algorithms to test battle-hardened production experience - can you write code that survives server crashes, handles race conditions, and scales across distributed systems? These coding challenges reflect real scenarios you’ve probably debugged at 3 AM: service orchestration failures, data corruption recovery, and the kind of edge cases that only show up after months in production. The solutions here demonstrate not just working code, but the defensive programming and architectural thinking that comes from years of learning things the hard way.
Q1: Write a distributed lock manager for coordinating jobs across multiple servers
This solves the classic problem of preventing duplicate cron jobs in clustered environments:
#!/bin/bash
Distributed lock implementation using shared filesystem or Redis
Section titled “Distributed lock implementation using shared filesystem or Redis”Handles stale locks, network partitions, and crash recovery
Section titled “Handles stale locks, network partitions, and crash recovery”LOCK_DIR=“/shared/locks”
REDIS_HOST=”${REDIS_HOST:-localhost}”
REDIS_PORT=”${REDIS_PORT:-6379}”
BACKEND=”${LOCK_BACKEND:-filesystem}” # or “redis”
Configuration
Section titled “Configuration”readonly SCRIPT_NAME=$(basename “$0”)
readonly HOSTNAME=$(hostname -f)
readonly PID=$$
readonly LOCK_TIMEOUT=${LOCK_TIMEOUT:-300} # 5 minutes default
readonly LOCK_RETRY_INTERVAL=2
readonly STALE_LOCK_THRESHOLD=600 # 10 minutes
Generate unique lock identifier
Section titled “Generate unique lock identifier”generate_lock_id() {
echo ”${HOSTNAME}-${PID}-$(date +%s%N)”
}
Filesystem-based lock implementation
Section titled “Filesystem-based lock implementation”fs_acquire_lock() {
local lock_name=$1
local lock_file=“$LOCK_DIR/$lock_name.lock”
local lock_id=$(generate_lock_id)
local temp_file=$(mktemp)
Write lock metadata
Section titled “Write lock metadata”cat > “$temp_file” << EOF
{
“holder”: “$lock_id”,
“hostname”: “$HOSTNAME”,
“pid”: $PID,
“acquired”: $(date +%s),
“timeout”: $LOCK_TIMEOUT,
“command”: “$0 $*”
}
EOF
Try atomic lock creation
Section titled “Try atomic lock creation”if ln “$temp_file” “$lock_file” 2>/dev/null; then
rm -f “$temp_file”
echo “$lock_id”
return 0
fi
Lock exists, check if stale
Section titled “Lock exists, check if stale”if [ -f “$lock_file” ]; then
local lock_age=$(($(date +%s) - $(stat -c %Y “$lock_file” 2>/dev/null || stat -f %m “$lock_file”)))
local lock_data=$(cat “$lock_file” 2>/dev/null)
local lock_pid=$(echo “$lock_data” | grep -o ‘“pid”: [0-9]’ | grep -o ‘[0-9]’)
local lock_host=$(echo “$lock_data” | grep -o ‘“hostname”: ”[^”]*”’ | cut -d’”’ -f4)
Check if lock is stale
Section titled “Check if lock is stale”if [ $lock_age -gt $STALE_LOCK_THRESHOLD ]; then
echo “Removing stale lock (age: ${lock_age}s)” >&2
rm -f “$lock_file”
fs_acquire_lock “$lock_name”
return $?
fi
Check if process still exists (same host only)
Section titled “Check if process still exists (same host only)”if [ “$lock_host” = “$HOSTNAME” ] && [ -n “$lock_pid” ]; then
if ! kill -0 “$lock_pid” 2>/dev/null; then
echo “Removing lock from dead process $lock_pid” >&2
rm -f “$lock_file”
fs_acquire_lock “$lock_name”
return $?
fi
fi
fi
rm -f “$temp_file”
return 1
}
fs_release_lock() {
local lock_name=$1
local lock_id=$2
local lock_file=“$LOCK_DIR/$lock_name.lock”
Verify we own the lock before releasing
Section titled “Verify we own the lock before releasing”if [ -f “$lock_file” ]; then
local current_holder=$(grep -o ‘“holder”: ”[^”]*”’ “$lock_file” 2>/dev/null | cut -d’”’ -f4)
if [ “$current_holder” = “$lock_id” ]; then
rm -f “$lock_file”
return 0
else
echo “Cannot release lock owned by $current_holder” >&2
return 1
fi
fi
return 0
}
Redis-based lock implementation
Section titled “Redis-based lock implementation”redis_acquire_lock() {
local lock_name=$1
local lock_id=$(generate_lock_id)
local lock_key=“dlock:$lock_name”
Try to acquire lock with NX (only if not exists) and EX (expiry)
Section titled “Try to acquire lock with NX (only if not exists) and EX (expiry)”local result=$(redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” \
SET “$lock_key” “$lock_id” NX EX “$LOCK_TIMEOUT” 2>/dev/null)
if [ “$result” = “OK” ]; then
Store metadata separately
Section titled “Store metadata separately”redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” >/dev/null 2>&1 <<EOF
HSET ”${lock_key}:meta” holder “$lock_id”
HSET ”${lock_key}:meta” hostname “$HOSTNAME”
HSET ”${lock_key}:meta” pid “$PID”
HSET ”${lock_key}:meta” acquired ”$(date +%s)”
EXPIRE ”${lock_key}:meta” “$LOCK_TIMEOUT”
EOF
echo “$lock_id”
return 0
fi
Check if lock is held by dead process on same host
Section titled “Check if lock is held by dead process on same host”local lock_host=$(redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” \
HGET ”${lock_key}:meta” hostname 2>/dev/null)
local lock_pid=$(redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” \
HGET ”${lock_key}:meta” pid 2>/dev/null)
if [ “$lock_host” = “$HOSTNAME” ] && [ -n “$lock_pid” ]; then
if ! kill -0 “$lock_pid” 2>/dev/null; then
echo “Removing lock from dead process $lock_pid” >&2
redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” DEL “$lock_key” ”${lock_key}:meta” >/dev/null
redis_acquire_lock “$lock_name”
return $?
fi
fi
return 1
}
redis_release_lock() {
local lock_name=$1
local lock_id=$2
local lock_key=“dlock:$lock_name”
Use Lua script for atomic check-and-delete
Section titled “Use Lua script for atomic check-and-delete”redis-cli -h “$REDIS_HOST” -p “$REDIS_PORT” —eval - “$lock_key” “$lock_id” <<‘EOF’ >/dev/null
if redis.call(“GET”, KEYS[1]) == ARGV[1] then
redis.call(“DEL”, KEYS[1], KEYS[1] .. ":meta")
return 1
else
return 0
end
EOF
}
Main lock interface
Section titled “Main lock interface”acquire_lock() {
local lock_name=$1
local max_wait=${2:-0}
local waited=0
while true; do
local lock_id
case “$BACKEND” in
filesystem)
lock_id=$(fs_acquire_lock “$lock_name”)
;;
redis)
lock_id=$(redis_acquire_lock “$lock_name”)
;;
*)
echo “Unknown backend: $BACKEND” >&2
return 1
;;
esac
if [ -n “$lock_id” ]; then
echo “$lock_id”
return 0
fi
if [ $max_wait -gt 0 ] && [ $waited -ge $max_wait ]; then
return 1
fi
sleep $LOCK_RETRY_INTERVAL
waited=$((waited + LOCK_RETRY_INTERVAL))
done
}
release_lock() {
local lock_name=$1
local lock_id=$2
case “$BACKEND” in
filesystem)
fs_release_lock “$lock_name” “$lock_id”
;;
redis)
redis_release_lock “$lock_name” “$lock_id”
;;
esac
}
Lock manager with automatic cleanup
Section titled “Lock manager with automatic cleanup”with_lock() {
local lock_name=$1
shift
local lock_id
lock_id=$(acquire_lock “$lock_name” 30) # Wait up to 30 seconds
if [ -z “$lock_id” ]; then
echo “Failed to acquire lock: $lock_name” >&2
return 1
fi
Set up cleanup trap
Section titled “Set up cleanup trap”trap “release_lock ‘$lock_name’ ‘$lock_id’” EXIT INT TERM
echo “Lock acquired: $lock_name (id: $lock_id)” >&2
Execute command
Section titled “Execute command””$@”
local exit_code=$?
Release lock
Section titled “Release lock”release_lock “$lock_name” “$lock_id”
trap - EXIT INT TERM
return $exit_code
}
Example usage for the interview
Section titled “Example usage for the interview”if [ ”${BASH_SOURCE[0]}” = ”${0}” ]; then
Demo: critical section protection
Section titled “Demo: critical section protection”critical_task() {
echo “Starting critical task on $HOSTNAME…”
sleep 10
echo “Critical task completed”
}
Use distributed lock
Section titled “Use distributed lock”with_lock “my-critical-task” critical_task
fi
This implementation handles real-world edge cases: stale locks from crashed processes, network partitions, and provides both filesystem and Redis backends for different infrastructure setups.
Q2: Create a self-healing service monitor that automatically restarts failed services with exponential backoff
This handles the complexity of service management in production:
#!/bin/bash
Self-healing service monitor with intelligent restart logic
Section titled “Self-healing service monitor with intelligent restart logic”Prevents restart loops and implements circuit breaker pattern
Section titled “Prevents restart loops and implements circuit breaker pattern”readonly CONFIG_DIR=“/etc/service-monitor”
readonly STATE_DIR=“/var/lib/service-monitor”
readonly LOG_FILE=“/var/log/service-monitor.log”
Initialize directories
Section titled “Initialize directories”mkdir -p “$CONFIG_DIR” “$STATE_DIR”
Service configuration example:
Section titled “Service configuration example:”{
“name”: “web-app”,
Section titled “ “name”: “web-app”,”“check_cmd”: “curl -sf http://localhost:8080/health”,
Section titled “ “check_cmd”: “curl -sf http://localhost:8080/health”,”“start_cmd”: “systemctl start web-app”,
Section titled “ “start_cmd”: “systemctl start web-app”,”“stop_cmd”: “systemctl stop web-app”,
Section titled “ “stop_cmd”: “systemctl stop web-app”,”“restart_cmd”: “systemctl restart web-app”,
Section titled “ “restart_cmd”: “systemctl restart web-app”,”“max_restarts”: 5,
Section titled “ “max_restarts”: 5,”“restart_window”: 3600,
Section titled “ “restart_window”: 3600,”“backoff_base”: 2,
Section titled “ “backoff_base”: 2,”“max_backoff”: 300
Section titled “ “max_backoff”: 300”Logging with severity
Section titled “Logging with severity”log() {
local level=$1
shift
echo ”[$(date ’+%Y-%m-%d %H:%M:%S’)] [$level] $*” | tee -a “$LOG_FILE”
Also log to syslog if available
Section titled “Also log to syslog if available”if command -v logger >/dev/null 2>&1; then
logger -t “service-monitor” -p “daemon.$level” ”$*”
fi
}
Load service configuration
Section titled “Load service configuration”load_service_config() {
local service_name=$1
local config_file=“$CONFIG_DIR/${service_name}.json”
if [ ! -f “$config_file” ]; then
log “error” “Configuration not found for service: $service_name”
return 1
fi
Parse JSON (using jq if available, fallback to python)
Section titled “Parse JSON (using jq if available, fallback to python)”if command -v jq >/dev/null 2>&1; then
jq -r ‘to_entries | .[] | “(.key)=(.value)”’ “$config_file”
else
python -c ”
import json, sys
with open(‘$config_file’) as f:
config = json.load(f)
for k, v in config.items():
print(f’{k}={v}’)
”
fi
}
Get service state
Section titled “Get service state”get_service_state() {
local service_name=$1
local state_file=“$STATE_DIR/${service_name}.state”
if [ ! -f “$state_file” ]; then
Initialize state
Section titled “Initialize state”cat > “$state_file” << EOF
restart_count=0
last_restart=0
consecutive_failures=0
backoff_seconds=1
status=unknown
last_check=0
total_restarts=0
circuit_breaker=closed
EOF
fi
source “$state_file”
}
Update service state
Section titled “Update service state”update_service_state() {
local service_name=$1
local state_file=“$STATE_DIR/${service_name}.state”
shift
Read current state
Section titled “Read current state”local temp_file=$(mktemp)
cp “$state_file” “$temp_file” 2>/dev/null || touch “$temp_file”
Update specified values
Section titled “Update specified values”while [ $# -gt 0 ]; do
local key=”${1%%=*}”
local value=”${1#*=}“
Update or add key
Section titled “Update or add key”if grep -q ”^${key}=” “$temp_file”; then
sed -i “s/^${key}=.*/${key}=${value}/” “$temp_file”
else
echo ”${key}=${value}” >> “$temp_file”
fi
shift
done
Atomic update
Section titled “Atomic update”mv “$temp_file” “$state_file”
}
Calculate exponential backoff
Section titled “Calculate exponential backoff”calculate_backoff() {
local base=$1
local attempt=$2
local max_backoff=$3
local backoff=$((base ** attempt))
if [ $backoff -gt $max_backoff ]; then
backoff=$max_backoff
fi
Add jitter (±25%) to prevent thundering herd
Section titled “Add jitter (±25%) to prevent thundering herd”local jitter=$((backoff / 4))
local random_jitter=$((RANDOM % (jitter * 2) - jitter))
backoff=$((backoff + random_jitter))
echo $backoff
}
Check if service is healthy
Section titled “Check if service is healthy”check_service_health() {
local check_cmd=$1
Execute health check with timeout
Section titled “Execute health check with timeout”timeout 30 bash -c “$check_cmd” >/dev/null 2>&1
}
Restart service with backoff
Section titled “Restart service with backoff”restart_service() {
local service_name=$1
local restart_cmd=$2
local backoff_base=$3
local max_backoff=$4
Get current state
Section titled “Get current state”get_service_state “$service_name”
Calculate backoff
Section titled “Calculate backoff”local backoff=$(calculate_backoff “$backoff_base” “$consecutive_failures” “$max_backoff”)
log “warning” “Restarting $service_name (attempt $((consecutive_failures + 1)), waiting ${backoff}s)“
Wait with exponential backoff
Section titled “Wait with exponential backoff”sleep “$backoff”
Attempt restart
Section titled “Attempt restart”if $restart_cmd; then
log “info” “Successfully restarted $service_name”
update_service_state “$service_name” \
“last_restart=$(date +%s)” \
“backoff_seconds=1” \
“total_restarts=$((total_restarts + 1))”
return 0
else
log “error” “Failed to restart $service_name”
return 1
fi
}
Main monitoring loop for a service
Section titled “Main monitoring loop for a service”monitor_service() {
local service_name=$1
Load configuration
Section titled “Load configuration”eval ”$(load_service_config “$service_name”)“
Configuration validation
Section titled “Configuration validation”for required in check_cmd restart_cmd max_restarts restart_window; do
if [ -z ”${!required}” ]; then
log “error” “Missing required config: $required for $service_name”
return 1
fi
done
Set defaults
Section titled “Set defaults”backoff_base=${backoff_base:-2}
max_backoff=${max_backoff:-300}
log “info” “Starting monitor for $service_name”
while true; do
Get current state
Section titled “Get current state”get_service_state “$service_name”
Check if circuit breaker is open
Section titled “Check if circuit breaker is open”if [ “$circuit_breaker” = “open” ]; then
local circuit_break_duration=$(($(date +%s) - last_restart))
if [ $circuit_break_duration -lt 3600 ]; then
sleep 60
continue
else
Try to close circuit breaker
Section titled “Try to close circuit breaker”log “info” “Attempting to close circuit breaker for $service_name”
update_service_state “$service_name” “circuit_breaker=half-open”
fi
fi
Perform health check
Section titled “Perform health check”if check_service_health “$check_cmd”; then
Service is healthy
Section titled “Service is healthy”if [ “$status” != “healthy” ] || [ “$consecutive_failures” -gt 0 ]; then
log “info” “$service_name is healthy”
update_service_state “$service_name” \
“status=healthy” \
“consecutive_failures=0” \
“circuit_breaker=closed”
fi
else
Service is unhealthy
Section titled “Service is unhealthy”log “error” “$service_name health check failed”
Update failure count
Section titled “Update failure count”consecutive_failures=$((consecutive_failures + 1))
update_service_state “$service_name” \
“status=unhealthy” \
“consecutive_failures=$consecutive_failures” \
“last_check=$(date +%s)“
Check restart window
Section titled “Check restart window”current_time=$(date +%s)
window_start=$((current_time - restart_window))
Count restarts in window
Section titled “Count restarts in window”recent_restarts=0
if [ -f “$STATE_DIR/${service_name}.history” ]; then
recent_restarts=$(awk -v start=“$window_start” ‘$1 > start’ \
“$STATE_DIR/${service_name}.history” | wc -l)
fi
Check if we should restart
Section titled “Check if we should restart”if [ $recent_restarts -ge $max_restarts ]; then
if [ “$circuit_breaker” != “open” ]; then
log “error” “Circuit breaker OPEN for $service_name (too many restarts)”
update_service_state “$service_name” “circuit_breaker=open”
Send alert
Section titled “Send alert”send_alert “$service_name” “Circuit breaker opened - manual intervention required”
fi
else
Attempt restart
Section titled “Attempt restart”if restart_service “$service_name” “$restart_cmd” “$backoff_base” “$max_backoff”; then
Log restart
Section titled “Log restart”echo ”$(date +%s) restart” >> “$STATE_DIR/${service_name}.history”
Wait before next check to allow service to stabilize
Section titled “Wait before next check to allow service to stabilize”sleep 30
else
Restart failed, increase backoff
Section titled “Restart failed, increase backoff”update_service_state “$service_name” \
“consecutive_failures=$consecutive_failures”
fi
fi
fi
Wait before next check
Section titled “Wait before next check”sleep ”${check_interval:-60}”
done
}
Send alerts (integrate with your alerting system)
Section titled “Send alerts (integrate with your alerting system)”send_alert() {
local service=$1
local message=$2
Example integrations
Section titled “Example integrations”if [ -n “$SLACK_WEBHOOK” ]; then
curl -X POST “$SLACK_WEBHOOK” \
-H “Content-Type: application/json” \
-d ”{“text”: “Service Monitor Alert: $service - $message”}” \
2>/dev/null || true
fi
Email alert
Section titled “Email alert”if command -v mail >/dev/null 2>&1 && [ -n “$ALERT_EMAIL” ]; then
echo “$message” | mail -s “Service Monitor: $service” “$ALERT_EMAIL”
fi
log “alert” “$service: $message”
}
Main execution
Section titled “Main execution”main() {
Monitor all configured services
Section titled “Monitor all configured services”for config_file in “$CONFIG_DIR”/*.json; do
[ -f “$config_file” ] || continue
service_name=$(basename “$config_file” .json)
monitor_service “$service_name” &
Store PID for management
Section titled “Store PID for management”echo $! > “$STATE_DIR/${service_name}.pid”
done
Wait for all monitors
Section titled “Wait for all monitors”wait
}
Handle signals
Section titled “Handle signals”trap ‘log “info” “Shutting down service monitor”; pkill -P $$; exit 0’ TERM INT
if [ ”${BASH_SOURCE[0]}” = ”${0}” ]; then
main ”$@”
fi
Q3: Implement a log aggregation and analysis system that detects anomalies across distributed services
This demonstrates handling big data streams and pattern recognition:
#!/bin/bash
Distributed log analysis system with anomaly detection
Section titled “Distributed log analysis system with anomaly detection”Handles multiple log formats, performs statistical analysis, and alerts on anomalies
Section titled “Handles multiple log formats, performs statistical analysis, and alerts on anomalies”readonly AGGREGATOR_PORT=9514
readonly ANALYSIS_INTERVAL=60
readonly BASELINE_WINDOW=3600 # 1 hour
readonly ANOMALY_THRESHOLD=3 # Standard deviations
readonly STATE_DIR=“/var/lib/log-analyzer”
readonly PATTERNS_FILE=“/etc/log-analyzer/patterns.conf”
mkdir -p “$STATE_DIR”
Pattern definitions for different log types
Section titled “Pattern definitions for different log types”declare -A LOG_PATTERNS=(
[“nginx”]=’(?P<ip>\S+) \S+ \S+ [(?P<time>[^]]+)] ”(?P<method>\S+) (?P<path>\S+) \S+” (?P<status>\d+) (?P<size>\d+)’
[“apache”]=’(?P<ip>\S+) \S+ \S+ [(?P<time>[^]]+)] ”(?P<method>\S+) (?P<path>\S+) \S+” (?P<status>\d+) (?P<size>\d+)’
[“syslog”]=’(?P<time>\S+ \S+ \S+) (?P<host>\S+) (?P<process>[^[]+)[(?P<pid>\d+)]: (?P<message>.*)’
[“json”]=’^\{.*\}$’
)
Statistical functions
Section titled “Statistical functions”calculate_stats() {
Input: newline-separated numbers
Section titled “Input: newline-separated numbers”Output: count mean stddev min max p95
Section titled “Output: count mean stddev min max p95”awk ’
{
data[NR] = $1
sum += $1
if (NR == 1 || $1 < min) min = $1
if (NR == 1 || $1 > max) max = $1
}
END {
if (NR == 0) {
print “0 0 0 0 0 0”
exit
}
mean = sum / NR
Calculate variance
Section titled “Calculate variance”for (i = 1; i <= NR; i++) {
variance += (data[i] - mean) ^ 2
}
variance = variance / NR
stddev = sqrt(variance)
Calculate 95th percentile
Section titled “Calculate 95th percentile”n = int(NR * 0.95)
if (n < 1) n = 1
Sort for percentile
Section titled “Sort for percentile”for (i = 1; i <= NR; i++) {
for (j = i + 1; j <= NR; j++) {
if (data[i] > data[j]) {
temp = data[i]
data[i] = data[j]
data[j] = temp
}
}
}
print NR, mean, stddev, min, max, data[n]
}
’
}
Z-score anomaly detection
Section titled “Z-score anomaly detection”is_anomaly() {
local value=$1
local mean=$2
local stddev=$3
local threshold=${4:-$ANOMALY_THRESHOLD}
Avoid division by zero
Section titled “Avoid division by zero”if (( $(echo “$stddev == 0” | bc -l) )); then
return 1
fi
Calculate z-score
Section titled “Calculate z-score”local zscore=$(echo “scale=2; ($value - $mean) / $stddev” | bc -l)
local abs_zscore=$(echo “scale=2; if ($zscore < 0) -$zscore else $zscore” | bc -l)
if (( $(echo “$abs_zscore > $threshold” | bc -l) )); then
return 0 # Is anomaly
else
return 1 # Not anomaly
fi
}
Parse log line based on format
Section titled “Parse log line based on format”parse_log_line() {
local line=$1
local log_type=$2
case “$log_type” in
json)
Parse JSON log
Section titled “Parse JSON log”if command -v jq >/dev/null 2>&1; then
echo “$line” | jq -r ’
@tsv “(.timestamp // now | todate) (.level // “INFO”) (.service // “unknown”) (.message // "")”
’ 2>/dev/null
else
Fallback to python
Section titled “Fallback to python”python3 -c ”
import json, sys, datetime
try:
data = json.loads(‘$line’)
timestamp = data.get(‘timestamp’, datetime.datetime.now().isoformat())
level = data.get(‘level’, ‘INFO’)
service = data.get(‘service’, ‘unknown’)
message = data.get(‘message’, ”)
print(f’{timestamp}\t{level}\t{service}\t{message}’)
except:
pass
”
fi
;;
nginx|apache)
Parse using regex
Section titled “Parse using regex”python3 -c ”
import re
pattern = r’${LOG_PATTERNS[$log_type]}’
match = re.match(pattern, '''$line''')
if match:
data = match.groupdict()
print(f”{data.get(‘time’, ”)}\t{data.get(‘status’, ”)}\t{data.get(‘method’, ”)}\t{data.get(‘path’, ”)}”)
”
;;
*)
Generic parsing
Section titled “Generic parsing”echo “$line” | awk ‘{print $1, $2, $3, $0}’
;;
esac
}
Aggregate logs from multiple sources
Section titled “Aggregate logs from multiple sources”aggregate_logs() {
local output_file=$1
local duration=$2
Start log receiver
Section titled “Start log receiver”nc -l -k -p “$AGGREGATOR_PORT” > “$output_file” 2>/dev/null &
local nc_pid=$!
Also collect local logs
Section titled “Also collect local logs”if command -v journalctl >/dev/null 2>&1; then
journalctl -f —since=“$duration seconds ago” >> “$output_file” 2>/dev/null &
local journal_pid=$!
fi
Collect for specified duration
Section titled “Collect for specified duration”sleep “$duration”
Stop collectors
Section titled “Stop collectors”kill $nc_pid 2>/dev/null
[ -n “$journal_pid” ] && kill $journal_pid 2>/dev/null
Return line count
Section titled “Return line count”wc -l < “$output_file”
}
Analyze log patterns and detect anomalies
Section titled “Analyze log patterns and detect anomalies”analyze_logs() {
local log_file=$1
local analysis_output=“$STATE_DIR/analysis_$(date +%s).json”
Initialize metrics
Section titled “Initialize metrics”declare -A error_counts
declare -A response_times
declare -A status_codes
declare -A service_errors
Process each log line
Section titled “Process each log line”while IFS= read -r line; do
Detect log type
Section titled “Detect log type”local log_type=“generic”
if [[ “$line” =~ ^\{ ]]; then
log_type=“json”
elif [[ “$line” =~ “[A-Z]+\ .*HTTP ]]; then
log_type=“nginx”
fi
Parse line
Section titled “Parse line”local parsed=$(parse_log_line “$line” “$log_type”)
[ -z “$parsed” ] && continue
Extract fields based on log type
Section titled “Extract fields based on log type”case “$log_type” in
json)
IFS=$‘\t’ read -r timestamp level service message <<< “$parsed”
Count errors by service
Section titled “Count errors by service”if [[ “$level” =~ ERROR|CRITICAL|FATAL ]]; then
((service_errors[“$service”]++))
fi
;;
nginx|apache)
IFS=$‘\t’ read -r timestamp status method path <<< “$parsed”
Count status codes
Section titled “Count status codes”((status_codes[“$status”]++))
Track error rates
Section titled “Track error rates”if [[ “$status” =~ ^[45] ]]; then
((error_counts[“http_errors”]++))
fi
;;
esac
done < “$log_file”
Load historical baselines
Section titled “Load historical baselines”local baseline_file=“$STATE_DIR/baseline.json”
if [ -f “$baseline_file” ]; then
Compare with baseline
Section titled “Compare with baseline”for service in ”${!service_errors[@]}”; do
local current_errors=${service_errors[$service]}
local baseline_mean=$(jq -r “.services.$service.error_rate.mean // 0” “$baseline_file”)
local baseline_stddev=$(jq -r “.services.$service.error_rate.stddev // 1” “$baseline_file”)
if is_anomaly “$current_errors” “$baseline_mean” “$baseline_stddev”; then
log “alert” “Anomaly detected: $service error rate is $current_errors (baseline: $baseline_mean ± $baseline_stddev)”
send_anomaly_alert “$service” “error_rate” “$current_errors” “$baseline_mean” “$baseline_stddev”
fi
done
fi
Generate analysis report
Section titled “Generate analysis report”cat > “$analysis_output” << EOF
{
“timestamp”: ”$(date -u +%Y-%m-%dT%H:%M:%SZ)”,
“total_lines”: $(wc -l < “$log_file”),
“services”: {
EOF
Add service metrics
Section titled “Add service metrics”local first=true
for service in ”${!service_errors[@]}”; do
[ “$first” = true ] && first=false || echo ”,” >> “$analysis_output”
cat >> “$analysis_output” << EOF
“$service”: {
“error_count”: ${service_errors[$service]},
“error_rate”: $(echo “scale=2; ${service_errors[$service]} * 100 / $(wc -l < “$log_file”)” | bc)
}
EOF
done
echo -e “\n },\n “status_codes”: {” >> “$analysis_output”
Add status code distribution
Section titled “Add status code distribution”first=true
for status in ”${!status_codes[@]}”; do
[ “$first” = true ] && first=false || echo ”,” >> “$analysis_output”
echo -n ” “$status”: ${status_codes[$status]}” >> “$analysis_output”
done
echo -e “\n }\n}” >> “$analysis_output”
Update baseline with exponential moving average
Section titled “Update baseline with exponential moving average”update_baseline “$analysis_output”
echo “$analysis_output”
}
Update baseline metrics
Section titled “Update baseline metrics”update_baseline() {
local current_analysis=$1
local baseline_file=“$STATE_DIR/baseline.json”
local alpha=0.3 # EMA weight for new data
if [ ! -f “$baseline_file” ]; then
Initialize baseline
Section titled “Initialize baseline”cp “$current_analysis” “$baseline_file”
return
fi
Update baseline using exponential moving average
Section titled “Update baseline using exponential moving average”python3 - “$baseline_file” “$current_analysis” “$alpha” << ‘EOF’
import json
import sys
baseline_file, current_file, alpha = sys.argv[1:4]
alpha = float(alpha)
with open(baseline_file) as f:
baseline = json.load(f)
with open(current_file) as f:
current = json.load(f)
Update service metrics
Section titled “Update service metrics”for service, metrics in current.get(‘services’, {}).items():
if service not in baseline.get(‘services’, {}):
baseline.setdefault(‘services’, {})[service] = {
‘error_rate’: {‘mean’: 0, ‘stddev’: 1}
}
Update using EMA
Section titled “Update using EMA”old_mean = baseline[‘services’][service][‘error_rate’][‘mean’]
new_value = metrics[‘error_rate’]
new_mean = alpha * new_value + (1 - alpha) * old_mean
Update variance estimate
Section titled “Update variance estimate”old_var = baseline[‘services’][service][‘error_rate’].get(‘variance’, 1)
new_var = alpha * ((new_value - new_mean) ** 2) + (1 - alpha) * old_var
baseline[‘services’][service][‘error_rate’] = {
‘mean’: new_mean,
‘stddev’: new_var ** 0.5,
‘variance’: new_var
}
Save updated baseline
Section titled “Save updated baseline”with open(baseline_file, ‘w’) as f:
json.dump(baseline, f, indent=2)
EOF
}
Send anomaly alerts
Section titled “Send anomaly alerts”send_anomaly_alert() {
local service=$1
local metric=$2
local current_value=$3
local baseline_mean=$4
local baseline_stddev=$5
local zscore=$(echo “scale=2; ($current_value - $baseline_mean) / $baseline_stddev” | bc -l)
Create alert payload
Section titled “Create alert payload”local alert_data=$(cat << EOF
{
“service”: “$service”,
“metric”: “$metric”,
“current_value”: $current_value,
“baseline_mean”: $baseline_mean,
“baseline_stddev”: $baseline_stddev,
“z_score”: $zscore,
“timestamp”: ”$(date -u +%Y-%m-%dT%H:%M:%SZ)”,
“severity”: ”$([ $(echo “$zscore > 5” | bc) -eq 1 ] && echo “critical” || echo “warning”)”
}
EOF
)
Send to various channels
Section titled “Send to various channels”if [ -n “$WEBHOOK_URL” ]; then
curl -X POST “$WEBHOOK_URL” \
-H “Content-Type: application/json” \
-d “$alert_data” \
2>/dev/null || true
fi
Log alert
Section titled “Log alert”echo “$alert_data” >> “$STATE_DIR/alerts.jsonl”
Execute custom alert handler if defined
Section titled “Execute custom alert handler if defined”if [ -x “/etc/log-analyzer/alert-handler.sh” ]; then
echo “$alert_data” | /etc/log-analyzer/alert-handler.sh
fi
}
Main monitoring loop
Section titled “Main monitoring loop”main() {
log “info” “Starting distributed log analyzer”
while true; do
local temp_log=$(mktemp)
Aggregate logs
Section titled “Aggregate logs”log “info” “Collecting logs for $ANALYSIS_INTERVAL seconds”
local line_count=$(aggregate_logs “$temp_log” “$ANALYSIS_INTERVAL”)
log “info” “Collected $line_count log lines”
Analyze logs
Section titled “Analyze logs”if [ “$line_count” -gt 0 ]; then
local analysis_file=$(analyze_logs “$temp_log”)
log “info” “Analysis complete: $analysis_file”
Clean up old analysis files (keep last 24 hours)
Section titled “Clean up old analysis files (keep last 24 hours)”find “$STATE_DIR” -name “analysis_*.json” -mtime +1 -delete
fi
rm -f “$temp_log”
Wait before next cycle
Section titled “Wait before next cycle”sleep 10
done
}
Logging function
Section titled “Logging function”log() {
local level=$1
shift
echo ”[$(date ’+%Y-%m-%d %H:%M:%S’)] [$level] $*” >&2
}
if [ ”${BASH_SOURCE[0]}” = ”${0}” ]; then
main ”$@”
fi
Q4: Build a zero-downtime deployment system with automatic rollback on failure
This handles complex deployment orchestration:
#!/bin/bash
Zero-downtime deployment system with health checks and automatic rollback
Section titled “Zero-downtime deployment system with health checks and automatic rollback”Supports blue-green and rolling deployments across multiple servers
Section titled “Supports blue-green and rolling deployments across multiple servers”readonly DEPLOY_DIR=“/opt/deployments”
readonly CONFIG_FILE=“/etc/deploy/config.json”
readonly STATE_FILE=“/var/lib/deploy/state.json”
readonly HEALTH_CHECK_RETRIES=10
readonly HEALTH_CHECK_INTERVAL=5
readonly CANARY_PERCENTAGE=10
Deployment strategies
Section titled “Deployment strategies”declare -A STRATEGIES=(
[“blue-green”]=“deploy_blue_green”
[“rolling”]=“deploy_rolling”
[“canary”]=“deploy_canary”
)
Initialize directories
Section titled “Initialize directories”mkdir -p ”$(dirname “$STATE_FILE”)” “$DEPLOY_DIR”
Atomic state management
Section titled “Atomic state management”update_state() {
local temp_file=$(mktemp)
if [ -f “$STATE_FILE” ]; then
cp “$STATE_FILE” “$temp_file”
else
echo ”{}” > “$temp_file”
fi
Update state using jq
Section titled “Update state using jq”local key=$1
local value=$2
jq —arg k “$key” —arg v “$value” ’.[$k] = $v’ “$temp_file” > ”${temp_file}.new”
mv ”${temp_file}.new” “$STATE_FILE”
rm -f “$temp_file”
}
get_state() {
local key=$1
if [ -f “$STATE_FILE” ]; then
jq -r —arg k “$key” ’.[$k] // empty’ “$STATE_FILE”
fi
}
Load balancer management (HAProxy example)
Section titled “Load balancer management (HAProxy example)”update_load_balancer() {
local action=$1
local server=$2
local backend=${3:-“webservers”}
case “$action” in
drain)
echo “set server $backend/$server state drain” | \
socat stdio /var/run/haproxy.sock
;;
ready)
echo “set server $backend/$server state ready” | \
socat stdio /var/run/haproxy.sock
;;
maint)
echo “disable server $backend/$server” | \
socat stdio /var/run/haproxy.sock
;;
enable)
echo “enable server $backend/$server” | \
socat stdio /var/run/haproxy.sock
;;
esac
}
Wait for connection draining
Section titled “Wait for connection draining”wait_for_drain() {
local server=$1
local backend=${2:-“webservers”}
local max_wait=60
local waited=0
log “info” “Waiting for connections to drain from $server”
while [ $waited -lt $max_wait ]; do
local conn_count=$(echo “show servers state $backend” | \
socat stdio /var/run/haproxy.sock | \
grep “$server” | awk ‘{print $7}’)
if [ “$conn_count” -eq 0 ]; then
log “info” “Server $server drained successfully”
return 0
fi
sleep 2
waited=$((waited + 2))
done
log “warning” “Timeout waiting for $server to drain (still $conn_count connections)”
return 1
}
Health check implementation
Section titled “Health check implementation”perform_health_check() {
local server=$1
local health_endpoint=$2
local expected_response=${3:-“200”}
local attempt=1
while [ $attempt -le $HEALTH_CHECK_RETRIES ]; do
log “debug” “Health check attempt $attempt for $server”
local response=$(curl -s -o /dev/null -w ”%{http_code}” \
—connect-timeout 5 \
—max-time 10 \
“http://$server$health_endpoint”)
if [[ “$response” =~ $expected_response ]]; then
log “info” “Health check passed for $server”
return 0
fi
log “warning” “Health check failed for $server (got $response, expected $expected_response)”
sleep $HEALTH_CHECK_INTERVAL
attempt=$((attempt + 1))
done
log “error” “Health check failed for $server after $HEALTH_CHECK_RETRIES attempts”
return 1
}
Deploy to single server
Section titled “Deploy to single server”deploy_to_server() {
local server=$1
local artifact=$2
local deploy_user=${3:-“deploy”}
local deploy_path=${4:-“/var/www/app”}
log “info” “Deploying $artifact to $server”
Create deployment directory with timestamp
Section titled “Create deployment directory with timestamp”local timestamp=$(date +%Y%m%d_%H%M%S)
local new_release_path=”${deploy_path}/releases/${timestamp}“
Pre-deployment commands
Section titled “Pre-deployment commands”ssh “$deploy_user@$server” “mkdir -p ${deploy_path}/releases”
Transfer artifact
Section titled “Transfer artifact”if ! scp “$artifact” “$deploy_user@$server:/tmp/deploy_${timestamp}.tar.gz”; then
log “error” “Failed to transfer artifact to $server”
return 1
fi
Extract and prepare
Section titled “Extract and prepare”if ! ssh “$deploy_user@$server” ”
set -e
cd ${deploy_path}/releases
mkdir -p ${timestamp}
tar -xzf /tmp/deploy_${timestamp}.tar.gz -C ${timestamp}
rm -f /tmp/deploy_${timestamp}.tar.gz
Run pre-deployment hooks if exists
Section titled “Run pre-deployment hooks if exists”if [ -x ${timestamp}/deploy/pre-deploy.sh ]; then
cd ${timestamp}
./deploy/pre-deploy.sh
fi
”; then
log “error” “Failed to prepare deployment on $server”
return 1
fi
Store current version for rollback
Section titled “Store current version for rollback”local current_link=$(ssh “$deploy_user@$server” “readlink ${deploy_path}/current” 2>/dev/null || echo "")
if [ -n “$current_link” ]; then
update_state “previous_release_${server}” “$current_link”
fi
Atomic switch
Section titled “Atomic switch”if ! ssh “$deploy_user@$server” ”
ln -sfn ${new_release_path} ${deploy_path}/current
Run post-deployment hooks
Section titled “Run post-deployment hooks”if [ -x ${new_release_path}/deploy/post-deploy.sh ]; then
cd ${new_release_path}
./deploy/post-deploy.sh
fi
Reload/restart services
Section titled “Reload/restart services”sudo systemctl reload nginx || true
sudo systemctl restart app || true
”; then
log “error” “Failed to activate deployment on $server”
return 1
fi
log “info” “Successfully deployed to $server”
return 0
}
Rollback single server
Section titled “Rollback single server”rollback_server() {
local server=$1
local deploy_user=${2:-“deploy”}
local deploy_path=${3:-“/var/www/app”}
local previous_release=$(get_state “previous_release_${server}”)
if [ -z “$previous_release” ]; then
log “error” “No previous release found for $server”
return 1
fi
log “warning” “Rolling back $server to $previous_release”
ssh “$deploy_user@$server” ”
ln -sfn $previous_release ${deploy_path}/current
sudo systemctl reload nginx || true
sudo systemctl restart app || true
”
}
Blue-Green deployment strategy
Section titled “Blue-Green deployment strategy”deploy_blue_green() {
local artifact=$1
local config=$2
Parse configuration
Section titled “Parse configuration”local blue_servers=($(echo “$config” | jq -r ‘.blue_servers[]’))
local green_servers=($(echo “$config” | jq -r ‘.green_servers[]’))
local health_endpoint=$(echo “$config” | jq -r ‘.health_check.endpoint // “/health”‘)
Determine which environment is currently live
Section titled “Determine which environment is currently live”local live_env=$(get_state “live_environment”)
local target_env
local target_servers
if [ “$live_env” = “blue” ]; then
target_env=“green”
target_servers=(”${green_servers[@]}”)
else
target_env=“blue”
target_servers=(”${blue_servers[@]}”)
fi
log “info” “Starting blue-green deployment to $target_env environment”
update_state “deployment_id” ”$(date +%s)”
update_state “deployment_status” “in_progress”
Deploy to target environment
Section titled “Deploy to target environment”local failed_servers=()
for server in ”${target_servers[@]}”; do
if ! deploy_to_server “$server” “$artifact”; then
failed_servers+=(“$server”)
fi
done
if [ ${#failed_servers[@]} -gt 0 ]; then
log “error” “Deployment failed on servers: ${failed_servers[*]}”
update_state “deployment_status” “failed”
return 1
fi
Health check all target servers
Section titled “Health check all target servers”log “info” “Running health checks on $target_env environment”
for server in ”${target_servers[@]}”; do
if ! perform_health_check “$server” “$health_endpoint”; then
log “error” “Health check failed for $server”
update_state “deployment_status” “failed”
Rollback is not needed in blue-green as we haven’t switched traffic
Section titled “Rollback is not needed in blue-green as we haven’t switched traffic”return 1
fi
done
Switch traffic
Section titled “Switch traffic”log “info” “Switching traffic to $target_env environment”
for server in ”${target_servers[@]}”; do
update_load_balancer “enable” “$server”
done
Drain old environment
Section titled “Drain old environment”local old_servers
if [ “$target_env” = “blue” ]; then
old_servers=(”${green_servers[@]}”)
else
old_servers=(”${blue_servers[@]}”)
fi
for server in ”${old_servers[@]}”; do
update_load_balancer “drain” “$server”
done
Wait for draining
Section titled “Wait for draining”for server in ”${old_servers[@]}”; do
wait_for_drain “$server”
update_load_balancer “maint” “$server”
done
Update state
Section titled “Update state”update_state “live_environment” “$target_env”
update_state “deployment_status” “completed”
update_state “deployment_end” ”$(date +%s)”
log “info” “Blue-green deployment completed successfully”
}
Rolling deployment strategy
Section titled “Rolling deployment strategy”deploy_rolling() {
local artifact=$1
local config=$2
Parse configuration
Section titled “Parse configuration”local servers=($(echo “$config” | jq -r ‘.servers[]’))
local batch_size=$(echo “$config” | jq -r ‘.batch_size // 1’)
local health_endpoint=$(echo “$config” | jq -r ‘.health_check.endpoint // “/health”’)
local pause_between_batches=$(echo “$config” | jq -r ‘.pause_seconds // 30’)
log “info” “Starting rolling deployment (batch size: $batch_size)”
update_state “deployment_id” ”$(date +%s)”
update_state “deployment_status” “in_progress”
Process servers in batches
Section titled “Process servers in batches”local total_servers=${#servers[@]}
local deployed=0
while [ $deployed -lt $total_servers ]; do
local batch_servers=()
local batch_end=$((deployed + batch_size))
Get servers for this batch
Section titled “Get servers for this batch”for ((i=deployed; i<batch_end && i<total_servers; i++)); do
batch_servers+=(”${servers[$i]}”)
done
log “info” “Processing batch: ${batch_servers[*]}“
Drain servers in batch
Section titled “Drain servers in batch”for server in ”${batch_servers[@]}”; do
update_load_balancer “drain” “$server”
done
Wait for all to drain
Section titled “Wait for all to drain”for server in ”${batch_servers[@]}”; do
wait_for_drain “$server”
done
Deploy to batch
Section titled “Deploy to batch”local batch_failed=false
for server in ”${batch_servers[@]}”; do
if ! deploy_to_server “$server” “$artifact”; then
log “error” “Deployment failed on $server”
batch_failed=true
break
fi
Health check immediately
Section titled “Health check immediately”if ! perform_health_check “$server” “$health_endpoint”; then
log “error” “Health check failed for $server”
batch_failed=true
break
fi
Re-enable in load balancer
Section titled “Re-enable in load balancer”update_load_balancer “ready” “$server”
done
if [ “$batch_failed” = true ]; then
log “error” “Batch deployment failed, initiating rollback”
Rollback all deployed servers
Section titled “Rollback all deployed servers”for ((i=0; i<deployed+${#batch_servers[@]}; i++)); do
rollback_server ”${servers[$i]}”
update_load_balancer “ready” ”${servers[$i]}”
done
update_state “deployment_status” “failed_rollback_completed”
return 1
fi
deployed=$((deployed + ${#batch_servers[@]}))
Pause between batches if not the last batch
Section titled “Pause between batches if not the last batch”if [ $deployed -lt $total_servers ]; then
log “info” “Pausing $pause_between_batches seconds before next batch”
sleep “$pause_between_batches”
fi
done
update_state “deployment_status” “completed”
update_state “deployment_end” ”$(date +%s)”
log “info” “Rolling deployment completed successfully”
}
Canary deployment strategy
Section titled “Canary deployment strategy”deploy_canary() {
local artifact=$1
local config=$2
Parse configuration
Section titled “Parse configuration”local servers=($(echo “$config” | jq -r ‘.servers[]’))
local canary_duration=$(echo “$config” | jq -r ‘.canary_duration // 300’)
local success_rate_threshold=$(echo “$config” | jq -r ‘.success_rate_threshold // 99.5’)
local health_endpoint=$(echo “$config” | jq -r ‘.health_check.endpoint // “/health”‘)
Calculate canary size
Section titled “Calculate canary size”local total_servers=${#servers[@]}
local canary_count=$((total_servers * CANARY_PERCENTAGE / 100))
[ $canary_count -lt 1 ] && canary_count=1
log “info” “Starting canary deployment ($canary_count of $total_servers servers)“
Deploy to canary servers
Section titled “Deploy to canary servers”local canary_servers=(”${servers[@]:0:$canary_count}”)
for server in ”${canary_servers[@]}”; do
update_load_balancer “drain” “$server”
wait_for_drain “$server”
if ! deploy_to_server “$server” “$artifact”; then
log “error” “Canary deployment failed on $server”
rollback_server “$server”
update_load_balancer “ready” “$server”
return 1
fi
if ! perform_health_check “$server” “$health_endpoint”; then
log “error” “Canary health check failed for $server”
rollback_server “$server”
update_load_balancer “ready” “$server”
return 1
fi
update_load_balancer “ready” “$server”
done
Monitor canary servers
Section titled “Monitor canary servers”log “info” “Monitoring canary servers for $canary_duration seconds”
local start_time=$(date +%s)
local end_time=$((start_time + canary_duration))
while [ $(date +%s) -lt $end_time ]; do
Check error rates
Section titled “Check error rates”local error_rate=$(calculate_error_rate ”${canary_servers[@]}”)
local success_rate=$(echo “100 - $error_rate” | bc)
if (( $(echo “$success_rate < $success_rate_threshold” | bc -l) )); then
log “error” “Canary success rate ($success_rate%) below threshold ($success_rate_threshold%)“
Rollback canary servers
Section titled “Rollback canary servers”for server in ”${canary_servers[@]}”; do
update_load_balancer “drain” “$server”
done
for server in ”${canary_servers[@]}”; do
wait_for_drain “$server”
rollback_server “$server”
update_load_balancer “ready” “$server”
done
return 1
fi
sleep 10
done
log “info” “Canary phase successful, proceeding with full deployment”
Deploy to remaining servers using rolling strategy
Section titled “Deploy to remaining servers using rolling strategy”local remaining_servers=(”${servers[@]:$canary_count}”)
local remaining_config=$(echo “$config” | jq —argjson servers ”$(printf ‘%s\n’ ”${remaining_servers[@]}” | jq -R . | jq -s .)” ‘.servers = $servers’)
deploy_rolling “$artifact” “$remaining_config”
}
Calculate error rate from logs/metrics
Section titled “Calculate error rate from logs/metrics”calculate_error_rate() {
local servers=(”$@“)
This would integrate with your metrics system
Section titled “This would integrate with your metrics system”For demo, returning mock value
Section titled “For demo, returning mock value”echo “0.5”
}
Logging
Section titled “Logging”log() {
local level=$1
shift
echo ”[$(date ’+%Y-%m-%d %H:%M:%S’)] [$level] $*” | tee -a /var/log/deploy.log
}
Main deployment orchestrator
Section titled “Main deployment orchestrator”main() {
local artifact=$1
local strategy=${2:-“rolling”}
if [ ! -f “$artifact” ]; then
log “error” “Artifact not found: $artifact”
exit 1
fi
if [ ! -f “$CONFIG_FILE” ]; then
log “error” “Configuration not found: $CONFIG_FILE”
exit 1
fi
local config=$(cat “$CONFIG_FILE”)
Validate strategy
Section titled “Validate strategy”if [ -z ”${STRATEGIES[$strategy]}” ]; then
log “error” “Unknown deployment strategy: $strategy”
exit 1
fi
Execute deployment
Section titled “Execute deployment”${STRATEGIES[$strategy]} “$artifact” “$config”
}
if [ ”${BASH_SOURCE[0]}” = ”${0}” ]; then
main ”$@”
fi
Q5: Create a disaster recovery automation system that handles failover between regions
This demonstrates complex orchestration and state management:
#!/bin/bash
Multi-region disaster recovery automation system
Section titled “Multi-region disaster recovery automation system”Handles database replication, DNS failover, and application state synchronization
Section titled “Handles database replication, DNS failover, and application state synchronization”readonly DR_CONFIG=“/etc/disaster-recovery/config.json”
readonly STATE_DIR=“/var/lib/dr-automation”
readonly HEALTH_CHECK_INTERVAL=30
readonly FAILOVER_THRESHOLD=3 # Consecutive failures before failover
readonly RPO_WARNING_THRESHOLD=300 # 5 minutes
mkdir -p “$STATE_DIR”
Region health tracking
Section titled “Region health tracking”declare -A REGION_HEALTH
declare -A REGION_FAILURES
declare -A LAST_REPLICATION_LAG
Load configuration
Section titled “Load configuration”load_config() {
if [ ! -f “$DR_CONFIG” ]; then
log “error” “DR configuration not found”
exit 1
fi
Export configuration as environment variables
Section titled “Export configuration as environment variables”eval ”$(jq -r ’
.regions[] |
“REGION_(.name | ascii_upcase)_ENDPOINT=“(.endpoint)"",
“REGION_(.name | ascii_upcase)_DB_HOST=“(.database.host)"",
“REGION_(.name | ascii_upcase)_DB_PORT=“(.database.port // 5432)""
’ “$DR_CONFIG”)“
Get all regions
Section titled “Get all regions”REGIONS=($(jq -r ‘.regions[].name’ “$DR_CONFIG”))
PRIMARY_REGION=$(jq -r ‘.primary_region’ “$DR_CONFIG”)
}
Comprehensive health check for a region
Section titled “Comprehensive health check for a region”check_region_health() {
local region=$1
local endpoint_var=“REGION_${region^^}_ENDPOINT”
local endpoint=”${!endpoint_var}”
log “debug” “Checking health of region: $region”
Application health check
Section titled “Application health check”local app_health=false
local http_status=$(curl -s -o /dev/null -w ”%{http_code}” \
—connect-timeout 5 —max-time 10 \
“https://$endpoint/health” || echo “000”)
if [ “$http_status” = “200” ]; then
app_health=true
fi
Database health check
Section titled “Database health check”local db_health=false
local db_host_var=“REGION_${region^^}_DB_HOST”
local db_port_var=“REGION_${region^^}_DB_PORT”
local db_host=”${!db_host_var}”
local db_port=”${!db_port_var}”
if pg_isready -h “$db_host” -p “$db_port” -t 5 >/dev/null 2>&1; then
db_health=true
fi
Check replication lag if not primary
Section titled “Check replication lag if not primary”local replication_ok=true
if [ “$region” != “$PRIMARY_REGION” ]; then
local lag=$(check_replication_lag “$region”)
LAST_REPLICATION_LAG[$region]=$lag
if [ “$lag” -gt “$RPO_WARNING_THRESHOLD” ]; then
log “warning” “High replication lag for $region: ${lag}s”
replication_ok=false
fi
fi
Overall health status
Section titled “Overall health status”if [ “$app_health” = true ] && [ “$db_health” = true ] && [ “$replication_ok” = true ]; then
REGION_HEALTH[$region]=“healthy”
REGION_FAILURES[$region]=0
return 0
else
REGION_HEALTH[$region]=“unhealthy”
REGION_FAILURES[$region]=$((${REGION_FAILURES[$region]:-0} + 1))
log “error” “Region $region health check failed - App: $app_health, DB: $db_health, Replication: $replication_ok”
return 1
fi
}
Check database replication lag
Section titled “Check database replication lag”check_replication_lag() {
local region=$1
local db_host_var=“REGION_${region^^}_DB_HOST”
local db_host=”${!db_host_var}“
Get replication lag in seconds
Section titled “Get replication lag in seconds”local lag=$(psql -h “$db_host” -U postgres -t -c ”
SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int
WHERE pg_is_in_recovery();” 2>/dev/null | tr -d ’ ’)
if [ -z “$lag” ] || [ “$lag” = “NULL” ]; then
echo “0”
else
echo “$lag”
fi
}
Perform DNS failover
Section titled “Perform DNS failover”perform_dns_failover() {
local from_region=$1
local to_region=$2
local domain=$(jq -r ‘.domain’ “$DR_CONFIG”)
local dns_provider=$(jq -r ‘.dns_provider’ “$DR_CONFIG”)
log “warning” “Initiating DNS failover from $from_region to $to_region”
case “$dns_provider” in
route53)
AWS Route53 failover
Section titled “AWS Route53 failover”local hosted_zone=$(aws route53 list-hosted-zones-by-name \
—query “HostedZones[?Name==’${domain}.’].Id” \
—output text | cut -d’/’ -f3)
local to_endpoint_var=“REGION_${to_region^^}_ENDPOINT”
local to_endpoint=”${!to_endpoint_var}“
Update DNS record
Section titled “Update DNS record”aws route53 change-resource-record-sets \
—hosted-zone-id “$hosted_zone” \
—change-batch ”{
“Changes”: [{
“Action”: “UPSERT”,
“ResourceRecordSet”: {
“Name”: ”${domain}”,
“Type”: “CNAME”,
“TTL”: 60,
“ResourceRecords”: [{“Value”: ”${to_endpoint}”}]
}
}]
}”
;;
cloudflare)
Cloudflare API
Section titled “Cloudflare API”local zone_id=$(curl -s -X GET “https://api.cloudflare.com/client/v4/zones?name=$domain” \
-H “Authorization: Bearer $CLOUDFLARE_API_TOKEN” | \
jq -r ‘.result[0].id’)
local record_id=$(curl -s -X GET \
“https://api.cloudflare.com/client/v4/zones/$zone_id/dns_records?name=$domain” \
-H “Authorization: Bearer $CLOUDFLARE_API_TOKEN” | \
jq -r ‘.result[0].id’)
local to_endpoint_var=“REGION_${to_region^^}_ENDPOINT”
local to_endpoint=”${!to_endpoint_var}”
curl -X PUT “https://api.cloudflare.com/client/v4/zones/$zone_id/dns_records/$record_id” \
-H “Authorization: Bearer $CLOUDFLARE_API_TOKEN” \
-H “Content-Type: application/json” \
—data ”{“type”:“CNAME”,“name”:“$domain”,“content”:“$to_endpoint”,“ttl”:60}”
;;
esac
log “info” “DNS failover initiated, propagation may take up to 60 seconds”
}
Promote standby database to primary
Section titled “Promote standby database to primary”promote_database() {
local region=$1
local db_host_var=“REGION_${region^^}_DB_HOST”
local db_host=”${!db_host_var}”
log “warning” “Promoting database in region $region to primary”
Execute promotion (PostgreSQL example)
Section titled “Execute promotion (PostgreSQL example)”ssh “postgres@$db_host” “pg_ctl promote -D /var/lib/postgresql/data”
Update application configuration to point to new primary
Section titled “Update application configuration to point to new primary”local app_servers=($(jq -r “.regions[] | select(.name==“$region”) | .app_servers[]” “$DR_CONFIG”))
for server in ”${app_servers[@]}”; do
ssh “deploy@$server” ”
sed -i ‘s/^DB_HOST=.*/DB_HOST=$db_host/’ /etc/app/config
sudo systemctl restart app
”
done
}
Sync application state
Section titled “Sync application state”sync_application_state() {
local from_region=$1
local to_region=$2
log “info” “Syncing application state from $from_region to $to_region”
Sync Redis/cache data
Section titled “Sync Redis/cache data”local from_redis=$(jq -r “.regions[] | select(.name==“$from_region”) | .redis.host” “$DR_CONFIG”)
local to_redis=$(jq -r “.regions[] | select(.name==“$to_region”) | .redis.host” “$DR_CONFIG”)
Create Redis dump and restore
Section titled “Create Redis dump and restore”ssh “redis@$from_redis” “redis-cli BGSAVE && sleep 2”
ssh “redis@$from_redis” “cat /var/lib/redis/dump.rdb” | \
ssh “redis@$to_redis” “cat > /var/lib/redis/dump.rdb && redis-cli FLUSHALL && redis-server —loadmodule”
Sync session data if using file-based sessions
Section titled “Sync session data if using file-based sessions”local from_servers=($(jq -r “.regions[] | select(.name==“$from_region”) | .app_servers[]” “$DR_CONFIG”))
local to_servers=($(jq -r “.regions[] | select(.name==“$to_region”) | .app_servers[]” “$DR_CONFIG”))
Use parallel for efficiency
Section titled “Use parallel for efficiency”printf ‘%s\n’ ”${from_servers[@]}” | parallel -j 4 ”
rsync -av
Bash Shell Scripting Interview Questions And Answers
Whether you’re just starting out or have years of experience, bash shell interview questions test your understanding of the shell’s unique features, syntax quirks, and best practices that make bash the go-to choice for system automation. We’ll cover the essential bash-specific concepts that interviewers love to explore - from array manipulation and string operations to process substitution and advanced parameter expansion. These questions focus on bash’s distinctive capabilities that set it apart from basic POSIX shell scripting.
Q1: What’s the difference between [ ] and [[ ]] in bash?
This is a classic that trips up many people. The single bracket [ ] is actually a command (same as test), while double brackets [[ ]] are a bash keyword with enhanced functionality:
[[ ]] supports pattern matching: [[ $string == pa* ]] works, but [ $string == pa* ] doesn’t
[[ ]] handles empty variables better: [[ -z $empty ]] works fine, while [ -z $empty ] needs quotes
[[ ]] supports regex matching with =~ operator
[[ ]] allows && and || inside: [[ $a -eq 1 && $b -eq 2 ]]
The key insight: always use [[ ]] in bash unless you need POSIX compatibility. It’s safer and more powerful.
Q2: Explain bash arrays and how they differ from regular variables
Bash supports both indexed and associative arrays, which many people don’t fully utilize:
Indexed arrays:
arr=(apple banana cherry)
echo ${arr[0]} # apple
echo ${arr[@]} # all elements
echo ${#arr[@]} # array length (3)
Associative arrays (bash 4+):
declare -A hash
hash[name]=“John”
hash[age]=30
echo ${hash[name]} # John
The tricky part is that bash arrays are sparse - you can have arr[0] and arr[10] without arr[1-9]. Also, array indices start at 0, not 1 like in some shells.
Q3: How does command substitution work and what’s the difference between “ and $()?
Both capture command output, but $() is strongly preferred:
$() nests easily: echo $(echo $(date))
Backticks require escaping for nesting: echo echo \date\ “
$() is clearer to read, especially with syntax highlighting
$() handles quotes more predictably
Example showing the difference:
This works fine
Section titled “This works fine”result=$(echo “hello $(echo “world”)“)
This is a nightmare with backticks
Section titled “This is a nightmare with backticks”result=echo "hello \echo “world”`”`
Q4: What are bash parameter expansions and give some advanced examples?
Parameter expansion is bash’s Swiss Army knife for string manipulation:
${var:-default} - Use default if var is unset/null
${var:=default} - Set var to default if unset/null
${var:?error} - Exit with error if var is unset/null
${var:+alternate} - Use alternate if var is set
Advanced string manipulation:
path=“/home/user/documents/file.txt”
echo ${path##*/} # file.txt (basename)
echo ${path%/*} # /home/user/documents (dirname)
echo ${path/documents/archive} # String replacement
echo ${path^^} # Convert to uppercase
echo ${path,,} # Convert to lowercase
The pattern: # removes from beginning, % from end. Double it (##, %%) for greedy matching.
Q5: Explain the different types of expansions in bash and their order
Bash performs expansions in a specific order, and understanding this prevents many bugs:
Brace expansion: {a,b,c} or {1..10}
Tilde expansion: ~ becomes home directory
Parameter/variable expansion: $var or ${var}
Command substitution: $(command) or command
Arithmetic expansion: $((2 + 2))
Word splitting: Based on IFS
Pathname expansion: *.txt (globbing)
Why this matters:
This doesn’t work as expected
Section titled “This doesn’t work as expected”var=“*.txt”
ls $var # Word splitting happens after parameter expansion
This works
Section titled “This works”ls *.txt # Pathname expansion happens directly
Q6: What is process substitution and when would you use it?
Process substitution <(command) creates a temporary file descriptor that acts like a file:
Compare outputs of two commands
Section titled “Compare outputs of two commands”diff <(ls dir1) <(ls dir2)
Read from multiple commands simultaneously
Section titled “Read from multiple commands simultaneously”while read -u 3 line1 && read -u 4 line2; do
echo “File1: $line1 | File2: $line2”
done 3< <(cat file1) 4< <(cat file2)
Use with commands that only accept files
Section titled “Use with commands that only accept files”grep “pattern” <(curl -s https://example.com)
It’s incredibly useful when you need to treat command output as a file without creating temporary files. The key limitation: it’s not POSIX, so it’s bash/zsh specific.
Q7: How do you handle signals and traps in bash?
Traps let you intercept signals and run cleanup code:
Basic trap syntax
Section titled “Basic trap syntax”trap ‘echo “Ctrl+C pressed”’ INT
trap ‘cleanup_function’ EXIT
trap ‘echo “Error on line $LINENO”’ ERR
Remove trap
Section titled “Remove trap”trap - INT
List all signals
Section titled “List all signals”trap -l
Useful pattern: cleanup on any exit
Section titled “Useful pattern: cleanup on any exit”cleanup() {
rm -f /tmp/tempfile.$$
echo “Cleaned up”
}
trap cleanup EXIT INT TERM
Pro tip: Always use single quotes in trap commands to prevent premature expansion. The ERR trap is especially useful with set -e for debugging.
Q8: Explain bash’s set options and their impact on script behavior
The set command changes bash’s behavior fundamentally:
set -e (errexit): Exit on any command failure
set -u (nounset): Exit on undefined variable
set -o pipefail: Pipe fails if any command fails
set -x (xtrace): Print commands before execution
Best practice combo:
set -euo pipefail # Strict mode
set -E # ERR trap inherits to functions
set -T # DEBUG/RETURN traps inherit to functions
You can also use set +e to temporarily disable, and $- contains current options. The gotcha: set -e doesn’t work in certain contexts like if conditions.
Q9: What are coprocesses in bash and how do you use them?
Coprocesses (bash 4+) let you run background processes with two-way communication:
Start a coprocess
Section titled “Start a coprocess”coproc BC { bc; }
Send data to coprocess
Section titled “Send data to coprocess”echo “2+2” >&${BC[1]}
Read result
Section titled “Read result”read result <&${BC[0]}
echo $result # 4
More complex example with named coprocess
Section titled “More complex example with named coprocess”coproc GREP { grep —line-buffered “error”; }
echo “this is fine” >&${GREP[1]}
echo “error occurred” >&${GREP[1]}
read -t 1 line <&${GREP[0]} && echo “Got: $line”
Coprocesses are useful for maintaining persistent connections to programs like bc, awk, or database clients without the overhead of starting new processes.
Q10: How does bash handle arithmetic and what are the different ways to perform calculations?
Bash offers multiple arithmetic methods, each with different capabilities:
Arithmetic expansion - integers only
Section titled “Arithmetic expansion - integers only”result=$((5 + 3))
((count++)) # C-style increment
let command
Section titled “let command”let “result = 5 + 3”
expr command (external, avoid)
Section titled “expr command (external, avoid)”result=$(expr 5 + 3)
bc for floating point
Section titled “bc for floating point”result=$(echo “scale=2; 5/3” | bc)
Arithmetic comparison in conditions
Section titled “Arithmetic comparison in conditions”if ((5 > 3)); then echo “yes”; fi
C-style for loop
Section titled “C-style for loop”for ((i=0; i<10; i++)); do
echo $i
done
Bash only supports integer arithmetic natively. For floating-point, use bc or awk. The ((…)) construct returns 0 for true, opposite of normal bash commands.
Q11: What’s the difference between source, ., and executing a script?
These three methods have crucial differences:
./script.sh - Runs in a subshell, can’t modify parent environment
source script.sh or . script.sh - Runs in current shell, can modify environment
exec script.sh - Replaces current shell process entirely
Example showing the difference:
In script.sh:
Section titled “In script.sh:”export MY_VAR=“hello”
Method 1: Won’t see MY_VAR after
Section titled “Method 1: Won’t see MY_VAR after”./script.sh
echo $MY_VAR # Empty
Method 2: Will see MY_VAR
Section titled “Method 2: Will see MY_VAR”source script.sh
echo $MY_VAR # hello
Method 3: Never returns to original shell
Section titled “Method 3: Never returns to original shell”exec script.sh
echo “Never printed”
Use source for configuration files, ./ for independent scripts, and exec for wrapper scripts.
Q12: How do you properly handle spaces and special characters in bash?
This is where most scripts break. Key strategies:
Always quote variables
Section titled “Always quote variables”bad: rm $file
good: rm “$file”
Use arrays for multiple items
Section titled “Use arrays for multiple items”files=(“file 1.txt” “file 2.txt”)
for f in ”${files[@]}”; do
cat “$f”
done
Read with proper IFS handling
Section titled “Read with proper IFS handling”while IFS= read -r line; do
echo “$line”
done < file.txt
Handle filenames with newlines
Section titled “Handle filenames with newlines”find . -name “*.txt” -print0 | while IFS= read -r -d ” file; do
echo “Processing: $file”
done
Safe command construction
Section titled “Safe command construction”cmd=(ls -la “/path with spaces/”)
”${cmd[@]}” # Executes correctly
The golden rule: If it’s not a numeric value you’re absolutely sure about, quote it. Use ”$@” not $* for arguments, and always use -r with read to prevent backslash interpretation.