Linux Kernel Mastery: Design Principles & Case Studies

Understanding the Linux OOM Killer: A Deep Dive with Practical Testing

The Linux Out-of-Memory (OOM) killer is one of the kernel’s most critical defense mechanisms against system lockups caused by memory exhaustion. When your system runs dangerously low on available RAM, the OOM killer springs into action, making life-or-death decisions about which processes to terminate to keep the system running.

What is the OOM Killer?

The OOM killer is a kernel subsystem that monitors system memory usage and takes action when the system is unable to allocate memory for essential operations. Rather than allowing the entire system to freeze or crash, it selectively terminates processes to free up memory.

When Does the OOM Killer Activate?

The OOM killer triggers when:

Physical memory (RAM) is nearly exhausted
Swap space is full or unavailable
The kernel cannot satisfy a memory allocation request
System performance degrades severely due to excessive swapping (thrashing)

How Does It Choose Victims?

The OOM killer uses a sophisticated scoring algorithm that considers multiple factors:

Memory Usage: Processes consuming more memory get higher scores
Process Age: Newer processes are more likely to be killed
Process Priority: Lower priority (higher nice values) processes are preferred targets
OOM Score Adjustment: Manual adjustments via /proc/PID/oom_score_adj
Process Type: Kernel threads and init processes are protected
Children Processes: Killing a parent that spawned many children is preferred

OOM Score Adjustment Values

The oom_score_adj file allows fine-tuning of the OOM killer’s target selection:

-1000: Never kill (OOM_SCORE_ADJ_MIN)
0: Default behavior
+1000: Always kill first (OOM_SCORE_ADJ_MAX)

Building an OOM Killer Test Program

To understand how the OOM killer works in practice, I’ve developed an enhanced test program that systematically consumes system memory until the OOM killer intervenes. This tool is invaluable for:

Testing OOM killer behavior in controlled environments
Validating system memory limits and configurations
Understanding memory allocation patterns
Debugging memory-related issues

Key Features of the Test Program

Intelligent Memory Allocation: Uses large chunks initially, then adapts to smaller sizes as memory becomes scarce
Memory Pattern Verification: Ensures allocated memory is actually committed to physical RAM
Resource Limit Removal: Bypasses process memory limits for maximum impact
Memory Locking: Prevents memory from being swapped, forcing RAM usage
Real-time Monitoring: Provides detailed progress reports and system memory status
Graceful Signal Handling: Reports statistics when terminated

The Complete Enhanced OOM Killer Test Program

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/resource.h>
#include <sys/time.h>
#include <signal.h>
#include <errno.h>
#include <time.h>
#include <linux/oom.h>

#define CHUNK_SIZE (1024 * 1024)  // 1MB chunks for faster allocation
#define PATTERN_SIZE 4096         // Pattern size for memory verification
#define MAX_CHUNKS 1000000        // Safety limit
#define PROGRESS_INTERVAL 100     // Report progress every N chunks

// Global variables for cleanup
static void **allocated_chunks = NULL;
static size_t chunk_count = 0;
static size_t total_allocated = 0;

// Signal handler for graceful shutdown
void signal_handler(int sig) {
    printf("\nReceived signal %d. Allocated %zu MB before termination.\n", 
           sig, total_allocated / (1024 * 1024));
    exit(EXIT_SUCCESS);
}

// Get current memory info from /proc/meminfo
void print_memory_info() {
    FILE *meminfo = fopen("/proc/meminfo", "r");
    if (!meminfo) return;
    
    char line[256];
    printf("\n=== Memory Status ===\n");
    while (fgets(line, sizeof(line), meminfo)) {
        if (strncmp(line, "MemTotal:", 9) == 0 ||
            strncmp(line, "MemFree:", 8) == 0 ||
            strncmp(line, "MemAvailable:", 13) == 0 ||
            strncmp(line, "Cached:", 7) == 0 ||
            strncmp(line, "SwapTotal:", 10) == 0 ||
            strncmp(line, "SwapFree:", 9) == 0) {
            printf("%s", line);
        }
    }
    printf("====================\n\n");
    fclose(meminfo);
}

// Set OOM score to maximum (most likely to be killed)
int set_oom_score_max() {
    FILE *oom_file = fopen("/proc/self/oom_score_adj", "w");
    if (!oom_file) {
        perror("Failed to open oom_score_adj");
        return 0;
    }
    
    if (fprintf(oom_file, "%d", OOM_SCORE_ADJ_MAX) < 0) {
        perror("Failed to write oom_score_adj");
        fclose(oom_file);
        return 0;
    }
    
    if (fclose(oom_file) != 0) {
        perror("Failed to close oom_score_adj");
        return 0;
    }
    
    printf("OOM score set to maximum (%d)\n", OOM_SCORE_ADJ_MAX);
    return 1;
}

// Remove all memory limits
int remove_memory_limits() {
    const struct rlimit unlimited = {RLIM_INFINITY, RLIM_INFINITY};
    
    struct {
        int resource;
        const char *name;
    } limits[] = {
        {RLIMIT_AS, "RLIMIT_AS"},
        {RLIMIT_DATA, "RLIMIT_DATA"}, 
        {RLIMIT_STACK, "RLIMIT_STACK"},
        {RLIMIT_RSS, "RLIMIT_RSS"},
        {RLIMIT_MEMLOCK, "RLIMIT_MEMLOCK"}
    };
    
    for (size_t i = 0; i < sizeof(limits) / sizeof(limits[0]); i++) {
        if (setrlimit(limits[i].resource, &unlimited) != 0) {
            printf("Warning: Failed to set %s to unlimited: %s\n", 
                   limits[i].name, strerror(errno));
            // Continue anyway - some limits might not be critical
        } else {
            printf("Set %s to unlimited\n", limits[i].name);
        }
    }
    
    return 1;
}

// Lock all memory to prevent swapping
int lock_memory() {
    if (mlockall(MCL_CURRENT | MCL_FUTURE) != 0) {
        printf("Warning: Failed to lock memory: %s\n", strerror(errno));
        printf("Continuing without memory locking...\n");
        return 0;  // Continue anyway
    }
    
    printf("Memory locking enabled (no swap)\n");
    return 1;
}

// Allocate and initialize a memory chunk
void* allocate_chunk(size_t size) {
    void *ptr = malloc(size);
    if (!ptr) return NULL;
    
    // Create a pattern to ensure memory is actually used
    static unsigned char pattern[PATTERN_SIZE];
    static int pattern_initialized = 0;
    
    if (!pattern_initialized) {
        for (int i = 0; i < PATTERN_SIZE; i++) {
            pattern[i] = (unsigned char)(i % 256);
        }
        pattern_initialized = 1;
    }
    
    // Fill the entire chunk with pattern
    unsigned char *byte_ptr = (unsigned char*)ptr;
    for (size_t i = 0; i < size; i += PATTERN_SIZE) {
        size_t copy_size = (size - i < PATTERN_SIZE) ? size - i : PATTERN_SIZE;
        memcpy(byte_ptr + i, pattern, copy_size);
    }
    
    // Verify a portion of the memory to force commitment
    if (memcmp(ptr, pattern, (size < PATTERN_SIZE) ? size : PATTERN_SIZE) != 0) {
        printf("Memory verification failed!\n");
        free(ptr);
        return NULL;
    }
    
    return ptr;
}

int main(int argc, char *argv[]) {
    size_t chunk_size = CHUNK_SIZE;
    int verbose = 0;
    
    // Parse command line arguments
    for (int i = 1; i < argc; i++) {
        if (strcmp(argv[i], "-v") == 0 || strcmp(argv[i], "--verbose") == 0) {
            verbose = 1;
        } else if (strcmp(argv[i], "-h") == 0 || strcmp(argv[i], "--help") == 0) {
            printf("Usage: %s [-v|--verbose] [-h|--help]\n", argv[0]);
            printf("  -v, --verbose: Show detailed progress\n");
            printf("  -h, --help:    Show this help\n");
            return EXIT_SUCCESS;
        }
    }
    
    printf("Enhanced OOM Killer Test Program\n");
    printf("================================\n");
    printf("Chunk size: %zu KB\n", chunk_size / 1024);
    printf("PID: %d\n", getpid());
    
    // Set up signal handlers
    signal(SIGTERM, signal_handler);
    signal(SIGINT, signal_handler);
    signal(SIGKILL, signal_handler);  // This won't work, but doesn't hurt
    
    // Show initial memory state
    if (verbose) print_memory_info();
    
    // Configure the process for OOM testing
    if (!remove_memory_limits()) {
        fprintf(stderr, "Failed to remove memory limits\n");
        return EXIT_FAILURE;
    }
    
    lock_memory();  // Continue even if this fails
    
    if (!set_oom_score_max()) {
        fprintf(stderr, "Failed to set OOM score\n");
        return EXIT_FAILURE;
    }
    
    // Allocate array to track chunks (optional, for debugging)
    allocated_chunks = calloc(MAX_CHUNKS, sizeof(void*));
    if (!allocated_chunks) {
        printf("Warning: Cannot track allocated chunks\n");
    }
    
    printf("\nStarting memory allocation...\n");
    
    struct timeval start_time, current_time;
    gettimeofday(&start_time, NULL);
    
    // Main allocation loop
    while (chunk_count < MAX_CHUNKS) {
        void *chunk = allocate_chunk(chunk_size);
        if (!chunk) {
            // Try smaller chunks if large allocation fails
            if (chunk_size > 4096) {
                chunk_size /= 2;
                printf("Allocation failed, reducing chunk size to %zu KB\n", 
                       chunk_size / 1024);
                continue;
            } else {
                printf("Cannot allocate even small chunks. Total: %zu MB\n", 
                       total_allocated / (1024 * 1024));
                break;
            }
        }
        
        if (allocated_chunks) {
            allocated_chunks[chunk_count] = chunk;
        }
        
        chunk_count++;
        total_allocated += chunk_size;
        
        // Progress reporting
        if (chunk_count % PROGRESS_INTERVAL == 0) {
            gettimeofday(&current_time, NULL);
            double elapsed = (current_time.tv_sec - start_time.tv_sec) + 
                           (current_time.tv_usec - start_time.tv_usec) / 1000000.0;
            
            printf("Allocated %zu chunks (%zu MB) in %.2f seconds (%.2f MB/s)\n",
                   chunk_count, total_allocated / (1024 * 1024), 
                   elapsed, (total_allocated / (1024 * 1024)) / elapsed);
            
            if (verbose && chunk_count % (PROGRESS_INTERVAL * 10) == 0) {
                print_memory_info();
            }
        }
    }
    
    printf("\nAllocation complete or failed.\n");
    printf("Total chunks: %zu\n", chunk_count);
    printf("Total memory: %zu MB\n", total_allocated / (1024 * 1024));
    
    if (verbose) print_memory_info();
    
    // Keep the process alive to maintain memory pressure
    printf("Keeping process alive to maintain memory pressure...\n");
    printf("The OOM killer should terminate this process when memory runs low.\n");
    
    while (1) {
        sleep(1);
        // Occasionally touch the memory to keep it active
        if (allocated_chunks && chunk_count > 0) {
            static size_t touch_index = 0;
            if (touch_index < chunk_count) {
                volatile char *ptr = (volatile char*)allocated_chunks[touch_index];
                if (ptr) {
                    *ptr = *ptr; // Touch the memory
                }
                touch_index = (touch_index + 1) % chunk_count;
            }
        }
    }
    
    return EXIT_SUCCESS;
}

Compilation and Usage

To compile and run the OOM killer test program:

# Compile the program
gcc -o oom_test oom_test.c -Wall -O2

# Run with basic output
sudo ./oom_test

# Run with verbose memory monitoring
sudo ./oom_test -v

# Show help
./oom_test -h

What to Expect

When you run this program, you’ll observe:

Initial Setup: The program configures itself as the primary OOM target
Memory Allocation: Rapid consumption of available RAM with progress reports
System Response: Increasing swap usage and system slowdown
OOM Killer Activation: The kernel logs the OOM event and terminates the process
System Recovery: Immediate return of freed memory to the system

Monitoring OOM Events

You can monitor OOM killer activity using:

# Real-time kernel messages
dmesg -w

# System log monitoring
journalctl -f

# Memory usage monitoring (separate terminal)
watch -n 1 'free -h && echo "=== Top Memory Users ===" && ps aux --sort=-%mem | head -10'

Safety Considerations

⚠️ Important Warning: This program is designed to consume all available system memory and trigger the OOM killer. Only use it in:

Test environments or virtual machines
Systems where data loss is acceptable
Controlled scenarios for educational or debugging purposes

Never run this on production systems or machines with important unsaved work.

Practical Applications

This OOM killer test program is valuable for:

System Testing: Validating memory limits and OOM behavior
Performance Tuning: Understanding memory allocation patterns
Container Testing: Testing memory constraints in Docker/Kubernetes
Educational Purposes: Learning about Linux memory management
Debugging: Reproducing memory-related issues

Understanding the Output

The program provides detailed information about:

Memory allocation rates (MB/s)
System memory status from /proc/meminfo
Resource limit modifications
OOM score adjustments
Real-time progress updates

Conclusion

The Linux OOM killer is a sophisticated mechanism that prevents complete system failure during memory exhaustion scenarios. By understanding how it works and testing it in controlled environments, system administrators and developers can better design resilient applications and configure systems for optimal memory management.

This enhanced test program provides a comprehensive tool for exploring OOM killer behavior while maintaining safety through careful monitoring and graceful error handling. Use it wisely to gain deeper insights into Linux memory management and system behavior under extreme conditions.

Remember: with great power comes great responsibility. Always test in safe environments and understand the implications of triggering the OOM killer on your systems.