Understanding the Linux OOM Killer: A Deep Dive with Practical Testing
The Linux Out-of-Memory (OOM) killer is one of the kernel’s most critical defense mechanisms against system lockups caused by memory exhaustion. When your system runs dangerously low on available RAM, the OOM killer springs into action, making life-or-death decisions about which processes to terminate to keep the system running.
What is the OOM Killer?
The OOM killer is a kernel subsystem that monitors system memory usage and takes action when the system is unable to allocate memory for essential operations. Rather than allowing the entire system to freeze or crash, it selectively terminates processes to free up memory.
When Does the OOM Killer Activate?
The OOM killer triggers when:
- Physical memory (RAM) is nearly exhausted
- Swap space is full or unavailable
- The kernel cannot satisfy a memory allocation request
- System performance degrades severely due to excessive swapping (thrashing)
How Does It Choose Victims?
The OOM killer uses a sophisticated scoring algorithm that considers multiple factors:
- Memory Usage: Processes consuming more memory get higher scores
- Process Age: Newer processes are more likely to be killed
- Process Priority: Lower priority (higher nice values) processes are preferred targets
- OOM Score Adjustment: Manual adjustments via
/proc/PID/oom_score_adj
- Process Type: Kernel threads and init processes are protected
- Children Processes: Killing a parent that spawned many children is preferred
OOM Score Adjustment Values
The oom_score_adj
file allows fine-tuning of the OOM killer’s target selection:
- -1000: Never kill (OOM_SCORE_ADJ_MIN)
- 0: Default behavior
- +1000: Always kill first (OOM_SCORE_ADJ_MAX)
Building an OOM Killer Test Program
To understand how the OOM killer works in practice, I’ve developed an enhanced test program that systematically consumes system memory until the OOM killer intervenes. This tool is invaluable for:
- Testing OOM killer behavior in controlled environments
- Validating system memory limits and configurations
- Understanding memory allocation patterns
- Debugging memory-related issues
Key Features of the Test Program
- Intelligent Memory Allocation: Uses large chunks initially, then adapts to smaller sizes as memory becomes scarce
- Memory Pattern Verification: Ensures allocated memory is actually committed to physical RAM
- Resource Limit Removal: Bypasses process memory limits for maximum impact
- Memory Locking: Prevents memory from being swapped, forcing RAM usage
- Real-time Monitoring: Provides detailed progress reports and system memory status
- Graceful Signal Handling: Reports statistics when terminated
The Complete Enhanced OOM Killer Test Program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/resource.h>
#include <sys/time.h>
#include <signal.h>
#include <errno.h>
#include <time.h>
#include <linux/oom.h>
#define CHUNK_SIZE (1024 * 1024) // 1MB chunks for faster allocation
#define PATTERN_SIZE 4096 // Pattern size for memory verification
#define MAX_CHUNKS 1000000 // Safety limit
#define PROGRESS_INTERVAL 100 // Report progress every N chunks
// Global variables for cleanup
static void **allocated_chunks = NULL;
static size_t chunk_count = 0;
static size_t total_allocated = 0;
// Signal handler for graceful shutdown
void signal_handler(int sig) {
printf("\nReceived signal %d. Allocated %zu MB before termination.\n",
sig, total_allocated / (1024 * 1024));
exit(EXIT_SUCCESS);
}
// Get current memory info from /proc/meminfo
void print_memory_info() {
FILE *meminfo = fopen("/proc/meminfo", "r");
if (!meminfo) return;
char line[256];
printf("\n=== Memory Status ===\n");
while (fgets(line, sizeof(line), meminfo)) {
if (strncmp(line, "MemTotal:", 9) == 0 ||
strncmp(line, "MemFree:", 8) == 0 ||
strncmp(line, "MemAvailable:", 13) == 0 ||
strncmp(line, "Cached:", 7) == 0 ||
strncmp(line, "SwapTotal:", 10) == 0 ||
strncmp(line, "SwapFree:", 9) == 0) {
printf("%s", line);
}
}
printf("====================\n\n");
fclose(meminfo);
}
// Set OOM score to maximum (most likely to be killed)
int set_oom_score_max() {
FILE *oom_file = fopen("/proc/self/oom_score_adj", "w");
if (!oom_file) {
perror("Failed to open oom_score_adj");
return 0;
}
if (fprintf(oom_file, "%d", OOM_SCORE_ADJ_MAX) < 0) {
perror("Failed to write oom_score_adj");
fclose(oom_file);
return 0;
}
if (fclose(oom_file) != 0) {
perror("Failed to close oom_score_adj");
return 0;
}
printf("OOM score set to maximum (%d)\n", OOM_SCORE_ADJ_MAX);
return 1;
}
// Remove all memory limits
int remove_memory_limits() {
const struct rlimit unlimited = {RLIM_INFINITY, RLIM_INFINITY};
struct {
int resource;
const char *name;
} limits[] = {
{RLIMIT_AS, "RLIMIT_AS"},
{RLIMIT_DATA, "RLIMIT_DATA"},
{RLIMIT_STACK, "RLIMIT_STACK"},
{RLIMIT_RSS, "RLIMIT_RSS"},
{RLIMIT_MEMLOCK, "RLIMIT_MEMLOCK"}
};
for (size_t i = 0; i < sizeof(limits) / sizeof(limits[0]); i++) {
if (setrlimit(limits[i].resource, &unlimited) != 0) {
printf("Warning: Failed to set %s to unlimited: %s\n",
limits[i].name, strerror(errno));
// Continue anyway - some limits might not be critical
} else {
printf("Set %s to unlimited\n", limits[i].name);
}
}
return 1;
}
// Lock all memory to prevent swapping
int lock_memory() {
if (mlockall(MCL_CURRENT | MCL_FUTURE) != 0) {
printf("Warning: Failed to lock memory: %s\n", strerror(errno));
printf("Continuing without memory locking...\n");
return 0; // Continue anyway
}
printf("Memory locking enabled (no swap)\n");
return 1;
}
// Allocate and initialize a memory chunk
void* allocate_chunk(size_t size) {
void *ptr = malloc(size);
if (!ptr) return NULL;
// Create a pattern to ensure memory is actually used
static unsigned char pattern[PATTERN_SIZE];
static int pattern_initialized = 0;
if (!pattern_initialized) {
for (int i = 0; i < PATTERN_SIZE; i++) {
pattern[i] = (unsigned char)(i % 256);
}
pattern_initialized = 1;
}
// Fill the entire chunk with pattern
unsigned char *byte_ptr = (unsigned char*)ptr;
for (size_t i = 0; i < size; i += PATTERN_SIZE) {
size_t copy_size = (size - i < PATTERN_SIZE) ? size - i : PATTERN_SIZE;
memcpy(byte_ptr + i, pattern, copy_size);
}
// Verify a portion of the memory to force commitment
if (memcmp(ptr, pattern, (size < PATTERN_SIZE) ? size : PATTERN_SIZE) != 0) {
printf("Memory verification failed!\n");
free(ptr);
return NULL;
}
return ptr;
}
int main(int argc, char *argv[]) {
size_t chunk_size = CHUNK_SIZE;
int verbose = 0;
// Parse command line arguments
for (int i = 1; i < argc; i++) {
if (strcmp(argv[i], "-v") == 0 || strcmp(argv[i], "--verbose") == 0) {
verbose = 1;
} else if (strcmp(argv[i], "-h") == 0 || strcmp(argv[i], "--help") == 0) {
printf("Usage: %s [-v|--verbose] [-h|--help]\n", argv[0]);
printf(" -v, --verbose: Show detailed progress\n");
printf(" -h, --help: Show this help\n");
return EXIT_SUCCESS;
}
}
printf("Enhanced OOM Killer Test Program\n");
printf("================================\n");
printf("Chunk size: %zu KB\n", chunk_size / 1024);
printf("PID: %d\n", getpid());
// Set up signal handlers
signal(SIGTERM, signal_handler);
signal(SIGINT, signal_handler);
signal(SIGKILL, signal_handler); // This won't work, but doesn't hurt
// Show initial memory state
if (verbose) print_memory_info();
// Configure the process for OOM testing
if (!remove_memory_limits()) {
fprintf(stderr, "Failed to remove memory limits\n");
return EXIT_FAILURE;
}
lock_memory(); // Continue even if this fails
if (!set_oom_score_max()) {
fprintf(stderr, "Failed to set OOM score\n");
return EXIT_FAILURE;
}
// Allocate array to track chunks (optional, for debugging)
allocated_chunks = calloc(MAX_CHUNKS, sizeof(void*));
if (!allocated_chunks) {
printf("Warning: Cannot track allocated chunks\n");
}
printf("\nStarting memory allocation...\n");
struct timeval start_time, current_time;
gettimeofday(&start_time, NULL);
// Main allocation loop
while (chunk_count < MAX_CHUNKS) {
void *chunk = allocate_chunk(chunk_size);
if (!chunk) {
// Try smaller chunks if large allocation fails
if (chunk_size > 4096) {
chunk_size /= 2;
printf("Allocation failed, reducing chunk size to %zu KB\n",
chunk_size / 1024);
continue;
} else {
printf("Cannot allocate even small chunks. Total: %zu MB\n",
total_allocated / (1024 * 1024));
break;
}
}
if (allocated_chunks) {
allocated_chunks[chunk_count] = chunk;
}
chunk_count++;
total_allocated += chunk_size;
// Progress reporting
if (chunk_count % PROGRESS_INTERVAL == 0) {
gettimeofday(¤t_time, NULL);
double elapsed = (current_time.tv_sec - start_time.tv_sec) +
(current_time.tv_usec - start_time.tv_usec) / 1000000.0;
printf("Allocated %zu chunks (%zu MB) in %.2f seconds (%.2f MB/s)\n",
chunk_count, total_allocated / (1024 * 1024),
elapsed, (total_allocated / (1024 * 1024)) / elapsed);
if (verbose && chunk_count % (PROGRESS_INTERVAL * 10) == 0) {
print_memory_info();
}
}
}
printf("\nAllocation complete or failed.\n");
printf("Total chunks: %zu\n", chunk_count);
printf("Total memory: %zu MB\n", total_allocated / (1024 * 1024));
if (verbose) print_memory_info();
// Keep the process alive to maintain memory pressure
printf("Keeping process alive to maintain memory pressure...\n");
printf("The OOM killer should terminate this process when memory runs low.\n");
while (1) {
sleep(1);
// Occasionally touch the memory to keep it active
if (allocated_chunks && chunk_count > 0) {
static size_t touch_index = 0;
if (touch_index < chunk_count) {
volatile char *ptr = (volatile char*)allocated_chunks[touch_index];
if (ptr) {
*ptr = *ptr; // Touch the memory
}
touch_index = (touch_index + 1) % chunk_count;
}
}
}
return EXIT_SUCCESS;
}
Compilation and Usage
To compile and run the OOM killer test program:
# Compile the program
gcc -o oom_test oom_test.c -Wall -O2
# Run with basic output
sudo ./oom_test
# Run with verbose memory monitoring
sudo ./oom_test -v
# Show help
./oom_test -h
What to Expect
When you run this program, you’ll observe:
- Initial Setup: The program configures itself as the primary OOM target
- Memory Allocation: Rapid consumption of available RAM with progress reports
- System Response: Increasing swap usage and system slowdown
- OOM Killer Activation: The kernel logs the OOM event and terminates the process
- System Recovery: Immediate return of freed memory to the system
Monitoring OOM Events
You can monitor OOM killer activity using:
# Real-time kernel messages
dmesg -w
# System log monitoring
journalctl -f
# Memory usage monitoring (separate terminal)
watch -n 1 'free -h && echo "=== Top Memory Users ===" && ps aux --sort=-%mem | head -10'
Safety Considerations
⚠️ Important Warning: This program is designed to consume all available system memory and trigger the OOM killer. Only use it in:
- Test environments or virtual machines
- Systems where data loss is acceptable
- Controlled scenarios for educational or debugging purposes
Never run this on production systems or machines with important unsaved work.
Practical Applications
This OOM killer test program is valuable for:
- System Testing: Validating memory limits and OOM behavior
- Performance Tuning: Understanding memory allocation patterns
- Container Testing: Testing memory constraints in Docker/Kubernetes
- Educational Purposes: Learning about Linux memory management
- Debugging: Reproducing memory-related issues
Understanding the Output
The program provides detailed information about:
- Memory allocation rates (MB/s)
- System memory status from
/proc/meminfo
- Resource limit modifications
- OOM score adjustments
- Real-time progress updates
Conclusion
The Linux OOM killer is a sophisticated mechanism that prevents complete system failure during memory exhaustion scenarios. By understanding how it works and testing it in controlled environments, system administrators and developers can better design resilient applications and configure systems for optimal memory management.
This enhanced test program provides a comprehensive tool for exploring OOM killer behavior while maintaining safety through careful monitoring and graceful error handling. Use it wisely to gain deeper insights into Linux memory management and system behavior under extreme conditions.
Remember: with great power comes great responsibility. Always test in safe environments and understand the implications of triggering the OOM killer on your systems.