About Course
Master Linux Kernel Development Through Real-World Bug-Driven Learning
This is not a traditional theory-based kernel course. You’ll learn Linux kernel internals by analyzing, reproducing, and fixing real production bugs from kernel.org bugzilla. Every concept is taught through actual kernel issues, giving you practical skills that directly translate to professional kernel development.
Core Learning Methodology
Bug-Driven Kernel Mastery
- Learn kernel subsystems by studying real bugs that affected production systems
- Understand why kernel code is designed a certain way by analyzing actual issues
- Develop deep intuition for kernel architecture through hands-on bug analysis
- Build a portfolio of kernel patches as you progress through the course
Complete Bug Resolution Workflow
For every kernel bug covered, you will:
- Analyze – Study the bug report and understand the problem
- Explore – Navigate kernel source code to find relevant subsystems
- Reproduce – Use provided VM images or build custom reproducers
- Instrument – Add logging and tracing to understand execution flow
- Fix – Study the upstream patch and understand the solution
- Test – Validate the fix in isolated environments
- Document – Create visual diagrams and detailed analysis
- Contribute – Learn to submit patches following kernel standards
While bugs are continuously added, you’ll gain expertise across major kernel areas:
Core Subsystems
- Memory Management (MM subsystem)
- Process Scheduling and Task Management
- Virtual File System (VFS) and File Systems
- Networking Stack (TCP/IP, packet handling)
- Block Layer and Storage
- Device Drivers and Hardware Interaction
- System Calls and Kernel Entry Points
- Locking and Synchronization Mechanisms
Fundamental Concepts
- User space vs kernel space architecture
- Virtual memory and address space management
- Interrupt handling and bottom halves
- Kernel threading and work queues
- Reference counting and object lifecycle
- Error handling and resource management
- Hardware abstraction and platform support
- Boot sequence and initialization
Advanced Topics
- Race conditions and concurrency bugs
- Memory corruption and use-after-free issues
- Performance bottlenecks and optimization
- Security vulnerabilities and hardening
- Compatibility and regression issues
- Cross-architecture considerations
- Kernel configuration and build system
What Will You Learn?
- Learn kernel subsystems by studying real bugs that affected production systems.
- - Why certain design decisions were made
- - What edge cases developers must consider
- - How subsystems interact in unexpected ways
- - Common pitfalls in kernel development
- - Best practices validated by production experience
Course Content
VM Setup – Enabling copy paste & Creating Shared Drive
-
DOC
Linux Kernel Issues List
-
Linux Kernel Issues List
-
Bug to start with
Linux kernel – Complete System Call Flow Analysis
Introduction
A Deep Dive into Linux x86-64 System Call Mechanism
Part 1: User Space to Kernel Transition
Step 1: Your C Program (test_syscall.c)
User space invocation (Ring 3)
syscall() wrapper function call
Example: Custom syscall number 463
Step 2: GCC Compilation to Assembly
Compilation process: gcc -S test_syscall.c
Generated assembly analysis
Register argument preparation (RDI, RSI, RDX)
PLT (Procedure Linkage Table) mechanism
Dynamic linking with glibc
Step 3: GOT Resolution Process
PLT.sec entry and GOT (Global Offset Table)
Dynamic symbol resolution
_dl_runtime_resolve operation
Lazy binding mechanism
Finding glibc's syscall() function
Step 4: Glibc syscall() Wrapper
Location: /lib/x86_64-linux-gnu/libc.so.6
Register shuffling for kernel ABI
Argument placement (RAX, RDI, RSI, RDX, R10, R8, R9)
The syscall instruction (0x0f 0x05)
Part 2: Hardware-Level Privilege Transition
Step 5: The syscall Instruction (Hardware Magic)
CPU atomic operations
State preservation (RIP→RCX, RFLAGS→R11)
MSR (Model-Specific Register) configuration
MSR_STAR: Segment selectors
MSR_LSTAR: Kernel entry point
MSR_SYSCALL_MASK: Flags to clear
Privilege switch (Ring 3 → Ring 0)
CPL (Current Privilege Level) change
MSR Configuration Deep Dive
syscall_init() function
wrmsrl() implementation chain
wrmsrl() wrapper
native_write_msr() with tracing
__wrmsr() raw instruction
WRMSR assembly instruction breakdown
Boot-time MSR setup
Part 3: Kernel Entry and Execution
Step 6: Kernel Entry Point (entry_64.S)
File: arch/x86/entry/entry_64.S
entry_SYSCALL_64 function
swapgs instruction (GS register switch)
Stack switching (user→kernel)
Register saving on kernel stack
pt_regs structure creation
Call to C dispatcher: do_syscall_64
Step 7: System Call Dispatcher (common.c)
File: arch/x86/entry/common.c
do_syscall_64() function flow
Stack offset randomization
syscall_enter_from_user_mode()
do_syscall_x64() implementation
Bounds checking and security validation
Step 8: System Call Table Lookup
Generated file: arch/x86/include/generated/asm/syscalls_64.h
Macro expansion mechanism
__SYSCALL() macro processing
Switch-case generation
x64_sys_call() dispatcher
Finding syscall handler: __x64_sys_hello
Step 9: Your Syscall Executes (hello.c)
Custom syscall implementation
SYSCALL_DEFINE0(hello) macro
Kernel logging with pr_info()
Return value to user space
Part 4: Return Journey to User Space
Step 10: Syscall Exit Path
Return value propagation
syscall_exit_to_user_mode() function
__syscall_exit_to_user_mode_work()
Syscall-specific cleanup
Interrupt disabling
Pending work handling
Interrupt Disabling Deep Dive
local_irq_disable() macro
native_irq_disable() implementation
CLI instruction (asm volatile("cli"))
RFLAGS interrupt flag clearing
Why interrupt disabling is critical
Step 11: The sysretq Instruction
x86-64 return mechanism
Hardware operations (atomic)
RIP ← RCX (return address restoration)
RFLAGS ← R11 (flags restoration)
CS ← MSR_STAR[63:48] + 16 (user code segment)
CS.RPL ← 3 (Ring 3 privilege)
SS ← MSR_STAR[63:48] + 8 (user stack segment)
SS.RPL ← 3 (Ring 3 privilege)
CPL ← 3 (user mode switch)
Linux Segment Selectors
Kernel segments: __KERNEL_CS (0x10), __KERNEL_DS (0x18)
User segments: __USER_CS (0x33), __USER_DS (0x2B)
CPL encoding in segment selectors
Ring 0 vs Ring 3 comparison table
MSR_STAR Segment Calculation
MSR_STAR register layout (64-bit)
SYSRET CS base selector
CS = MSR_STAR[63:48] + 16 calculation
SS = MSR_STAR[63:48] + 8 calculation
Example: 0x23 + 16 = 0x33 (__USER_CS)
Part 5: Return to User Space
Step 12: Return to Glibc
Glibc syscall() return point (0x130fdd)
Error checking logic
Comparison with 0xfffffffffffff001
Error code range (-4095 to -1)
errno setting on failure
Success path: direct return
Return instruction to test_syscall.c
Step 13: Complete Flow Verification
Expected output demonstration
Live system proof
Kernel log verification with dmesg
Success confirmation
-
Complete System Call Flow – A Deep Dive into Linux x86-64 System Call Mechanism
Understanding Linux Kernel Module & Module programming.
Topics Covered
» Building Custom Kernel Modules from Scratch
» How insmod Loads a Module (Step-by-Step Inside the Kernel)
» The Kernel Module Linked List (How Modules Live in Memory)
» How lsmod Reads and Displays All Loaded Modules
» How rmmod Safely Removes a Module
» Writing a Module that Replicates lsmod Output
» Modifying Kernel Source (module.c) to Trace insmod Stages in Real Time
» Real-World Compatibility: Kernel 3.9.0 API Differences
-
What Is a Kernel Module?
Linux Kernel – Analyzing Crash Dump – Ex: 218536
https://bugzilla.kernel.org/show_bug.cgi?id=218536
-
Linux Kernel – Tracing Dmesg
-
Decoding – tcp_rcv_space_adjust+0xbe/0x160
-
Kernel Stack Trace Debugging
-
Full Analysis – divide error in tcp_rcv_space_adjust
Case Study 1 – Huge Page
-
Huge Page Sample – video and text
-
Hugepages in Linux
-
Overview of Huge Pages
Case Study 2 – OOM Killer
OOM Killer
-
OOM Killer Internals
-
Code Changes
-
Reproducer program
-
memcg OOM execution
-
memcg OOM execution – Dmesg
-
memcg OOM execution – Patch
-
memcg OOM execution – SYSTEM OOM vs MEMCG OOM – Side by Side Comparison
-
System wide OOM & memcg oom
Case Study 3 – Memory Management Deep Dive
-
Types of memory hardware or software handled
Case Study 4 – Bug 60665 – TCP Backlog Overflow Fix
Bug Details
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60665
Summary: int backlog is assigned to unsigned short sk_max_ack_backlog causing overflow
-
Overview – Bug #60665 – TCP Backlog Overflow
-
Reproducing the Bug
-
Bug Reproduced – The Smoking Gun Evidence:
-
Linux Kernel Networking Internals – From Backlog Bug
-
Quick Reference: Logging Locations for Backlog Bug Study
-
Hands-On Tutorial: Adding Logs to Study the Backlog Bug
-
Bug Fix
-
Sync queue, Accept queue and hash table
Case Study 5 – Bug 209949 – swap is not activated
With kernel 5.10 RC 1, swap is not activated, swapon says "file is not commited".
-
Dmesg : Bug 209949
-
Bug Analysis
-
VISUAL ANALYSIS: Swap Activation Bug with Diagrams
-
Understanding – FILE HOLES, SPARSE FILES, AND SWAP SECURITY
-
Pressing enter 1000times is not a hole
-
Why – File size != only what you wrote
-
WHY ext4 developers migrated to iomap
-
What is 512-byte or block-sized paradigm ?
-
UNDERSTANDING: ext4, bmap, and iomap – The Complete Picture
-
UNDERSTANDING: iomap.type and File Mapping Details
-
how does that iomap.type field actually get set?
-
sudo swapon /home/lakshmi/swapfile
-
logging swap out path – RAM is under pressure & pages get swapped to disk.
Case Study 6 – What is User Space and Kernel Space ? What Actually Makes “User Space “User Space”
-
User Space, Kernel Space and Main Memory ( RAM )
-
User vs Kernel Space
-
User Space vs Kernel Space – 2
-
Demo
-
Virtual vs Physical: Linear Split vs Scattered Mapping
-
Virtual vs Physical: Linear Split vs Scattered Mapping – 2
-
Where are CPU_Ring and PTE_US_bit Stored?
-
CPU_Ring and PTE_US_bit – Physical Storage Locations – Visual
-
Who Sets CPL and U/S Bits?
-
Who Sets CPL and U/S Bits? – Simple Visual
-
All Scenarios Where CPL Changes from 3 to 0
-
All CPL 3→0 Scenarios – Quick Reference
-
U/S Bit :Lifecycle and Management, When this bit gets set or reset
-
Real-World Example: U/S Bit Settings for User Process
-
Why Check U/S Bit When Virtual Address Range Seems Obvious?
Case Study 7 – How does the kernel discover the hardware?
-
Architecture-Dependent Hardware Discovery in Linux : ACPI, Device Tree, and PCI
-
The Complete Boot Sequence & Who Calls What.
-
“ACPI Bus” is a Software Abstraction for Real Hardware
-
What Is Linux Device Model ? (The Unified Model)
-
How Devices Connect To Buses
-
Why Virtual Buses Exist – The Problem They Solve
-
ACPI Sub-System Architecture – A Virtual Bus – Code
-
ACPI Initialization Sequence – Bus First, Devices Second, Drivers Third
-
E820 vs ACPI | What Is E820 and How Is It Linked to ACPI?
-
ACPI Memory Hotplug (The Exception)
-
BIOS Creates, Kernel Reads – ACPI Tables
-
Complete drivers/acpi/ File Breakdown
-
Actual Hardware Layout
-
Relationship Between BIOS & The Kernel
-
E820 alternatives across all major architectures. x86_64, ARM, PowerPC etc
-
🔌 PCI Self-Discovery : How It Works
-
ACPI_Spec_6_5_Aug29 PDF
-
ACPI NAMESPACE – EXTRACTION FROM DMESG
Case Study 8 – Core Memory Management Initialization
-
Memory (RAM) Initialization Sequence – In 9 Stages
-
What is DIMM Slot (Dual In-line Memory Module Slot), looped by BIOS code, to find RAM ?
-
Physical Memory ( RAM ) Changes During 9 Stages of Initialization
-
Virtual Address Space – Stage 10 in Detail
-
Page Frame DataBase (struct page) – Stage 9 In Detail
-
Memblock Allocations During – Stage 4 – 7 in Detail
-
Compete list of all page flags.
-
Is All RAM is tracked by struct page after Stage 9
-
What Is initrd ( initial ramdisk ) Why Do We Need initrd ?
-
What E820 Map Contains ?
-
Virtual Memory Setup – Creating Initial Page Table: create_initial_page_tables()
-
Part2 – Virtual Memory Setup – create_initial_page_tables()
-
Memblock Design: How It Tracks Allocations
-
Major Subsystems That Use Memblock
-
THE COMPLETE MEMORY INITIALIZATION SEQUENCE
Case Study 9 – Memblock Allocator Guide Reference Kernel 5.15.0-rc1
-
Memblock boot memory allocator
-
Hardware Discovery one more time
Case Study 10 – PCI Device Discovery in Linux Kernel
-
Linux PCI Device Discovery Flow Linux Kernel
-
PCI Discovery: Complete Function Call Flow ( For this System – ACPI Path)
-
Logs Explanation Line-by-Line Mapping Understanding Custom Logging
-
Changed C Files
-
Complete Linux Boot Sequence: Firmware to PCI Discovery
-
Linux PCI Device Discovery Flow – Detailed Analysis
-
PCI Discovery: Complete Function Call Flow
-
Your Logs Explained: Line-by-Line Mapping – Part 2
-
Complete Linux Boot Flow with File Paths – Part 1
-
Complete Linux Boot Flow with File Paths – Part 2
-
Complete Linux Boot Sequence: Firmware to PCI Discovery
-
Complete Linux Boot Sequence: Firmware to PCI Discovery: V2
-
Linux Boot Flow – File Names Only
-
Understanding PCI0 and PNP0A03:00
-
Understanding Device(PCI0) in BIOS
-
How PNP0A03:00 Enables Discovery of PCI Devices
-
Complete Discovery of Audio Device 00:05.0
-
Why PCI Root Bridge Cannot Be Self-Discovered
Case Study 11 – System.map – Everything You Need to Know About Linux Kernel Symbol Tables
-
Everything You Need to Know About Linux Kernel Symbol Tables
-
What System.map Contains
Case Study 12 – The Complete Linux Kernel Boot Sequence
-
The Complete Linux Kernel Boot Sequence
-
PART 1: OVERVIEW & ARCHITECTURE
-
PART 2: THE 8 INITCALL LEVELS
-
PART 3: TECHNICAL DEEP DIVE
-
PART 4: PRACTICAL GUIDE
Case Study 13 – Networking – Bug 211911: Panic in skb_find_text()
-
Changed Files
Case Study 14 – Understanding Linux Audio Sub System and Audio Driver
-
LINUX AUDIO DRIVER TRACING
Case Study 15 – What is Perf tool – How it works – All about perf
-
Bug 88011 – Perf annotate shows the wrong data when viewing recursive functions and then segfaults
-
Bug 202677 – “perf report” is extremely slow without –no-online
-
How perf is taking samples during perf record and how it is comparing the address to the give the percent
-
Understanding Mysampler.c file – Its our version of Perf
-
How to get the IP (Instruction Pointer ) Address of currently executing program – Interview Question
Student Ratings & Reviews
No Review Yet