How much can we feasibly strip from a zig binary? Starting from a normal zig program that does absolutely nothing:
pub fn main() void {}
zig build-exe main.zig -target x86_64-linux-gnu
du -hk main
# 2180 main
2180K for a binary that does nothing. Given that the smallest possible executable ELF file is around 80 bytes, 2180K is quite a bit of bloat. What happens when we strip out debug info?
zig build-exe main.zig -target x86_64-linux-gnu -fstrip
du -hk main
# 192 main
Saved 1988K just by stripping out debugging information. However 192K is still quite far from our 80 byte goal. We are still compiling in Debug mode, so let’s switch to ReleaseSmall (equivalent to -Os for gcc/clang as far as I can tell).
zig build-exe main.zig -target x86_64-linux-gnu -fstrip -OReleaseSmall
du -hk main
# 12 main
Now we’re at 12K! Saved 180K just by switching from Debug to ReleaseSmall. Next step is to enable function and data sections to allow the linker to strip away unreferenced functions or data.
zig build-exe main.zig -target x86_64-linux-gnu -fstrip -OReleaseSmall -ffunction-sections -fdata-sections --gc-sections
du -hk main
# 12 main
…and that did nothing. I guess ReleaseSmall already handles this optimization. Taking a peek at the ELF sections shows quite a few unnecessary sections:
There are 9 section headers, starting at offset 0x2068:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .rodata PROGBITS 00000000010001c8 000001c8
0000000000000954 0000000000000000 AMS 0 0 8
[ 2] .eh_frame_hdr PROGBITS 0000000001000b1c 00000b1c
00000000000000bc 0000000000000000 A 0 0 4
[ 3] .eh_frame PROGBITS 0000000001000bd8 00000bd8
00000000000003d4 0000000000000000 A 0 0 8
[ 4] .text PROGBITS 0000000001001fac 00000fac
0000000000001041 0000000000000000 AX 0 0 4
[ 5] .tbss NOBITS 0000000001002ff0 00001ff0
000000000000000d 0000000000000000 WAT 0 0 8
[ 6] .bss NOBITS 0000000001004000 00002000
0000000000003108 0000000000000000 WA 0 0 4096
[ 7] .comment PROGBITS 0000000000000000 00002000
000000000000001c 0000000000000001 MS 0 0 1
[ 8] .shstrtab STRTAB 0000000000000000 0000201c
0000000000000045 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
.eh_frame
and .eh_frame_hdr
are generated to provide unwinding information, and is not strictly necessary for the the binary to run. The .comment
section holds useless metadata. .tbss
is a section for thread local storage, which is also unnecessary since the program does not do any threading.
zig build-exe main.zig -target x86_64-freestanding-none -fstrip -OReleaseSmall
# warning(link): unexpected LLD stderr:
# ld.lld: warning: cannot find entry symbol _start; not setting start address
wc -c main
# 472 main
Switching from x86_64-linux-gnu
to x86_64-freestanding-none
cuts most of the extra cruft from the binary, down to 472 bytes. Looking at the sections now reveals that all but 2 sections have been removed:
There are 3 section headers, starting at offset 0x118:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .comment PROGBITS 0000000000000000 000000e8
000000000000001c 0000000000000001 MS 0 0 1
[ 2] .shstrtab STRTAB 0000000000000000 00000104
0000000000000014 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
But something isn’t quite right. The binary no longer contains any executable code. This is because we have to change our executable’s entrypoint. Now that our platform is freestanding, the entrypoint is _start
instead of main
.
const syscall1 = @import("std").os.linux.syscall1;
export fn _start() void {
_ = syscall1(.exit, 0);
}
Our compile command hasn’t changed and the binary size is now slightly larger.
zig build-exe main.zig -target x86_64-freestanding-none -fstrip -OReleaseSmall
wc -c main
# 616 main
Except now our binary has some executable code this time:
There are 4 section headers, starting at offset 0x168:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000001001120 00000120
000000000000000b 0000000000000000 AX 0 0 4
[ 2] .comment PROGBITS 0000000000000000 0000012b
000000000000001c 0000000000000001 MS 0 0 1
[ 3] .shstrtab STRTAB 0000000000000000 00000147
000000000000001a 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
Looking at the size of the text section, it only contains 11 bytes of code. Where is the 605 extra bytes coming from? Inspecting the ELF further with readelf shows that there are 4 program segments. Each program segments takes up 56 bytes of space, for a total of bytes.
Elf file type is EXEC (Executable file)
Entry point 0x1001120
There are 4 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000001000040 0x0000000001000040
0x00000000000000e0 0x00000000000000e0 R 0x8
LOAD 0x0000000000000000 0x0000000001000000 0x0000000001000000
0x0000000000000120 0x0000000000000120 R 0x1000
LOAD 0x0000000000000120 0x0000000001001120 0x0000000001001120
0x000000000000000b 0x000000000000000b R E 0x1000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000001000000 RW 0x0
Section to Segment mapping:
Segment Sections...
00
01
02 .text
03
GNU_STACK
is completely optional, and only acts as a hint to the linux kernel. PHDR
is similarly unnecessary and the two LOAD
segments can be merged into a single large RWX segment. We cannot directly control the program segments from the command line, so it is time to break out a linker script.
This script creates a single RWX segment that spans all of the executable code and data, cutting down the 4 segments to a single segment.
ENTRY(_start)
PHDRS {
code PT_LOAD FLAGS(7);
}
SECTIONS {
. = SIZEOF_HEADERS;
.text : ALIGN(1) { *(.text.*) }
.rodata : ALIGN(1) { *(.rodata.*) }
.data : ALIGN(1) { *(.data.*) }
.bss : ALIGN(1) { *(.bss.*) }
}
Recompiling with the linker script brings the binary down to 616 - 56 * 3 = 448
bytes.
zig build-exe main.zig -target x86_64-freestanding-none -fstrip -OReleaseSmall -T linker.ld
wc -c main
# 448 main
We return our attention to the section headers in the binary. The linux kernel completely ignores section headers, so they can be safely removed without affecting the binary. The contents of .comment
and .shstrtab
can also be stripped since they are not mapped by any program segment.
There are 4 section headers, starting at offset 0xc0:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000078 00000078
000000000000000b 0000000000000000 AX 0 0 4
[ 2] .comment PROGBITS 0000000000000000 00000083
000000000000001c 0000000000000001 MS 0 0 1
[ 3] .shstrtab STRTAB 0000000000000000 0000009f
000000000000001a 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
Here we can take advantage of how the compiler lays out the ELF file.
ELF Header
Program segments
Section data (ALLOC)
Section data
Section headers
Sections that are marked as ALLOC
are sections that are mapped by a program segment and required for program execution. The way the ELF file is created the Section headers and non alloc sections are all in one contiguous block at the end of the file. To strip out the extra metadata we can cut away any data that is after the last ALLOC
section.
from pwnc.minelf import ELF
elf = ELF(open("main", "rb").read())
offset = 0
for section in elf.sections:
if section.flags & elf.Section.Flags.ALLOC != 0:
offset = section.offset + section.size
elf.header.section_offset = 0
elf.header.number_of_sections = 0
elf.header.section_name_table_index = 0
elf.raw_elf_bytes = elf.raw_elf_bytes[:offset]
elf.write("main")
Compiling and patching now yields a 131 byte binary. Much better.
zig build-exe main.zig -target x86_64-freestanding-none -fstrip -OReleaseSmall -T linker.ld
python3 patch.py
wc -c main
# 131 main
Now we can apply some optimizations to the code in the binary to save a few bytes. The disassembled code shows that the function still attempts to return even though the program exits before, and a strange extra stub function at the end.
main: file format elf64-x86-64
Disassembly of section PT_LOAD#0:
0000000000000078 <PT_LOAD#0>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 31 ff xor edi, edi
7d: 0f 05 syscall
7f: c3 ret
80: 31 c0 xor eax, eax
82: c3 ret
Marking the function as noreturn
eliminates one of the extraneous ret
instructions.
const syscall1 = @import("std").os.linux.syscall1;
export fn _start() noreturn {
_ = syscall1(.exit, 0);
unreachable;
}
main: file format elf64-x86-64
Disassembly of section PT_LOAD#0:
0000000000000078 <PT_LOAD#0>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 31 ff xor edi, edi
7d: 0f 05 syscall
7f: 31 c0 xor eax, eax
81: c3 ret
Switching from syscall1
to syscall0
eliminates xor edi, edi
.
const syscall0 = @import("std").os.linux.syscall0;
export fn _start() noreturn {
_ = syscall0(.exit);
unreachable;
}
main: file format elf64-x86-64
Disassembly of section PT_LOAD#0:
0000000000000078 <PT_LOAD#0>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 0f 05 syscall
7d: 31 c0 xor eax, eax
7f: c3 ret
_start
is already marked as noreturn
, so where is the xor eax, eax ; ret
coming from? We can temporarily recompile with -fno-strip
and dump the binary to figure out where the extra instructions are coming from.
main: file format elf64-x86-64
Disassembly of section .text:
0000000000000078 <_start>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 0f 05 syscall
000000000000007d <getauxval>:
7d: 31 c0 xor eax, eax
7f: c3 ret
What is getauxval
doing here??? This is a freestanding environment so auxiliary values shouldn’t be used at all. Since the function is not referenced by anything, adding the -flto
compile option to strip out unused functions and data removes the extra code.
zig build-exe main.zig -target x86_64-freestanding-none -fstrip -OReleaseSmall -T linker.ld -flto
python3 patch.py
wc -c main
# 125 main
main: file format elf64-x86-64
Disassembly of section PT_LOAD#0:
0000000000000078 <PT_LOAD#0>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 0f 05 syscall
This is the absolute limit that we can reach without using tricks to overlap the ELF metadata to further shrink the binary.
There is one last change that needs to be made before the binary can run on all linux systems. Currently the program header maps the binary at address 0x00000078
, which would require the linux kernel to map a page at address 0x00000000
.
Elf file type is EXEC (Executable file)
Entry point 0x78
There is 1 program header, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000078 0x0000000000000078 0x0000000000000078
0x0000000000000005 0x0000000000000005 RWE 0x1000
Most linux distros set the sysctl value vm.mmap_min_addr
to a non zero address to mitigate kernel exploits taking advantage of kernel NULL dereferences. This means that as the binary is right now, it will not run on most modern linux distros. To fix this we can update the python patching script to change the ELF file type from EXEC
to DYN
. This will tell the linux kernel to choose a base address for the binary instead of using the program segment addresses directly.
from pwnc.minelf import ELF
elf = ELF(open("main", "rb").read())
elf.header.type = elf.Header.Type.DYN
offset = 0
for section in elf.sections:
if section.flags & elf.Section.Flags.ALLOC != 0:
offset = section.offset + section.size
elf.header.section_offset = 0
elf.header.number_of_sections = 0
elf.header.section_name_table_index = 0
elf.raw_elf_bytes = elf.raw_elf_bytes[:offset]
elf.write("main")
The final ELF file:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x78
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 1
Size of section headers: 64 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000078 0x0000000000000078 0x0000000000000078 0x000005 0x000005 RWE 0x1000
There is no dynamic section in this file.
There are no relocations in this file.
No processor specific unwind information to decode
Dynamic symbol information is not available for displaying symbols.
No version information found in this file.
main: file format elf64-x86-64
Disassembly of section PT_LOAD#0:
0000000000000078 <PT_LOAD#0>:
78: 6a 3c push 60
7a: 58 pop rax
7b: 0f 05 syscall