This content originally appeared on DEV Community and was authored by lostghost
This blog is part of a series.
Getting the program to compile and run is a challenge. Getting it to run correctly is even more of a challenge. You would want to know exactly what the program is doing, and how it’s different from what you intended for it to do. This is known as debugging.
Debugging takes place at various stages of a program’s lifecycle, starting from the programming stage. There are various linters and scanners that go over your source code and identify undesireable functionality. Most of them are intended for use from your editor-turned-IDE, such as VIm. As an example, for the programs ctags
and cscope
, here’s a video.
After programming and compilation, you get a binary program image, which you can also analyse. Analysis at both source code and binary image stages is called static analysis. We already did static analysis, when going over program segments with readelf
. A popular static analysis tool is valgrind.
Now let’s turn to dynamic analysis. Because a userspace process is a virtualized environment, any interaction with the outside world, any side-effect, has to go through the kernel, in the form of a syscall. So to get an idea for what the program is really doing, monitoring system calls is a good idea. strace
helps with that.
Take the same program from the previous blog, compiled statically:
[lostghost1@archlinux c]$ cat main.c
#include <stdio.h>
int main(int argc, char** argv){
if (argc<2) return 1;
printf("%s\n",argv[1]);
return 0;
}
And let’s see what it does:
[lostghost1@archlinux c]$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffeb1ab5720 /* 41 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x405658) = 0
set_tid_address(0x405790) = 1631947
exit_group(1) = ?
+++ exited with 1 +++
First, it exec
s the program. Then it sets the FS
register, used for pointing to thread-local variables. Then it sets the TID
– thread ID – address, used for multithreading. Finally, it exits with code zero.
And now with an argument:
[lostghost1@archlinux c]$ strace ./a.out hello
execve("./a.out", ["./a.out", "hello"], 0x7ffdf3a92938 /* 41 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x405658) = 0
set_tid_address(0x405790) = 1632018
ioctl(1, TIOCGWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
writev(1, [{iov_base="hello", iov_len=5}, {iov_base="\n", iov_len=1}], 2hello
) = 6
exit_group(0) = ?
+++ exited with 0 +++
Besides what we already went over, it asks for the size of the terminal – 80×24 in this case, and writes “hello” to the terminal – the “2hello” is it writing out “hello” mid-output from strace. It then exits successfully with code 0.
Now let’s compile our executable dynamically. To get the full trace, run it the following way:
[lostghost1@archlinux c]$ strace -f /lib/ld-musl-x86_64.so.1 ./a.out hello
execve("/lib/ld-musl-x86_64.so.1", ["/lib/ld-musl-x86_64.so.1", "./a.out", "hello"], 0x7ffd9a5ff938 /* 41 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x740a78ee5b68) = 0
set_tid_address(0x740a78ee5fd0) = 1632581
open("./a.out", O_RDONLY|O_LARGEFILE) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\20\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 20480, PROT_READ, MAP_PRIVATE, 3, 0) = 0x740a78e38000
mmap(0x740a78e39000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x740a78e39000
mmap(0x740a78e3a000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED, 3, 0x2000) = 0x740a78e3a000
mmap(0x740a78e3b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x2000) = 0x740a78e3b000
close(3) = 0
brk(NULL) = 0x5555572c6000
brk(0x5555572c8000) = 0x5555572c8000
mmap(0x5555572c6000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x5555572c6000
mprotect(0x740a78ee2000, 4096, PROT_READ) = 0
mprotect(0x740a78e3b000, 4096, PROT_READ) = 0
ioctl(1, TIOCGWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
writev(1, [{iov_base="hello", iov_len=5}, {iov_base="\n", iov_len=1}], 2hello
) = 6
exit_group(0) = ?
+++ exited with 0 +++
This is because:
[lostghost1@archlinux c]$ file /lib/ld-musl-x86_64.so.1
/lib/ld-musl-x86_64.so.1: symbolic link to /usr/lib/musl/lib/libc.so
With musl, the C library is it’s own loader! So you don’t need to load the loader, then the C library, then the executable – the loader and the library are loaded together (due to being the same file), and only the executable is left. Compare this to glibc:
[lostghost1@archlinux c]$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffc43b5b910 /* 41 vars */) = 0
brk(NULL) = 0x5adf33995000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=138995, ...}) = 0
mmap(NULL, 138995, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7894a5b10000
close(3) = 0
openat(AT_FDCWD, "/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0px\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 840, 64) = 840
fstat(3, {st_mode=S_IFREG|0755, st_size=2006328, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7894a5b0e000
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 840, 64) = 840
mmap(NULL, 2030680, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7894a591e000
mmap(0x7894a5942000, 1507328, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x24000) = 0x7894a5942000
mmap(0x7894a5ab2000, 319488, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x194000) = 0x7894a5ab2000
mmap(0x7894a5b00000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e1000) = 0x7894a5b00000
mmap(0x7894a5b06000, 31832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7894a5b06000
close(3) = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7894a591b000
arch_prctl(ARCH_SET_FS, 0x7894a591b740) = 0
set_tid_address(0x7894a591ba10) = 1632711
set_robust_list(0x7894a591ba20, 24) = 0
rseq(0x7894a591b680, 0x20, 0, 0x53053053) = 0
mprotect(0x7894a5b00000, 16384, PROT_READ) = 0
mprotect(0x5adf31a26000, 4096, PROT_READ) = 0
mprotect(0x7894a5b67000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7894a5b10000, 138995) = 0
exit_group(1) = ?
+++ exited with 1 +++
It has to go through the effort of finding the libc first. Let’s now compare the actual memory maps. For that, modify the source code:
[lostghost1@archlinux c]$ cat main.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main() {
printf("My PID is: %d\n", getpid());
printf("Press Enter to exit...");
calloc(2048,2048);
getchar();
return 0;
}
Run the program in one terminal. In another, run pmap <PID>
:
[lostghost1@archlinux ~]$ pmap 1634216
1634216: ./a.out
000055aa8a1c4000 4K r---- a.out
000055aa8a1c5000 4K r-x-- a.out
000055aa8a1c6000 4K r---- a.out
000055aa8a1c7000 4K r---- a.out
000055aa8a1c8000 4K rw--- a.out
000055aa8bdb0000 132K rw--- [ anon ]
00007be23b9b4000 4112K rw--- [ anon ]
00007be23bdb8000 144K r---- libc.so.6
00007be23bddc000 1472K r-x-- libc.so.6
00007be23bf4c000 312K r---- libc.so.6
00007be23bf9a000 16K r---- libc.so.6
00007be23bf9e000 8K rw--- libc.so.6
00007be23bfa0000 40K rw--- [ anon ]
00007be23bfcc000 4K r---- ld-linux-x86-64.so.2
00007be23bfcd000 164K r-x-- ld-linux-x86-64.so.2
00007be23bff6000 44K r---- ld-linux-x86-64.so.2
00007be23c001000 8K r---- ld-linux-x86-64.so.2
00007be23c003000 4K rw--- ld-linux-x86-64.so.2
00007be23c004000 4K rw--- [ anon ]
00007ffe8077a000 132K rw--- [ stack ]
00007ffe807b2000 16K r---- [ anon ]
00007ffe807b6000 8K r-x-- [ anon ]
ffffffffff600000 4K --x-- [ anon ]
total 6644K
We can see that our program takes up 5 pages, with different permissions. The second one is definitely code, or .text
– it has “rx” permissions. The last one, under “rw” – is global variables. The rest are constants, read-only data.
Then comes the heap – it’s the largest element in the output. I made it so by allocating a large amount of memory – so it stands out. It is boxed in by the libc and the program itself. Let’s see how much room it has to grow:
(0x00007be23bdb8000 - 0x000055aa8bdb0000) / 1024 / 1024/1024/1024
= 38.2175293266773223877
38 Terabytes. Should be enough 🙂
After the libc and the linker, comes the stack. It has room to grow up til’ the fff address. Let’s calculate, how much that is:
(0xffffffffff600000 - 0x00007ffe8077a000) / 1024 / 1024/1024/1024/1024/1024
= 15.99987793525954060669
15 Exabytes. So why is it that a program so easily crashes with “stack overflow”, when it has 15 Exabytes of stack?
[lostghost1@archlinux c]$ ulimit -s
8192
That’s the answer. The stack is artificially limited, to catch “endless recursion” bugs. If you remove that limit – you won’t catch “stack overflow”, it will be regular “Out of memory” instead.
Let’s compare this pmap to that of a static executable:
[lostghost1@archlinux ~]$ pmap 1636409
1636409: ./a.out
0000000000400000 4K r---- a.out
0000000000401000 20K r-x-- a.out
0000000000406000 4K r---- a.out
0000000000407000 8K rw--- a.out
0000000002036000 4096K rw--- [ anon ]
00007ffd8a010000 132K rw--- [ stack ]
00007ffd8a189000 16K r---- [ anon ]
00007ffd8a18d000 8K r-x-- [ anon ]
ffffffffff600000 4K --x-- [ anon ]
total 4292K
Much nicer!
But pmap
doesn’t tell us the full story – only the part that relates to our program. To learn the full story, of the virtual memory mapping for a userspace executable on Linux – refer to this document.
An even better writeup can be found here.
Why is it important to know all this? Because dynamic libraries are a security nightmare. If the attacker tricks the executable into loading a malicious library, the executable is compromised, and can execute arbitrary code (with the privileges of the executable – made worse by SUID. How can an executable be tricked into loading a malicious library?
- With LD_PRELOAD
- With write access to folders in library search path
- With rpath pointing to a writable directory
- With replacing the libc loader
- With replacing the loader path in a binary
Yeah, this is a broken system. Containers help, I guess. But I’d say, getting rid of shared libraries altogether is a good idea.
Thanks for reading!
This content originally appeared on DEV Community and was authored by lostghost