Computer-Science

01. Debugging a program

(intel instruction set: refer to http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-1.html)

Editing, compiling, running, and debugging a C program in Linux. Understanding ASM code: where is local variable, stack change during call/ret instruction, …. Understanding the process image.

0. Background for Buffer Overflow Attack

1. Example program: ex1.c

#include <stdio.h>
void main(){
   int x;
   x=30;
   printf("x is %d\n", x);
}

2. Compiling and Running

$ gcc –m32 -o ex1 ex1.c
$ ./ex1
x is 30

3. ASM code

$ objdump -D -M intel ex1 > ex1.txt
$ vi ex1.txt
/main

repeat / until you find β€œ<main>:”

080483c4 <main>:
80483c4: 55                        push   ebp
80483c5: 89 e5                     mov    ebp, esp
80483c7: 83 e4 f0                  and    esp, 0xfffffff0
80483ca: 83 ec 20                  sub    esp, 0x20
80483cd: c7 44 24 1c 1e 00 00 00   mov   DWORD PTR [esp+0x1c], 0x1e
80483d5: b8 b4 84 04 08            mov   eax, 0x80484b4
80483da: 8b 54 24 1c               mov   edx, DOWRD PTR [esp+0x1c]
80483de: 89 54 24 04               mov   DWORD PTR [esp+0x4], edx
80483e2: 89 04 24                  mov   DWORD PTR [esp], eax
80483e5: e8 0a ff ff ff            call   80482f4
80483ea: c9                        leave
80483eb: c3                        ret
......
push  x
     esp = esp – 4
     mem[esp] οƒŸ x
pop  x
     x οƒŸ mem[esp]
     esp = esp + 4
mov  reg1, data
     reg1 οƒŸ data
and  reg, data
     reg οƒŸ reg AND data
sub  reg, data
     reg οƒŸ reg – data
mov DWORD PTR [addr], data
     4 byte in mem[addr] οƒŸ data
call  x
     push return-addr (the address of the instruction after "call x")
     jump to x
leave
     esp οƒŸ ebp
     pop  ebp
ret
     eip οƒŸ mem[esp]
     esp = esp + 4

Exercise

(1) Edit, compile, and run ex1.c.

$ vi ex1.c
$ gcc –m32 -o ex1 ex1.c
$ ./ex1
x is 30

(2) Get ex1.txt as above and show the asm code for main.

$ objdump -D -M intel ex1 > ex1.txt
$ vi ex1.txt
/main

objdump λͺ…λ Ήμ–΄λ₯Ό μ΄μš©ν•΄ 기계어 파일 ex1을 intel format으둜 λ³Ό 수 μžˆλ„λ‘ λ³€ν™˜ ν›„, β€œtxt” ν˜•μ‹μœΌλ‘œ μ €μž₯ν•˜μ˜€λ‹€.

(3) Draw the memory map and show all the changes in registers and memory after each instruction up to the ret instruction.

Assume esp = 0xbffff63c and ebp = 0xbffff6b8 in the beginning of main.

μš°μ„ , <main> asm μ½”λ“œλŠ” μ•„λž˜μ™€ κ°™λ‹€.

<main> asm code
0804841c <main>:
 804841c:       55                      push   ebp
 804841d:       89 e5                   mov    ebp,esp
 804841f:       83 e4 f0                and    esp,0xfffffff0
 8048422:       83 ec 20                sub    esp,0x20
 8048425:       c7 44 24 1c 1e 00 00    mov    DWORD PTR [esp+0x1c],0x1e
 804842c:       00
 804842d:       8b 44 24 1c             mov    eax,DWORD PTR [esp+0x1c]
 8048431:       89 44 24 04             mov    DWORD PTR [esp+0x4],eax
 8048435:       c7 04 24 e4 84 04 08    mov    DWORD PTR [esp],0x80484e4
 804843c:       e8 af fe ff ff          call   80482f0 <printf@plt>
 8048441:       c9                      leave
 8048442:       c3                      ret
 8048443:       66 90                   xchg   ax,ax
 8048445:       66 90                   xchg   ax,ax
 8048447:       66 90                   xchg   ax,ax
 8048449:       66 90                   xchg   ax,ax
 804844b:       66 90                   xchg   ax,ax
 804844d:       66 90                   xchg   ax,ax
 804844f:       90                      nop
0. Beginning of main

1. push ebp

2. mov ebp, esp

3. and esp,0xfffffff0

4. sub esp,0x20

-espμ—μ„œ 0x20만큼 λΊ€ 값을 esp에 μ €μž₯ν•œλ‹€.

5. mov DWORD PTR [esp+0x1c], 0x1e

6. mov eax, DWORD PTR [esp+0x1c]

7. mov DWORD PTR [esp+0x4], eax

8. mov DWORD PTR [esp], 0x80484e4

9. call 80482f0 <printf@plt>

080482f0 <printf@plt>:
 80482f0:       ff 25 0c a0 04 08       jmp    DWORD PTR ds:0x804a00c
 80482f6:       68 00 00 00 00          push   0x0
 80482fb:       e9 e0 ff ff ff          jmp    80482e0 <_init+0x2c>

10. leave

11. ret

(4) Find corresponding instructions for x=30; and printf("x is %d\n",x); in the ASM code.

x=30;

x=30;μ—μ„œ 10μ§„μˆ˜ 30을 16μ§„μˆ˜λ‘œ λ³€ν™˜ν•˜λ©΄ 0x1e이닀.

<main> asm μ½”λ“œμ—μ„œ 1eλ₯Ό 찾아보면 μ•„λž˜μ™€ κ°™λ‹€.

0804841c <main>:
 ......         ......                  ...    ......
 8048425:       c7 44 24 1c 1e 00 00    mov    DWORD PTR [esp+0x1c],
 ......         ......                  ...    ......

0x08048425μ—μ„œ mov DWORD PTR [esp+0x1c] μ΄λ―€λ‘œ esp+0x1c에 30을 μ €μž₯ ν•œλ‹€. 즉, c7 44 24 1c 1e 00 00κ°€ x=30;을 λ‚˜νƒ€λ‚Έλ‹€.

printf("x is %d\n",x);
0804841c <main>:
 ......         ......                  ...    ......
 804842d:       8b 44 24 1c             mov    eax,DWORD PTR [esp+0x1c]
 8048431:       89 44 24 04             mov    DWORD PTR [esp+0x4],eax
 8048435:       c7 04 24 e4 84 04 08    mov    DWORD PTR [esp],0x80484e4
 ......         ......                  ...    ......

80484e4λ₯Ό ν™•μΈν•˜λ©΄ μ•„λž˜μ™€ κ°™λ‹€.

(5) What is the memory location of the variable x?

0804841c <main>:
 ......         ......                  ...    ......
 8048425:       c7 44 24 1c 1e 00 00    mov    DWORD PTR [esp+0x1c],
 ......         ......                  ...    ......

μœ„ Exercise (4)와 λ™μΌν•˜κ²Œ esp + 0x1c인 것을 μ•Œ 수 μžˆλ‹€.

(6) Find the memory address where the string "x is %d\n" is stored. Confirm the ascii codes for "x is %d\n" at that address.

(7) Show the memory address where main() begins.

<main> asm codeλ₯Ό 확인해보면 0x0804841cκ°€ λ©”λͺ¨λ¦¬ μ£Όμ†ŒμΈ 것을 확인할 수 μžˆλ‹€.

4. Debugging

1. compile with -m32 (for 32 bit environment) and -g (for gdb) option

$ gcc -m32 -g -o ex1 ex1.c

2. copy .gdbinit to configure gdb

$ cp ../../linuxer1/.gdbinit  .

3. run gdb

$ gdb ex1
     ....................
gdb$ set disassembly-flavor intel  # to see asm output in intel syntax
gdb$ disassemble main              # disassemble main() and show asm code for main
Dump of assembler code for function main:
     0x804841c <+0>:  push ebp          # first instruction of main
     ....................
End of assembler dump.
gdb$ display $esp # display the value of esp after each ni
gdb$ display $ebp
gdb$ display $eax
gdb$ b *0x804841c # set break point at addr=0x804841c (first instr addr of main)
     ....................
gdb$ r # start running the program
[0x002B:0xFFFFD5EC]------------------------------------------------------[stack]
0xFFFFD63C : 20 83 04 08 00 00 00 00 - F0 5D D0 44 79 D7 D2 44  ........].Dy..D
0xFFFFD62C : 00 00 00 00 00 00 00 00 - 00 00 00 00 01 00 00 00 ................
0xFFFFD61C : 00 00 00 00 00 00 00 00 - 5D 83 CC CE 2B 26 D7 94 ........]...+&..
0xFFFFD60C : 02 00 00 00 02 00 00 00 - 00 60 EC 44 00 00 00 00 .........`.D....
0xFFFFD5FC : B0 C6 FF F7 01 00 00 00 - 01 00 00 00 00 00 00 00 ................
0xFFFFD5EC : 65 D8 D2 44 01 00 00 00 - 84 D6 FF FF 8C D6 FF FF e..D............
--------------------------------------------------------------------------[code]
=> 0x804841c <main>:    push   ebp
   0x804841d <main+1>:  mov    ebp,esp
   0x804841f <main+3>:  and    esp,0xfffffff0
   0x8048422 <main+6>:  sub    esp,0x20
--------------------------------------------------------------------------------

Breakpoint 1, main () at ex1.c:2
2       void main(){
3: $eax = 0x1
2: $ebp = (void *) 0x0
1: $esp = (void *) 0xffffd5ec

gdb$ ni # execute next instruction ("push ebp")
gdb$ ni # execute next instruction ("mov ebp, esp")
gdb$ ni # execute next instruction ("and esp, 0xfffffff0")
     ....................
gdb$ ni # execute "sub esp, 0x20"
     ....................
gdb$ ni # execute "mov dword ptr [esp+0x1c], 0x1e
gdb$ ni # execute "mov eax, DWORD PTR [esp+0x1c]
     ....................
gdb$ ni # execute "DWORD PTR [esp+0x4], eax
     ....................
gdb$ ni # execute DWORD PTR [esp], 0x80484e4
     ....................
gdb$ si # execute "call printf" with si to enter the function

Exercise

(8) Follow above steps to show the content of the registers or memory that have been changed after each instruction in main(). You should indicate the changed part in your picture (the captured output screen from gdb) for all instructions one by one. For β€œcall” instruction use si command to enter the function and show the changes in the stack and register.

컴파일 μ‹œ, -g μ˜΅μ…˜μ„ μΆ”κ°€ν•˜μ—¬ debugκ°€ κ°€λŠ₯ν•˜λ„λ‘ ν•˜μ˜€λ‹€. .gdbinit 을 볡사 ν›„ gdbλ₯Ό μ‹€ν–‰ν•˜μ˜€λ‹€.

set disassembly-flavor intel둜 ASM이 intel format으둜 좜λ ₯이 λ˜λ„λ‘ ν•˜λ©°, disassemble main λͺ…λ Ήμ–΄λ‘œ main의 ASM μ½”λ“œλ₯Ό ν™•μΈν•˜μ˜€λ‹€.

break pointλ₯Ό main ν•¨μˆ˜μ˜ λ©”λͺ¨λ¦¬ μ£Όμ†ŒμΈ 0x0804841c둜 ν•˜μ˜€λ‹€. display $esp, ebp, eax, eip둜 λͺ…λ Ήμ–΄λ₯Ό μ‹€ν–‰ ν•  λ•Œλ§ˆλ‹€ 값이 좜λ ₯ λ˜λ„λ‘ ν•˜μ˜€λ‹€.

r둜 디버깅을 μ‹œμž‘ ν–ˆμœΌλ©°, 0x804841cμ—μ„œ breakpointλ₯Ό λ§Œλ‚˜ λ©ˆμ·„μœΌλ©°, stackκ³Ό ASM codeλ₯Ό 확인 ν•  수 μžˆμ—ˆλ‹€. 그리고 μ„€μ •ν•œ eip, eax, ebp, esp 값을 확인 ν•  수 μžˆμ—ˆλ‹€.


ni둜 λ‹€μŒ 쀄을 μ‹€ν–‰ ν•˜μ˜€λ‹€.

push ebpλŠ” 1) esp = esp-4, 2) ebp = esp
의 과정을 거쳐 espκ°€ 4만큼 κ°μ†Œ κ°μ†Œν•˜λ©°, eipκ°€ 이동 ν•œ 것을 μ•Œ 수 μžˆλ‹€.

그리고 stack 0xffffd4f8에 00 00 00 00이 된 것을 μ•Œ 수 μžˆλ‹€.

and esp,0xfffffff0 esp κ°€ esp and 0xfffffff0의 κ°’ (0xffffd4f8)을 κ°€μ§€κ²Œ λœλ‹€.

sub esp,0x20 λͺ…λ Ήμ–΄λ‘œ espκ°€ esp-0x20 값을 κ°€μ§€κ²Œ λœλ‹€. κ·ΈλŸ¬λ―€λ‘œ espλŠ” 0xffffd4f0이 λœλ‹€.

mov DWORD PTR [esp+0x1c],0x1eμ΄λ―€λ‘œ 4byte에 0x1eλ₯Ό μ €μž₯ν•œλ‹€. [esp+0x1c]에 0x1eκ°€ μ €μž₯ λ˜λŠ” 것을 μ•Œ 수 μžˆλ‹€.

mov eax,DWORD PTR [esp+0x1c]μ΄λ―€λ‘œ [esp+0x1c] 뢀뢄에 0x1e 값이 μ±„μ›Œμ Έμžˆμ—ˆκΈ° λ•Œλ¬Έμ—, eax 값이 30(0x1e)이 λœλ‹€.

mov DWORD PTR [esp+0x4],eaxμ΄λ―€λ‘œ [esp+0x4]에 eaxλ₯Ό μ €μž₯ν•œλ‹€.

mov DWORD PTR [esp],0x80484e4μ΄λ―€λ‘œ esp μœ„μΉ˜μ— 0x80484e4λ₯Ό μ €μž₯ν•œλ‹€.

call 0x80482f0 <printf@plt>μ΄λ―€λ‘œ μš°μ„  espκ°€ 4 κ°μ†Œ ν•˜λ©°, return address 0x08048441λ₯Ό μ €μž₯ ν•œλ‹€.

jmp DWORD PTR ds:0x804a00c

push 0x0μ΄λ―€λ‘œ esp-4(0x0)이 μ €μž₯ λ˜λŠ” 것을 μ•Œ 수 μžˆλ‹€.

jmp 0x80482e0μ΄λ―€λ‘œ eipκ°€ jumpν•œλ‹€.

push DWORD PTR ds:0x804a004μ΄λ―€λ‘œ espλŠ” 4κ°€ λ‚΄λ €κ°€λ©° stack에 μ €μž₯λœλ‹€.

jmp DWORD PTR ds:0x804a008μ΄λ―€λ‘œ eipκ°€ jumpν•œλ‹€.

5. gdb commands

gdb$ b *addr        # break at addr
gdb$ b funcname     # break at function "funcname"
gdb$ r              # rerun
gdb$ bt             # backtrack stack frames
gdb$ p expr         # print the value of expr ex) p $sp or p/x $eax (in hexa)
gdb$ nexti          # run next instruction (do not go into a function). same as ni.
gdb$ stepi          # run next instruction (go inside a function). same as si.
gdb$ info f         # show the stack frame of the current function
gdb$ display $eip   # show the value of eip after every gdb command
gdb$ display $esp   # show the value of esp after every gdb command
gdb$ info registers # show the value of all registers
gdb$ info registers eip # show the value of eip
gdb$ info line      # memory address of the current function
gdb$ info line main # memory address of function main
gdb$ x/8xb addr     # show 8 bytes in hexa starting from addr
gdb$ x/20xh addr    # show 20 half words (2 bytes) in hexa starting from addr
gdb$ x/13xw addr    # show 13 words (4 bytes) in hexa starting from addr

6. 64-bit Linux shows different output compared to 32-bit Linux

  1. It uses 8-byte registers (rsp, rip, rbp, …) instead of esp, eip, ebp, …
  2. The calling convention (how to pass function arguments) is different.
    • 64-bit Linux passes function arguments in registers instead of stack:
    • The first six integer or pointer arguments are passed in rdi, rsi, rdx, rcx, r8, r9 (in that order left to right), while xmm0, xmm1, xmm2, .., xmm7 are used for floating point arguments.
    • Additional arguments are passed on the stack and return value is stored in rax.