Computer-Science

Interrupt

An operating system is a collection of service routines. The service routine can be executed by a request from an application (system call interrupt).
Or it runs automatically when there is a serious error while running an application (exception interrupt) or when there is an external hardware event that the operating system has to handle (hardware interrupt).
The service routines are called ISRs (Interrupt Service Routines).

external event interrupt number ISR (ISR1 => ISR2)
An application calls
read(...)
int 128
(syscall num 3)
system_call() => sys_read()
An application calls
write(...)
int 128
(syscall num 4)
system_call() => sys_write()
timer ticks int 32 interrupt[0] => timer_interrupt()
key stroke int 33 interrupt[1] => atkbd_interrupt()
An application runs
x=x/0;
int 0 divide_error() => do_divide_error()
page fault while an application run int 14 page_fault() => do_page_fault()

ISR1s are all located in arch/x86/kernel/entry_32.S.
ISR2s are located in various locations of the kernel.

When an interrupt, INT x, happens, the cpu stores the current cs, eip, flag register into stack and jumps to ISR1 for INT x.
The ISR1 locations are written in IDT (Interrupt Descriptor Table), and the cpu jumpts to the location written in IDT[x].
ISR1 knows the location of ISR2.
It knows the location of ISR2 because it is hard-coded (exception interrupt case), or is written in irq_desc table (hardware interrupt case) or is written in syscall_table (system call interrupt case).

1. Interrupt classification and Interrupt number

Hardware interrupts have been assigned following interrupt numbers in Linux.

device interrupt number irq number
timer 32 0
keyboard 33 1
PIC cascading 34 2
second serial port 35 3
first serial port 36 4
floppy disk 38 6
system clock 40 8
network interface 42 10
usb port, sound card 43 11
ps/2 mouse 44 12
math coprocessor 45 13
eide disk, first chain 46 14
eide disk, second chain 47 15

Exceptions have been assigned following interrupt numbers.

exception interrupt number
divide-by-zero error 0
debug 1
NMI 2
breakpoint 3
overflow 4
bounds check 5
invalid opcode 6
device not available 7
double fault 8
coprocessor segment overrun 9
invalid TSS 10
segment not present 11
stack segment fault 12
general protection 13
page fault 14
intel-reserved 15
floating point error 16
alignment check 17
machine check 18
simd floating point 19

Finally, system calls in Linux are all assigned the same interrupt number, 128 (0x80). To differentiate between different system calls, a unique system call number has been given to each system call. For the full table, look at arch/x86/kernel/syscall_table_32.S.

system calls interrupt number system call number
exit 128 1
fork 128 2
read 128 3
write 128 4
open 128 5
close 128 6
โ€ฆโ€ฆ โ€ฆโ€ฆ โ€ฆโ€ฆ

2. How interrupts are detected?

Interrupts are detected by CPU. Exceptions are detected when the corresponding error happens. System calls are detected when the program executes INT 128 instruction. Hardware interrupts are detected when the corresponding devices are affected. Hardware interrupts need more detailed explanation.

The above picture shows how hardware interrupts are detected by the CPU. All hardware devices are connected to 8259A interrupt controller through IRQ lines. Timer is connected through IRQ0 line, keyboard is connected through IRQ1 line, and so on. When an event happens in one of these devices, the corresponding IRQ line is activated, and 8259A signals CPU about this event along with the corresponding interrupt number for this IRQ line. The interrupt number is computed as (IRQ line number + 32) in Linux.

3. How interrupts are handled

Interrupts are first handled by the CPU, and then the operating system takes care of the rest of things.

3.1 cpu part

3.2 OS part

Interrupt numbers and their ISR1 and ISR2 list.

interrupt number ISR1 ISR2
0 divide_error do_divide_error
1 debug do_debug
โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ ย  ย 
32 interrupt[0] timer_interrupt
33 interrupt[1] atkbd_interrupt
โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ. ย  ย 
128 (syscall num 1) system_call sys_exit
128 (syscall num 2) system_call sys_fork
โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ.. ย  ย 

4. Creating a new system call and using it

Creating a new system call : 2 steps

Using ex1.c:

void main(){
  syscall(x);
}

./ex1

==> syscall(x)
==> mov eax, x
       int   0x80
==> system_call
==> my_syscall

5. Exercise

1) Following events will cause interrupts in the system. What interrupt number will be assigned to each event? For system call interrupt, also give the system call number.

2) Change drivers/input/keyboard/atkbd.c as follows.

$ vi drivers/input/keyboard/atkbd.c

drivers/input/keyboard/atkbd.c :

static irqreturn_t atkbd_interrupt(....){
   return IRQ_HANDLED;  // Add this at the first line
   .............
}

Recompile the kernel and reboot with it.

$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot

What happens and why does this happen? Show the sequence of events that happen when you hit a key in a normal Linux kernel (as detail as possible): hit a key => keyboard controller sends a signal through IRQ line 1 => โ€ฆโ€ฆetc. Now with the changed Linux kernel show which step in this sequence has been modified and prevents the kernel to display the pressed key in the monitor.

๋ถ€ํŒ…์ด ๋๋‚˜๋ฉด ๋กœ๊ทธ์ธ ์ž…๋ ฅ์ด ์ถœ๋ ฅ๋˜์ง€๋งŒ, Keyboard action์ด ๋จนํžˆ์ง€ ์•Š๋Š”๋‹ค.

์›๋ž˜ ์ฝ”๋“œ๋Š”interrupt๊ฐ€ ๋ฐœ์ƒ ํ›„, ํ‚ค๋ณด๋“œ ์ž…๋ ฅ๊ณผ์ •์„ ์ฒ˜๋ฆฌํ•œ ๋’ค IRQ_HANDLED๋ฅผ ๋ฆฌํ„ดํ•˜๋Š”๋ฐ, ์ด ๊ณผ์ •๋“ค์„ ๊ฑฐ์น˜์ง€ ์•Š๊ณ  ๋ฐ”๋กœ ๋ฆฌํ„ดํ•˜๋„๋ก ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—, ์–ด๋–ค ๋ฌธ์ž๋ฅผ ์ž…๋ ฅํ•ด๋„ ์‹คํ–‰์ด ๋˜์ง€ ์•Š๋Š”๋‹ค.

3) Change the kernel such that it prints โ€œx pressedโ€ for each key pressing, where x is the scan code of the key. After you change the kernel and reboot it, do followings to see the effect of your changing.

$ vi drivers/input/keyboard/atkbd.c

code๋ฅผ printk๋กœ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋„๋ก printk("%x pressed\n", code);๋ฅผ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค.

drivers/input/keyboard/atkbd.c :

์ดํ›„, ์ปดํŒŒ์ผํ•˜๊ณ  ์žฌ๋ถ€ํŒ…ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฆฌ๋ˆ…์Šค ์ปค๋„์„ ์ ์šฉํ•˜์˜€๋‹ค.

$ cat /proc/sys/kernel/printk
1  4  1  7

์œ„๋Š” ํ˜„์žฌ์˜ ์ฝ˜์†” ๋กœ๊ทธ ๋ ˆ๋ฒจ, ๊ธฐ๋ณธ ๋กœ๊ทธ ๋ ˆ๋ฒจ, ์ตœ์†Œ ๋กœ๊ทธ ๋ ˆ๋ , ์ตœ๋Œ€ ๋กœ๊ทธ ๋ ˆ๋ฒจ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ํ˜„์žฌ๋Š” 1๋กœ ๊ธฐ๋ณธ ๋ ˆ๋ฒจ๋ณด๋‹ค ๋‚ฎ๊ธฐ ๋•Œ๋ฌธ์— printk()๋กœ ์ถœ๋ ฅ๋˜๋Š” ๋ฌธ์ž๋“ค์ด ํ™”๋ฉด์— ๋‚˜ํƒ€๋‚˜์ง€ ์•Š๋Š”๋‹ค.

$ echo 8 > /proc/sys/kernel/printk

์œ„ ๋ช…๋ น์œผ๋กœ ํ˜„์žฌ ์ฝ˜์†” ๋กœ๊ทธ ๋ ˆ๋ฒจ์„ 8๋กœ ๋ฐ”๊พธ๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๋ ˆ๋ฒจ์ด ๋ฐ”๋€๋‹ค.

$ cat /proc/sys/kernel/printk
8   4   1   7

ํ˜„์žฌ๋Š” ๋ ˆ๋ฒจ์ด 8์ด๊ธฐ ๋•Œ๋ฌธ์— printk๋กœ ์ถœ๋ ฅ๋˜๋Š” ๋ฌธ์ž๋“ค์ด ํ™”๋ฉด์— ๋ณด์ธ๋‹ค. ์œ„์™€ ๊ฐ™์ด ์ž…๋ ฅ๋˜๋Š” ํ‚ค ์ฝ”๋“œ๊ฐ€ ํ™”๋ฉด์— ๋ณด์ธ๋‹ค.

$ echo 1 > /proc/sys/kernel/printk

printk ์ถœ๋ ฅ์„ ๋ณด์ด์ง€ ์•Š๊ฒŒ ํ•˜๋ ค๋ฉด ํ˜„์žฌ ๋กœ๊ทธ ๋ ˆ๋ฒจ์„ 1๋กœ ๋˜๋Œ๋ฆฌ๋ฉด ๋œ๋‹ค.

4) Change the kernel such that it displays the next character in the keyboard scancode table. For example, when you type โ€œrootโ€, the monitor would display โ€œtppyโ€. How can you log in as root with this kernel?

$ vi drivers/input/keyboard/atkbd.c

ํ‚ค๋ณด๋“œ ์ž…๋ ฅ์ด ๋“ค์–ด์˜ค๋ฉด, ์‹ค์ œ ์ž…๋ ฅํ•œ ๊ธ€์ž์˜ ๋‹ค์Œ ๊ธ€์ž๋ฅผ ์ž…๋ ฅํ•œ ๊ฒƒ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๋„๋ก unsigned int code = data;๋ฅผ unsigned int code = data+1;๋กœ ์ˆ˜์ •ํ•˜์˜€๋‹ค.

drivers/input/keyboard/atkbd.c :

์ดํ›„, ์ปดํŒŒ์ผํ•˜๊ณ  ์žฌ๋ถ€ํŒ…ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฆฌ๋ˆ…์Šค ์ปค๋„์„ ์ ์šฉํ•˜์˜€๋‹ค.

์žฌ๋ถ€ํŒ…ํ•˜๊ณ  ๋กœ๊ทธ์ธ์„ ํ•˜๊ธฐ ์œ„ํ•ด โ€œrootโ€๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ํ‚ค๋ณด๋“œ์—์„œ ํ•œ ๊ธ€์ž์”ฉ ๋ฐ€๋ฆฐ โ€œtppyโ€๊ฐ€ ์ถœ๋ ฅ๋œ๋‹ค.

5) Define a function mydelay in init/main.c which whenever called will stop the booting process until you hit โ€˜sโ€™. Call this function after do_basic_setup() function call in kernel_init() in order to make the kernel stop and wait for โ€˜sโ€™ during the booting process. You need to modify atkbd.c such that it changes exit_mydelay to 1 when the user presses โ€˜sโ€™.


init/main.c :

........
int exit_mydelay;    // define a global variable
void mydelay(char *str){
   printk(str);
   printk("enter s to continue\n");
   exit_mydelay=0;  // init to zero
   for(;;){  // and wait here until the user press 's'
      msleep(1); // sleep 1 micro-second so that keyboard interrupt ISR
                 // can do its job
      if (exit_mydelay==1) break; // if the user press 's', break
   }
}
void kernel_init(){
    ...............
    do_basic_setup();
    mydelay("after do basic setup in kernel_init\n"); // wait here
    .........
}

drivers/input/keyboard/atkbd.c :

.........
extern int exit_mydelay;  // declare as extern since it is defined in main.c
static irqreturn_t atkbd_interrupt(....){
    .............
    // detect 's' key pressed and change exit_mydelay
    if (code == 31) {
        printk("s pressed\n");
        exit_mydelay = 1;
    }
    .............
}

์ดํ›„, ์ปดํŒŒ์ผํ•˜๊ณ  ์žฌ๋ถ€ํŒ…ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฆฌ๋ˆ…์Šค ์ปค๋„์„ ์ ์šฉํ•˜์˜€๋‹ค.

๋ถ€ํŒ… ์‹œ โ€œenter s to continueโ€๋ผ๋Š” ๋ฉ”์„ธ์ง€์™€ ํ•จ๊ป˜ ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ๋‹ค.

โ€™sโ€™๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ๋ถ€ํŒ…์ด ์ด์–ด์„œ ์ง„ํ–‰๋œ๋‹ค.

5-1) Add mydelay before do_basic_setup(). What happens and why?


init/main.c :

void kernel_init(){
    ...............
    mydelay("before do basic setup in kernel_init\n"); // wait here
    do_basic_setup();
    mydelay("after do basic setup in kernel_init\n"); // wait here
    .........
}

์ดํ›„, ์ปดํŒŒ์ผํ•˜๊ณ  ์žฌ๋ถ€ํŒ…ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฆฌ๋ˆ…์Šค ์ปค๋„์„ ์ ์šฉํ•˜์˜€๋‹ค.

๋ถ€ํŒ…ํ•˜๋ฉด mydelay() ํ•จ์ˆ˜๊ฐ€ ์‹คํ–‰๋˜์–ด ์•„๋ž˜์™€ ๊ฐ™์€ ๋ฉ”์„ธ์ง€๋ฅผ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

ํ•˜์ง€๋งŒ, do_basic_setup() ํ•จ์ˆ˜๊ฐ€ ์‹คํ–‰๋˜๊ธฐ ์ „์—๋Š” ํ‚ค๋ณด๋“œ ์ž…๋ ฅ์ด ๋ถˆ๊ฐ€ํ•˜์—ฌ โ€˜sโ€™๋ฅผ ์ž…๋ ฅํ•  ์ˆ˜ ์—†์—ˆ๊ณ , ์ดํ›„์˜ ๋ถ€ํŒ… ๊ณผ์ •์„ ์ด์–ด์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์—†์—ˆ๋‹ค.

6) Which function call in atkbd_interrupt() actually displays the pressed key in the monitor?


drivers/input/keyboard/atkbd.c :

ํ‚ค๋ณด๋“œ๋กœ ์ž…๋ ฅํ•œ key๋ฅผ ์ฐพ๋Š” Prob 3)์—์„œ drivers/input/keyboard/atkbd.c์— printk("%x pressed\n", code);๋ฅผ ์ถ”๊ฐ€ํ•œ ๊ฒƒ์œผ๋กœ ๋ฏธ๋ฃจ์–ด๋ณด์•„,
ํ‚ค๋ฅผ ํ™”๋ฉด์— ์ถœ๋ ฅํ•˜๋Š” ํ•จ์ˆ˜๋Š” code ๋ณ€์ˆ˜๋ฅผ ์ธ์ž๋กœ ๊ฐ€์ ธ์•ผํ•  ๊ฒƒ์ด๋ผ๊ณ  ์œ ์ถ”ํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋Œ€์ž…, ๋น„๊ต, ์กฐ๊ฑด๋ฌธ์— ์“ฐ์ธ code ๋ณ€์ˆ˜๋ฅผ ์ œ์™ธํ•˜๊ณ  ๊ด€๋ จ๋œ ํ•จ์ˆ˜๋“ค์„ ์•„๋ž˜์™€ ๊ฐ™์ด ๋ชจ๋‘ ์ฐพ์•„๋ณด์•˜๋‹ค.

๋˜ํ•œ code๋ฅผ ์ด์šฉํ•˜์—ฌ keycode๋ผ๋Š” ๋ณ€์ˆ˜๋ฅผ ๋งŒ๋“ค์–ด ์ด์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— keycode ๋ณ€์ˆ˜๋ฅผ ์ธ์ž๋กœ ๊ฐ–๋Š” ํ•จ์ˆ˜๋„ ์ฐพ์•„๋ณด์•˜๋‹ค.

input_event ํ•จ์ˆ˜์™€ input_report_key ํ•จ์ˆ˜๋ฅผ ์ฐพ์•„๋ณด๋ฉด ํ•ด๋‹ต์ด ๋‚˜์˜ฌ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

include/linux/input.h :

input_report_key ํ•จ์ˆ˜ ์—ญ์‹œ input_event ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

drivers/input/input.c:

input_handle_event๊ฐ€ event๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ํ•จ์ˆ˜๋ผ ์ถ”์ธกํ•˜๊ณ  ํ•ด๋‹น ํŒŒ์ผ ๋‚ด์—์„œ ํ•จ์ˆ˜์˜ ์ •์˜๋ฅผ ์ฐพ์•„๋ณด์•˜๋‹ค.

ํ•ด๋‹น ํ•จ์ˆ˜์˜ ๋งˆ์ง€๋ง‰์ค„์—์„œ input_pass_event ํ•จ์ˆ˜์— code ๋ณ€์ˆ˜๋ฅผ ์ธ์ž๋กœ ๋„˜๊ธฐ๋ฉฐ ํ˜ธ์ถœํ•œ๋‹ค.

input_pass_event ํ•จ์ˆ˜์˜ ์ •์˜๋ฅผ ๋ณด๋ฉด, ํ•ด๋‹น ํ•จ์ˆ˜์—์„œ handler ๊ตฌ์กฐ์ฒด์˜ event๋ฅผ ํ˜ธ์ถœํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์•„, atkbd_interrupt() ๋‚ด์—์„œ input_event๊ฐ€ ํ‚ค๋ฅผ ๋ชจ๋‹ˆํ„ฐ์— ๋„์šด๋‹ค๊ณ  ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

6-1) What are the interrupt numbers for divide-by-zero exception, keyboard interrupt, and โ€œreadโ€ system call? Where is ISR1 and ISR2 for each of them (write the exact code location)? Show their code, too.

ย  Interrupt Number ISR1 ISR2
divide-by-zero exception 0 divide_error do_divide_error
keyboard interrupt 33 interrupt[1] atkbd_interrupt
read system call 128 system_call sys_read
ย  location
ISR1 arch/x86/kernel/entry_32.S
ISR2(do_divide_error) arch/x86/kernel/traps_32.c
ISR2(atkbd_interrupt) drivers/input/keyboard/atkbd.c
ISR2(sys_read) fs/read_write.c
ISR1

divide-by-zero exception์˜ ISR1(divide_error)๋ฅผ ํ™•์ธํ•ด๋ณด์ž.
arch/x86/kernel/entry_32.S :


keyboard interrupt์˜ ISR1(interrupt[1])๋ฅผ ํ™•์ธํ•ด๋ณด์ž.
arch/x86/kernel/entry_32.S :


read system call์˜ ISR1(system_call)๋ฅผ ํ™•์ธํ•ด๋ณด์ž.
arch/x86/kernel/entry_32.S :

โ€ฆ์ค‘๊ฐ„์ƒ๋žตโ€ฆ

ISR2

keyboard interrupt์˜ ISR2(atkbd_interrupt) ์‹ค์ œ ์ฝ”๋“œ
arch/x86/kernel/traps_32.c :

divide-by-zero exception์˜ ISR2(do_divide_error)๊ฐ€ 0๋ฒˆ์ธ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.


keyboard interrupt์˜ ISR2(atkbd_interrupt) ์‹ค์ œ ์ฝ”๋“œ
drivers/input/keyboard/atkbd.c :


read system call์˜ ISR2(sys_read)์˜ ์‹ค์ œ ์ฝ”๋“œ
fs/read_write.c :

7) sys_call_table[] is in arch/x86/kernel/syscall_table_32.S. How many system calls does Linux 2.6 support?
What are the system call numbers for exit, fork, execve, wait4, read, write, and mkdir? Find system call numbers for sys_ni_syscall, which is defined at kernel/sys_ni.c. What is the role of sys_ni_syscall?

$ vi arch/x86/kernel/syscall_table_32.S

arch/x86/kernel/syscall_table_32.S : system calls๋“ค์ด 0๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜์—ฌ 326๊นŒ์ง€ ์žˆ์œผ๋ฏ€๋กœ

Linux 2.6์€ 327๊ฐœ์˜ system call์„ ์ง€์›ํ•œ๋‹ค.

์ฒซ๋ฒˆ์งธ ์Šคํฌ๋ฆฐ์ƒท์„ ์ฐธ๊ณ ํ•˜๋ฉด, exit๋Š” 1๋ฒˆ, fork๋Š” 2๋ฒˆ, execve๋Š” 11๋ฒˆ, read๋Š” 3๋ฒˆ, write๋Š” 4๋ฒˆ์ด๋‹ค.

๋˜ํ•œ wait4๋Š” 114๋ฒˆ์ด๊ณ ,

mkdir๋Š” 39๋ฒˆ์ž„์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

sys_ni_call์€ ์‹œ์Šคํ…œ ์ฝœ ๋ฒˆํ˜ธ๋กœ 17, 31, 32๋ฒˆ ๋“ฑ ์—ฌ๋Ÿฌ ๋ฒˆํ˜ธ๊ฐ€ ์žˆ๋Š”๋ฐ kernel/sys_ni.c๋กœ ๊ฐ€์„œ ํŒŒ์ผ์„ ์—ด์–ด๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

kernel/sys_ni.c :

sys_ni_syscall๋Š” ๊ตฌํ˜„๋˜์ง€ ์•Š์€ ์‹œ์Šคํ…œ ์ฝœ์„ ๊ฐ€๋ฆฌํ‚ค๋Š” ํ•จ์ˆ˜์ด๋ฉฐ -ENOSYS์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ENOSYS๋Š” ๊ตฌํ˜„๋˜์ง€ ์•Š์€ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” ์˜ค๋ฅ˜ ์ฝ”๋“œ์ด๋‹ค.

8) Change the kernel such that it prints โ€œlength 17 string foundโ€ for each printf(s) when the length of s is 17. Run a program that contains a printf() statement to see the effect. printf(s) calls write(1, s, strlen(s)) system call which in turn runs

printf(s)๋Š” ๋‚ด๋ถ€์ ์œผ๋กœ write(1, s, strlen(s))๋ฅผ ํ˜ธ์ถœํ•œ๋‹ค. ๋”ฐ๋ผ์„œ printf๋ฅผ ํ˜ธ์ถœํ•  ๋•Œ ๊ฐ™์ด ์‹คํ–‰๋˜๋Š” ์ฝ”๋“œ๋ฅผ ์‚ฝ์ž…ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” write ํ•จ์ˆ˜๊ฐ€ ํ˜ธ์ถœ๋˜๋Š” ์‹œ์ ์„ ์•Œ์•„์•ผ ํ•œ๋‹ค.

fs/read_write.c :

write ํ•จ์ˆ˜๋Š” sys_write๋ฅผ ํ˜ธ์ถœํ•˜๋ฏ€๋กœ count๊ฐ€ 17์ธ์ง€ ํ™•์ธํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์‚ฝ์ž…ํ•˜๋ฉด ๋œ๋‹ค.

์ปค๋„์„ ์ปดํŒŒ์ผํ•˜๊ณ  ์žฌ๋ถ€ํŒ…ํ•œ๋‹ค.

๋ถ€ํŒ… ํ›„, ์•„๋ž˜์™€ ๊ฐ™์€ ์ฝ”๋“œ๋ฅผ ๊ฐ€์ง€๋Š” ํŒŒ์ผ๋“ค์„ ๋งŒ๋“ค์—ˆ๋‹ค.

hello_world.c :

#include <stdio.h>

int main() {
  printf("Hello World!\n");
}

ex.c :

#include <stdio.h>

int main() {
  printf("1234567890123456\n");
}

ex.c์—์„œ๋Š” ๊ฐœํ–‰๋ฌธ์ž(\n)์„ ํฌํ•จํ•˜์—ฌ 17๊ธ€์ž๋ฅผ ์ถœ๋ ฅํ•˜๋„๋ก ํ•˜์˜€๋‹ค.

๋กœ๊ทธ ๋ ˆ๋ฒจ์ด ๋‚ฎ์•„ printk ์ถœ๋ ฅ์ด ๋ณด์ด์ง€ ์•Š์œผ๋ฏ€๋กœ 8๋กœ ์ˆ˜์ •ํ–ˆ๋‹ค.

์ž„์˜์˜ 17๊ธ€์ž ๋ฌธ์ž์—ด์„ printf๋กœ ์ถœ๋ ฅํ•˜๊ฒŒ ํ•œ๋‹ค. โ€œHello World!โ€๋Š” 17์ž๊ฐ€ ์•„๋‹ˆ์–ด์„œ printk๊ฐ€ ํ˜ธ์ถœ๋˜์ง€ ์•Š์ง€๋งŒ, โ€œ01234โ€ฆโ€œ๋Š” ๋งˆ์ง€๋ง‰์˜ ๊ฐœํ–‰๋ฌธ์ž(\n)๊นŒ์ง€ ์ด 17์ž ์ด๋ฏ€๋กœ โ€œlength 17 string foundโ€๊ฐ€ ์ถœ๋ ฅ๋˜์—ˆ๋‹ค.

9) You can call a system call indirectly with syscall().

write(1, "hi", 2);

can be written as

syscall(4, 1, "hi", 2); // 4 is the system call number for `write` system call

Write a program that prints โ€œhelloโ€ in the screen using syscall.

Sol)

ex2.c :

#include <stdio.h>

int main() {
    syscall(4, 1, "hello", 5); // 4 is the system call number for `write` system call
}

10) Create a new system call, my_sys_call with system call number 17 (system call number 17 is one that is not being used currently). Define my_sys_call() just before sys_write() in fs/read_write.c. Write a program that uses this system call:

void main(){
    syscall(17); // calls a system call with syscall number 17
}

When the above program runs, the kernel should display

hello from my_sys_call

Sol)

10-1) Create another system call that will add two numbers given by the user.

Suppose 31 is an empty entry in sys_call_table.

arch/x86/kernel/syscall_table_32.S :

31๋ฒˆ ์ž๋ฆฌ์— ์ƒˆ๋กœ์šด system call์ธ my_sys_sum์œผ๋กœ ๋ณ€๊ฒฝํ•ด์ฃผ์—ˆ๋‹ค.

fs/read_write.c :

ex2.c :

void main(){
    int sum;
    sum = syscall(31, 4, 9);  // suppose 31 is an empty entry in sys_call_table
    printf("sum is %d\n", sum);
}

11) Modify the kernel such that it displays the system call number for all system calls. Run a simple program that displays โ€œhelloโ€ in the screen and find out what system calls have been called. Also explain for each system call why that system call has been used.

Suppose 31 is an empty entry in sys_call_table.

arch/x86/kernel/syscall_table_32.S :

31๋ฒˆ ์ž๋ฆฌ์— ์ƒˆ๋กœ์šด system call์ธ my_sys_call_num์œผ๋กœ ๋ณ€๊ฒฝํ•ด์ฃผ์—ˆ๋‹ค.

fs/read_write.c :

arch/x86/kernel/entry_32.S :

syscall_call ์•„๋ž˜์— my_sys_call_num๋ฅผ ํ˜ธ์ถœํ•˜๋Š” ์–ด์…ˆ๋ธ”๋ฆฌ ์ฝ”๋“œ๋ฅด ์‚ฝ์ž…ํ•œ๋‹ค. ํ•จ์ˆ˜์— ์ฒซ ๋ฒˆ์งธ ์ธ์ž๋กœ ์‹œ์Šคํ…œ ์ฝœ ๋ฒˆํ˜ธ๋ฅผ ์ „๋‹ฌํ•˜๊ธฐ ์œ„ํ•ด ํ˜ธ์ถœ ์ „ pushl %eax๋ฅผ ํ•œ๋‹ค. ํ•จ์ˆ˜ ํ˜ธ์ถœ์ด ๋๋‚˜๋ฉด popl %eax์œผ๋กœ ๋ ˆ์ง€์Šคํ„ฐ ์ƒํƒœ๋ฅผ ๋˜๋Œ๋ ค ๋†“์•˜๋‹ค.

์œ„์™€ ๊ฐ™์ด ์‹œ์Šคํ…œ ํ•จ์ˆ˜๋“ค์ด ํ˜ธ์ถœ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

12) What system calls are being called when you remove a file? Use system() function to run a Linux command as below. Explain what each system call is doing. You need to make f1 file before you run it. Also explain for each system call why that system call has been used.

์•„๋ž˜์™€ ๊ฐ™์ด ex.c ํŒŒ์ผ์„ ๋งŒ๋“ค์—ˆ๋‹ค.
ex.c :

#include <stdlib.h>

int main() {
    system("rm x");
    return 0;
}

system์€ ๋ฆฌ๋ˆ…์Šค ๋ช…๋ น์–ด๋ฅผ ์ง์ ‘ ์‹คํ–‰ํ•˜๋Š” ํ•จ์ˆ˜์ด๋‹ค.
๋ฆฌ๋ˆ…์Šค ๋ช…๋ น์–ด์ธ rm x๋Š” x๋ผ๋Š” ์ด๋ฆ„์„ ๊ฐ€์ง„ ํŒŒ์ผ์„ ์‚ญ์ œ(remove)ํ•˜๋Š” ๋ช…๋ น์ด๋‹ค.

$ gcc -o ex ex.c
$ echo 8 > /proc/sys/kernel/printk
$ ./ex







์‚ฌ์šฉ๋œ ์‹œ์Šคํ…œ ์ฝœ์— ๋ฒˆํ˜ธ์™€ ๋Œ€์‘๋˜๋Š” ์ด๋ฆ„์„ ๋ถ™์—ฌ ๋‚˜์—ดํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

number name
45 sys_brk
33 sys_access
5 sys_open
197 sys_fstat64
192 sys_mmap2
6 sys_close
5 sys_open
197 sys_fstat64
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
6 sys_close
192 sys_mmap2
243 sys_set_thread_area
125 sys_mprotect
125 sys_mprotect
125 sys_mprotect
91 sys_munmap
45 sys_brk
33 sys_access
5 sys_open
197 sys_fstat64
192 sys_mmap2
6 sys_close
5 sys_open
4 sys_write
197 sys_fstat64
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
6 sys_close
5 sys_open
3 sys_read
197 sys_fstat64
192 sys_mmap2
192 sys_mmap2
6 sys_close
5 sys_open
3 sys_read
197 sys_fstat64
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
6 sys_close
192 sys_mmap2
243 sys_set_thread_area
125 sys_mprotect
125 sys_mprotect
125 sys_mprotect
125 sys_mprotect
125 sys_mprotect
91 sys_munmap
45 sys_brk
33 sys_access
5 sys_open
197 sys_fstat64
192 sys_mmap2
6 sys_close
5 sys_open
3 sys_read
197 sys_fstat64
192 sys_mmap2
6 sys_close
5 sys_open
3 sys_read
197 sys_fstat64
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
192 sys_mmap2
6 sys_close
192 sys_mmap2
243 sys_set_thread_area
125 sys_mprotect
125 sys_mprotect
125 sys_mprotect
91 sys_munmap
119 sys_sigreturn

System Call ํ๋ฆ„์„ ๋ณด๋ฉด rmํŒŒ์ผ์ด ์žˆ๋Š”์ง€, ํŒŒ์ผ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ํ™•์ธํ•˜๊ณ  ํŒŒ์ผ์„ ์‹คํ–‰ํ•œ๋‹ค. rm์—์„œ๋Š” f1์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ํ™•์ธํ•˜๊ณ  ํŒŒ์ผ์„ ์‚ญ์ œํ•œ๋‹ค.

sys_mmap2๊ณผ sys_munmap ๊ธฐ์–ต์žฅ์น˜์— ์ €์žฅ๋˜์–ด ์žˆ๋Š” ํŒŒ์ผ์— ์ ‘๊ทผํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ ๊ฒƒ์œผ๋กœ ๋ณด์ด๋ฉฐ, ์—ฌ๋Ÿฌ ํ”„๋กœ๊ทธ๋žจ์ด ๋™์‹œ์— ํ•œ ํŒŒ์ผ์— ์ ‘๊ทผํ•ด์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ๋ง‰๊ธฐ ์œ„ํ•ด sys_set_thread_area์™€ sys_mprotect์œผ๋กœ ๋ฝ์„ ๊ฑธ์–ด ํ•˜๋‚˜์˜ ์“ฐ๋ ˆ๋“œ๋งŒ ํŒŒ์ผ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๊ฒŒํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

13) Find rm.c in busybox-1.31.1 and show the code that actually removes f1. Note all linux commands are actually a program, and running rm command means running rm.c program. rm needs a system call defined in uClibc-0.9.33.2 to remove a file. You may want to continue the code tracing all the way up to โ€œINT 0x80โ€ in uClibc for this system call.

busybox-1.31.1/coreutils/rm.c :

#include "libbb.h"

int rm_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int rm_main(int argc UNUSED_PARAM, char **argv)
{
	int status = 0;
	int flags = 0;
	unsigned opt;

	opt = getopt32(argv, "^" "fiRrv" "\0" "f-i:i-f");
	argv += optind;
	if (opt & 1)
		flags |= FILEUTILS_FORCE;
	if (opt & 2)
		flags |= FILEUTILS_INTERACTIVE;
	if (opt & (8|4))
		flags |= FILEUTILS_RECUR;
	if ((opt & 16) && FILEUTILS_VERBOSE)
		flags |= FILEUTILS_VERBOSE;

	if (*argv != NULL) {
		do {
			const char *base = bb_get_last_path_component_strip(*argv);

			if (DOT_OR_DOTDOT(base)) {
				bb_error_msg("can't remove '.' or '..'");
			} else if (remove_file(*argv, flags) >= 0) {
				continue;
			}
			status = 1;
		} while (*++argv);
	} else if (!(flags & FILEUTILS_FORCE)) {
		bb_show_usage();
	}

	return status;
}

rm ํ”„๋กœ๊ทธ๋žจ์˜ ์ฝ”๋“œ๋Š” ์œ„์™€ ๊ฐ™๋‹ค. ๊ฐ์ข… ํ”Œ๋ž˜๊ทธ๋ฅผ ํ™•์ธํ•˜๊ณ , remove_file ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•ด ํŒŒ์ผ์„ ์‚ญ์ œํ•œ๋‹ค.

`busybox-1.31.1//libbb/remove_file.c:

#include "libbb.h"

int FAST_FUNC remove_file(const char *path, int flags)
{
	struct stat path_stat;

	if (lstat(path, &path_stat) < 0) {
		if (errno != ENOENT) {
			bb_perror_msg("can't stat '%s'", path);
			return -1;
		}
		if (!(flags & FILEUTILS_FORCE)) {
			bb_perror_msg("can't remove '%s'", path);
			return -1;
		}
		return 0;
	}

	if (S_ISDIR(path_stat.st_mode)) {
		DIR *dp;
		struct dirent *d;
		int status = 0;

		if (!(flags & FILEUTILS_RECUR)) {
			bb_error_msg("'%s' is a directory", path);
			return -1;
		}

		if ((!(flags & FILEUTILS_FORCE) && access(path, W_OK) < 0 && isatty(0))
		 || (flags & FILEUTILS_INTERACTIVE)
		) {
			fprintf(stderr, "%s: descend into directory '%s'? ",
					applet_name, path);
			if (!bb_ask_y_confirmation())
				return 0;
		}

		dp = opendir(path);
		if (dp == NULL) {
			return -1;
		}

		while ((d = readdir(dp)) != NULL) {
			char *new_path;

			new_path = concat_subpath_file(path, d->d_name);
			if (new_path == NULL)
				continue;
			if (remove_file(new_path, flags) < 0)
				status = -1;
			free(new_path);
		}

		if (closedir(dp) < 0) {
			bb_perror_msg("can't close '%s'", path);
			return -1;
		}

		if (flags & FILEUTILS_INTERACTIVE) {
			fprintf(stderr, "%s: remove directory '%s'? ",
					applet_name, path);
			if (!bb_ask_y_confirmation())
				return status;
		}

		if (status == 0 && rmdir(path) < 0) {
			bb_perror_msg("can't remove '%s'", path);
			return -1;
		}

		if (flags & FILEUTILS_VERBOSE) {
			printf("removed directory: '%s'\n", path);
		}

		return status;
	}

	/* !ISDIR */
	if ((!(flags & FILEUTILS_FORCE)
	     && access(path, W_OK) < 0
	     && !S_ISLNK(path_stat.st_mode)
	     && isatty(0))
	 || (flags & FILEUTILS_INTERACTIVE)
	) {
		fprintf(stderr, "%s: remove '%s'? ", applet_name, path);
		if (!bb_ask_y_confirmation())
			return 0;
	}

	if (unlink(path) < 0) {
		bb_perror_msg("can't remove '%s'", path);
		return -1;
	}

	if (flags & FILEUTILS_VERBOSE) {
		printf("removed '%s'\n", path);
	}

	return 0;
}

uClibc-0.9.33.2/libc/sysdeps/linux/common/rmdir.c :

#include <sys/syscall.h>
#include <unistd.h>

_syscall1(int, rmdir, const char *, pathname)
libc_hidden_def(rmdir)

rmdir์˜ ์‹ค์ œ ์ฝ”๋“œ๋Š” uClibc์— ์žˆ๋‹ค.

_syscall1๋ฅผ ๋”ฐ๋ผ๊ฐ€๋ณด์ž.

uClibc-0.9.33.2/libc/sysdeps/linux/common/bits/syscalls-common.h :

#define _syscall1(args...)  SYSCALL_FUNC(1, args)

#define SYSCALL_FUNC(nargs, type, name, args...)                    \
type name(C_DECL_ARGS_##nargs(args)) {                              \
    return (type)INLINE_SYSCALL(name, nargs, C_ARGS_##nargs(args)); \
}

#define INLINE_SYSCALL(name, nr, args...) INLINE_SYSCALL_NCS(__NR_##name, nr, args)

#define INLINE_SYSCALL_NCS(name, nr, args...)                       \
(__extension__                                                      \
 ({                                                                 \
    INTERNAL_SYSCALL_DECL(__err);                                   \
    (__extension__                                                  \
     ({                                                             \
       long __res = INTERNAL_SYSCALL_NCS(name, __err, nr, args);    \
       if (unlikely(INTERNAL_SYSCALL_ERROR_P(__res, __err))) {      \
        __set_errno(INTERNAL_SYSCALL_ERRNO(__res, __err));          \
        __res = -1L;                                                \
       }                                                            \
       __res;                                                       \
      })                                                            \
    );                                                              \
  })                                                                \
)

uClibc-0.9.33.2/libc/sysdeps/linux/i386/bits/syscalls.h

#define INTERNAL_SYSCALL_NCS(name, err, nr, args...)    \
(__extension__                                          \
 ({                                                     \
    register unsigned int resultvar;                    \
    __asm__ __volatile__ (                              \
        LOADARGS_##nr                                   \
        "movl   %1, %%eax\n\t"                          \
        "int    $0x80\n\t"                              \
        RESTOREARGS_##nr                                \
        : "=a" (resultvar)                              \
        : "g" (name) ASMFMT_##nr(args) : "memory", "cc" \
    );                                                  \
    (int) resultvar;                                    \
  })                                                    \
)

์ฝ”๋“œ๋ฅผ ๋ณด๋ฉด ์•Œ ์ˆ˜ ์žˆ๋“ฏ, INTERNAL_SYSCALL_NCS๋Š” int $0x80์œผ๋กœ rmdir์˜ ์ธํ„ฐ๋ŸฝํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์–ด์…ˆ๋ธ”๋ฆฌ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ด๋•Œ, x80์€ 10์ง„์ˆ˜ 128๋กœ System Call์˜ ์ธํ„ฐ๋ŸฝํŠธ ๋ฒˆํ˜ธ์ด๋‹ค.