An operating system is a collection of service routines. The service routine can be executed by a request from an application (system call interrupt).
Or it runs automatically when there is a serious error while running an application (exception interrupt) or when there is an external hardware event that the operating system has to handle (hardware interrupt).
The service routines are called ISRs (Interrupt Service Routines).
external event | interrupt number | ISR (ISR1 => ISR2) |
---|---|---|
An application callsread(...) |
int 128 (syscall num 3) |
system_call() => sys_read() |
An application callswrite(...) |
int 128 (syscall num 4) |
system_call() => sys_write() |
timer ticks | int 32 | interrupt[0] => timer_interrupt() |
key stroke | int 33 | interrupt[1] => atkbd_interrupt() |
An application runsx=x/0; |
int 0 | divide_error( ) => do_divide_error() |
page fault while an application run | int 14 | page_fault() => do_page_fault() |
ISR1s are all located in arch/x86/kernel/entry_32.S
.
ISR2s are located in various locations of the kernel.
When an interrupt, INT x, happens, the cpu stores the current cs, eip, flag register into stack and jumps to ISR1 for INT x.
The ISR1 locations are written in IDT (Interrupt Descriptor Table), and the cpu jumpts to the location written in IDT[x]
.
ISR1 knows the location of ISR2.
It knows the location of ISR2 because it is hard-coded (exception interrupt case), or is written in irq_desc table (hardware interrupt case) or is written in syscall_table (system call interrupt case).
Hardware interrupts have been assigned following interrupt numbers in Linux.
device | interrupt number | irq number |
---|---|---|
timer | 32 | 0 |
keyboard | 33 | 1 |
PIC cascading | 34 | 2 |
second serial port | 35 | 3 |
first serial port | 36 | 4 |
floppy disk | 38 | 6 |
system clock | 40 | 8 |
network interface | 42 | 10 |
usb port, sound card | 43 | 11 |
ps/2 mouse | 44 | 12 |
math coprocessor | 45 | 13 |
eide disk, first chain | 46 | 14 |
eide disk, second chain | 47 | 15 |
Exceptions have been assigned following interrupt numbers.
exception | interrupt number |
---|---|
divide-by-zero error | 0 |
debug | 1 |
NMI | 2 |
breakpoint | 3 |
overflow | 4 |
bounds check | 5 |
invalid opcode | 6 |
device not available | 7 |
double fault | 8 |
coprocessor segment overrun | 9 |
invalid TSS | 10 |
segment not present | 11 |
stack segment fault | 12 |
general protection | 13 |
page fault | 14 |
intel-reserved | 15 |
floating point error | 16 |
alignment check | 17 |
machine check | 18 |
simd floating point | 19 |
Finally, system calls in Linux are all assigned the same interrupt number, 128 (0x80
). To differentiate between different system calls, a unique system call number has been given to each system call. For the full table, look at arch/x86/kernel/syscall_table_32.S
.
system calls | interrupt number | system call number |
---|---|---|
exit | 128 | 1 |
fork | 128 | 2 |
read | 128 | 3 |
write | 128 | 4 |
open | 128 | 5 |
close | 128 | 6 |
โฆโฆ | โฆโฆ | โฆโฆ |
Interrupts are detected by CPU. Exceptions are detected when the corresponding error happens. System calls are detected when the program executes INT 128 instruction. Hardware interrupts are detected when the corresponding devices are affected. Hardware interrupts need more detailed explanation.
The above picture shows how hardware interrupts are detected by the CPU. All hardware devices are connected to 8259A interrupt controller through IRQ lines. Timer is connected through IRQ0 line, keyboard is connected through IRQ1 line, and so on. When an event happens in one of these devices, the corresponding IRQ line is activated, and 8259A signals CPU about this event along with the corresponding interrupt number for this IRQ line. The interrupt number is computed as (IRQ line number + 32) in Linux.
Interrupts are first handled by the CPU, and then the operating system takes care of the rest of things.
INT x
instruction is two steps:
IDT[x]
IDT[32]
indicates address 0x10200
, the ISR for timer interrupt is located at address 0x10200
, which means whenever the timer ticks, the cpu jumps to address 0x10200
.IDT[33]
indicates address 0x10300
, the ISR for keyboard is located at 0x10300
.0x10300
and start to execute whatever program stored there.arch/x86/kernel/traps_32.c/trap_init()
(for exception interrupts and system call interrupt) and in arch/x86/kernel/i8259_32.c/init_IRQ()
(for hardware interrupts).set_intr_gate()
for hardware interrupts in arch/x86/kernel/i8259_32.c/native_init_IRQ()
, and by calling set_trap_gate()
for most of the exceptions and set_system_gate()
for system call interrupts in arch/x86/kernel/traps_32.c/trap_init()
.irq_desc[]
table by calling request_irq()
.sys_call_table[]
in arch/x86/kernel/syscall_table_32.S
.arch/x86/kernel/entry_32.S
, and ISR2s are defined in various places.Interrupt numbers and their ISR1 and ISR2 list.
interrupt number | ISR1 | ISR2 |
---|---|---|
0 | divide_error | do_divide_error |
1 | debug | do_debug |
โฆโฆโฆโฆโฆ | ย | ย |
32 | interrupt[0] | timer_interrupt |
33 | interrupt[1] | atkbd_interrupt |
โฆโฆโฆโฆโฆ. | ย | ย |
128 (syscall num 1) | system_call | sys_exit |
128 (syscall num 2) | system_call | sys_fork |
โฆโฆโฆโฆโฆโฆโฆ.. | ย | ย |
Creating a new system call : 2 steps
sys_call_table
entry (sys_ni_syscall
) : x
my_syscall
my_syscall
(in appropriate file such as fs/read_write.c
)
asmlinkage void my_syscall(){
printk("hello from my_syscall\n");
}
ex1.c
:void main(){
syscall(x);
}
./ex1
==> syscall(x)
==> mov eax, x
int 0x80
==> system_call
==> my_syscall
scanf()
system_call()
=> sys_read()(3)
: 128interrupt[1]
: 33printf()
system_call()
=> sys_read()(3)
:128page_fault()
=> do_page_fault()
:14drivers/input/keyboard/atkbd.c
as follows.$ vi drivers/input/keyboard/atkbd.c
drivers/input/keyboard/atkbd.c
:
static irqreturn_t atkbd_interrupt(....){
return IRQ_HANDLED; // Add this at the first line
.............
}
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
๋ถํ ์ด ๋๋๋ฉด ๋ก๊ทธ์ธ ์ ๋ ฅ์ด ์ถ๋ ฅ๋์ง๋ง, Keyboard action์ด ๋จนํ์ง ์๋๋ค.
์๋ ์ฝ๋๋interrupt๊ฐ ๋ฐ์ ํ, ํค๋ณด๋ ์
๋ ฅ๊ณผ์ ์ ์ฒ๋ฆฌํ ๋ค IRQ_HANDLED
๋ฅผ ๋ฆฌํดํ๋๋ฐ, ์ด ๊ณผ์ ๋ค์ ๊ฑฐ์น์ง ์๊ณ ๋ฐ๋ก ๋ฆฌํดํ๋๋ก ํ๊ธฐ ๋๋ฌธ์, ์ด๋ค ๋ฌธ์๋ฅผ ์
๋ ฅํด๋ ์คํ์ด ๋์ง ์๋๋ค.
$ vi drivers/input/keyboard/atkbd.c
code
๋ฅผ printk
๋ก ์ถ๋ ฅํ ์ ์๋๋ก printk("%x pressed\n", code);
๋ฅผ ์ถ๊ฐํ์๋ค.
drivers/input/keyboard/atkbd.c
:
์ดํ, ์ปดํ์ผํ๊ณ ์ฌ๋ถํ ํ์ฌ ์๋ก์ด ๋ฆฌ๋ ์ค ์ปค๋์ ์ ์ฉํ์๋ค.
$ cat /proc/sys/kernel/printk
1 4 1 7
์๋ ํ์ฌ์ ์ฝ์ ๋ก๊ทธ ๋ ๋ฒจ, ๊ธฐ๋ณธ ๋ก๊ทธ ๋ ๋ฒจ, ์ต์ ๋ก๊ทธ ๋ ๋ , ์ต๋ ๋ก๊ทธ ๋ ๋ฒจ์ ๋ํ๋ธ๋ค. ํ์ฌ๋ 1
๋ก ๊ธฐ๋ณธ ๋ ๋ฒจ๋ณด๋ค ๋ฎ๊ธฐ ๋๋ฌธ์ printk()
๋ก ์ถ๋ ฅ๋๋ ๋ฌธ์๋ค์ด ํ๋ฉด์ ๋ํ๋์ง ์๋๋ค.
$ echo 8 > /proc/sys/kernel/printk
์ ๋ช
๋ น์ผ๋ก ํ์ฌ ์ฝ์ ๋ก๊ทธ ๋ ๋ฒจ์ 8
๋ก ๋ฐ๊พธ๋ฉด ์๋์ ๊ฐ์ด ๋ ๋ฒจ์ด ๋ฐ๋๋ค.
$ cat /proc/sys/kernel/printk
8 4 1 7
ํ์ฌ๋ ๋ ๋ฒจ์ด 8
์ด๊ธฐ ๋๋ฌธ์ printk
๋ก ์ถ๋ ฅ๋๋ ๋ฌธ์๋ค์ด ํ๋ฉด์ ๋ณด์ธ๋ค. ์์ ๊ฐ์ด ์
๋ ฅ๋๋ ํค ์ฝ๋๊ฐ ํ๋ฉด์ ๋ณด์ธ๋ค.
$ echo 1 > /proc/sys/kernel/printk
printk
์ถ๋ ฅ์ ๋ณด์ด์ง ์๊ฒ ํ๋ ค๋ฉด ํ์ฌ ๋ก๊ทธ ๋ ๋ฒจ์ 1
๋ก ๋๋๋ฆฌ๋ฉด ๋๋ค.
$ vi drivers/input/keyboard/atkbd.c
ํค๋ณด๋ ์
๋ ฅ์ด ๋ค์ด์ค๋ฉด, ์ค์ ์
๋ ฅํ ๊ธ์์ ๋ค์ ๊ธ์๋ฅผ ์
๋ ฅํ ๊ฒ์ผ๋ก ์ฒ๋ฆฌํ๋๋ก unsigned int code = data;
๋ฅผ unsigned int code = data+1;
๋ก ์์ ํ์๋ค.
drivers/input/keyboard/atkbd.c
:
์ดํ, ์ปดํ์ผํ๊ณ ์ฌ๋ถํ ํ์ฌ ์๋ก์ด ๋ฆฌ๋ ์ค ์ปค๋์ ์ ์ฉํ์๋ค.
์ฌ๋ถํ ํ๊ณ ๋ก๊ทธ์ธ์ ํ๊ธฐ ์ํด โrootโ๋ฅผ ์ ๋ ฅํ๋ฉด ํค๋ณด๋์์ ํ ๊ธ์์ฉ ๋ฐ๋ฆฐ โtppyโ๊ฐ ์ถ๋ ฅ๋๋ค.
mydelay
in init/main.c
which whenever called will stop the booting process until you hit โsโ. Call this function after do_basic_setup()
function call in kernel_init()
in order to make the kernel stop and wait for โsโ during the booting process. You need to modify atkbd.c
such that it changes exit_mydelay
to 1 when the user presses โsโ.init/main.c
:
........
int exit_mydelay; // define a global variable
void mydelay(char *str){
printk(str);
printk("enter s to continue\n");
exit_mydelay=0; // init to zero
for(;;){ // and wait here until the user press 's'
msleep(1); // sleep 1 micro-second so that keyboard interrupt ISR
// can do its job
if (exit_mydelay==1) break; // if the user press 's', break
}
}
void kernel_init(){
...............
do_basic_setup();
mydelay("after do basic setup in kernel_init\n"); // wait here
.........
}
drivers/input/keyboard/atkbd.c
:
.........
extern int exit_mydelay; // declare as extern since it is defined in main.c
static irqreturn_t atkbd_interrupt(....){
.............
// detect 's' key pressed and change exit_mydelay
if (code == 31) {
printk("s pressed\n");
exit_mydelay = 1;
}
.............
}
์ดํ, ์ปดํ์ผํ๊ณ ์ฌ๋ถํ ํ์ฌ ์๋ก์ด ๋ฆฌ๋ ์ค ์ปค๋์ ์ ์ฉํ์๋ค.
๋ถํ ์ โenter s to continueโ๋ผ๋ ๋ฉ์ธ์ง์ ํจ๊ป ์ฌ์ฉ์์ ์ ๋ ฅ์ ๊ธฐ๋ค๋ฆฌ๊ณ ์๋ค.
โsโ๋ฅผ ์ ๋ ฅํ๋ฉด ๋ถํ ์ด ์ด์ด์ ์งํ๋๋ค.
do_basic_setup()
. What happens and why?init/main.c
:
void kernel_init(){
...............
mydelay("before do basic setup in kernel_init\n"); // wait here
do_basic_setup();
mydelay("after do basic setup in kernel_init\n"); // wait here
.........
}
์ดํ, ์ปดํ์ผํ๊ณ ์ฌ๋ถํ ํ์ฌ ์๋ก์ด ๋ฆฌ๋ ์ค ์ปค๋์ ์ ์ฉํ์๋ค.
๋ถํ
ํ๋ฉด mydelay()
ํจ์๊ฐ ์คํ๋์ด ์๋์ ๊ฐ์ ๋ฉ์ธ์ง๋ฅผ ๋ณผ ์ ์๋ค.
ํ์ง๋ง, do_basic_setup()
ํจ์๊ฐ ์คํ๋๊ธฐ ์ ์๋ ํค๋ณด๋ ์
๋ ฅ์ด ๋ถ๊ฐํ์ฌ โsโ๋ฅผ ์
๋ ฅํ ์ ์์๊ณ , ์ดํ์ ๋ถํ
๊ณผ์ ์ ์ด์ด์ ์คํํ ์ ์์๋ค.
atkbd_interrupt()
actually displays the pressed key in the monitor?drivers/input/keyboard/atkbd.c
:
ํค๋ณด๋๋ก ์
๋ ฅํ key๋ฅผ ์ฐพ๋ Prob 3)์์ drivers/input/keyboard/atkbd.c
์ printk("%x pressed\n", code);
๋ฅผ ์ถ๊ฐํ ๊ฒ์ผ๋ก ๋ฏธ๋ฃจ์ด๋ณด์,
ํค๋ฅผ ํ๋ฉด์ ์ถ๋ ฅํ๋ ํจ์๋ code
๋ณ์๋ฅผ ์ธ์๋ก ๊ฐ์ ธ์ผํ ๊ฒ์ด๋ผ๊ณ ์ ์ถํ ์ ์๋ค.
๊ทธ๋์ ๋์
, ๋น๊ต, ์กฐ๊ฑด๋ฌธ์ ์ฐ์ธ code
๋ณ์๋ฅผ ์ ์ธํ๊ณ ๊ด๋ จ๋ ํจ์๋ค์ ์๋์ ๊ฐ์ด ๋ชจ๋ ์ฐพ์๋ณด์๋ค.
๋ํ code
๋ฅผ ์ด์ฉํ์ฌ keycode
๋ผ๋ ๋ณ์๋ฅผ ๋ง๋ค์ด ์ด์ฉํ๊ธฐ ๋๋ฌธ์ keycode
๋ณ์๋ฅผ ์ธ์๋ก ๊ฐ๋ ํจ์๋ ์ฐพ์๋ณด์๋ค.
input_event
ํจ์์ input_report_key
ํจ์๋ฅผ ์ฐพ์๋ณด๋ฉด ํด๋ต์ด ๋์ฌ ๊ฒ์ผ๋ก ๋ณด์ธ๋ค.
include/linux/input.h
:
input_report_key
ํจ์ ์ญ์ input_event
ํจ์๋ฅผ ํ์ฉํ๋ค๋ ๊ฒ์ ์ ์ ์๋ค.
drivers/input/input.c
:
input_handle_event
๊ฐ event๋ฅผ ์ฒ๋ฆฌํ๋ ํจ์๋ผ ์ถ์ธกํ๊ณ ํด๋น ํ์ผ ๋ด์์ ํจ์์ ์ ์๋ฅผ ์ฐพ์๋ณด์๋ค.
ํด๋น ํจ์์ ๋ง์ง๋ง์ค์์ input_pass_event
ํจ์์ code
๋ณ์๋ฅผ ์ธ์๋ก ๋๊ธฐ๋ฉฐ ํธ์ถํ๋ค.
input_pass_event
ํจ์์ ์ ์๋ฅผ ๋ณด๋ฉด, ํด๋น ํจ์์์ handler
๊ตฌ์กฐ์ฒด์ event
๋ฅผ ํธ์ถํ๋ ๊ฒ์ผ๋ก ๋ณด์, atkbd_interrupt()
๋ด์์ input_event
๊ฐ ํค๋ฅผ ๋ชจ๋ํฐ์ ๋์ด๋ค๊ณ ์์ํ ์ ์๋ค.
ย | Interrupt Number | ISR1 | ISR2 |
---|---|---|---|
divide-by-zero exception | 0 | divide_error | do_divide_error |
keyboard interrupt | 33 | interrupt[1] | atkbd_interrupt |
read system call | 128 | system_call | sys_read |
ย | location |
---|---|
ISR1 | arch/x86/kernel/entry_32.S |
ISR2(do_divide_error) | arch/x86/kernel/traps_32.c |
ISR2(atkbd_interrupt) | drivers/input/keyboard/atkbd.c |
ISR2(sys_read) | fs/read_write.c |
divide-by-zero exception์ ISR1(divide_error)๋ฅผ ํ์ธํด๋ณด์.
arch/x86/kernel/entry_32.S
:
keyboard interrupt์ ISR1(interrupt[1])๋ฅผ ํ์ธํด๋ณด์.
arch/x86/kernel/entry_32.S
:
read
system call์ ISR1(system_call)๋ฅผ ํ์ธํด๋ณด์.
arch/x86/kernel/entry_32.S
:
โฆ์ค๊ฐ์๋ตโฆ
keyboard interrupt์ ISR2(atkbd_interrupt) ์ค์ ์ฝ๋
arch/x86/kernel/traps_32.c
:
divide-by-zero exception์ ISR2(do_divide_error)๊ฐ 0
๋ฒ์ธ ๊ฒ์ ํ์ธํ ์ ์๋ค.
keyboard interrupt์ ISR2(atkbd_interrupt) ์ค์ ์ฝ๋
drivers/input/keyboard/atkbd.c
:
read
system call์ ISR2(sys_read)์ ์ค์ ์ฝ๋
fs/read_write.c
:
sys_call_table[]
is in arch/x86/kernel/syscall_table_32.S
. How many system calls does Linux 2.6 support? exit
, fork
, execve
, wait4
, read
, write
, and mkdir
? Find system call numbers for sys_ni_syscall
, which is defined at kernel/sys_ni.c
. What is the role of sys_ni_syscall
?$ vi arch/x86/kernel/syscall_table_32.S
arch/x86/kernel/syscall_table_32.S
:
system calls๋ค์ด 0๋ถํฐ ์์ํ์ฌ
326๊น์ง ์์ผ๋ฏ๋ก
Linux 2.6์ 327๊ฐ์ system call์ ์ง์ํ๋ค.
์ฒซ๋ฒ์งธ ์คํฌ๋ฆฐ์ท์ ์ฐธ๊ณ ํ๋ฉด, exit
๋ 1๋ฒ, fork
๋ 2๋ฒ, execve
๋ 11๋ฒ, read
๋ 3๋ฒ, write
๋ 4๋ฒ์ด๋ค.
๋ํ wait4
๋ 114๋ฒ์ด๊ณ ,
mkdir
๋ 39๋ฒ์์ ์ ์ ์๋ค.
sys_ni_call
์ ์์คํ
์ฝ ๋ฒํธ๋ก 17, 31, 32๋ฒ ๋ฑ ์ฌ๋ฌ ๋ฒํธ๊ฐ ์๋๋ฐ kernel/sys_ni.c
๋ก ๊ฐ์ ํ์ผ์ ์ด์ด๋ณด๋ฉด ์๋์ ๊ฐ๋ค.
kernel/sys_ni.c
:
sys_ni_syscall
๋ ๊ตฌํ๋์ง ์์ ์์คํ
์ฝ์ ๊ฐ๋ฆฌํค๋ ํจ์์ด๋ฉฐ -ENOSYS
์ ๋ฐํํ๋ค. ENOSYS
๋ ๊ตฌํ๋์ง ์์ ํจ์๋ฅผ ์ฌ์ฉํ ๋ ๋ฐ์ํ๋ ์ค๋ฅ ์ฝ๋์ด๋ค.
printf(s)
when the length of s
is 17. Run a program that contains a printf()
statement to see the effect. printf(s)
calls write(1, s, strlen(s))
system call which in turn runsprintf(s)
๋ ๋ด๋ถ์ ์ผ๋ก write(1, s, strlen(s))
๋ฅผ ํธ์ถํ๋ค. ๋ฐ๋ผ์ printf
๋ฅผ ํธ์ถํ ๋ ๊ฐ์ด ์คํ๋๋ ์ฝ๋๋ฅผ ์ฝ์
ํ๊ธฐ ์ํด์๋ write
ํจ์๊ฐ ํธ์ถ๋๋ ์์ ์ ์์์ผ ํ๋ค.
fs/read_write.c
:
write
ํจ์๋ sys_write
๋ฅผ ํธ์ถํ๋ฏ๋ก count
๊ฐ 17
์ธ์ง ํ์ธํ๋ ์ฝ๋๋ฅผ ์ฝ์
ํ๋ฉด ๋๋ค.
์ปค๋์ ์ปดํ์ผํ๊ณ ์ฌ๋ถํ ํ๋ค.
๋ถํ ํ, ์๋์ ๊ฐ์ ์ฝ๋๋ฅผ ๊ฐ์ง๋ ํ์ผ๋ค์ ๋ง๋ค์๋ค.
hello_world.c
:
#include <stdio.h>
int main() {
printf("Hello World!\n");
}
ex.c
:
#include <stdio.h>
int main() {
printf("1234567890123456\n");
}
ex.c
์์๋ ๊ฐํ๋ฌธ์(\n
)์ ํฌํจํ์ฌ 17๊ธ์๋ฅผ ์ถ๋ ฅํ๋๋ก ํ์๋ค.
๋ก๊ทธ ๋ ๋ฒจ์ด ๋ฎ์ printk
์ถ๋ ฅ์ด ๋ณด์ด์ง ์์ผ๋ฏ๋ก 8๋ก ์์ ํ๋ค.
์์์ 17๊ธ์ ๋ฌธ์์ด์ printf๋ก
์ถ๋ ฅํ๊ฒ ํ๋ค. โHello World!โ๋ 17์๊ฐ ์๋์ด์ printk
๊ฐ ํธ์ถ๋์ง ์์ง๋ง, โ01234โฆโ๋ ๋ง์ง๋ง์ ๊ฐํ๋ฌธ์(\n
)๊น์ง ์ด 17์ ์ด๋ฏ๋ก โlength 17 string foundโ๊ฐ ์ถ๋ ฅ๋์๋ค.
syscall()
.write(1, "hi", 2);
can be written as
syscall(4, 1, "hi", 2); // 4 is the system call number for `write` system call
Write a program that prints โhelloโ in the screen using syscall.
ex2.c
:
#include <stdio.h>
int main() {
syscall(4, 1, "hello", 5); // 4 is the system call number for `write` system call
}
my_sys_call
with system call number 17 (system call number 17 is one that is not being used currently). Define my_sys_call()
just before sys_write()
in fs/read_write.c
. Write a program that uses this system call:void main(){
syscall(17); // calls a system call with syscall number 17
}
When the above program runs, the kernel should display
hello from my_sys_call
x
arch/x86/kernel/syscall_table_32.S
at index x
fs/read_write.c
)
asmlinkage void my_sys_call(){
printk("hello from my_sys_call\n");
}
void main(){
syscall(x);
}
Suppose 31 is an empty entry in sys_call_table.
arch/x86/kernel/syscall_table_32.S
:
31๋ฒ ์๋ฆฌ์ ์๋ก์ด system call์ธ my_sys_sum
์ผ๋ก ๋ณ๊ฒฝํด์ฃผ์๋ค.
fs/read_write.c
:
ex2.c
:
void main(){
int sum;
sum = syscall(31, 4, 9); // suppose 31 is an empty entry in sys_call_table
printf("sum is %d\n", sum);
}
Suppose 31 is an empty entry in sys_call_table.
arch/x86/kernel/syscall_table_32.S
:
31๋ฒ ์๋ฆฌ์ ์๋ก์ด system call์ธ my_sys_call_num
์ผ๋ก ๋ณ๊ฒฝํด์ฃผ์๋ค.
fs/read_write.c
:
arch/x86/kernel/entry_32.S
:
syscall_call
์๋์ my_sys_call_num
๋ฅผ ํธ์ถํ๋ ์ด์
๋ธ๋ฆฌ ์ฝ๋๋ฅด ์ฝ์
ํ๋ค. ํจ์์ ์ฒซ ๋ฒ์งธ ์ธ์๋ก ์์คํ
์ฝ ๋ฒํธ๋ฅผ ์ ๋ฌํ๊ธฐ ์ํด ํธ์ถ ์ pushl %eax
๋ฅผ ํ๋ค. ํจ์ ํธ์ถ์ด ๋๋๋ฉด popl %eax
์ผ๋ก ๋ ์ง์คํฐ ์ํ๋ฅผ ๋๋๋ ค ๋์๋ค.
์์ ๊ฐ์ด ์์คํ ํจ์๋ค์ด ํธ์ถ๋๋ ๊ฒ์ ๋ณผ ์ ์๋ค.
system()
function to run a Linux command as below. Explain what each system call is doing. You need to make f1
file before you run it. Also explain for each system call why that system call has been used.์๋์ ๊ฐ์ด ex.c
ํ์ผ์ ๋ง๋ค์๋ค.
ex.c
:
#include <stdlib.h>
int main() {
system("rm x");
return 0;
}
system
์ ๋ฆฌ๋
์ค ๋ช
๋ น์ด๋ฅผ ์ง์ ์คํํ๋ ํจ์์ด๋ค.
๋ฆฌ๋
์ค ๋ช
๋ น์ด์ธ rm x
๋ x
๋ผ๋ ์ด๋ฆ์ ๊ฐ์ง ํ์ผ์ ์ญ์ (remove)ํ๋ ๋ช
๋ น์ด๋ค.
$ gcc -o ex ex.c
$ echo 8 > /proc/sys/kernel/printk
$ ./ex
์ฌ์ฉ๋ ์์คํ ์ฝ์ ๋ฒํธ์ ๋์๋๋ ์ด๋ฆ์ ๋ถ์ฌ ๋์ดํ๋ฉด ์๋์ ๊ฐ๋ค.
number | name |
---|---|
45 | sys_brk |
33 | sys_access |
5 | sys_open |
197 | sys_fstat64 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
197 | sys_fstat64 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
6 | sys_close |
192 | sys_mmap2 |
243 | sys_set_thread_area |
125 | sys_mprotect |
125 | sys_mprotect |
125 | sys_mprotect |
91 | sys_munmap |
45 | sys_brk |
33 | sys_access |
5 | sys_open |
197 | sys_fstat64 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
4 | sys_write |
197 | sys_fstat64 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
3 | sys_read |
197 | sys_fstat64 |
192 | sys_mmap2 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
3 | sys_read |
197 | sys_fstat64 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
6 | sys_close |
192 | sys_mmap2 |
243 | sys_set_thread_area |
125 | sys_mprotect |
125 | sys_mprotect |
125 | sys_mprotect |
125 | sys_mprotect |
125 | sys_mprotect |
91 | sys_munmap |
45 | sys_brk |
33 | sys_access |
5 | sys_open |
197 | sys_fstat64 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
3 | sys_read |
197 | sys_fstat64 |
192 | sys_mmap2 |
6 | sys_close |
5 | sys_open |
3 | sys_read |
197 | sys_fstat64 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
192 | sys_mmap2 |
6 | sys_close |
192 | sys_mmap2 |
243 | sys_set_thread_area |
125 | sys_mprotect |
125 | sys_mprotect |
125 | sys_mprotect |
91 | sys_munmap |
119 | sys_sigreturn |
sys_access
๋ ํ์ผ์ ๊ถํ์ ๊ฒ์ฌํ๋ ํจ์์ด๋ค.sys_brk
๋ ํ(๋ฐ์ดํฐ) ์์ญ์ ํ์ฅํ๊ฑฐ๋ ์ถ์ํ๋ ํจ์์ด๋ค.sys_fstat64
๋ ํ์ผ ์ ๋ณด๋ฅผ ์ฝ๋ ํจ์์ด๋ค.sys_mmap2
๋ ํ์ผ์ด๋ ์ฅ์น๋ฅผ ๋ฉ๋ชจ๋ฆฌ์ ๋์์ํค๋ ํจ์์ด๋ค.sys_set_thread_area
๋ thread-local storage๋ฅผ ์กฐ์ํ๋ ํจ์์ด๋ค.sys_mprotect
๋ ๋ฉ๋ชจ๋ฆฌ ์์ญ์ ๋ํ ์ ๊ทผ์ ์ ์ดํ๋ ํจ์์ด๋ค.sys_munmap
์ mmap์ผ๋ก ๋ง๋ค์ด์ง ๋ฉ๋ชจ๋ฆฌ ํ์ด์ง๋ฅผ ํด์ ํ๋ ํจ์์ด๋ค.sys_sigreturn
๋ ์๊ทธ๋ ํธ๋ค๋ฌ์์ ๊ฐ์ ๋ฐํํ๊ณ ์คํ ํ๋ ์์ ์ ๋ฆฌํ๋ ํจ์์ด๋ค.System Call ํ๋ฆ์ ๋ณด๋ฉด rm
ํ์ผ์ด ์๋์ง, ํ์ผ์ ๋ํ ์ ๋ณด๋ฅผ ํ์ธํ๊ณ ํ์ผ์ ์คํํ๋ค.
rm
์์๋ f1
์ ๋ํ ์ ๋ณด๋ฅผ ํ์ธํ๊ณ ํ์ผ์ ์ญ์ ํ๋ค.
sys_mmap2
๊ณผ sys_munmap
๊ธฐ์ต์ฅ์น์ ์ ์ฅ๋์ด ์๋ ํ์ผ์ ์ ๊ทผํ๊ธฐ ์ํด ์ฌ์ฉ๋ ๊ฒ์ผ๋ก ๋ณด์ด๋ฉฐ,
์ฌ๋ฌ ํ๋ก๊ทธ๋จ์ด ๋์์ ํ ํ์ผ์ ์ ๊ทผํด์ ๋ฐ์ํ๋ ๋ฌธ์ ๋ฅผ ๋ง๊ธฐ ์ํด sys_set_thread_area
์ sys_mprotect
์ผ๋ก ๋ฝ์ ๊ฑธ์ด ํ๋์ ์ฐ๋ ๋๋ง ํ์ผ์ ์ ๊ทผํ ์ ์๊ฒํ ๊ฒ์ผ๋ก ๋ณด์ธ๋ค.
rm.c
in busybox-1.31.1
and show the code that actually removes f1
. Note all linux commands are actually a program, and running rm
command means running rm.c program. rm
needs a system call defined in uClibc-0.9.33.2
to remove a file. You may want to continue the code tracing all the way up to โINT 0x80โ in uClibc for this system call.busybox-1.31.1/coreutils/rm.c
:#include "libbb.h"
int rm_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int rm_main(int argc UNUSED_PARAM, char **argv)
{
int status = 0;
int flags = 0;
unsigned opt;
opt = getopt32(argv, "^" "fiRrv" "\0" "f-i:i-f");
argv += optind;
if (opt & 1)
flags |= FILEUTILS_FORCE;
if (opt & 2)
flags |= FILEUTILS_INTERACTIVE;
if (opt & (8|4))
flags |= FILEUTILS_RECUR;
if ((opt & 16) && FILEUTILS_VERBOSE)
flags |= FILEUTILS_VERBOSE;
if (*argv != NULL) {
do {
const char *base = bb_get_last_path_component_strip(*argv);
if (DOT_OR_DOTDOT(base)) {
bb_error_msg("can't remove '.' or '..'");
} else if (remove_file(*argv, flags) >= 0) {
continue;
}
status = 1;
} while (*++argv);
} else if (!(flags & FILEUTILS_FORCE)) {
bb_show_usage();
}
return status;
}
rm
ํ๋ก๊ทธ๋จ์ ์ฝ๋๋ ์์ ๊ฐ๋ค. ๊ฐ์ข
ํ๋๊ทธ๋ฅผ ํ์ธํ๊ณ , remove_file
ํจ์๋ฅผ ํธ์ถํด ํ์ผ์ ์ญ์ ํ๋ค.
#include "libbb.h"
int FAST_FUNC remove_file(const char *path, int flags)
{
struct stat path_stat;
if (lstat(path, &path_stat) < 0) {
if (errno != ENOENT) {
bb_perror_msg("can't stat '%s'", path);
return -1;
}
if (!(flags & FILEUTILS_FORCE)) {
bb_perror_msg("can't remove '%s'", path);
return -1;
}
return 0;
}
if (S_ISDIR(path_stat.st_mode)) {
DIR *dp;
struct dirent *d;
int status = 0;
if (!(flags & FILEUTILS_RECUR)) {
bb_error_msg("'%s' is a directory", path);
return -1;
}
if ((!(flags & FILEUTILS_FORCE) && access(path, W_OK) < 0 && isatty(0))
|| (flags & FILEUTILS_INTERACTIVE)
) {
fprintf(stderr, "%s: descend into directory '%s'? ",
applet_name, path);
if (!bb_ask_y_confirmation())
return 0;
}
dp = opendir(path);
if (dp == NULL) {
return -1;
}
while ((d = readdir(dp)) != NULL) {
char *new_path;
new_path = concat_subpath_file(path, d->d_name);
if (new_path == NULL)
continue;
if (remove_file(new_path, flags) < 0)
status = -1;
free(new_path);
}
if (closedir(dp) < 0) {
bb_perror_msg("can't close '%s'", path);
return -1;
}
if (flags & FILEUTILS_INTERACTIVE) {
fprintf(stderr, "%s: remove directory '%s'? ",
applet_name, path);
if (!bb_ask_y_confirmation())
return status;
}
if (status == 0 && rmdir(path) < 0) {
bb_perror_msg("can't remove '%s'", path);
return -1;
}
if (flags & FILEUTILS_VERBOSE) {
printf("removed directory: '%s'\n", path);
}
return status;
}
/* !ISDIR */
if ((!(flags & FILEUTILS_FORCE)
&& access(path, W_OK) < 0
&& !S_ISLNK(path_stat.st_mode)
&& isatty(0))
|| (flags & FILEUTILS_INTERACTIVE)
) {
fprintf(stderr, "%s: remove '%s'? ", applet_name, path);
if (!bb_ask_y_confirmation())
return 0;
}
if (unlink(path) < 0) {
bb_perror_msg("can't remove '%s'", path);
return -1;
}
if (flags & FILEUTILS_VERBOSE) {
printf("removed '%s'\n", path);
}
return 0;
}
rmdir
์ ํธ์ถํด ํ์ผ์ ์ญ์ ํ๋ค.uClibc-0.9.33.2/libc/sysdeps/linux/common/rmdir.c
:#include <sys/syscall.h>
#include <unistd.h>
_syscall1(int, rmdir, const char *, pathname)
libc_hidden_def(rmdir)
rmdir
์ ์ค์ ์ฝ๋๋ uClibc
์ ์๋ค.
_syscall1
๋ฅผ ๋ฐ๋ผ๊ฐ๋ณด์.
uClibc-0.9.33.2/libc/sysdeps/linux/common/bits/syscalls-common.h
:#define _syscall1(args...) SYSCALL_FUNC(1, args)
#define SYSCALL_FUNC(nargs, type, name, args...) \
type name(C_DECL_ARGS_##nargs(args)) { \
return (type)INLINE_SYSCALL(name, nargs, C_ARGS_##nargs(args)); \
}
#define INLINE_SYSCALL(name, nr, args...) INLINE_SYSCALL_NCS(__NR_##name, nr, args)
#define INLINE_SYSCALL_NCS(name, nr, args...) \
(__extension__ \
({ \
INTERNAL_SYSCALL_DECL(__err); \
(__extension__ \
({ \
long __res = INTERNAL_SYSCALL_NCS(name, __err, nr, args); \
if (unlikely(INTERNAL_SYSCALL_ERROR_P(__res, __err))) { \
__set_errno(INTERNAL_SYSCALL_ERRNO(__res, __err)); \
__res = -1L; \
} \
__res; \
}) \
); \
}) \
)
_syscall1
๋ SYSCALL_FUNC
๋ฅผ ํธ์ถํ๋ค.SYSCALL_FUNC
๋ INLINE_SYSCALL
๋ฅผ ํธ์ถํ๊ณ ,INLINE_SYSCALL
๋ INLINE_SYSCALL_NCS
๋ฅผ ํธ์ถํ๋ค.INLINE_SYSCALL_NC
S์์๋ INTERNAL_SYSCALL_NCS
๋ฅผ ํธ์ถํ๋ค.uClibc-0.9.33.2/libc/sysdeps/linux/i386/bits/syscalls.h
#define INTERNAL_SYSCALL_NCS(name, err, nr, args...) \
(__extension__ \
({ \
register unsigned int resultvar; \
__asm__ __volatile__ ( \
LOADARGS_##nr \
"movl %1, %%eax\n\t" \
"int $0x80\n\t" \
RESTOREARGS_##nr \
: "=a" (resultvar) \
: "g" (name) ASMFMT_##nr(args) : "memory", "cc" \
); \
(int) resultvar; \
}) \
)
์ฝ๋๋ฅผ ๋ณด๋ฉด ์ ์ ์๋ฏ, INTERNAL_SYSCALL_NCS
๋ int $0x80
์ผ๋ก rmdir
์ ์ธํฐ๋ฝํธ๋ฅผ ์์ฑํ๋ ์ด์
๋ธ๋ฆฌ ์ฝ๋๋ฅผ ์์ฑํ๋ค. ์ด๋, x80
์ 10์ง์ 128๋ก System Call์ ์ธํฐ๋ฝํธ ๋ฒํธ์ด๋ค.