Computer-Science

String Process Algorithm

Contents


์ŠคํŠธ๋ง ํƒ์ƒ‰ ์•Œ๊ณ ๋ฆฌ์ฆ˜


์ง์„ ์  ์•Œ๊ณ ๋ฆฌ์ฆ˜

aka. ๋ฌด์ž‘์ • ๋ฌด์ง€์„ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜

Pseudo Code

BruteForce(p[], t[]) // ๋‹จ์ˆœํ•œ algo์ง€๋งŒ ์ง์ ‘ ๊ตฌํ˜„ํ•˜๋ฉด ์–ด๋ ค์šธ ์ˆ˜ ์žˆ๋‹ค.
    M โ† ํŒจํ„ด์˜ ๊ธธ์ด; N โ† ํ…์ŠคํŠธ์˜ ๊ธธ์ด;
        for (i โ† 0, j โ† 0; j < M and i < N; i โ† i + 1, j โ† j + 1) do {
            if (t[i] =ฬธ p[j]) then {
                i โ† i - j; // i๊ฐ€ 1์นธ ์ด๋™
                j โ† -1;
            }
        }
    if (j = M) then return i - M;
    else return i;
end ButeForce()

์‹œ๊ฐ„๋ณต์žก๋„

์ตœ์•…์˜ ๊ฒฝ์šฐ ์‹œ๊ฐ„ ๋ณต์žก๋„๋Š” ํ…์ŠคํŠธ์˜ ๋ชจ๋“  ์œ„์น˜์—์„œ ํŒจํ„ด์„ ๋น„๊ตํ•ด์•ผ ํ•˜๋ฏ€๋กœ O(MN)์ด ๋จ

KMP ์•Œ๊ณ ๋ฆฌ์ฆ˜

์‹œ๊ฐ„ ๋ณต์žก๋„

O(M+N)

Pseudo Code: ์žฌ์‹œ์ž‘ ์œ„์น˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜

InitNext(p[])
    M โ† ํŒจํ„ด์˜ ๊ธธ์ด;
    next[0] โ† -1;
    for (i โ† 0, j โ† -1; i < M; i โ† i + 1, j โ† j + 1) do
    {
        next[i] โ† j;
        while ((j โ‰ฅ 0) and (p[i] =ฬธ p[j])) do
            j โ† next[j];
    }
end InitNext()

Pseudo Code: KMP ์•Œ๊ณ ๋ฆฌ์ฆ˜

KMP(p[], t[])
    M โ† ํŒจํ„ด์˜ ๊ธธ์ด; N โ† ํ…์ŠคํŠธ์˜ ๊ธธ์ด;
    InitNext(p);
    for(iโ†0,jโ†0; j <M and i<N; iโ†i+1,jโ†j+1) do
        while ((j โ‰ฅ 0) and (t[i] =ฬธ p[j])) do
            j โ† next[j];
    if (j = M) then return i - M;
    else return i;
end KMP()

ํŒจํ„ด์ด ๋‚ด์žฅ๋œ KMP ์•Œ๊ณ ๋ฆฌ์ฆ˜

if (p[i] == p[j]) then next[i] โ† next[j];
else next[i] โ† j

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2022-0-19 - แ…ฉแ„Œแ…ฅแ†ซ 11 24 32

๋ณด์ด์–ด-๋ฌด์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜

๋ถˆ์ผ์น˜ ๋ฌธ์ž ๋ฐฉ์ฑ…๊ณผ ์ผ์น˜ ์ ‘๋ฏธ๋ถ€ ๋ฐฉ์ฑ…

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2022-04-19 แ„‹แ…ฉแ„Œแ…ฅแ†ซ 11 26 39

Pseudo Code: ๋ถˆ์ผ์น˜ ๋ฌธ์ž ๋ฐฉ์ฑ… ์•Œ๊ณ ๋ฆฌ์ฆ˜

void InitSkip(char *p) {
    int i, M = strlen(p);
    for (i = 0; i < NUM; i++) skip[i] = M;
    for (i = 0; i < M; i++) skip[index(p[i])] = M - i - 1;
}
MisChar(p[], t[])
    M โ† ํŒจํ„ด์˜ ๊ธธ์ด; N โ† ํ…์ŠคํŠธ์˜ ๊ธธ์ด;
    InitSkip(p);
    for (i โ† M-1, j โ† M-1; j โ‰ฅ 0; i โ† i - 1, j โ† j - 1) do
        while (t[i] =ฬธ p[j]) do {
            k โ† skip[index(t[i])];
            if (M-j > k) then i โ† i + M - j;  // M-j = 5-1 = 4
            else i โ† i + k;                   // k = 0
            if (i โ‰ฅ N ) then return N;
            j โ† M - 1;
        }
    return i+1;
end MisChar()

๋ผ๋นˆ-์นดํ”„ ์•Œ๊ณ ๋ฆฌ์ฆ˜

Levenshtein Distance ํŽธ์ง‘๊ฑฐ๋ฆฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜

๋งํฌ ์ฐธ๊ณ 

ํŒจํ„ด ๋งค์นญ ์•Œ๊ณ ๋ฆฌ์ฆ˜

ํŒจํ„ด ๋งค์นญ (pattern matching)

IMG_AB632FFAE60A-1

์ •๊ทœ์‹ (regular expression)

ํŒจํ„ด ๋งค์นญ ์žฅ์น˜ (pattern matching machine)

ํŒจํ„ด ๋งค์นญ ์žฅ์น˜ ๊ตฌํ˜„

๋™์ž‘ ๊ณผ์ •

<img src=โ€https://user-images.githubusercontent.com/73745836/171323280-63ef8485-dd51-483a-8c14-61671bfe78d8.jpegโ€ width = 60%/>

ํŒŒ์ผ ์••์ถ• ์•Œ๊ณ ๋ฆฌ์ฆ˜

ํ—ˆํ”„๋งŒ ์ธ์ฝ”๋”ฉ (Huffman encoding)

with Priority Queue

์•Œ๊ณ ๋ฆฌ์ฆ˜

  1. ๋ชจ๋“  ๋…ธ๋“œ๋ฅผ PQ์— insertํ•œ๋‹ค.
  2. freq๊ฐ€ ๊ฐ€์žฅ ์ž‘์€ ๋‘ ๋…ธ๋“œ๋ฅผ PQ์—์„œ ๋นผ์˜จ๋‹ค.
  3. ๊ทธ ๋‘ ๋…ธ๋“œ๋ฅผ ํ•˜๋‚˜์˜ ๋…ธ๋“œ๋กœ ๋ฌถ๋Š”๋‹ค.
  4. ํ•˜๋‚˜๋กœ ๋ฌถ์€ ๋…ธ๋“œ๋ฅผ ๋‹ค์‹œ PQ์— ๋„ฃ๋Š”๋‹ค.
  5. 2๋ฒˆ ~ 4๋ฒˆ์˜ ๊ณผ์ •์„ ๊ณ„์† ๋ฐ˜๋ณตํ•œ๋‹ค.

์•”ํ˜ธํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜

๊ณต๊ฐœ ํ‚ค ์•”ํ˜ธํ™” ์‹œ์Šคํ…œ

RSA(Rivest-Shamir-Adleman) ์•Œ๊ณ ๋ฆฌ์ฆ˜

>>> p = 3
>>> q = 5
>>> r = p*q
>>> r
15
>>> (p-1)*(q-1)
8
>>>
>>> # let e be 11 (prime greater than both p, q)
>>> e = 11
>>>
>>> # find d
>>> # d*e = 1 modulo (p-1)*(q-1)
>>> # d*11 = 1 modulo 8
>>> # so, d = 3
>>> d = 3
>>>
>>> ## Encryption ###
>>> # let P = 13 (Plain text)
>>> P = 13
>>> # C = P^e modulo r
>>> # C = P**e % r
>>> # C = 13*11 % 15
>>> C = P**e % r
>>> C
7
>>>
>>> ### Decryption ###
>>> # P = C^d modulo r
>>> # P = C**d % r
>>> # P = 7**3 % 15
>>> P = C**d % r
>>> P
13

๋ฐœ์†ก์ธ S๊ฐ€ ์ˆ˜์‹ ์ธ R์—๊ฒŒ ๋ฉ”์‹œ์ง€ P ์ „์†ก

๋ฐœ์†ก์ธ S :

์ˆ˜์‹ ์ธ R :

S๊ฐ€ ๋ฉ”์‹œ์ง€ P๋ฅผ ์ „์†กํ•˜์˜€๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ!