The Fractional Knapsack Problem,Huffman Coding

3-2/알고리즘설계와분석

The Fractional Knapsack Problem,Huffman Coding

코딩하는 랄뚜기 2021. 11. 16. 17:40

A greedy approach

Pi/Wi를 내림차순으로 정렬한다.(Greedy)
knapsack이 찰 때까지 item을 고른다.

Implementation 1

Sort the item -> O(nlogn)
Repeat the choice -> O(n)

O(n+nlogn) = O(n logn)

Implementation 2

Put the items in a heap -> O(n)
Repeat the choice -> O(k logn)

O(n+k logn) = ?

Implementation2 방식으로 하면 k값에 따라 시간복잡도가 달라진다.

확실한 것은 가장 최악이 O(n logn)으로 Implementation1 방식보다 훨씬 효율적이다.

Knapsack Example 2: n=6, W=10

Huffman Coding

Data compression
- Data compression can save storage space for files.
- Huffman coding is just one of many data compression techniques.
Problem
- Given a file, find a binary character code for the characters in the file, which represents the file in the least number of bits.

변환의 목적은 크기를 줄이는 것이므로, character를 변환을 할 때, 반드시 최소의 bit를 사용하도록 한다.

Example
- Original text file : ababcbbbc
- Huffman codes : a = 10, b = 0, c =11
- Compressed file : 1001001100011

여기서 b가 가장 빈도수가 높으므로 1개의 bit만을 할당하였다.

a = 01, b = 0, c = 11로 한다면 01....이런 코드를 보았을 때, 0 (b)로 읽어야 하는지 01(a)로 읽어야하는지 헷갈리게 된다. (Ambiguous하다)

Prefix Codes

코드를 unambiguous하게 만들려면 각각의 character에 할당되는 bit들이 다른 character의 prefix(접두사)가 되면 안된다.

ex) a=01, b = 010 X

위 처럼 binary tree를 이용하면 prefix를 겹치지 않게하여 각각의 character에 bit를 할당 할 수 있다.

bit들이 주어지고 bit를 해석하는 binary tree가 주어졌을 때, 총 cost는 freq(vi)*depth(vi)가 된다.

여기서 freq는 vi가 몇 번 나타나는지를 depth는 vi의 bit개수를 의미한다.

(0)번 처럼 character와 frequcency가 주어진 리스트가 주어지면 frequency를 기준으로 min_heap을 만든다.

그리고 min_heap에서 하나씩 값을 빼면서 (1)~(5)번 과정처럼 tree를 만들어 준다.

min_heap을 만드는데 시간복잡도는 O(n) 그리고 n개를 뽑아내는데 O(nlogn)이므로 총 비용은 O(n+nlogn) = O(nlogn)이 된다.

'3-2 > 알고리즘설계와분석' 카테고리의 다른 글

Intractable Problem (0)	2021.11.12
Knapsack Problem (0)	2021.11.05
The Gapped Alignment Problem, Longest Increasing Subsequence (0)	2021.11.05
The Checkerboard Problem, Longest Common Subsequence (0)	2021.11.05
The Manhattan Tourist Problem, Chained Matrix Multiplication (0)	2021.10.21

현재글The Fractional Knapsack Problem,Huffman Coding

누적 합, 시뮬레이션, simulation, math, Prefix sum, 넓이 우선 탐색, 우선순위 큐, SUAPC, sp, 수학, concurrent programming, 시스템프로그래밍, BFS, Codeforces, SYSTEM PROGRAMMING, 컴퓨터아키텍쳐, 구현, Standard I/O, 데이터베이스시스템, 백준,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

랄뚜기의 코딩일기