马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?立即注册
x
课题摘要:
先容一些常见的数据压缩算法,并提供更详细的Python代码实现。
一、无损压缩算法
1. Huffman编码
Huffman编码是一种基于字符频率的编码方法,通过构建一棵Huffman树来天生每个字符的唯一编码。
详细代码示例(Python):
- import heapq
- from collections import defaultdict, Counter
- class Node:
- def __init__(self, char, freq):
- self.char = char
- self.freq = freq
- self.left = None
- self.right = None
- def __lt__(self, other):
- return self.freq < other.freq
- def build_huffman_tree(frequency):
- heap = [Node(char, freq) for char, freq in frequency.items()]
- heapq.heapify(heap)
-
- while len(heap) > 1:
- left = heapq.heappop(heap)
- right = heapq.heappop(heap)
- merged = Node(None, left.freq + right.freq)
- merged.left = left
- merged.right = right
- heapq.heappush(heap, merged)
-
- return heap[0]
- def generate_codes(node, prefix="", code_dict=None):
- if code_dict is None:
- code_dict = {}
- if node is not None:
- if node.char is not None:
- code_dict[node.char] = prefix
- generate_codes(node.left, prefix + "0", code_dict)
- generate_codes(node.right, prefix + "1", code_dict)
- return code_dict
- def huffman_encode(s):
- frequency = Counter(s)
- huffman_tree = build_huffman_tree(frequency)
- huffman_codes = generate_codes(huffman_tree)
- encoded_string = ''.join(huffman_codes[char] for char in s)
- return encoded_string, huffman_codes
- def huffman_decode(encoded_string, huffman_codes):
- reverse_dict = {code: char for char, code in huffman_codes.items()}
- current_code = ""
- decoded_string = ""
- for bit in encoded_string:
- current_code += bit
- if current_code in reverse_dict:
- decoded_string += reverse_dict[current_code]
- current_code = ""
- return decoded_string
- # 示例
- s = "this is an example for huffman encoding"
- encoded_string, huffman_codes = huffman_encode(s)
- print("Encoded string:", encoded_string)
- print("Huffman dictionary:", huffman_codes)
- decoded_string = huffman_decode(encoded_string, huffman_codes)
- print("Decoded string:", decoded_string)
复制代码 2. Lempel-Ziv-Welch (LZW) 编码
LZW编码是一种基于字典的压缩算法,通过动态构建字典来编码重复的字符串。
详细代码示例(Python):
- def lzw_encode(s):
- dictionary = {chr(i): i for i in range(256)}
- w = ""
- result = []
- for c in s:
- wc = w + c
- if wc in dictionary:
- w = wc
- else:
- result.append(dictionary[w])
- dictionary[wc] = len(dictionary)
- w = c
- if w:
- result.append(dictionary[w])
- return result
- def lzw_decode(encoded):
- dictionary = {i: chr(i) for i in range(256)}
- w = chr(encoded.pop(0))
- result = [w]
- for k in encoded:
- if k in dictionary:
- entry = dictionary[k]
- elif k == len(dictionary):
- entry = w + w[0]
- result.append(entry)
- dictionary[len(dictionary)] = w + entry[0]
- w = entry
- return ''.join(result)
- # 示例
- s = "TOBEORNOTTOBEORTOBEORNOT"
- encoded = lzw_encode(s)
- print("Encoded:", encoded)
- decoded = lzw_decode(encoded)
- print("Decoded:", decoded)
复制代码 3. Run-Length Encoding (RLE)
RLE是一种简单的无损压缩算法,通过将连续重复的字符替换为字符和重复次数的组合。
详细代码示例(Python):
- def rle_encode(s):
- if not s:
- return ""
-
- result = []
- prev_char = s[0]
- count = 1
-
- for char in s[1:]:
- if char == prev_char:
- count += 1
- else:
- result.append((prev_char, count))
- prev_char = char
- count = 1
- result.append((prev_char, count))
-
- return ''.join([f"{char}{count}" for char, count in result])
- def rle_decode(encoded):
- result = []
- i = 0
- while i < len(encoded):
- char = encoded[i]
- count = int(encoded[i+1])
- result.append(char * count)
- i += 2
- return ''.join(result)
- # 示例
- s = "AAAABBBCCDAA"
- encoded = rle_encode(s)
- print("Encoded:", encoded)
- decoded = rle_decode(encoded)
- print("Decoded:", decoded)
复制代码 二、有损压缩算法
1. JPEG压缩(有损)
JPEG是一种广泛利用的图像压缩标准,通常用于有损压缩。虽然JPEG压缩的实现较为复杂,但可以利用Python的Pillow库来处置惩罚JPEG图像。
详细代码示例(Python):
- from PIL import Image
- # 压缩图像
- def compress_image(input_path, output_path, quality=85):
- image = Image.open(input_path)
- image.save(output_path, "JPEG", quality=quality)
- # 示例
- compress_image("input.jpg", "output.jpg", quality=50)
复制代码 2. DEFLATE(ZIP压缩)
DEFLATE是一种结合了LZ77算法和Huffman编码的压缩算法,广泛用于ZIP文件格式。
详细代码示例(Python):
- import zlib
- def deflate_compress(data):
- compressed_data = zlib.compress(data.encode())
- return compressed_data
- def deflate_decompress(compressed_data):
- decompressed_data = zlib.decompress(compressed_data)
- return decompressed_data.decode()
- # 示例
- data = "this is an example for deflate compression"
- compressed_data = deflate_compress(data)
- print("Compressed data:", compressed_data)
- decompressed_data = deflate_decompress(compressed_data)
- print("Decompressed data:", decompressed_data)
复制代码 3. Brotli
Brotli是一种现代的压缩算法,结合了多种压缩技术,提供比DEFLATE更好的压缩率。
详细代码示例(Python):
- import brotli
- def brotli_compress(data):
- compressed_data = brotli.compress(data.encode())
- return compressed_data
- def brotli_decompress(compressed_data):
- decompressed_data = brotli.decompress(compressed_data)
- return decompressed_data.decode()
- # 示例
- data = "this is an example for brotli compression"
- compressed_data = brotli_compress(data)
- print("Compressed data:", compressed_data)
- decompressed_data = brotli_decompress(compressed_data)
- print("Decompressed data:", decompressed_data)
复制代码 4. LZMA
LZMA是一种高效的压缩算法,广泛用于7z文件格式。
详细代码示例(Python):
- import lzma
- def lzma_compress(data):
- compressed_data = lzma.compress(data.encode())
- return compressed_data
- def lzma_decompress(compressed_data):
- decompressed_data = lzma.decompress(compressed_data)
- return decompressed_data.decode()
- # 示例
- data = "this is an example for lzma compression"
- compressed_data = lzma_compress(data)
- print("Compressed data:", compressed_data)
- decompressed_data = lzma_decompress(compressed_data)
- print("Decompressed data:", decompressed_data)
复制代码 5. Zstandard (Zstd)
Zstd是一种现代的压缩算法,结合了高压缩率和快速解压缩的特点。
详细代码示例(Python):
- import zstandard
- def zstd_compress(data):
- compressed_data = zstandard.compress(data.encode())
- return compressed_data
- def zstd_decompress(compressed_data):
- decompressed_data = zstandard.decompress(compressed_data)
- return decompressed_data.decode()
- # 示例
- data = "this is an example for zstd compression"
- compressed_data = zstd_compress(data)
- print("Compressed data:", compressed_data)
- decompressed_data = zstd_decompress(compressed_data)
- print("Decompressed data:", decompressed_data)
复制代码 总结
这些数据压缩算法在不同的场景下具有各自的优势和适用性。无损压缩算法如Huffman编码、LZW编码和RLE适用于需要完全恢复原始数据的场景,而有损压缩算法如JPEG压缩则适用于对数据质量要求不高的场景。根据具体需求选择符合的压缩算法可以有效节省存储空间和传输带宽。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |