青少年编程与数学 02-016 Python数据结构与算法 30课题、数据压缩算法 ...

宝塔山 · 2025-4-20 23:42:41

马上注册，结交更多好友，享用更多功能，让你轻松玩转社区。

您需要登录才可以下载或查看，没有账号？立即注册

x

课题摘要:
先容一些常见的数据压缩算法，并提供更详细的Python代码实现。

一、无损压缩算法

1. Huffman编码

Huffman编码是一种基于字符频率的编码方法，通过构建一棵Huffman树来天生每个字符的唯一编码。
详细代码示例（Python）：

import heapq
from collections import defaultdict, Counter
class Node:
def __init__(self, char, freq):
self.char = char
self.freq = freq
self.left = None
self.right = None
def __lt__(self, other):
return self.freq < other.freq
def build_huffman_tree(frequency):
heap = [Node(char, freq) for char, freq in frequency.items()]
heapq.heapify(heap)
while len(heap) > 1:
left = heapq.heappop(heap)
right = heapq.heappop(heap)
merged = Node(None, left.freq + right.freq)
merged.left = left
merged.right = right
heapq.heappush(heap, merged)
return heap[0]
def generate_codes(node, prefix="", code_dict=None):
if code_dict is None:
code_dict = {}
if node is not None:
if node.char is not None:
code_dict[node.char] = prefix
generate_codes(node.left, prefix + "0", code_dict)
generate_codes(node.right, prefix + "1", code_dict)
return code_dict
def huffman_encode(s):
frequency = Counter(s)
huffman_tree = build_huffman_tree(frequency)
huffman_codes = generate_codes(huffman_tree)
encoded_string = ''.join(huffman_codes[char] for char in s)
return encoded_string, huffman_codes
def huffman_decode(encoded_string, huffman_codes):
reverse_dict = {code: char for char, code in huffman_codes.items()}
current_code = ""
decoded_string = ""
for bit in encoded_string:
current_code += bit
if current_code in reverse_dict:
decoded_string += reverse_dict[current_code]
current_code = ""
return decoded_string
# 示例
s = "this is an example for huffman encoding"
encoded_string, huffman_codes = huffman_encode(s)
print("Encoded string:", encoded_string)
print("Huffman dictionary:", huffman_codes)
decoded_string = huffman_decode(encoded_string, huffman_codes)
print("Decoded string:", decoded_string)

复制代码

2. Lempel-Ziv-Welch (LZW) 编码

LZW编码是一种基于字典的压缩算法，通过动态构建字典来编码重复的字符串。
详细代码示例（Python）：

def lzw_encode(s):
dictionary = {chr(i): i for i in range(256)}
w = ""
result = []
for c in s:
wc = w + c
if wc in dictionary:
w = wc
else:
result.append(dictionary[w])
dictionary[wc] = len(dictionary)
w = c
if w:
result.append(dictionary[w])
return result
def lzw_decode(encoded):
dictionary = {i: chr(i) for i in range(256)}
w = chr(encoded.pop(0))
result = [w]
for k in encoded:
if k in dictionary:
entry = dictionary[k]
elif k == len(dictionary):
entry = w + w[0]
result.append(entry)
dictionary[len(dictionary)] = w + entry[0]
w = entry
return ''.join(result)
# 示例
s = "TOBEORNOTTOBEORTOBEORNOT"
encoded = lzw_encode(s)
print("Encoded:", encoded)
decoded = lzw_decode(encoded)
print("Decoded:", decoded)

复制代码

3. Run-Length Encoding (RLE)

RLE是一种简单的无损压缩算法，通过将连续重复的字符替换为字符和重复次数的组合。
详细代码示例（Python）：

def rle_encode(s):
if not s:
return ""
result = []
prev_char = s[0]
count = 1
for char in s[1:]:
if char == prev_char:
count += 1
else:
result.append((prev_char, count))
prev_char = char
count = 1
result.append((prev_char, count))
return ''.join([f"{char}{count}" for char, count in result])
def rle_decode(encoded):
result = []
i = 0
while i < len(encoded):
char = encoded[i]
count = int(encoded[i+1])
result.append(char * count)
i += 2
return ''.join(result)
# 示例
s = "AAAABBBCCDAA"
encoded = rle_encode(s)
print("Encoded:", encoded)
decoded = rle_decode(encoded)
print("Decoded:", decoded)

复制代码

二、有损压缩算法

1. JPEG压缩（有损）

JPEG是一种广泛利用的图像压缩标准，通常用于有损压缩。虽然JPEG压缩的实现较为复杂，但可以利用Python的Pillow库来处置惩罚JPEG图像。
详细代码示例（Python）：

from PIL import Image
# 压缩图像
def compress_image(input_path, output_path, quality=85):
image = Image.open(input_path)
image.save(output_path, "JPEG", quality=quality)
# 示例
compress_image("input.jpg", "output.jpg", quality=50)

复制代码

2. DEFLATE（ZIP压缩）

DEFLATE是一种结合了LZ77算法和Huffman编码的压缩算法，广泛用于ZIP文件格式。
详细代码示例（Python）：

import zlib
def deflate_compress(data):
compressed_data = zlib.compress(data.encode())
return compressed_data
def deflate_decompress(compressed_data):
decompressed_data = zlib.decompress(compressed_data)
return decompressed_data.decode()
# 示例
data = "this is an example for deflate compression"
compressed_data = deflate_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = deflate_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

复制代码

3. Brotli

Brotli是一种现代的压缩算法，结合了多种压缩技术，提供比DEFLATE更好的压缩率。
详细代码示例（Python）：

import brotli
def brotli_compress(data):
compressed_data = brotli.compress(data.encode())
return compressed_data
def brotli_decompress(compressed_data):
decompressed_data = brotli.decompress(compressed_data)
return decompressed_data.decode()
# 示例
data = "this is an example for brotli compression"
compressed_data = brotli_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = brotli_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

复制代码

4. LZMA

LZMA是一种高效的压缩算法，广泛用于7z文件格式。
详细代码示例（Python）：

import lzma
def lzma_compress(data):
compressed_data = lzma.compress(data.encode())
return compressed_data
def lzma_decompress(compressed_data):
decompressed_data = lzma.decompress(compressed_data)
return decompressed_data.decode()
# 示例
data = "this is an example for lzma compression"
compressed_data = lzma_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = lzma_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

复制代码

5. Zstandard (Zstd)

Zstd是一种现代的压缩算法，结合了高压缩率和快速解压缩的特点。
详细代码示例（Python）：

import zstandard
def zstd_compress(data):
compressed_data = zstandard.compress(data.encode())
return compressed_data
def zstd_decompress(compressed_data):
decompressed_data = zstandard.decompress(compressed_data)
return decompressed_data.decode()
# 示例
data = "this is an example for zstd compression"
compressed_data = zstd_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = zstd_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

复制代码

总结

这些数据压缩算法在不同的场景下具有各自的优势和适用性。无损压缩算法如Huffman编码、LZW编码和RLE适用于需要完全恢复原始数据的场景，而有损压缩算法如JPEG压缩则适用于对数据质量要求不高的场景。根据具体需求选择符合的压缩算法可以有效节省存储空间和传输带宽。

免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！更多信息从访问主页：qidao123.com:ToB企服之家，中国第一个企服评测及商务社交产业平台。

		自动登录	找回密码
密码			立即注册

青少年编程与数学 02-016 Python数据结构与算法 30课题、数据压缩算法 ...

马上注册，结交更多好友，享用更多功能，让你轻松玩转社区。

0 个回复

快速回复

楼主热帖

标签云

浏览过的版块