人工智能-Android 将 Whisper 模型转换为 TFLite 并在 Android 应用中直接运行

天津储鑫盛钢材现货供应商 发表于 2025-4-17 11:57:22

Android 将 Whisper 模型转换为 TFLite 并在 Android 应用中直接运行

Whisper

将 Whisper 模型转换为 TFLite 并在 Android 应用中直接运行的完备步骤：
1. 模型转换流程

1.1 安装依赖

bash
复制
# 安装 PyTorch、ONNX、TensorFlow 和转换工具
pip install torch onnx tf2onnx transformers tensorflow 1.2 导出 PyTorch 模型为 ONNX

python
复制
import torch
from transformers import WhisperForConditionalGeneration, WhisperProcessor

# 加载模型和处理器
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
processor = WhisperProcessor.from_pretrained("openai/whisper-tiny.en")

# 示例输入（音频长度 30s，采样率 16kHz）
dummy_input = torch.randn(1, 30000)# 1x30000 的音频信号

# 导出为 ONNX
torch.onnx.export(
model,
dummy_input,
"whisper_tiny.onnx",
input_names=["input_features"],
output_names=["logits"],
dynamic_axes={"input_features": {0: "batch_size", 1: "sequence_length"}},
) 1.3 转换 ONNX 为 TensorFlow 模型

python
复制
import tf2onnx
from onnx_tf.backend import prepare

# 加载 ONNX 模型并转换为 TensorFlow
onnx_model = onnx.load("whisper_tiny.onnx")
tf_rep = prepare(onnx_model)
tf_rep.export_graph("whisper_tiny_tf") 1.4 转换为 TFLite 模型（含量化）

python
复制
import tensorflow as tf

# 加载 TensorFlow SavedModel
converter = tf.lite.TFLiteConverter.from_saved_model("whisper_tiny_tf")

# 启用动态范围量化（减小模型体积）
converter.optimizations =

# 转换为 TFLite 模型
tflite_model = converter.convert()

# 保存模型
with open("whisper_tiny.tflite", "wb") as f:
f.write(tflite_model) 2. Android 端集成

2.1 添加 TFLite 依赖

在 app/build.gradle 中添加：
gradle
复制
dependencies {
implementation 'org.tensorflow:tensorflow-lite:2.10.0'
implementation 'org.tensorflow:tensorflow-lite-support:0.4.4'
} 2.2 将模型文件放入 Assets

复制
app/src/main/assets/
└── whisper_tiny.tflite 3. Java 代码实现

3.1 音频预处置惩罚工具类

java
复制
import android.content.Context;
import org.tensorflow.lite.support.audio.TensorAudio;
import org.tensorflow.lite.support.common.FileUtil;

public class AudioPreprocessor {
// 将音频文件转换为模型输入张量
public static TensorAudio preprocessAudio(Context context, String audioPath) {
   // 加载音频并重采样到 16kHz
   TensorAudio tensor = TensorAudio.create(
         TensorAudio.TensorAudioFormat.create(1, 16000),
         FileUtil.loadAudioFile(context, audioPath)
   );
   return tensor;
}
} 3.2 TFLite 模型推理

java
复制
import android.content.Context;
import org.tensorflow.lite.Interpreter;
import org.tensorflow.lite.support.common.FileUtil;
import java.nio.ByteBuffer;

public class WhisperInference {
private final Interpreter interpreter;

public WhisperInference(Context context) {
   // 加载 TFLite 模型
   ByteBuffer modelBuffer = FileUtil.loadMappedFile(context, "whisper_tiny.tflite");
   interpreter = new Interpreter(modelBuffer);
}

public String transcribe(TensorAudio audioTensor) {
   // 输入输出容器
   float[][][] input = new float; // 梅尔频谱图维度
   String[] output = new String;

   // 执行推理
   interpreter.run(input, output);
   return output;
}

public void close() {
   interpreter.close();
}
} 3.3 主界面逻辑

java
复制
import android.os.Bundle;
import android.widget.Button;
import androidx.appcompat.app.AppCompatActivity;

public class MainActivity extends AppCompatActivity {
private WhisperInference whisperInference;

@Override
protected void onCreate(Bundle savedInstanceState) {
   super.onCreate(savedInstanceState);
   setContentView(R.layout.activity_main);

   // 初始化模型
   whisperInference = new WhisperInference(this);

   // 录音和推理按钮
   Button btnProcess = findViewById(R.id.btn_process);
   btnProcess.setOnClickListener(v -> {
         // 1. 录音并保存为文件（需实现录音逻辑）
         String audioPath = "/path/to/audio.wav";

         // 2. 预处理音频
         TensorAudio audio = AudioPreprocessor.preprocessAudio(this, audioPath);

         // 3. 执行推理
         String text = whisperInference.transcribe(audio);

         // 4. 使用 TTS 播放结果
         // （参考之前的 TTS 代码）
   });
}

@Override
protected void onDestroy() {
   whisperInference.close();
   super.onDestroy();
}
} 4. 关键优化技能

4.1 模型量化

[*] 动态范围量化：减少 4x 模型体积（从 150MB 到 ~40MB）
[*] 全整数量化（需校准数据集）：
python
复制
converter.representative_dataset = calibration_dataset
converter.target_spec.supported_ops =
4.2 输入输出优化

[*] 梅尔频谱图预处置惩罚：在 Android 端实现音频到频谱图的转换（替代 Python 代码）
[*] 缓存机制：复用 Interpreter 实例避免重复加载模型
4.3 多线程推理

java
复制
// 在后台线程执行推理
new Thread(() -> {
String text = whisperInference.transcribe(audio);
runOnUiThread(() -> updateUI(text));
}).start(); 5. 完备流程对比

步骤Python + Chaquopy 方案TFLite 方案模型体积200MB+ (含 Python 运行时)40MB (量化后)推理速度较慢（依赖 Python 解释器）快（原生 C++ 执行）内存占用高（加载完备 PyTorch 模型）低（TFLite 优化）开辟复杂度低（直接调用 Python）中（需处置惩罚模型转换和 C++ 接口） 6. 留意事项

[*] 算子兼容性：转换过程中大概遇到不支持的算子（如自界说 Layer），需手动实现或简化模型。
[*] 精度损失：量化大概导致辨认准确率降落，需通过测试调整量化策略。
[*] 及时性要求：长音频需分段处置惩罚，避免内存溢出。
[*] 设备兼容性：确保 NDK 编译时包罗全部目标 ABI（armeabi-v7a, arm64-v8a）。
通过上述步骤，可将 Whisper 模型高效摆设到 Android 设备，显著提升性能并降低依赖复杂度。
TTS

TensorFlowTTS https://csdnimg.cn/release/blog_editor_html/release2.3.8/ckeditor/plugins/CsdnLink/icons/icon-default.png?t=P1C7https://github.com/TensorSpeech/TensorFlowTTS
      Android for TTSTensorFlowTTS/examples/android at master · TensorSpeech/TensorFlowTTS · GitHub:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages) - TensorFlowTTS/examples/android at master · TensorSpeech/TensorFlowTTShttps://csdnimg.cn/release/blog_editor_html/release2.3.8/ckeditor/plugins/CsdnLink/icons/icon-default.png?t=P1C7https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/android

免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！更多信息从访问主页：qidao123.com:ToB企服之家，中国第一个企服评测及商务社交产业平台。

页: [1]

qidao123.com技术社区-IT企服评测·应用市场's Archiver

Android 将 Whisper 模型转换为 TFLite 并在 Android 应用中直接运行