Fix CUDA OOM by mutually unloading Whisper and ChatTTS on 8GB GPU.
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -733,10 +733,26 @@ nvidia-smi
|
||||
fuser -v /dev/nvidia*
|
||||
```
|
||||
|
||||
Whisper 与 ChatTTS 不会同时常驻最大显存,但首次加载模型时峰值较高。建议:
|
||||
Whisper 与 ChatTTS **不能同时常驻** 8GB 显存(会 CUDA OOM)。应用已自动互斥卸载:
|
||||
|
||||
- 锁定 120W 功耗墙
|
||||
- `max_memory_restart: "6G"` 已在 PM2 配置中设置
|
||||
- 识别前卸载 ChatTTS
|
||||
- 合成 / 锁定音色前卸载 Whisper
|
||||
|
||||
若仍 OOM:
|
||||
|
||||
```bash
|
||||
pm2 restart trading_studio
|
||||
nvidia-smi # 确认无其他占 GPU 进程
|
||||
```
|
||||
|
||||
在 `.env` 调低合成峰值:
|
||||
|
||||
```ini
|
||||
TTS_MAX_CHARS_PER_CHUNK=150
|
||||
TTS_MAX_NEW_TOKEN=768
|
||||
```
|
||||
|
||||
PM2 已配置 `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` 缓解碎片。建议锁定 120W 功耗墙。
|
||||
|
||||
### 10.3 Whisper 模型加载失败
|
||||
|
||||
|
||||
Reference in New Issue
Block a user