Fix CUDA OOM by mutually unloading Whisper and ChatTTS on 8GB GPU.
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -13,3 +13,7 @@ OLLAMA_PORT=11434
|
||||
# WHISPER_MODEL_DIR=/opt/Trading_Studio/models/whisper
|
||||
# WHISPER_MODEL_SIZE=small
|
||||
# HF_ENDPOINT=https://hf-mirror.com
|
||||
|
||||
# 8GB 显存 OOM 时可调低(合成按段切分)
|
||||
# TTS_MAX_CHARS_PER_CHUNK=150
|
||||
# TTS_MAX_NEW_TOKEN=768
|
||||
|
||||
Reference in New Issue
Block a user