6 Commits

Author SHA1 Message Date
dekun 8be34a2fd5 Fix ChatTTS CUDA device-side assert with text sanitize and GPU recovery.
Re-enable KV cache by default, normalize digits and unsafe chars, disable per-chunk split_text, and reload ChatTTS after CUDA errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:13:57 +08:00
dekun 1779449bba Fix ChatTTS recursion depth exceeded on empty generation.
Disable ensure_non_empty retries, set min_new_token, always refine text, and use per-chunk manual_seed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:10:26 +08:00
dekun 0cce6cda7c Fix CUDA OOM by mutually unloading Whisper and ChatTTS on 8GB GPU.
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:03:37 +08:00
dekun 0f5277c22e Add Whisper offline loading for air-gapped servers.
Pre-download via HF mirror scripts so inner-network deploys avoid Hub Network is unreachable errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 16:11:57 +08:00
dekun aacdffac77 Fix ChatTTS load: pre-download via HF mirror, avoid GitHub timeout.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:16:27 +08:00
dekun aea39a00ae Support .env for server-local Ollama config to avoid git pull conflicts.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:53:47 +08:00