Re-enable KV cache by default, normalize digits and unsafe chars, disable per-chunk split_text, and reload ChatTTS after CUDA errors.
Co-authored-by: Cursor <cursoragent@cursor.com>
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2.
Co-authored-by: Cursor <cursoragent@cursor.com>