Replace Group with Column, use CSS grid for equal card sizes, and override Gradio default primary borders for a cleaner 1800px layout.
Co-authored-by: Cursor <cursoragent@cursor.com>
Set container max-width to 1800px, redesign one-click tab with hero bar, card inputs, toolbar row, and three-column output grid.
Co-authored-by: Cursor <cursoragent@cursor.com>
Wrap workflow in one panel with side-by-side input columns, inline options, embedded history accordion, and fixed tab bar distribution.
Co-authored-by: Cursor <cursoragent@cursor.com>
Evenly distribute main tabs, make upload and transcript inputs equal height, and fold voice selection into a collapsed accordion.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add a save button that exports WAV at the current slider speed using Web Audio, matching what the user hears during preview.
Co-authored-by: Cursor <cursoragent@cursor.com>
Replace Gradio Audio output with an HTML player that supports play/pause and a 0.5x-2.0x speed slider, plus direct /outputs WAV download.
Co-authored-by: Cursor <cursoragent@cursor.com>
Use four tabs (one-click, pipeline, voice lock, config) as the main navigation and relocate project intro, install button, and Ollama or speaker status into the config tab to save vertical space.
Co-authored-by: Cursor <cursoragent@cursor.com>
Use the same manual_seed for every chunk and normalize per-segment peaks before concat so long voiceovers no longer sound like different speakers between segments.
Co-authored-by: Cursor <cursoragent@cursor.com>
Only pass show_download_button and show_share_button when the installed Gradio Audio component supports them, fixing PM2 startup TypeError.
Co-authored-by: Cursor <cursoragent@cursor.com>
Keep synthesized wav files browsable with playback and download, default to preset steady male voice, show one-click pipeline as the first tab, and reduce post-synthesis UI flicker.
Co-authored-by: Cursor <cursoragent@cursor.com>
Enable Gradio queue, immediate pending feedback, segment progress, and gr.update for Audio so long syntheses show logs and playback correctly.
Co-authored-by: Cursor <cursoragent@cursor.com>
Refresh status every 60s only, shorten synth log, update log before audio, and isolate repaint regions.
Co-authored-by: Cursor <cursoragent@cursor.com>
Re-enable KV cache by default, normalize digits and unsafe chars, disable per-chunk split_text, and reload ChatTTS after CUDA errors.
Co-authored-by: Cursor <cursoragent@cursor.com>
Disable ensure_non_empty retries, set min_new_token, always refine text, and use per-chunk manual_seed.
Co-authored-by: Cursor <cursoragent@cursor.com>
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2.
Co-authored-by: Cursor <cursoragent@cursor.com>
Use spk_smp plus txt_smp for voice clone instead of mis-encoding into spk_emb; migrate legacy speaker_emb.pt and improve error hints.
Co-authored-by: Cursor <cursoragent@cursor.com>
Strip Markdown and stage directions before ChatTTS synthesis with chunked long scripts; document model pre-download, server-update, and microphone HTTPS notes.
Co-authored-by: Cursor <cursoragent@cursor.com>