Commit Graph

37 Commits

Author SHA1 Message Date
dekun 56f14206dd Widen layout to 1800px and polish one-click page design
Set container max-width to 1800px, redesign one-click tab with hero bar, card inputs, toolbar row, and three-column output grid.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 19:31:18 +08:00
dekun d63cb318b2 Redesign one-click tab as single compact page
Wrap workflow in one panel with side-by-side input columns, inline options, embedded history accordion, and fixed tab bar distribution.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 19:22:52 +08:00
dekun ca49b2feed Improve one-click tab layout: even nav, equal cards, collapsible voice
Evenly distribute main tabs, make upload and transcript inputs equal height, and fold voice selection into a collapsed accordion.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:59:27 +08:00
dekun 54523e39af Allow saving voiceover at adjusted playback speed
Add a save button that exports WAV at the current slider speed using Web Audio, matching what the user hears during preview.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:58:00 +08:00
dekun 1acba0349c Add playback speed control for generated voiceovers
Replace Gradio Audio output with an HTML player that supports play/pause and a 0.5x-2.0x speed slider, plus direct /outputs WAV download.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:53:35 +08:00
dekun 2dd642598f Move header and status into config tab for compact nav
Use four tabs (one-click, pipeline, voice lock, config) as the main navigation and relocate project intro, install button, and Ollama or speaker status into the config tab to save vertical space.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:50:35 +08:00
dekun 541df29722 Fix inconsistent voice across TTS segments
Use the same manual_seed for every chunk and normalize per-segment peaks before concat so long voiceovers no longer sound like different speakers between segments.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:46:25 +08:00
dekun 4255cf7cd7 Fix Gradio 4.x Audio compatibility on server
Only pass show_download_button and show_share_button when the installed Gradio Audio component supports them, fixing PM2 startup TypeError.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:40:19 +08:00
dekun bdc63c04df Add voice history, default preset voice, and one-click tab
Keep synthesized wav files browsable with playback and download, default to preset steady male voice, show one-click pipeline as the first tab, and reduce post-synthesis UI flicker.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:37:53 +08:00
dekun 7c50b13c57 Fix TTS synthesis UI stuck on loading state
Enable Gradio queue, immediate pending feedback, segment progress, and gr.update for Audio so long syntheses show logs and playback correctly.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 18:02:34 +08:00
dekun 97c11e08e0 Reduce post-synthesis UI flicker by removing 1s status timer.
Refresh status every 60s only, shorten synth log, update log before audio, and isolate repaint regions.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:46:03 +08:00
dekun 038e00fbcf Use vertical pipeline layout after polish to fix cramped UI.
Stack steps in full-width cards, compact voice grid, and guide user to Step 3 after polish.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:39:13 +08:00
dekun 131cbf070a Fix voice selector white screen on dark mobile UI.
Replace Dropdown with styled Radio and add dark-theme CSS for select lists.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:32:56 +08:00
dekun eb71e28427 Add local GPU preset voices with dropdown selection.
Generate ChatTTS sample_random_speaker presets without cloud APIs; choose clone or preset in synthesize UI.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:28:17 +08:00
dekun 8be34a2fd5 Fix ChatTTS CUDA device-side assert with text sanitize and GPU recovery.
Re-enable KV cache by default, normalize digits and unsafe chars, disable per-chunk split_text, and reload ChatTTS after CUDA errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:13:57 +08:00
dekun 1779449bba Fix ChatTTS recursion depth exceeded on empty generation.
Disable ensure_non_empty retries, set min_new_token, always refine text, and use per-chunk manual_seed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:10:26 +08:00
dekun 0cce6cda7c Fix CUDA OOM by mutually unloading Whisper and ChatTTS on 8GB GPU.
Release GPU memory before TTS/ASR switches, lower TTS token limits, and set PYTORCH_CUDA_ALLOC_CONF in PM2.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 17:03:37 +08:00
dekun 82f99c0b89 Fix ChatTTS Corrupt input data by correcting speaker params.
Use spk_smp plus txt_smp for voice clone instead of mis-encoding into spk_emb; migrate legacy speaker_emb.pt and improve error hints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 16:41:23 +08:00
dekun f36056d293 Add TTS markdown sanitization and expand deployment docs.
Strip Markdown and stage directions before ChatTTS synthesis with chunked long scripts; document model pre-download, server-update, and microphone HTTPS notes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 16:31:06 +08:00
dekun 0f5277c22e Add Whisper offline loading for air-gapped servers.
Pre-download via HF mirror scripts so inner-network deploys avoid Hub Network is unreachable errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 16:11:57 +08:00
dekun 39e29fe6a9 Load mobile audio via ffmpeg to avoid librosa PySoundFile warnings.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 16:05:55 +08:00
dekun aacdffac77 Fix ChatTTS load: pre-download via HF mirror, avoid GitHub timeout.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:16:27 +08:00
dekun 1ab1ede1b5 Fix PWA_NPS.md section numbering.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:11:55 +08:00
dekun 984f2e03a4 Add mobile microphone HTTPS hints and Permissions-Policy header.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:11:31 +08:00
dekun 90e77f8f70 Move reverse proxy docs to PWA_NPS.md for NPS setup; remove bundled nginx config.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:04:43 +08:00
dekun 21400700c5 Add HTTPS reverse proxy guide and PNG icons for real PWA install.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 15:00:15 +08:00
dekun 1d00c36cd3 Add server-update.sh for force sync when CRLF causes git pull conflicts.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:55:31 +08:00
dekun aea39a00ae Support .env for server-local Ollama config to avoid git pull conflicts.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:53:47 +08:00
dekun 7e65349878 Optimize tablet load: defer health check, lighten service worker, drop Google Fonts.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:49:58 +08:00
dekun f0bb40c605 Fix hint visibility and add PWA install button with one-click prompt.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:30:57 +08:00
dekun 3a0dff87bf Center responsive layout and add PWA install support for mobile, tablet, and desktop.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:25:57 +08:00
dekun e11caa59ab Improve UI contrast: high-visibility theme and status cards.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:20:05 +08:00
dekun fc96f834a0 Fix Gradio 6.0 theme/css warning and refresh speaker status after lock.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:17:26 +08:00
dekun 4a4f40fac4 Improve deploy.sh: fix git sync, CN pip mirrors, and pip retry on timeout.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 14:09:38 +08:00
dekun 136fc51f62 Fix deploy.sh CRLF line endings for Linux execution.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 13:37:24 +08:00
dekun b38b821c35 Add one-click deploy script for /opt production setup with PM2.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 13:32:06 +08:00
dekun 5e95d3af2f Initial commit: add Trading Studio voice-over pipeline for quant trading review videos.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 13:19:44 +08:00