features update

2026-05-04 19:01:11 -06:00
parent dd4ce58920
commit 137dc80cde
2 changed files with 10 additions and 3 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@ -54,6 +54,9 @@ Use project virtualenvs where available (`.venv312`, `.venv`, or `venv`) for bac
 - Startup/rendering on Linux WebKit can regress when reintroducing remote fonts/CSP allowances; prefer local font assets.
 - Media URL handling between project load paths should remain consistent to avoid format-specific regressions (especially WAV/MP3 behavior).
 - Export pipeline changes must preserve caption modes (`none`, `sidecar`, `burn-in`) and audio enhancement behavior.
+- WAV export uses `pcm_s16le` codec — only available for audio-only inputs (no video stream). Format selector conditionally shows WAV based on input file extension.
+- `<select>` dropdowns need `[color-scheme:dark]` Tailwind class on Linux WebKit or the native popup renders white-on-light-gray.
+- Frontend gain ranges use camelCase (`gainDb`) but the backend expects snake_case (`gain_db`). The ExportDialog maps them before sending. Any new call sites must do the same.

 ## Recent Changes

@ -64,6 +67,10 @@ Use project virtualenvs where available (`.venv312`, `.venv`, or `venv`) for bac
 - **Audio normalization (#018)**: New backend endpoint `POST /audio/normalize` in `backend/routers/audio.py`. Two-pass FFmpeg `loudnorm` (measure then apply) implemented in `backend/services/audio_cleaner.py:normalize_audio()`. Falls back to single-pass if measurement fails. Frontend UI in Export panel: target selector (YouTube -14, Spotify -16, Broadcast -23, etc.) with "Normalize" button.
 - **Store**: New `updateWordText(index, text)` action in `editorStore.ts` updates both `words[]` and recomputes `segments[].text`.
 - **Settings panel**: New confidence threshold slider (0–1 range).
+- **WAV export format**: Format selector shows "WAV (Uncompressed)" for audio-only inputs. Backend uses `pcm_s16le` codec via `_get_codec_args()` helper. Codec selection centralized in `backend/services/video_editor.py:_get_codec_args(format_hint, has_video)`.
+- **Normalization moved to export**: No longer a standalone button. Integrated as `normalizeAudio` checkbox + LUFS target selector in ExportPanel. Sent as `normalize_loudness`/`normalize_target_lufs` to backend. Applied via `loudnorm` in FFmpeg audio filter chain during export.
+- **Export camelCase fix**: `ExportDialog.tsx` now manually maps `gainRanges`→`gain_db` and `muteRanges`→`{start,end}` before sending to backend. Prevents Pydantic v2 field rejection.
+- **color-scheme:dark**: All `<select>` elements in ExportDialog use `[color-scheme:dark]` to ensure readable native dropdown popups on Linux WebKit.

 ## Update Rules (Important)

--- a/FEATURES.md
+++ b/FEATURES.md
@ -12,11 +12,11 @@ Features are grouped by priority. Check off items as they are implemented.

 - [x] [#012] **Low-confidence word highlighting** — words with `confidence < 0.6` (configurable in Settings) get an orange dotted underline. Hover shows exact confidence %. (2026-05-04)

- [x] [#018] **Audio normalization / loudness targeting** — "Normalize" button in Export panel with LUFS target selector (-14 YouTube, -16 Spotify, -23 Broadcast). Backend: FFmpeg two-pass `loudnorm`. (2026-05-04)
+- [x] [#018] **Audio normalization / loudness targeting** — Integrated checkbox in Export panel with LUFS target selector (-14 YouTube, -16 Spotify, -23 Broadcast). Applied during export via FFmpeg `loudnorm` in the audio filter chain. No intermediate files. (2026-05-04)

 - [ ] [#024] **Export to transcript text / SRT only** — some users just want a clean `.txt` or `.srt` of the edited transcript without rendering video.

- [ ] [#023] **Batch silence removal** — full-file scan + remove all pauses above threshold in one click. Distinct from the manual trimmer above; this is a "fix the whole file" operation.
+- [x] [#023] **Batch silence removal** — full-file scan + remove all pauses above threshold in one click. Implemented by `SilenceTrimmerPanel` + `POST /audio/detect-silence` (FFmpeg silencedetect).

 ---

@ -80,7 +80,7 @@ These aren't features to build — they're things to make more visible in the UI
 - [#029] AI filler word detection and removal (Ollama / OpenAI / Claude)
 - [#030] AI clip suggestions for social media
 - [#031] Noise reduction (DeepFilterNet or FFmpeg ANLMDN)
- [#032] Export: fast stream-copy or full reencode (MP4/MOV/WebM, 720p/1080p/4K)
+- [#032] Export: fast stream-copy or full reencode (MP4/MOV/WebM/WAV, 720p/1080p/4K). WAV available for audio-only inputs.
 - [#033] Captions: SRT, VTT, ASS burn-in with font/color/position options
 - [#034] Speaker diarization
 - [#035] Project save / load (.aive JSON format)