6.3 KiB
6.3 KiB
TalkEdit Copilot Instructions (Living Project Context)
Purpose: give AI assistants immediate, accurate context for this repository and define what must be kept in sync when the project evolves.
How To Use This File
- This is a workspace instruction file for VS Code Chat/Copilot.
- Treat this as the first source of truth for architecture and workflow expectations.
- If your code changes make any section outdated, update this file in the same change.
Project Snapshot
- Name: TalkEdit
- Product: local-first, AI-powered, text-based audio/video editor.
- Primary runtime: Tauri + React frontend + Python FastAPI backend.
- Desktop only (Electron has been removed; Tauri is the exclusive desktop runtime).
Tech Stack
- Frontend: React 19, TypeScript, Vite, Tailwind, Zustand.
- Desktop bridge: Tauri API (IPC commands via
window.electronAPIpolyfill infrontend/src/lib/tauri-bridge.tsfor unified call-site interface). - Backend: FastAPI + Uvicorn (
backend/main.py) with routers inbackend/routersand core services inbackend/services. - Media tooling: FFmpeg for edit/export and codec operations.
- AI tooling: WhisperX/faster-whisper for transcription; provider layer supports OpenAI/Anthropic/Ollama.
Code Map
frontend/src/components: editor UI (player, transcript, waveform, settings, export, AI panel).frontend/src/store: Zustand state (editorStore,aiStore).frontend/src/hooks: keyboard/video sync behavior.backend/routers: API surface (/transcribe,/export,/ai/*,/captions,/audio/*).backend/services: heavy operations (transcription, captioning, diarization, video editing, cleanup).shared/project-schema.json: saved project schema contract.src-tauri: Rust/Tauri host code and app configuration.
Run And Build (Preferred)
- Frontend dev:
npm run dev - Backend dev:
npm run dev:backend - Tauri dev:
npm run dev:tauri - Tauri build:
npm run build:tauri
Use project virtualenvs where available (.venv312, .venv, or venv) for backend execution.
Working Conventions
- Keep router files thin; put heavy logic in
backend/services. - Preserve response compatibility for existing frontend callers unless task explicitly allows API breakage.
- Frontend uses unified
window.electronAPIinterface (Tauri-backed via tauri-bridge.ts); desktop APIs are implemented exclusively in Tauri. - Prefer small, focused edits over broad refactors.
Known Risk Areas
- Startup/rendering on Linux WebKit can regress when reintroducing remote fonts/CSP allowances; prefer local font assets.
- Media URL handling between project load paths should remain consistent to avoid format-specific regressions (especially WAV/MP3 behavior).
- Export pipeline changes must preserve caption modes (
none,sidecar,burn-in) and audio enhancement behavior. - WAV export uses
pcm_s16lecodec — only available for audio-only inputs (no video stream). Format selector conditionally shows WAV based on input file extension. <select>dropdowns need[color-scheme:dark]Tailwind class on Linux WebKit or the native popup renders white-on-light-gray.- Frontend gain ranges use camelCase (
gainDb) but the backend expects snake_case (gain_db). The ExportDialog maps them before sending. Any new call sites must do the same.
Recent Changes
2026-05-04 — Word text correction, low-confidence highlighting, audio normalization
- Word text correction (#015): Double-click any word in the transcript editor to edit its text inline. Press Enter to commit, Escape to cancel. State is updated in both
words[]andsegments[]arrays (segment text recomposed from updated words). Pure frontend; no backend changes needed. - Low-confidence word highlighting (#012): Words with
confidence < threshold(default 0.6, configurable in Settings panel) render with an orange dotted underline. Tooltip shows exact confidence percentage. Threshold is persisted inlocalStoragekeytalkedit:confidenceThreshold. - Audio normalization (#018): New backend endpoint
POST /audio/normalizeinbackend/routers/audio.py. Two-pass FFmpegloudnorm(measure then apply) implemented inbackend/services/audio_cleaner.py:normalize_audio(). Falls back to single-pass if measurement fails. Frontend UI in Export panel: target selector (YouTube -14, Spotify -16, Broadcast -23, etc.) with "Normalize" button. - Store: New
updateWordText(index, text)action ineditorStore.tsupdates bothwords[]and recomputessegments[].text. - Settings panel: New confidence threshold slider (0–1 range).
- WAV export format: Format selector shows "WAV (Uncompressed)" for audio-only inputs. Backend uses
pcm_s16lecodec via_get_codec_args()helper. Codec selection centralized inbackend/services/video_editor.py:_get_codec_args(format_hint, has_video). - Normalization moved to export: No longer a standalone button. Integrated as
normalizeAudiocheckbox + LUFS target selector in ExportPanel. Sent asnormalize_loudness/normalize_target_lufsto backend. Applied vialoudnormin FFmpeg audio filter chain during export. - Export camelCase fix:
ExportDialog.tsxnow manually mapsgainRanges→gain_dbandmuteRanges→{start,end}before sending to backend. Prevents Pydantic v2 field rejection. - color-scheme:dark: All
<select>elements in ExportDialog use[color-scheme:dark]to ensure readable native dropdown popups on Linux WebKit.
Update Rules (Important)
When a task changes architecture, app wiring, commands, API shape, project schema, or major conventions, update this file before finishing.
Always update these sections if affected:
Project SnapshotTech StackCode MapRun And Build (Preferred)Working ConventionsKnown Risk Areas- Recent changes section (if applicable)
Code MapRun And Build (Preferred)Known Risk Areas
If behavior changed significantly, add a short note under a new Recent Changes section with:
- Date (
YYYY-MM-DD) - What changed
- What future edits should preserve
Assistant Behavior For This Repo
- Validate assumptions against current files before editing.
- Prefer existing patterns in neighboring files over introducing new patterns.
- Call out uncertainty explicitly when code and docs disagree.
- If you discover stale docs, fix them as part of the same task when reasonable.