Files
TalkEdit/.github/copilot-instructions.md

97 lines
5.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# TalkEdit Copilot Instructions (Living Project Context)
Purpose: give AI assistants immediate, accurate context for this repository and define what must be kept in sync when the project evolves.
## How To Use This File
- This is a workspace instruction file for VS Code Chat/Copilot.
- Treat this as the first source of truth for architecture and workflow expectations.
- If your code changes make any section outdated, update this file in the same change.
## Project Snapshot
- Name: TalkEdit
- Product: local-first, AI-powered, text-based audio/video editor.
- Primary runtime: Tauri + React frontend + Python FastAPI backend.
- Desktop only (Electron has been removed; Tauri is the exclusive desktop runtime).
## Tech Stack
- Frontend: React 19, TypeScript, Vite, Tailwind, Zustand.
- Desktop bridge: Tauri API (IPC commands via `window.electronAPI` polyfill in `frontend/src/lib/tauri-bridge.ts` for unified call-site interface).
- Backend: FastAPI + Uvicorn (`backend/main.py`) with routers in `backend/routers` and core services in `backend/services`.
- Media tooling: FFmpeg for edit/export and codec operations.
- AI tooling: WhisperX/faster-whisper for transcription; provider layer supports OpenAI/Anthropic/Ollama.
## Code Map
- `frontend/src/components`: editor UI (player, transcript, waveform, settings, export, AI panel).
- `frontend/src/store`: Zustand state (`editorStore`, `aiStore`).
- `frontend/src/hooks`: keyboard/video sync behavior.
- `backend/routers`: API surface (`/transcribe`, `/export`, `/ai/*`, `/captions`, `/audio/*`).
- `backend/services`: heavy operations (transcription, captioning, diarization, video editing, cleanup).
- `shared/project-schema.json`: saved project schema contract.
- `src-tauri`: Rust/Tauri host code and app configuration.
## Run And Build (Preferred)
- Frontend dev: `npm run dev`
- Backend dev: `npm run dev:backend`
- Tauri dev: `npm run dev:tauri`
- Tauri build: `npm run build:tauri`
Use project virtualenvs where available (`.venv312`, `.venv`, or `venv`) for backend execution.
## Working Conventions
- Keep router files thin; put heavy logic in `backend/services`.
- Preserve response compatibility for existing frontend callers unless task explicitly allows API breakage.
- Frontend uses unified `window.electronAPI` interface (Tauri-backed via tauri-bridge.ts); desktop APIs are implemented exclusively in Tauri.
- Prefer small, focused edits over broad refactors.
## Known Risk Areas
- Startup/rendering on Linux WebKit can regress when reintroducing remote fonts/CSP allowances; prefer local font assets.
- Media URL handling between project load paths should remain consistent to avoid format-specific regressions (especially WAV/MP3 behavior).
- Export pipeline changes must preserve caption modes (`none`, `sidecar`, `burn-in`) and audio enhancement behavior.
## Recent Changes
### 2026-05-04 — Word text correction, low-confidence highlighting, audio normalization
- **Word text correction (#015)**: Double-click any word in the transcript editor to edit its text inline. Press Enter to commit, Escape to cancel. State is updated in both `words[]` and `segments[]` arrays (segment text recomposed from updated words). Pure frontend; no backend changes needed.
- **Low-confidence word highlighting (#012)**: Words with `confidence < threshold` (default 0.6, configurable in Settings panel) render with an orange dotted underline. Tooltip shows exact confidence percentage. Threshold is persisted in `localStorage` key `talkedit:confidenceThreshold`.
- **Audio normalization (#018)**: New backend endpoint `POST /audio/normalize` in `backend/routers/audio.py`. Two-pass FFmpeg `loudnorm` (measure then apply) implemented in `backend/services/audio_cleaner.py:normalize_audio()`. Falls back to single-pass if measurement fails. Frontend UI in Export panel: target selector (YouTube -14, Spotify -16, Broadcast -23, etc.) with "Normalize" button.
- **Store**: New `updateWordText(index, text)` action in `editorStore.ts` updates both `words[]` and recomputes `segments[].text`.
- **Settings panel**: New confidence threshold slider (01 range).
## Update Rules (Important)
When a task changes architecture, app wiring, commands, API shape, project schema, or major conventions, update this file before finishing.
Always update these sections if affected:
- `Project Snapshot`
- `Tech Stack`
- `Code Map`
- `Run And Build (Preferred)`
- `Working Conventions`
- `Known Risk Areas`
- Recent changes section (if applicable)
- `Code Map`
- `Run And Build (Preferred)`
- `Known Risk Areas`
If behavior changed significantly, add a short note under a new `Recent Changes` section with:
- Date (`YYYY-MM-DD`)
- What changed
- What future edits should preserve
## Assistant Behavior For This Repo
- Validate assumptions against current files before editing.
- Prefer existing patterns in neighboring files over introducing new patterns.
- Call out uncertainty explicitly when code and docs disagree.
- If you discover stale docs, fix them as part of the same task when reasonable.