# TalkEdit **Edit video by editing text.** An offline, local-first desktop video editor where deleting a word from the transcript cuts it from the video. TalkEdit screenshot --- ## Features - **Text-based editing** — delete, reorder, or correct words in the transcript to edit the underlying video. No razor tool, no timeline slicing. - **Word-level transcription** — Whisper.cpp with per-word timestamps and confidence scores. Low-confidence words get a visual warning. - **Four zone types** — Cut, Mute, Sound Gain, and Speed Adjust. Create zones on the waveform timeline and drag edges to refine. - **Waveform timeline** — zoomable, scrollable waveform with playhead scrubbing, zone visualization, markers, chapters, and thumbnail strips. - **AI-powered editing** - Filler word detection and removal - Smart Clean: one-click filler removal + silence trim + noise reduction + loudness normalization - Clip suggestions for social media shorts - Sentence rephrase with AI alternatives - Supports **Ollama** (local), **OpenAI**, and **Claude** backends - **Background music** — import a second audio track with auto-ducking via sidechain compression. - **Export** — fast stream-copy or full re-encode to MP4, MOV, WebM, or WAV. Resolution up to 4K. - **Captions** — generate SRT, VTT, or burn-in ASS subtitles with configurable font, color, and position. - **Speaker diarization** — identify and label multiple speakers. - **Audio tools** — noise reduction (DeepFilterNet), loudness normalization (LUFS targeting), background removal (MediaPipe), batch silence removal, video zoom/punch-in. - **Project save/load** — `.aive` JSON format preserves all edits, zones, markers, and AI config. - **Customizable hotkeys** — two presets (Standard / Left-hand) with per-key remapping and conflict detection. - **100% offline, no account required** — everything runs on your machine. No telemetry, no cloud dependency. - **7-day free trial** with one-time license key purchase. No subscription. --- ## Tech Stack | Layer | Technology | |-------|------------| | Desktop shell | **Tauri 2.0** (Rust) | | Frontend | **React** + **TypeScript** + **Tailwind CSS** | | State management | **Zustand** with Zundo undo/redo | | Transcription | **Whisper.cpp** (word-level timestamps) | | AI / LLM | **Ollama**, **OpenAI**, **Claude** (plugable backends) | | Media processing | **FFmpeg** | | Python services | **FastAPI** (spawned as a child process) | --- ## Quick Start ### Prerequisites - **Node.js** 18+ - **Python** 3.10+ - **FFmpeg** (in PATH) - **Rust** toolchain (for Tauri) - **Ollama** (optional, for local AI features) ### Install ```bash # Root and frontend dependencies npm install cd frontend && npm install && cd .. # Backend dependencies cd backend && pip install -r requirements.txt && cd .. ``` ### Run (Development) ```bash # Start everything: backend + frontend + Tauri npm run dev:tauri ``` Or run components separately: ```bash # Terminal 1: Python backend npm run dev:backend # Terminal 2: Frontend + Tauri cd frontend && cargo tauri dev ``` ### Build ```bash npm run build:tauri ``` --- ## Project Structure ``` talkedit/ ├── src-tauri/ # Tauri 2.0 Rust runtime │ ├── Cargo.toml │ └── src/ │ ├── main.rs # App entry, backend spawner │ ├── lib.rs # Command handlers (IPC bridge) │ ├── transcription.rs # Whisper.cpp integration │ ├── video_editor.rs # FFmpeg-based editing │ ├── caption_generator.rs │ ├── diarization.rs │ ├── ai_provider.rs # Ollama / OpenAI / Claude │ ├── audio_cleaner.rs │ ├── background_removal.rs │ ├── licensing.rs # Trial + key activation │ ├── models.rs # Shared data types │ └── paths.rs ├── frontend/ # React + Vite + Tailwind │ └── src/ │ ├── components/ # UI components │ │ ├── TranscriptEditor.tsx │ │ ├── WaveformTimeline.tsx │ │ ├── VideoPlayer.tsx │ │ ├── AIPanel.tsx │ │ ├── ExportDialog.tsx │ │ ├── SettingsPanel.tsx │ │ ├── BackgroundMusicPanel.tsx │ │ ├── MarkersPanel.tsx │ │ ├── ZoneEditor.tsx │ │ ├── SilenceTrimmerPanel.tsx │ │ ├── AppendClipPanel.tsx │ │ ├── LicenseDialog.tsx │ │ └── DevPanel.tsx │ ├── store/ # Zustand state (editorStore, aiStore, settingsStore) │ ├── hooks/ # Custom React hooks │ ├── lib/ # Utilities and Tauri bridge │ └── types/ # TypeScript interfaces ├── backend/ # FastAPI Python services │ ├── main.py │ ├── routers/ # API endpoints │ │ ├── transcribe.py │ │ ├── ai.py │ │ ├── audio.py │ │ ├── captions.py │ │ └── export.py │ ├── services/ # Core logic │ ├── video_editor.py │ ├── caption_generator.py │ ├── ai_provider.py │ ├── diarization.py │ ├── audio_cleaner.py │ ├── background_removal.py │ └── license_server.py ├── shared/ # Schema definitions (project format) ├── models/ # Whisper model storage └── docs/ # Documentation ``` --- ## Keyboard Shortcuts | Key | Action | |-----|--------| | Space | Play / Pause | | J / K / L | Reverse / Pause / Forward | | I / O | Mark In / Mark Out | | ← / → | Seek ±5 seconds | | Delete | Delete selected words or zones | | Ctrl+Z | Undo | | Ctrl+Shift+Z | Redo | | Ctrl+S | Save project | | Ctrl+E | Export | | Ctrl+F | Search transcript | | Ctrl+Scroll | Zoom waveform | | ? | Shortcut cheatsheet | --- ## License Source code is MIT — see [LICENSE](LICENSE) for details. The distributed binary includes a 7-day free trial requiring a one-time license key purchase for continued use.