TalkEdit
Edit video by editing text. An offline, local-first desktop video editor where deleting a word from the transcript cuts it from the video.
Features
- Text-based editing — delete, reorder, or correct words in the transcript to edit the underlying video. No razor tool, no timeline slicing.
- Word-level transcription — Whisper.cpp with per-word timestamps and confidence scores. Low-confidence words get a visual warning.
- Four zone types — Cut, Mute, Sound Gain, and Speed Adjust. Create zones on the waveform timeline and drag edges to refine.
- Waveform timeline — zoomable, scrollable waveform with playhead scrubbing, zone visualization, markers, chapters, and thumbnail strips.
- AI-powered editing
- Filler word detection and removal
- Smart Clean: one-click filler removal + silence trim + noise reduction + loudness normalization
- Clip suggestions for social media shorts
- Sentence rephrase with AI alternatives
- Supports Ollama (local), OpenAI, and Claude backends
- Background music — import a second audio track with auto-ducking via sidechain compression.
- Export — fast stream-copy or full re-encode to MP4, MOV, WebM, or WAV. Resolution up to 4K.
- Captions — generate SRT, VTT, or burn-in ASS subtitles with configurable font, color, and position.
- Speaker diarization — identify and label multiple speakers.
- Audio tools — noise reduction (DeepFilterNet), loudness normalization (LUFS targeting), background removal (MediaPipe), batch silence removal, video zoom/punch-in.
- Project save/load —
.aiveJSON format preserves all edits, zones, markers, and AI config. - Customizable hotkeys — two presets (Standard / Left-hand) with per-key remapping and conflict detection.
- 100% offline, no account required — everything runs on your machine. No telemetry, no cloud dependency.
- 7-day free trial with one-time license key purchase. No subscription.
Tech Stack
| Layer | Technology |
|---|---|
| Desktop shell | Tauri 2.0 (Rust) |
| Frontend | React + TypeScript + Tailwind CSS |
| State management | Zustand with Zundo undo/redo |
| Transcription | Whisper.cpp (word-level timestamps) |
| AI / LLM | Ollama, OpenAI, Claude (plugable backends) |
| Media processing | FFmpeg |
| Python services | FastAPI (spawned as a child process) |
Quick Start
Prerequisites
- Node.js 18+
- Python 3.10+
- FFmpeg (in PATH)
- Rust toolchain (for Tauri)
- Ollama (optional, for local AI features)
Install
# Root and frontend dependencies
npm install
cd frontend && npm install && cd ..
# Backend dependencies
cd backend && pip install -r requirements.txt && cd ..
Run (Development)
# Start everything: backend + frontend + Tauri
npm run dev:tauri
Or run components separately:
# Terminal 1: Python backend
npm run dev:backend
# Terminal 2: Frontend + Tauri
cd frontend && cargo tauri dev
Build
npm run build:tauri
Project Structure
talkedit/
├── src-tauri/ # Tauri 2.0 Rust runtime
│ ├── Cargo.toml
│ └── src/
│ ├── main.rs # App entry, backend spawner
│ ├── lib.rs # Command handlers (IPC bridge)
│ ├── transcription.rs # Whisper.cpp integration
│ ├── video_editor.rs # FFmpeg-based editing
│ ├── caption_generator.rs
│ ├── diarization.rs
│ ├── ai_provider.rs # Ollama / OpenAI / Claude
│ ├── audio_cleaner.rs
│ ├── background_removal.rs
│ ├── licensing.rs # Trial + key activation
│ ├── models.rs # Shared data types
│ └── paths.rs
├── frontend/ # React + Vite + Tailwind
│ └── src/
│ ├── components/ # UI components
│ │ ├── TranscriptEditor.tsx
│ │ ├── WaveformTimeline.tsx
│ │ ├── VideoPlayer.tsx
│ │ ├── AIPanel.tsx
│ │ ├── ExportDialog.tsx
│ │ ├── SettingsPanel.tsx
│ │ ├── BackgroundMusicPanel.tsx
│ │ ├── MarkersPanel.tsx
│ │ ├── ZoneEditor.tsx
│ │ ├── SilenceTrimmerPanel.tsx
│ │ ├── AppendClipPanel.tsx
│ │ ├── LicenseDialog.tsx
│ │ └── DevPanel.tsx
│ ├── store/ # Zustand state (editorStore, aiStore, settingsStore)
│ ├── hooks/ # Custom React hooks
│ ├── lib/ # Utilities and Tauri bridge
│ └── types/ # TypeScript interfaces
├── backend/ # FastAPI Python services
│ ├── main.py
│ ├── routers/ # API endpoints
│ │ ├── transcribe.py
│ │ ├── ai.py
│ │ ├── audio.py
│ │ ├── captions.py
│ │ └── export.py
│ ├── services/ # Core logic
│ ├── video_editor.py
│ ├── caption_generator.py
│ ├── ai_provider.py
│ ├── diarization.py
│ ├── audio_cleaner.py
│ ├── background_removal.py
│ └── license_server.py
├── shared/ # Schema definitions (project format)
├── models/ # Whisper model storage
└── docs/ # Documentation
Keyboard Shortcuts
| Key | Action |
|---|---|
| Space | Play / Pause |
| J / K / L | Reverse / Pause / Forward |
| I / O | Mark In / Mark Out |
| ← / → | Seek ±5 seconds |
| Delete | Delete selected words or zones |
| Ctrl+Z | Undo |
| Ctrl+Shift+Z | Redo |
| Ctrl+S | Save project |
| Ctrl+E | Export |
| Ctrl+F | Search transcript |
| Ctrl+Scroll | Zoom waveform |
| ? | Shortcut cheatsheet |
License
Source code is MIT — see LICENSE for details. The distributed binary includes a 7-day free trial requiring a one-time license key purchase for continued use.
Languages
TypeScript
60.5%
Python
27.6%
Rust
9.3%
Shell
2%
JavaScript
0.3%
Other
0.3%