Text-based editing — delete, reorder, or correct words in the transcript to edit the underlying video. No razor tool, no timeline slicing.
Word-level transcription — Whisper.cpp with per-word timestamps and confidence scores. Low-confidence words get a visual warning.
Four zone types — Cut, Mute, Sound Gain, and Speed Adjust. Create zones on the waveform timeline and drag edges to refine.
Waveform timeline — zoomable, scrollable waveform with playhead scrubbing, zone visualization, markers, chapters, and thumbnail strips.
AI-powered editing
- Filler word detection and removal
- Smart Clean: one-click filler removal + silence trim + noise reduction + loudness normalization
- Clip suggestions for social media shorts
- Sentence rephrase with AI alternatives
- Supports Ollama (local), OpenAI, and Claude backends
Background music — import a second audio track with auto-ducking via sidechain compression.
Export — fast stream-copy or full re-encode to MP4, MOV, WebM, or WAV. Resolution up to 4K.
Captions — generate SRT, VTT, or burn-in ASS subtitles with configurable font, color, and position.
Speaker diarization — identify and label multiple speakers.
Audio tools — noise reduction (DeepFilterNet), loudness normalization (LUFS targeting), background removal (MediaPipe), batch silence removal, video zoom/punch-in.
Project save/load — .aive JSON format preserves all edits, zones, markers, and AI config.
Customizable hotkeys — two presets (Standard / Left-hand) with per-key remapping and conflict detection.
100% offline, no account required — everything runs on your machine. No telemetry, no cloud dependency.
7-day free trial with one-time license key purchase. No subscription.

Tech Stack

Layer	Technology
Desktop shell	Tauri 2.0 (Rust)
Frontend	React + TypeScript + Tailwind CSS
State management	Zustand with Zundo undo/redo
Transcription	Whisper.cpp (word-level timestamps)
AI / LLM	Ollama, OpenAI, Claude (plugable backends)
Media processing	FFmpeg
Python services	FastAPI (spawned as a child process)

Quick Start

Prerequisites

Node.js 18+
Python 3.10+
FFmpeg (in PATH)
Rust toolchain (for Tauri)
Ollama (optional, for local AI features)

Install

# Root and frontend dependencies
npm install
cd frontend && npm install && cd ..

# Backend dependencies
cd backend && pip install -r requirements.txt && cd ..

Run (Development)

# Start everything: backend + frontend + Tauri
npm run dev:tauri

Or run components separately:

# Terminal 1: Python backend
npm run dev:backend

# Terminal 2: Frontend + Tauri
cd frontend && cargo tauri dev

Build

npm run build:tauri

Project Structure

talkedit/
├── src-tauri/             # Tauri 2.0 Rust runtime
│   ├── Cargo.toml
│   └── src/
│       ├── main.rs             # App entry, backend spawner
│       ├── lib.rs              # Command handlers (IPC bridge)
│       ├── transcription.rs    # Whisper.cpp integration
│       ├── video_editor.rs     # FFmpeg-based editing
│       ├── caption_generator.rs
│       ├── diarization.rs
│       ├── ai_provider.rs      # Ollama / OpenAI / Claude
│       ├── audio_cleaner.rs
│       ├── background_removal.rs
│       ├── licensing.rs        # Trial + key activation
│       ├── models.rs           # Shared data types
│       └── paths.rs
├── frontend/              # React + Vite + Tailwind
│   └── src/
│       ├── components/         # UI components
│       │   ├── TranscriptEditor.tsx
│       │   ├── WaveformTimeline.tsx
│       │   ├── VideoPlayer.tsx
│       │   ├── AIPanel.tsx
│       │   ├── ExportDialog.tsx
│       │   ├── SettingsPanel.tsx
│       │   ├── BackgroundMusicPanel.tsx
│       │   ├── MarkersPanel.tsx
│       │   ├── ZoneEditor.tsx
│       │   ├── SilenceTrimmerPanel.tsx
│       │   ├── AppendClipPanel.tsx
│       │   ├── LicenseDialog.tsx
│       │   └── DevPanel.tsx
│       ├── store/             # Zustand state (editorStore, aiStore, settingsStore)
│       ├── hooks/             # Custom React hooks
│       ├── lib/               # Utilities and Tauri bridge
│       └── types/             # TypeScript interfaces
├── backend/               # FastAPI Python services
│   ├── main.py
│   ├── routers/               # API endpoints
│   │   ├── transcribe.py
│   │   ├── ai.py
│   │   ├── audio.py
│   │   ├── captions.py
│   │   └── export.py
│   ├── services/              # Core logic
│   ├── video_editor.py
│   ├── caption_generator.py
│   ├── ai_provider.py
│   ├── diarization.py
│   ├── audio_cleaner.py
│   ├── background_removal.py
│   └── license_server.py
├── shared/                # Schema definitions (project format)
├── models/                # Whisper model storage
└── docs/                  # Documentation

Keyboard Shortcuts

Key	Action
Space	Play / Pause
J / K / L	Reverse / Pause / Forward
I / O	Mark In / Mark Out
← / →	Seek ±5 seconds
Delete	Delete selected words or zones
Ctrl+Z	Undo
Ctrl+Shift+Z	Redo
Ctrl+S	Save project
Ctrl+E	Export
Ctrl+F	Search transcript
Ctrl+Scroll	Zoom waveform
?	Shortcut cheatsheet

License

Source code is MIT — see LICENSE for details. The distributed binary includes a 7-day free trial requiring a one-time license key purchase for continued use.

Languages

TypeScript 60.5%

Python 27.6%

Rust 9.3%

Shell 2%

JavaScript 0.3%

Other 0.3%