dillonj a64ae78833
Some checks failed
CI / rust (push) Failing after 2m46s
CI / frontend (push) Successful in 36s
CI / python (push) Failing after 8s
Validate All / validate-all (push) Failing after 4m53s
Update app icons to custom waveform SVG
2026-05-07 02:58:45 -06:00
2026-04-15 16:36:21 -06:00
2026-05-07 01:32:19 -06:00
2026-04-15 18:02:25 -06:00
2026-04-15 18:02:25 -06:00
2025-01-28 16:57:23 -05:00
2026-04-09 01:36:28 -06:00
2026-04-03 10:35:07 -06:00
2026-04-15 18:00:34 -06:00
2026-04-15 17:40:27 -06:00
2026-05-06 16:47:54 -06:00
2026-03-24 23:56:08 -06:00
2026-04-09 01:36:28 -06:00
2026-04-15 17:40:27 -06:00
2026-05-06 23:11:00 -06:00
2026-05-06 13:18:53 -06:00
2026-05-06 16:15:38 -06:00
2026-04-03 10:46:49 -06:00
2026-05-06 16:15:38 -06:00
2026-04-09 01:36:28 -06:00
2026-05-06 23:11:00 -06:00

TalkEdit

Edit video by editing text. An offline, local-first desktop video editor where deleting a word from the transcript cuts it from the video.

TalkEdit screenshot

Features

  • Text-based editing — delete, reorder, or correct words in the transcript to edit the underlying video. No razor tool, no timeline slicing.
  • Word-level transcription — Whisper.cpp with per-word timestamps and confidence scores. Low-confidence words get a visual warning.
  • Four zone types — Cut, Mute, Sound Gain, and Speed Adjust. Create zones on the waveform timeline and drag edges to refine.
  • Waveform timeline — zoomable, scrollable waveform with playhead scrubbing, zone visualization, markers, chapters, and thumbnail strips.
  • AI-powered editing
    • Filler word detection and removal
    • Smart Clean: one-click filler removal + silence trim + noise reduction + loudness normalization
    • Clip suggestions for social media shorts
    • Sentence rephrase with AI alternatives
    • Supports Ollama (local), OpenAI, and Claude backends
  • Background music — import a second audio track with auto-ducking via sidechain compression.
  • Export — fast stream-copy or full re-encode to MP4, MOV, WebM, or WAV. Resolution up to 4K.
  • Captions — generate SRT, VTT, or burn-in ASS subtitles with configurable font, color, and position.
  • Speaker diarization — identify and label multiple speakers.
  • Audio tools — noise reduction (DeepFilterNet), loudness normalization (LUFS targeting), background removal (MediaPipe), batch silence removal, video zoom/punch-in.
  • Project save/load.aive JSON format preserves all edits, zones, markers, and AI config.
  • Customizable hotkeys — two presets (Standard / Left-hand) with per-key remapping and conflict detection.
  • 100% offline, no account required — everything runs on your machine. No telemetry, no cloud dependency.
  • 7-day free trial with one-time license key purchase. No subscription.

Tech Stack

Layer Technology
Desktop shell Tauri 2.0 (Rust)
Frontend React + TypeScript + Tailwind CSS
State management Zustand with Zundo undo/redo
Transcription Whisper.cpp (word-level timestamps)
AI / LLM Ollama, OpenAI, Claude (plugable backends)
Media processing FFmpeg
Python services FastAPI (spawned as a child process)

Quick Start

Prerequisites

  • Node.js 18+
  • Python 3.10+
  • FFmpeg (in PATH)
  • Rust toolchain (for Tauri)
  • Ollama (optional, for local AI features)

Install

# Root and frontend dependencies
npm install
cd frontend && npm install && cd ..

# Backend dependencies
cd backend && pip install -r requirements.txt && cd ..

Run (Development)

# Start everything: backend + frontend + Tauri
npm run dev:tauri

Or run components separately:

# Terminal 1: Python backend
npm run dev:backend

# Terminal 2: Frontend + Tauri
cd frontend && cargo tauri dev

Build

npm run build:tauri

Project Structure

talkedit/
├── src-tauri/             # Tauri 2.0 Rust runtime
│   ├── Cargo.toml
│   └── src/
│       ├── main.rs             # App entry, backend spawner
│       ├── lib.rs              # Command handlers (IPC bridge)
│       ├── transcription.rs    # Whisper.cpp integration
│       ├── video_editor.rs     # FFmpeg-based editing
│       ├── caption_generator.rs
│       ├── diarization.rs
│       ├── ai_provider.rs      # Ollama / OpenAI / Claude
│       ├── audio_cleaner.rs
│       ├── background_removal.rs
│       ├── licensing.rs        # Trial + key activation
│       ├── models.rs           # Shared data types
│       └── paths.rs
├── frontend/              # React + Vite + Tailwind
│   └── src/
│       ├── components/         # UI components
│       │   ├── TranscriptEditor.tsx
│       │   ├── WaveformTimeline.tsx
│       │   ├── VideoPlayer.tsx
│       │   ├── AIPanel.tsx
│       │   ├── ExportDialog.tsx
│       │   ├── SettingsPanel.tsx
│       │   ├── BackgroundMusicPanel.tsx
│       │   ├── MarkersPanel.tsx
│       │   ├── ZoneEditor.tsx
│       │   ├── SilenceTrimmerPanel.tsx
│       │   ├── AppendClipPanel.tsx
│       │   ├── LicenseDialog.tsx
│       │   └── DevPanel.tsx
│       ├── store/             # Zustand state (editorStore, aiStore, settingsStore)
│       ├── hooks/             # Custom React hooks
│       ├── lib/               # Utilities and Tauri bridge
│       └── types/             # TypeScript interfaces
├── backend/               # FastAPI Python services
│   ├── main.py
│   ├── routers/               # API endpoints
│   │   ├── transcribe.py
│   │   ├── ai.py
│   │   ├── audio.py
│   │   ├── captions.py
│   │   └── export.py
│   ├── services/              # Core logic
│   ├── video_editor.py
│   ├── caption_generator.py
│   ├── ai_provider.py
│   ├── diarization.py
│   ├── audio_cleaner.py
│   ├── background_removal.py
│   └── license_server.py
├── shared/                # Schema definitions (project format)
├── models/                # Whisper model storage
└── docs/                  # Documentation

Keyboard Shortcuts

Key Action
Space Play / Pause
J / K / L Reverse / Pause / Forward
I / O Mark In / Mark Out
← / → Seek ±5 seconds
Delete Delete selected words or zones
Ctrl+Z Undo
Ctrl+Shift+Z Redo
Ctrl+S Save project
Ctrl+E Export
Ctrl+F Search transcript
Ctrl+Scroll Zoom waveform
? Shortcut cheatsheet

License

Source code is MIT — see LICENSE for details. The distributed binary includes a 7-day free trial requiring a one-time license key purchase for continued use.

Description
No description provided
Readme MIT 3.1 MiB
Languages
TypeScript 60.5%
Python 27.6%
Rust 9.3%
Shell 2%
JavaScript 0.3%
Other 0.3%