182 lines
9.6 KiB
Markdown
182 lines
9.6 KiB
Markdown
# TalkEdit — Features & Roadmap
|
||
|
||
**Niche:** "Descript for long-form content" — works on hour+ files without degrading, fully offline, one-time payment.
|
||
|
||
---
|
||
|
||
## ✅ Already Implemented
|
||
|
||
### Core editing
|
||
- [x] [#001] **Cut / Mute sections** — remove or silence segments from output
|
||
- [x] [#002] **Silence / pause trimmer** — batch detect and remove silent pauses
|
||
- [x] [#006] **Volume / gain control** — per-zone and global gain adjustment
|
||
- [x] [#007] **Speed adjustment** — per-zone playback speed changes (0.25x–4x)
|
||
- [x] [#008] **Cut preview** — preview zones before export with configurable padding
|
||
- [x] [#009] **Timeline shows output length** — adjusted timeline with cut compression
|
||
- [x] [#011] **Mark In / Out** — I/O keys to set selection range on timeline
|
||
|
||
### Transcript
|
||
- [x] [#010] **Transcript search (Ctrl+F)** — find words, navigate matches
|
||
- [x] [#012] **Low-confidence word highlighting** — orange dotted underline with confidence %
|
||
- [x] [#013] **Re-transcribe selection** — re-run Whisper on a selected word range
|
||
- [x] [#015] **Word text correction** — double-click any word to edit text in-place
|
||
- [x] [#016] **Named timeline markers** — colored pins with labels, editable
|
||
- [x] [#017] **Chapters** — auto-form from markers, copy as YouTube timestamps
|
||
- [x] [#025] Word-level transcript editing (click, shift+click, drag select)
|
||
- [x] [#026] Ctrl+click word → seek video to that timestamp
|
||
- [x] [#027] Waveform timeline with zoom (Ctrl+scroll), scroll, drag-to-scrub
|
||
- [x] [#028] Auto-scroll waveform when playhead goes off-screen
|
||
|
||
### AI features
|
||
- [x] [#029] **AI filler word detection** — find and remove "um", "uh", "like" etc.
|
||
- [x] [#030] **AI clip suggestions** — find best 20-60s segments for social media
|
||
- [x] [#031] **Noise reduction** — DeepFilterNet or FFmpeg ANLMDN
|
||
- [x] [#034] **Speaker diarization** — label speakers in transcript
|
||
- [x] [#042] **Background removal** — MediaPipe segmentation, blur/color/image replacement
|
||
|
||
### Export
|
||
- [x] [#018] **Audio loudness normalization** — LUFS targets (-14 YouTube, -16 Spotify, -23 Broadcast)
|
||
- [x] [#019] **Background music** — auto-ducking via FFmpeg sidechain compress
|
||
- [x] [#020] **Video zoom / punch-in** — crop, zoom, pan during export
|
||
- [x] [#021] **Multi-clip / append** — concatenate multiple video files
|
||
- [x] [#024] **Export transcript** — plain text or SRT without video
|
||
- [x] [#032] **Export** — fast stream-copy or full re-encode (MP4/MOV/WebM/WAV, 720p–4K)
|
||
- [x] [#033] **Captions** — SRT, VTT, ASS burn-in with font/color/position options
|
||
|
||
### Project & state
|
||
- [x] [#003] **Undo / redo** — 100-level history via Zundo
|
||
- [x] [#004] **Grouped silence-trim zones** — editable batch groups
|
||
- [x] [#005] **Edit silence-trim group** settings after applying
|
||
- [x] [#022] **Clip thumbnail strip** — canvas capture from video, clickable
|
||
- [x] [#035] **Project save / load** — .aive JSON format
|
||
- [x] [#037] **Multi-format input** — MP4, MKV, MOV, AVI, WebM, M4A
|
||
- [x] [#038] **Keyboard shortcuts** — Space, J/K/L, arrows, Ctrl+Z/S/E, ?
|
||
- [x] [#039] **Settings panel** — AI provider config (Ollama, OpenAI, Claude)
|
||
- [x] [#040] **Zone creation on timeline** — draggable edits, Delete to remove
|
||
- [x] [#041] **Customizable hotkeys** — two presets, click-to-remap, conflict detection
|
||
- [x] **[M] Manage Models** — view/delete downloaded Whisper and LLM files
|
||
- [x] **[M] Keyboard cheatsheet** — `?` overlay with close button, preset indicator
|
||
- [x] **[M] Visual toolbar** — grouped buttons with section dividers
|
||
- [x] **[M] Help panel** — full feature documentation in sidebar
|
||
- [x] **[M] First-run welcome overlay** — 3-step quick-start guide
|
||
- [x] **[M] Responsive welcome screen** — animated audio bars, model picker
|
||
- [x] **[M] Error boundary** — catches React crashes, shows fallback + reload
|
||
- [x] **[M] Global error logging** — uncaught errors logged to Rust backend
|
||
- [x] **[M] Store input validation** — NaN rejection, bounds clamping, min zone duration
|
||
- [x] **[M] Runtime assertions** — dev-mode guards in critical paths
|
||
- [x] **[M] Backend health check** — polls every 30s, shows reconnecting banner
|
||
|
||
### Licensing
|
||
- [x] **[L] 7-day free trial** — no credit card required
|
||
- [x] **[L] License activation** — email confirmation step to deter key sharing
|
||
- [x] **[L] Ed25519-signed license keys** — offline verification
|
||
- [x] **[L] Trial integrity** — sentinel file prevents delete-and-reset, XOR checksum deters timestamp editing
|
||
- [x] **[L] canEdit gate** — defaults to locked, only unlocks after verified status
|
||
- [x] **[L] Expired state** — export and loading still work, editing and AI locked
|
||
|
||
### Robustness
|
||
- [x] **[R] Auto-save crash recovery** — every 60s, restore prompt on next launch
|
||
- [x] **[R] Bad project state recovery** — auto-prunes invalid zones on load
|
||
- [x] **[R] Zone/marker deletion confirmations** — prevents accidental removals
|
||
- [x] **[R] Progress bars** — export (determinate), transcription (indeterminate)
|
||
- [x] **[R] Loading spinners** — waveform, AI processing
|
||
- [x] **[R] Error states with retry** — AIPanel, WaveformTimeline
|
||
- [x] **[R] Empty states** — MarkersPanel, AIPanel, ZoneEditor
|
||
- [x] **[R] Canvas zone handles enlarged** — radius 6px, hit area increased
|
||
- [x] **[R] Search match contrast** — thicker rings, higher opacity
|
||
- [x] **[R] Split panes keyboard-accessible** — arrow keys, tabIndex, ARIA
|
||
|
||
### Testing
|
||
- [x] **95 frontend tests** — editorStore (68), licenseStore (22), aiStore (15), assert (4)
|
||
- [x] **12 Rust tests** — licensing (7), models (5)
|
||
- [x] **CI pipeline** — GitHub Actions (Rust: test+clippy, Frontend: tsc+vitest, Python: pytest)
|
||
|
||
---
|
||
|
||
## 🔴 What's Next — highest impact
|
||
|
||
- [ ] **[LLM] Bundled Qwen3 LLM** — auto-download on first AI use, no API keys needed. Replace Python `ai_provider.py` with llama.cpp Rust bindings. Two sizes: 4B (2.5GB, 8GB+ RAM) and 1.7B (1GB, 4GB+ RAM)
|
||
- [ ] **[SHORTS] Smart Shorts finder** — scan transcript for self-contained 10–90s segments, ranked by engagement. One-click export as separate clips
|
||
- [ ] **[PAYMENT] Wire checkout** — payment page at talked.it, Stripe → license key generation → delivery email
|
||
- [ ] **[BETA] Beta testers** — give 5–10 podcasters free licenses in exchange for feedback
|
||
- [ ] **[BUILD] Production builds** — `cargo tauri build` for Windows, macOS, Linux
|
||
|
||
---
|
||
|
||
## 🟡 Medium impact — AI features
|
||
|
||
- [ ] [#044] **AI Transcript Summarization** — bullet-point summary from transcript
|
||
- [ ] [#045] **AI Sentence Rephrase** — right-click word → see alternatives → replace
|
||
- [ ] [#046] **AI Smart Speed** — detect slow sections → suggest speed adjustments
|
||
- [ ] [#047] **AI Auto-Chapters** — topic detection from transcript → markers
|
||
- [ ] [#048] **AI Show Notes** — title, description, keywords, timestamps
|
||
- [ ] [#049] **AI Find Fluff** — detect rambles, off-topic chatter
|
||
- [ ] [#050] **AI Smooth Cuts** — crossfade between deleted segments
|
||
|
||
---
|
||
|
||
## 🟢 Lower impact — expansion
|
||
|
||
- [ ] **Project stitching** — load multiple .aive projects into one export
|
||
- [ ] **Batch export** — multiple projects/cuts in sequence
|
||
- [ ] **Smart chunking** — overlapping chunks for files >2hr
|
||
- [ ] [#014] Alternate transcription backend (VibeVoice-ASR-HF)
|
||
- [ ] [#051] **AI B-roll** — generate footage from text prompt
|
||
- [ ] [#052] **Smart Layouts** — auto-switch speakers in video frame
|
||
- [ ] [#053] **Per-track audio levels** — gain per speaker track
|
||
- [ ] [#054] **Intro/Outro templates** — reusable segment presets
|
||
- [ ] [#055] **Built-in free music library** — CC0 loops shipped with app
|
||
- [ ] [#056] **Stock media browser** — browse local resources/media/
|
||
- [ ] [#057] **Sample content downloader** — test video with pre-made transcript
|
||
|
||
---
|
||
|
||
## 🎬 OpenShot-inspired (long-term)
|
||
|
||
- [ ] Keyframe animations — clip position, scale, opacity over time
|
||
- [ ] Video transitions — crossfade, wipe between clips
|
||
- [ ] Title / text overlays — SVG templates, adjustable font/color
|
||
- [ ] Chroma key / greenscreen — per-clip effect
|
||
- [ ] Speed ramps — animate speed within a clip
|
||
- [ ] Frame-accurate stepping — arrow keys frame by frame
|
||
- [ ] Clip trimming on timeline — drag edges to trim
|
||
- [ ] Snapping — magnetic snap to markers and edges
|
||
|
||
---
|
||
|
||
## 💡 Competitive advantages
|
||
|
||
- **7-day free trial (no CC)** — full features, no risk
|
||
- **One-time purchase** — $39 Pro, $79 Business, no subscription
|
||
- **100% offline** — no account, no cloud, no data leaves your machine
|
||
- **Local AI** — filler detection, clip suggestions, Smart Clean work offline
|
||
- **Word-level precision** — edit video by deleting words, not razor cuts
|
||
- **Per-segment re-transcription** — fix transcription errors on just the bad part
|
||
- **Auto-ducking background music** — music lowers when speech detected, no keyframing
|
||
- **Works on long files** — virtualized transcript + chunked waveform handles 1hr+
|
||
|
||
---
|
||
|
||
## 🚫 Explicitly deferred
|
||
|
||
- Cloud sync / collaboration
|
||
- Voice cloning / TTS
|
||
- Full multi-track NLE (compositing, keyframes, nested sequences)
|
||
- Mobile app
|
||
- Subscription model
|
||
- Image/video generation models
|
||
|
||
TalkEdit's advantage is that it isn't a timeline editor — the text-is-the-timeline model makes spoken-word editing drastically faster than dragging razor cuts.
|
||
|
||
---
|
||
|
||
## 📦 Launch checklist
|
||
|
||
- [ ] Landing page at talked.it (features, screenshots, pricing, downloads)
|
||
- [ ] Demo video (3–5 min walkthrough)
|
||
- [ ] Product Hunt listing + 50 free licenses
|
||
- [ ] r/podcasting, r/VideoEditing, r/selfhosted posts
|
||
- [ ] Hacker News "Show HN"
|
||
- [ ] GitHub v1.0.0 release with Windows/macOS/Linux binaries
|
||
- [ ] Compare page: TalkEdit vs Descript
|