9.6 KiB
9.6 KiB
TalkEdit — Features & Roadmap
Niche: "Descript for long-form content" — works on hour+ files without degrading, fully offline, one-time payment.
✅ Already Implemented
Core editing
- [#001] Cut / Mute sections — remove or silence segments from output
- [#002] Silence / pause trimmer — batch detect and remove silent pauses
- [#006] Volume / gain control — per-zone and global gain adjustment
- [#007] Speed adjustment — per-zone playback speed changes (0.25x–4x)
- [#008] Cut preview — preview zones before export with configurable padding
- [#009] Timeline shows output length — adjusted timeline with cut compression
- [#011] Mark In / Out — I/O keys to set selection range on timeline
Transcript
- [#010] Transcript search (Ctrl+F) — find words, navigate matches
- [#012] Low-confidence word highlighting — orange dotted underline with confidence %
- [#013] Re-transcribe selection — re-run Whisper on a selected word range
- [#015] Word text correction — double-click any word to edit text in-place
- [#016] Named timeline markers — colored pins with labels, editable
- [#017] Chapters — auto-form from markers, copy as YouTube timestamps
- [#025] Word-level transcript editing (click, shift+click, drag select)
- [#026] Ctrl+click word → seek video to that timestamp
- [#027] Waveform timeline with zoom (Ctrl+scroll), scroll, drag-to-scrub
- [#028] Auto-scroll waveform when playhead goes off-screen
AI features
- [#029] AI filler word detection — find and remove "um", "uh", "like" etc.
- [#030] AI clip suggestions — find best 20-60s segments for social media
- [#031] Noise reduction — DeepFilterNet or FFmpeg ANLMDN
- [#034] Speaker diarization — label speakers in transcript
- [#042] Background removal — MediaPipe segmentation, blur/color/image replacement
Export
- [#018] Audio loudness normalization — LUFS targets (-14 YouTube, -16 Spotify, -23 Broadcast)
- [#019] Background music — auto-ducking via FFmpeg sidechain compress
- [#020] Video zoom / punch-in — crop, zoom, pan during export
- [#021] Multi-clip / append — concatenate multiple video files
- [#024] Export transcript — plain text or SRT without video
- [#032] Export — fast stream-copy or full re-encode (MP4/MOV/WebM/WAV, 720p–4K)
- [#033] Captions — SRT, VTT, ASS burn-in with font/color/position options
Project & state
- [#003] Undo / redo — 100-level history via Zundo
- [#004] Grouped silence-trim zones — editable batch groups
- [#005] Edit silence-trim group settings after applying
- [#022] Clip thumbnail strip — canvas capture from video, clickable
- [#035] Project save / load — .aive JSON format
- [#037] Multi-format input — MP4, MKV, MOV, AVI, WebM, M4A
- [#038] Keyboard shortcuts — Space, J/K/L, arrows, Ctrl+Z/S/E, ?
- [#039] Settings panel — AI provider config (Ollama, OpenAI, Claude)
- [#040] Zone creation on timeline — draggable edits, Delete to remove
- [#041] Customizable hotkeys — two presets, click-to-remap, conflict detection
- [M] Manage Models — view/delete downloaded Whisper and LLM files
- [M] Keyboard cheatsheet —
?overlay with close button, preset indicator - [M] Visual toolbar — grouped buttons with section dividers
- [M] Help panel — full feature documentation in sidebar
- [M] First-run welcome overlay — 3-step quick-start guide
- [M] Responsive welcome screen — animated audio bars, model picker
- [M] Error boundary — catches React crashes, shows fallback + reload
- [M] Global error logging — uncaught errors logged to Rust backend
- [M] Store input validation — NaN rejection, bounds clamping, min zone duration
- [M] Runtime assertions — dev-mode guards in critical paths
- [M] Backend health check — polls every 30s, shows reconnecting banner
Licensing
- [L] 7-day free trial — no credit card required
- [L] License activation — email confirmation step to deter key sharing
- [L] Ed25519-signed license keys — offline verification
- [L] Trial integrity — sentinel file prevents delete-and-reset, XOR checksum deters timestamp editing
- [L] canEdit gate — defaults to locked, only unlocks after verified status
- [L] Expired state — export and loading still work, editing and AI locked
Robustness
- [R] Auto-save crash recovery — every 60s, restore prompt on next launch
- [R] Bad project state recovery — auto-prunes invalid zones on load
- [R] Zone/marker deletion confirmations — prevents accidental removals
- [R] Progress bars — export (determinate), transcription (indeterminate)
- [R] Loading spinners — waveform, AI processing
- [R] Error states with retry — AIPanel, WaveformTimeline
- [R] Empty states — MarkersPanel, AIPanel, ZoneEditor
- [R] Canvas zone handles enlarged — radius 6px, hit area increased
- [R] Search match contrast — thicker rings, higher opacity
- [R] Split panes keyboard-accessible — arrow keys, tabIndex, ARIA
Testing
- 95 frontend tests — editorStore (68), licenseStore (22), aiStore (15), assert (4)
- 12 Rust tests — licensing (7), models (5)
- CI pipeline — GitHub Actions (Rust: test+clippy, Frontend: tsc+vitest, Python: pytest)
🔴 What's Next — highest impact
- [LLM] Bundled Qwen3 LLM — auto-download on first AI use, no API keys needed. Replace Python
ai_provider.pywith llama.cpp Rust bindings. Two sizes: 4B (2.5GB, 8GB+ RAM) and 1.7B (1GB, 4GB+ RAM) - [SHORTS] Smart Shorts finder — scan transcript for self-contained 10–90s segments, ranked by engagement. One-click export as separate clips
- [PAYMENT] Wire checkout — payment page at talked.it, Stripe → license key generation → delivery email
- [BETA] Beta testers — give 5–10 podcasters free licenses in exchange for feedback
- [BUILD] Production builds —
cargo tauri buildfor Windows, macOS, Linux
🟡 Medium impact — AI features
- [#044] AI Transcript Summarization — bullet-point summary from transcript
- [#045] AI Sentence Rephrase — right-click word → see alternatives → replace
- [#046] AI Smart Speed — detect slow sections → suggest speed adjustments
- [#047] AI Auto-Chapters — topic detection from transcript → markers
- [#048] AI Show Notes — title, description, keywords, timestamps
- [#049] AI Find Fluff — detect rambles, off-topic chatter
- [#050] AI Smooth Cuts — crossfade between deleted segments
🟢 Lower impact — expansion
- Project stitching — load multiple .aive projects into one export
- Batch export — multiple projects/cuts in sequence
- Smart chunking — overlapping chunks for files >2hr
- [#014] Alternate transcription backend (VibeVoice-ASR-HF)
- [#051] AI B-roll — generate footage from text prompt
- [#052] Smart Layouts — auto-switch speakers in video frame
- [#053] Per-track audio levels — gain per speaker track
- [#054] Intro/Outro templates — reusable segment presets
- [#055] Built-in free music library — CC0 loops shipped with app
- [#056] Stock media browser — browse local resources/media/
- [#057] Sample content downloader — test video with pre-made transcript
🎬 OpenShot-inspired (long-term)
- Keyframe animations — clip position, scale, opacity over time
- Video transitions — crossfade, wipe between clips
- Title / text overlays — SVG templates, adjustable font/color
- Chroma key / greenscreen — per-clip effect
- Speed ramps — animate speed within a clip
- Frame-accurate stepping — arrow keys frame by frame
- Clip trimming on timeline — drag edges to trim
- Snapping — magnetic snap to markers and edges
💡 Competitive advantages
- 7-day free trial (no CC) — full features, no risk
- One-time purchase — $39 Pro, $79 Business, no subscription
- 100% offline — no account, no cloud, no data leaves your machine
- Local AI — filler detection, clip suggestions, Smart Clean work offline
- Word-level precision — edit video by deleting words, not razor cuts
- Per-segment re-transcription — fix transcription errors on just the bad part
- Auto-ducking background music — music lowers when speech detected, no keyframing
- Works on long files — virtualized transcript + chunked waveform handles 1hr+
🚫 Explicitly deferred
- Cloud sync / collaboration
- Voice cloning / TTS
- Full multi-track NLE (compositing, keyframes, nested sequences)
- Mobile app
- Subscription model
- Image/video generation models
TalkEdit's advantage is that it isn't a timeline editor — the text-is-the-timeline model makes spoken-word editing drastically faster than dragging razor cuts.
📦 Launch checklist
- Landing page at talked.it (features, screenshots, pricing, downloads)
- Demo video (3–5 min walkthrough)
- Product Hunt listing + 50 free licenses
- r/podcasting, r/VideoEditing, r/selfhosted posts
- Hacker News "Show HN"
- GitHub v1.0.0 release with Windows/macOS/Linux binaries
- Compare page: TalkEdit vs Descript