Files
TalkEdit/FEATURES.md
2026-05-06 16:47:54 -06:00

9.6 KiB
Raw Permalink Blame History

TalkEdit — Features & Roadmap

Niche: "Descript for long-form content" — works on hour+ files without degrading, fully offline, one-time payment.


Already Implemented

Core editing

  • [#001] Cut / Mute sections — remove or silence segments from output
  • [#002] Silence / pause trimmer — batch detect and remove silent pauses
  • [#006] Volume / gain control — per-zone and global gain adjustment
  • [#007] Speed adjustment — per-zone playback speed changes (0.25x4x)
  • [#008] Cut preview — preview zones before export with configurable padding
  • [#009] Timeline shows output length — adjusted timeline with cut compression
  • [#011] Mark In / Out — I/O keys to set selection range on timeline

Transcript

  • [#010] Transcript search (Ctrl+F) — find words, navigate matches
  • [#012] Low-confidence word highlighting — orange dotted underline with confidence %
  • [#013] Re-transcribe selection — re-run Whisper on a selected word range
  • [#015] Word text correction — double-click any word to edit text in-place
  • [#016] Named timeline markers — colored pins with labels, editable
  • [#017] Chapters — auto-form from markers, copy as YouTube timestamps
  • [#025] Word-level transcript editing (click, shift+click, drag select)
  • [#026] Ctrl+click word → seek video to that timestamp
  • [#027] Waveform timeline with zoom (Ctrl+scroll), scroll, drag-to-scrub
  • [#028] Auto-scroll waveform when playhead goes off-screen

AI features

  • [#029] AI filler word detection — find and remove "um", "uh", "like" etc.
  • [#030] AI clip suggestions — find best 20-60s segments for social media
  • [#031] Noise reduction — DeepFilterNet or FFmpeg ANLMDN
  • [#034] Speaker diarization — label speakers in transcript
  • [#042] Background removal — MediaPipe segmentation, blur/color/image replacement

Export

  • [#018] Audio loudness normalization — LUFS targets (-14 YouTube, -16 Spotify, -23 Broadcast)
  • [#019] Background music — auto-ducking via FFmpeg sidechain compress
  • [#020] Video zoom / punch-in — crop, zoom, pan during export
  • [#021] Multi-clip / append — concatenate multiple video files
  • [#024] Export transcript — plain text or SRT without video
  • [#032] Export — fast stream-copy or full re-encode (MP4/MOV/WebM/WAV, 720p4K)
  • [#033] Captions — SRT, VTT, ASS burn-in with font/color/position options

Project & state

  • [#003] Undo / redo — 100-level history via Zundo
  • [#004] Grouped silence-trim zones — editable batch groups
  • [#005] Edit silence-trim group settings after applying
  • [#022] Clip thumbnail strip — canvas capture from video, clickable
  • [#035] Project save / load — .aive JSON format
  • [#037] Multi-format input — MP4, MKV, MOV, AVI, WebM, M4A
  • [#038] Keyboard shortcuts — Space, J/K/L, arrows, Ctrl+Z/S/E, ?
  • [#039] Settings panel — AI provider config (Ollama, OpenAI, Claude)
  • [#040] Zone creation on timeline — draggable edits, Delete to remove
  • [#041] Customizable hotkeys — two presets, click-to-remap, conflict detection
  • [M] Manage Models — view/delete downloaded Whisper and LLM files
  • [M] Keyboard cheatsheet? overlay with close button, preset indicator
  • [M] Visual toolbar — grouped buttons with section dividers
  • [M] Help panel — full feature documentation in sidebar
  • [M] First-run welcome overlay — 3-step quick-start guide
  • [M] Responsive welcome screen — animated audio bars, model picker
  • [M] Error boundary — catches React crashes, shows fallback + reload
  • [M] Global error logging — uncaught errors logged to Rust backend
  • [M] Store input validation — NaN rejection, bounds clamping, min zone duration
  • [M] Runtime assertions — dev-mode guards in critical paths
  • [M] Backend health check — polls every 30s, shows reconnecting banner

Licensing

  • [L] 7-day free trial — no credit card required
  • [L] License activation — email confirmation step to deter key sharing
  • [L] Ed25519-signed license keys — offline verification
  • [L] Trial integrity — sentinel file prevents delete-and-reset, XOR checksum deters timestamp editing
  • [L] canEdit gate — defaults to locked, only unlocks after verified status
  • [L] Expired state — export and loading still work, editing and AI locked

Robustness

  • [R] Auto-save crash recovery — every 60s, restore prompt on next launch
  • [R] Bad project state recovery — auto-prunes invalid zones on load
  • [R] Zone/marker deletion confirmations — prevents accidental removals
  • [R] Progress bars — export (determinate), transcription (indeterminate)
  • [R] Loading spinners — waveform, AI processing
  • [R] Error states with retry — AIPanel, WaveformTimeline
  • [R] Empty states — MarkersPanel, AIPanel, ZoneEditor
  • [R] Canvas zone handles enlarged — radius 6px, hit area increased
  • [R] Search match contrast — thicker rings, higher opacity
  • [R] Split panes keyboard-accessible — arrow keys, tabIndex, ARIA

Testing

  • 95 frontend tests — editorStore (68), licenseStore (22), aiStore (15), assert (4)
  • 12 Rust tests — licensing (7), models (5)
  • CI pipeline — GitHub Actions (Rust: test+clippy, Frontend: tsc+vitest, Python: pytest)

🔴 What's Next — highest impact

  • [LLM] Bundled Qwen3 LLM — auto-download on first AI use, no API keys needed. Replace Python ai_provider.py with llama.cpp Rust bindings. Two sizes: 4B (2.5GB, 8GB+ RAM) and 1.7B (1GB, 4GB+ RAM)
  • [SHORTS] Smart Shorts finder — scan transcript for self-contained 1090s segments, ranked by engagement. One-click export as separate clips
  • [PAYMENT] Wire checkout — payment page at talked.it, Stripe → license key generation → delivery email
  • [BETA] Beta testers — give 510 podcasters free licenses in exchange for feedback
  • [BUILD] Production buildscargo tauri build for Windows, macOS, Linux

🟡 Medium impact — AI features

  • [#044] AI Transcript Summarization — bullet-point summary from transcript
  • [#045] AI Sentence Rephrase — right-click word → see alternatives → replace
  • [#046] AI Smart Speed — detect slow sections → suggest speed adjustments
  • [#047] AI Auto-Chapters — topic detection from transcript → markers
  • [#048] AI Show Notes — title, description, keywords, timestamps
  • [#049] AI Find Fluff — detect rambles, off-topic chatter
  • [#050] AI Smooth Cuts — crossfade between deleted segments

🟢 Lower impact — expansion

  • Project stitching — load multiple .aive projects into one export
  • Batch export — multiple projects/cuts in sequence
  • Smart chunking — overlapping chunks for files >2hr
  • [#014] Alternate transcription backend (VibeVoice-ASR-HF)
  • [#051] AI B-roll — generate footage from text prompt
  • [#052] Smart Layouts — auto-switch speakers in video frame
  • [#053] Per-track audio levels — gain per speaker track
  • [#054] Intro/Outro templates — reusable segment presets
  • [#055] Built-in free music library — CC0 loops shipped with app
  • [#056] Stock media browser — browse local resources/media/
  • [#057] Sample content downloader — test video with pre-made transcript

🎬 OpenShot-inspired (long-term)

  • Keyframe animations — clip position, scale, opacity over time
  • Video transitions — crossfade, wipe between clips
  • Title / text overlays — SVG templates, adjustable font/color
  • Chroma key / greenscreen — per-clip effect
  • Speed ramps — animate speed within a clip
  • Frame-accurate stepping — arrow keys frame by frame
  • Clip trimming on timeline — drag edges to trim
  • Snapping — magnetic snap to markers and edges

💡 Competitive advantages

  • 7-day free trial (no CC) — full features, no risk
  • One-time purchase — $39 Pro, $79 Business, no subscription
  • 100% offline — no account, no cloud, no data leaves your machine
  • Local AI — filler detection, clip suggestions, Smart Clean work offline
  • Word-level precision — edit video by deleting words, not razor cuts
  • Per-segment re-transcription — fix transcription errors on just the bad part
  • Auto-ducking background music — music lowers when speech detected, no keyframing
  • Works on long files — virtualized transcript + chunked waveform handles 1hr+

🚫 Explicitly deferred

  • Cloud sync / collaboration
  • Voice cloning / TTS
  • Full multi-track NLE (compositing, keyframes, nested sequences)
  • Mobile app
  • Subscription model
  • Image/video generation models

TalkEdit's advantage is that it isn't a timeline editor — the text-is-the-timeline model makes spoken-word editing drastically faster than dragging razor cuts.


📦 Launch checklist

  • Landing page at talked.it (features, screenshots, pricing, downloads)
  • Demo video (35 min walkthrough)
  • Product Hunt listing + 50 free licenses
  • r/podcasting, r/VideoEditing, r/selfhosted posts
  • Hacker News "Show HN"
  • GitHub v1.0.0 release with Windows/macOS/Linux binaries
  • Compare page: TalkEdit vs Descript