Files
TalkEdit/polish_plan.md
2026-05-06 10:53:27 -06:00

329 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# TalkEdit — UI Polish Plan
## 1. Tooltips: show what it does + keyboard shortcut
Every toolbar button and action button should have a `title` that explains the action and shows the keyboard shortcut if one exists.
### Toolbar buttons (App.tsx)
Current: `title={label}` → shows just the name.
New format: `title="Cut the selected or marked range [Ctrl+X]"`
| Button | Current tooltip | New tooltip |
|--------|----------------|-------------|
| Cut | "Cut" | "Cut selected word range or mark in/out area [Ctrl+X]" |
| Mute | "Mute" | "Mute selected word range or mark in/out area [Ctrl+M]" |
| Gain Zone | "Gain Zone" | "Add gain zone from selection or mark in/out [Ctrl+G]" |
| Speed Zone | "Speed Zone" | "Add speed zone from selection or mark in/out [Ctrl+Shift+S]" |
| Zones | "Zones" | "Open zone editor panel [Ctrl+Shift+Z]" |
| Pause Trim | "Pause Trim" | "Detect and remove silent pauses [Ctrl+T]" |
| Markers | "Markers" | "Add and manage timeline markers [Ctrl+Shift+M]" |
| Music | "Music" | "Add background music track [Ctrl+Shift+B]" |
| Append | "Append" | "Append additional video clips [Ctrl+Shift+A]" |
| Reprocess | "Reprocess transcript with selected model" | "Re-transcribe entire video with selected model |
| AI | "AI" | "AI filler detection and clip suggestions [Ctrl+I]" |
| Export | "Export" | "Export video with current edits [Ctrl+E]" |
| Settings | "Settings" | "Configure AI providers, shortcuts, models [Ctrl+,]" |
### File menu dropdown items
| Item | Current | New |
|------|---------|-----|
| New Project | none | "Start a new empty project" |
| Open File | none | "Open a video or audio file for transcription" |
| Load Project | none | "Open a saved .aive project file" |
| Save | none | "Save current project [Ctrl+S]" |
| Save As | none | "Save a copy of the current project" |
### Waveform timeline controls
| Element | New tooltip |
|---------|-------------|
| Show adjusted timeline checkbox | "Compress cut regions to see the output timeline without gaps" |
| Cut zones toggle | "Show/hide cut ranges on the timeline" |
| Mute zones toggle | "Show/hide mute ranges on the timeline" |
| Gain zones toggle | "Show/hide gain ranges on the timeline" |
| Speed zones toggle | "Show/hide speed ranges on the timeline" |
| Zoom instruction text | "Scroll to pan · Ctrl+Scroll to zoom [Ctrl+= to reset zoom]" |
| Thumbnail toggle | "Show waveform thumbnail previews from the video" |
### Transcript selection toolbar
| Button | New tooltip |
|--------|-------------|
| Cut | "Remove this word range from the output" |
| Mute | "Silence audio for this word range" |
| Gain | "Adjust volume for this word range — positive boosts, negative reduces" |
| Speed | "Change playback speed for this word range — lower is slower, higher is faster" |
| Re-transcribe | "Re-run Whisper transcription on just this segment to improve accuracy" |
### AIPanel buttons
| Button | New tooltip |
|--------|-------------|
| Detect Filler Words | "Scan the entire transcript for filler words (um, uh, like, you know…) and mark for removal" |
| Apply All | "Create cut ranges for all detected filler words at once" |
| Dismiss | "Clear detected filler word results without applying" |
| Find Best Clips | "Analyze transcript to find the most engaging 20-60 second segments for social media" |
| Preview clip | "Seek to this clip's position and play a preview" |
| Export clip | "Export just this segment as a standalone video file" |
### ExportDialog controls
Every control needs a tooltip — this is the most complex panel with zero tooltips.
| Control | Tooltip |
|---------|---------|
| Fast export card | "Stream copy — no re-encoding, fast but no effects or cuts applied" |
| Re-encode card | "Full re-encode — applies cuts, gain, speed, zoom, captions, and effects" |
| Resolution select | "Output video resolution — higher = larger file" |
| Format select | "Output container format — MP4 is most compatible" |
| Enable zoom checkbox | "Crop and reposition the video frame — useful for removing black bars or reframing" |
| Zoom slider | "Magnification level — 1.0x is original, higher values zoom in" |
| Pan X slider | "Horizontal position of the crop window — negative moves left, positive moves right" |
| Pan Y slider | "Vertical position of the crop window — negative moves up, positive moves down" |
| Background removal checkbox | "Remove or replace the background behind the speaker" |
| Background blur slider | "Amount of Gaussian blur applied to the background" |
| Loudness normalization checkbox | "Normalize audio to a consistent loudness target — recommended for YouTube" |
| LUFS target select | "Loudness target: YouTube (-14), Spotify (-16), Broadcast (-23)" |
| Audio enhancement checkbox | "Apply noise reduction and speech enhancement (DeepFilterNet)" |
| Captions select | "Burn captions into video, export as separate file (SRT/VTT), or none" |
| Export Transcript section | "Export just the transcript text or subtitles without the video" |
### SettingsPanel controls
| Control | Tooltip |
|---------|---------|
| Zone preview padding | "Extra context time shown before and after each zone when previewing" |
| Confidence threshold | "Words below this confidence get an orange underline — lower = show fewer warnings" |
| AI provider selector | "Choose which AI backend powers filler detection, chapters, and suggestions" |
| Ollama base URL | "URL of your Ollama instance — default is localhost:11434" |
| Ollama model | "Model name to use for AI features — requires Ollama running with this model pulled" |
| OpenAI API key | "Your OpenAI API key — stored encrypted on your machine" |
| Claude API key | "Your Anthropic Claude API key — stored encrypted on your machine" |
| Keyboard shortcut inputs | "Click then press the key combination you want to assign" |
### Zone detail tooltips
| Element | Tooltip |
|---------|---------|
| Zone preview button | "Preview this zone with {N}s of context before and after" |
| Gain dB input | "Volume adjustment in decibels — +6 dB doubles volume, -6 dB halves it" |
| Speed multiplier | "Playback speed multiplier — 1.0x is normal, 2.0x is twice as fast" |
| Delete zone button | "Remove this zone permanently" |
---
## 2. Help menu / feature documentation
### 2.1 Help button in toolbar
Add a `?` help button to the right side of the toolbar (next to Settings):
```
[? Help]
```
Clicking it opens a **Help panel** (not a dialog — uses the existing sidebar panel system, or slides in as an overlay).
### 2.2 Help panel sections
#### Getting Started (for first-time users)
```
Welcome to TalkEdit
1. Open a video file → click "Open Video File" or press Ctrl+O
2. Wait for transcription — Whisper processes your audio and creates a word-level transcript
3. Edit by selecting words → choose Cut, Mute, Gain, or Speed from the toolbar
4. Use AI tools → detect filler words, find clips, auto-chapter
5. Export → apply all edits and save your final video
Pro tip: press ? anytime to see all keyboard shortcuts
```
#### Feature reference
**Transcription**
- Select a Whisper model from the toolbar dropdown (larger = more accurate but slower)
- Click a word to select it, Shift+click to extend the selection
- Ctrl+click any word to seek the video to that timestamp
- Double-click any word to edit its text
- Right-click or use the selection toolbar to apply Cut/Mute/Gain/Speed
- Select a word range and click Re-transcribe to improve accuracy on that segment
**Zones (Cut / Mute / Gain / Speed)**
- Zones are time-range edits applied during export
- Create zones by: selecting words in the transcript, using mark-in/mark-out on the timeline, or dragging on the waveform while in zone mode
- Cut = removes the segment from output entirely
- Mute = silences audio but keeps the video
- Gain = adjust volume (positive = louder, negative = quieter)
- Speed = change playback speed
- All zones can be resized and moved on the waveform timeline
- View and manage all zones in the Zone Editor panel
**Waveform Timeline**
- The waveform shows your audio with all zone overlays
- Click to seek, drag to scrub
- Enter Cut/Mute/Gain/Speed mode from the toolbar, then drag on the waveform to create a zone
- Click an existing zone to select it — drag edges to resize, drag body to move
- Press Delete or Backspace to remove the selected zone
- Ctrl+Scroll to zoom in/out, Scroll to pan horizontally
- Toggle individual zone types on/off with the colored buttons
- "Show adjusted timeline" compresses cut regions to preview the output
**AI Features**
- Filler word detection: finds "um", "uh", "like", "you know" and similar words. Add custom fillers in the AI panel. Apply All to create cut ranges for all detected fillers at once.
- Clip suggestions: analyzes your transcript to find the best 20-60 second segments for TikTok, YouTube Shorts, or Instagram Reels.
- AI features work locally with the bundled Qwen3 model (no internet needed) or via Ollama/OpenAI/Claude — configure in Settings.
**Markers**
- Markers are named timestamps pinned to the waveform
- Add markers at the current playhead position with a label and color
- Markers auto-sort as chapters — copy as YouTube timestamps format
- Useful for chapter breaks, key moments, or section headings
**Music & Append**
- Background Music: add a music track with auto-ducking (music lowers when someone speaks)
- Append Clips: load additional video files to concatenate during export
- Both are applied during re-encode export only
**Export**
- Fast mode (stream copy): no quality loss, but doesn't apply cuts, effects, or music — only works if you haven't made any edits
- Re-encode mode: applies all edits, cuts, effects, captions, and music
- Captions: burn directly into video or export as separate SRT/VTT file
- Loudness normalization: match YouTube (-14 LUFS), Spotify (-16), or Broadcast (-23) standards
- Audio enhancement: noise reduction and speech clarity via DeepFilterNet
- Video zoom: crop and reposition the frame (useful for removing letterboxing or reframing)
**Keyboard Shortcuts**
[Full table of all shortcuts — same as the ? cheatsheet but always visible in this section]
**Settings**
- AI Providers: configure Ollama (local), OpenAI (cloud), or Claude (cloud). The bundled Qwen3 model works with zero setup.
- Model Management: view and delete downloaded Whisper and LLM models to free disk space
- Keyboard Shortcuts: remap any shortcut — click a binding then press your desired combination
- Confidence threshold: adjust the low-confidence word highlighting sensitivity
- Zone preview padding: how much context to show before/after zones during preview
### 2.3 First-run onboarding
When a user opens the app for the first time (no license activated, no project loaded):
Show a **welcome overlay** with:
1. "Welcome to TalkEdit" heading
2. Brief description: "The offline video editor for long-form content"
3. Three quick-start steps with icons:
- Open a video → starts transcription
- Edit by deleting words → cuts out the matching video
- Export your final cut
4. "Got it" button that dismisses permanently (store in localStorage)
5. A "Show this again" checkbox in the Help panel
---
## 3. Keyboard shortcut cheatsheet improvements
Current: `?` key appends a `<div>` to `document.body` with a table of shortcuts.
### Fixes:
- [ ] Render the cheatsheet as a React portal (inside a modal overlay) instead of manual DOM
- [ ] Add a close button (×) in the top-right corner
- [ ] Group shortcuts by category with visual headers (Transport, Editing, File, View)
- [ ] Show the current active preset name at the top
- [ ] Add the `?` tooltip "Show/hide keyboard shortcuts" to itself
- [ ] Show the cheatsheet from the Help panel too (not just `?` key)
### Categories and grouping:
| Transport | Edit | File | View |
|-----------|------|------|------|
| Space — Play/Pause | Delete — Cut selection | Ctrl+S — Save | ? — Toggle cheatsheet |
| ← → — Skip 5s | I — Mark in | Ctrl+O — Open | Ctrl+F — Find |
| J — Slow down | O — Mark out | Ctrl+E — Export | |
| K — Pause | Ctrl+Z — Undo | | |
| L — Speed up | Ctrl+Shift+Z — Redo | | |
---
## 4. Missing states (empty/loading/error)
### Empty states
| Component | Current | Fix |
|-----------|---------|-----|
| MarkersPanel | Shows nothing when no markers | Add: "No markers yet. Press M or click Add Marker to create one." |
| AIPanel (clips) | Shows nothing before first detection | Add: "Click 'Find Best Clips' to discover the most shareable moments in your video." |
| AppendClipPanel | "No additional clips loaded" | Keep but add hint: "Add video files to concatenate during export." |
| WaveformTimeline (zones) | Canvas is empty | No change needed — zones are overlays, not content |
### Error states
| Component | Current | Fix |
|-----------|---------|-----|
| AIPanel | Errors logged to console only | Show error message in the panel with a retry button |
| ExportDialog | Shows export error in a red box | Keep, but add a "Copy error" button |
| VideoPlayer | No error for broken video | Add an error state with "Could not load video" + re-select button |
| WaveformTimeline | Shows error text in a `<pre>` tag | Keep, but add a "Retry" button |
| Silence detection | Errors use `alert()` | Show error inline in the panel |
### Loading states
| Component | Current | Fix |
|-----------|---------|-----|
| WaveformTimeline | Blank canvas while audio loads | Add a centered "Loading waveform…" spinner |
| Export | Percentage text only | Add a determinate progress bar |
| Transcription | Spinning waveform bars + text | Add a determinate progress bar for model download phase |
| AI features | Spinner + "Processing…" | Add descriptive step text ("Analyzing transcript…") |
---
## 5. Consistency fixes
### 5.1 Fix mute zone color in ZoneEditor
`ZoneEditor.tsx` uses `border-orange-500/40` for mute zones — should be `border-blue-500/40` to match the waveform timeline's blue mute color.
### 5.2 Unify disabled opacity
- All disabled buttons: `opacity-40` (currently some use 50%)
### 5.3 Unify border radius
- All toolbar buttons: `rounded-md` (keep)
- All sidebar panel inputs: `rounded-lg` (keep)
- All zone/detection list items: `rounded-lg` (currently `rounded`)
### 5.4 Remove orphaned VolumePanel
`VolumePanel.tsx` is not imported anywhere. Either wire it into the sidebar or remove it.
---
## 6. Quick wins (implement first)
- [ ] Add `title` tooltips to ALL toolbar buttons with shortcut hints
- [ ] Add `title` tooltips to ALL ExportDialog controls
- [ ] Fix mute zone color in ZoneEditor (orange → blue)
- [ ] Add empty state to MarkersPanel
- [ ] Add error display to AIPanel
- [ ] Add close button to keyboard cheatsheet
- [ ] Unify disabled opacity to 40% everywhere
- [ ] Remove orphaned VolumePanel.tsx
- [ ] Add loading spinner to WaveformTimeline
## 7. Help system (implement second)
- [ ] Create `HelpContent.tsx` with all feature documentation content
- [ ] Add Help button to toolbar (`?` icon, opens sidebar)
- [ ] Wire Help as a sidebar panel (like AI, Export, Settings)
- [ ] Build first-run welcome overlay component
- [ ] Add "Show help on startup" checkbox to Settings
- [ ] Render keyboard cheatsheet as React portal with close button
## 8. Polish (implement third)
- [ ] Progress bar for export (determinate bar, not just text)
- [ ] Progress bar for model downloads
- [ ] Retry button on waveform load error
- [ ] Confirmation dialog for zone/marker deletion
- [ ] Keyboard-accessible split pane resizing
- [ ] Larger hit targets for canvas zone handles (r=4 → r=6)
- [ ] Search bar match indicator contrast improvement