updated features and docs
This commit is contained in:
17
FEATURES.md
17
FEATURES.md
@ -131,9 +131,13 @@ Features adapted from OpenShot Video Editor that would fill capability gaps.
|
|||||||
|
|
||||||
These aren't features to build — they're things to make more visible in the UI and README:
|
These aren't features to build — they're things to make more visible in the UI and README:
|
||||||
|
|
||||||
|
- **7-day free trial (no CC required)** — full feature access for 7 days with no credit card. Try everything risk-free.
|
||||||
|
- **One-time purchase, no subscription** — pay once, own forever. No recurring fees, no cloud dependency.
|
||||||
- **100% offline / no account required** — CapCut requires login and sends data to servers. Descript is cloud-first. TalkEdit never leaves the machine.
|
- **100% offline / no account required** — CapCut requires login and sends data to servers. Descript is cloud-first. TalkEdit never leaves the machine.
|
||||||
- **Local AI models** — Ollama support means no API costs and no data leaving the device.
|
- **Local AI models** — Ollama support means no API costs and no data leaving the device.
|
||||||
- **Word-level precision** — editing by deleting words (not dragging razor cuts) is faster for talking-head content than any timeline-based editor.
|
- **Word-level precision editing** — editing by deleting words (not dragging razor cuts) is faster for talking-head content than any timeline-based editor. Double-click any word to correct its text in-place.
|
||||||
|
- **Per-segment re-transcription** — select any word range and re-run Whisper on just that segment instead of re-transcribing the entire file.
|
||||||
|
- **Auto-ducking background music** — add a second audio track that automatically lowers when speech is detected, no manual keyframing needed.
|
||||||
- **Works on long files** — virtualized transcript + chunked waveform handles 1hr+ content that bogs down CapCut.
|
- **Works on long files** — virtualized transcript + chunked waveform handles 1hr+ content that bogs down CapCut.
|
||||||
|
|
||||||
---
|
---
|
||||||
@ -171,3 +175,14 @@ Everything beyond that (picture-in-picture, multi-layer compositing, per-layer k
|
|||||||
- [#038] Keyboard shortcuts (Space, J/K/L, arrows, Ctrl+Z/Shift+Z, Ctrl+S, Ctrl+E)
|
- [#038] Keyboard shortcuts (Space, J/K/L, arrows, Ctrl+Z/Shift+Z, Ctrl+S, Ctrl+E)
|
||||||
- [#039] Settings panel: AI provider config (Ollama, OpenAI, Claude)
|
- [#039] Settings panel: AI provider config (Ollama, OpenAI, Claude)
|
||||||
- [#040] Cut/mute range creation on timeline with draggable zone edits and Delete-to-remove
|
- [#040] Cut/mute range creation on timeline with draggable zone edits and Delete-to-remove
|
||||||
|
- [#058] Help panel with full feature documentation and searchable index
|
||||||
|
- [#059] First-run welcome overlay with quick-start guides
|
||||||
|
- [#060] Auto-save crash recovery — periodic project snapshots, restore on next launch
|
||||||
|
- [#061] Error boundary + global error logging to `~/.TalkEdit/logs/`
|
||||||
|
- [#062] Store input validation — Zustand middleware guards against invalid state
|
||||||
|
- [#063] Trial system — 7-day free trial, sentinel file with integrity check, no CC required
|
||||||
|
- [#064] License activation — email confirmation flow, offline validation
|
||||||
|
- [#065] Model management — view/delete downloaded Whisper and DeepFilterNet models
|
||||||
|
- [#066] Keyboard cheatsheet — press `?` overlay with close button and active preset indicator
|
||||||
|
- [#067] Visual toolbar — grouped icon buttons with section dividers
|
||||||
|
- [#068] Backend health check — reconnecting banner when backend is unreachable
|
||||||
|
|||||||
183
README.md
183
README.md
@ -1,26 +1,58 @@
|
|||||||
# CutScript
|
# TalkEdit
|
||||||
|
|
||||||
An open-source, local-first, Descript-like text-based audio and video editor powered by AI. Edit audio/video by editing text — delete a word from the transcript and it's cut from the audio/video.
|
**Edit video by editing text.** An offline, local-first desktop video editor where deleting a word from the transcript cuts it from the video.
|
||||||
|
|
||||||
<img width="1034" height="661" alt="image" src="https://github.com/user-attachments/assets/b1ed9505-792e-42ca-bb73-85458d0f02a5" />
|
<img width="1034" height="661" alt="TalkEdit screenshot" src="https://github.com/user-attachments/assets/b1ed9505-792e-42ca-bb73-85458d0f02a5" />
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Architecture
|
## Features
|
||||||
|
|
||||||
- **Tauri + React** desktop app with Tailwind CSS
|
- **Text-based editing** — delete, reorder, or correct words in the transcript to edit the underlying video. No razor tool, no timeline slicing.
|
||||||
- **FastAPI** Python backend (spawned as child process)
|
- **Word-level transcription** — Whisper.cpp with per-word timestamps and confidence scores. Low-confidence words get a visual warning.
|
||||||
- **WhisperX** for word-level transcription with alignment
|
- **Four zone types** — Cut, Mute, Sound Gain, and Speed Adjust. Create zones on the waveform timeline and drag edges to refine.
|
||||||
- **FFmpeg** for video processing (stream-copy and re-encode)
|
- **Waveform timeline** — zoomable, scrollable waveform with playhead scrubbing, zone visualization, markers, chapters, and thumbnail strips.
|
||||||
- **Ollama / OpenAI / Claude** for AI features (filler removal, clip creation)
|
- **AI-powered editing**
|
||||||
|
- Filler word detection and removal
|
||||||
|
- Smart Clean: one-click filler removal + silence trim + noise reduction + loudness normalization
|
||||||
|
- Clip suggestions for social media shorts
|
||||||
|
- Sentence rephrase with AI alternatives
|
||||||
|
- Supports **Ollama** (local), **OpenAI**, and **Claude** backends
|
||||||
|
- **Background music** — import a second audio track with auto-ducking via sidechain compression.
|
||||||
|
- **Export** — fast stream-copy or full re-encode to MP4, MOV, WebM, or WAV. Resolution up to 4K.
|
||||||
|
- **Captions** — generate SRT, VTT, or burn-in ASS subtitles with configurable font, color, and position.
|
||||||
|
- **Speaker diarization** — identify and label multiple speakers.
|
||||||
|
- **Audio tools** — noise reduction (DeepFilterNet), loudness normalization (LUFS targeting), background removal (MediaPipe), batch silence removal, video zoom/punch-in.
|
||||||
|
- **Project save/load** — `.aive` JSON format preserves all edits, zones, markers, and AI config.
|
||||||
|
- **Customizable hotkeys** — two presets (Standard / Left-hand) with per-key remapping and conflict detection.
|
||||||
|
- **100% offline, no account required** — everything runs on your machine. No telemetry, no cloud dependency.
|
||||||
|
- **7-day free trial** with one-time license key purchase. No subscription.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Tech Stack
|
||||||
|
|
||||||
|
| Layer | Technology |
|
||||||
|
|-------|------------|
|
||||||
|
| Desktop shell | **Tauri 2.0** (Rust) |
|
||||||
|
| Frontend | **React** + **TypeScript** + **Tailwind CSS** |
|
||||||
|
| State management | **Zustand** with Zundo undo/redo |
|
||||||
|
| Transcription | **Whisper.cpp** (word-level timestamps) |
|
||||||
|
| AI / LLM | **Ollama**, **OpenAI**, **Claude** (plugable backends) |
|
||||||
|
| Media processing | **FFmpeg** |
|
||||||
|
| Python services | **FastAPI** (spawned as a child process) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
|
|
||||||
- Node.js 18+
|
- **Node.js** 18+
|
||||||
- Python 3.10+
|
- **Python** 3.10+
|
||||||
- FFmpeg (in PATH)
|
- **FFmpeg** (in PATH)
|
||||||
- (Optional) Ollama for local AI features
|
- **Rust** toolchain (for Tauri)
|
||||||
|
- **Ollama** (optional, for local AI features)
|
||||||
|
|
||||||
### Install
|
### Install
|
||||||
|
|
||||||
@ -36,65 +68,89 @@ cd backend && pip install -r requirements.txt && cd ..
|
|||||||
### Run (Development)
|
### Run (Development)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Start Tauri dev environment (includes backend + frontend)
|
# Start everything: backend + frontend + Tauri
|
||||||
npm run dev:tauri
|
npm run dev:tauri
|
||||||
```
|
```
|
||||||
|
|
||||||
Or run them separately:
|
Or run components separately:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: Backend
|
# Terminal 1: Python backend
|
||||||
cd backend && python -m uvicorn main:app --reload --port 8642
|
npm run dev:backend
|
||||||
|
|
||||||
# Terminal 2: Frontend + Tauri
|
# Terminal 2: Frontend + Tauri
|
||||||
cd frontend && cargo tauri dev
|
cd frontend && cargo tauri dev
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Build
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run build:tauri
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|
||||||
```
|
```
|
||||||
talkedit/
|
talkedit/
|
||||||
├── src-tauri/ # Tauri Rust runtime
|
├── src-tauri/ # Tauri 2.0 Rust runtime
|
||||||
│ ├── Cargo.toml
|
│ ├── Cargo.toml
|
||||||
│ ├── src/
|
|
||||||
│ │ ├── main.rs # App entry & backend spawner
|
|
||||||
│ │ └── commands/ # Tauri IPC handlers
|
|
||||||
├── frontend/ # React + Vite + Tailwind
|
|
||||||
│ └── src/
|
│ └── src/
|
||||||
│ ├── components/ # VideoPlayer, TranscriptEditor, etc.
|
│ ├── main.rs # App entry, backend spawner
|
||||||
│ ├── store/ # Zustand state (editorStore, aiStore)
|
│ ├── lib.rs # Command handlers (IPC bridge)
|
||||||
│ ├── lib/tauri-bridge.ts # Tauri API polyfill
|
│ ├── transcription.rs # Whisper.cpp integration
|
||||||
│ └── types/ # TypeScript interfaces
|
│ ├── video_editor.rs # FFmpeg-based editing
|
||||||
├── backend/ # FastAPI Python backend
|
│ ├── caption_generator.rs
|
||||||
|
│ ├── diarization.rs
|
||||||
|
│ ├── ai_provider.rs # Ollama / OpenAI / Claude
|
||||||
|
│ ├── audio_cleaner.rs
|
||||||
|
│ ├── background_removal.rs
|
||||||
|
│ ├── licensing.rs # Trial + key activation
|
||||||
|
│ ├── models.rs # Shared data types
|
||||||
|
│ └── paths.rs
|
||||||
|
├── frontend/ # React + Vite + Tailwind
|
||||||
|
│ └── src/
|
||||||
|
│ ├── components/ # UI components
|
||||||
|
│ │ ├── TranscriptEditor.tsx
|
||||||
|
│ │ ├── WaveformTimeline.tsx
|
||||||
|
│ │ ├── VideoPlayer.tsx
|
||||||
|
│ │ ├── AIPanel.tsx
|
||||||
|
│ │ ├── ExportDialog.tsx
|
||||||
|
│ │ ├── SettingsPanel.tsx
|
||||||
|
│ │ ├── BackgroundMusicPanel.tsx
|
||||||
|
│ │ ├── MarkersPanel.tsx
|
||||||
|
│ │ ├── ZoneEditor.tsx
|
||||||
|
│ │ ├── SilenceTrimmerPanel.tsx
|
||||||
|
│ │ ├── AppendClipPanel.tsx
|
||||||
|
│ │ ├── LicenseDialog.tsx
|
||||||
|
│ │ └── DevPanel.tsx
|
||||||
|
│ ├── store/ # Zustand state (editorStore, aiStore, settingsStore)
|
||||||
|
│ ├── hooks/ # Custom React hooks
|
||||||
|
│ ├── lib/ # Utilities and Tauri bridge
|
||||||
|
│ └── types/ # TypeScript interfaces
|
||||||
|
├── backend/ # FastAPI Python services
|
||||||
│ ├── main.py
|
│ ├── main.py
|
||||||
│ ├── routers/ # API endpoints
|
│ ├── routers/ # API endpoints
|
||||||
│ ├── services/ # Core logic (transcription, editing, AI)
|
│ │ ├── transcribe.py
|
||||||
│ └── utils/ # GPU, cache, audio helpers
|
│ │ ├── ai.py
|
||||||
└── shared/ # Project schema
|
│ │ ├── audio.py
|
||||||
|
│ │ ├── captions.py
|
||||||
|
│ │ └── export.py
|
||||||
|
│ ├── services/ # Core logic
|
||||||
|
│ ├── video_editor.py
|
||||||
|
│ ├── caption_generator.py
|
||||||
|
│ ├── ai_provider.py
|
||||||
|
│ ├── diarization.py
|
||||||
|
│ ├── audio_cleaner.py
|
||||||
|
│ ├── background_removal.py
|
||||||
|
│ └── license_server.py
|
||||||
|
├── shared/ # Schema definitions (project format)
|
||||||
|
├── models/ # Whisper model storage
|
||||||
|
└── docs/ # Documentation
|
||||||
```
|
```
|
||||||
|
|
||||||
## Features
|
---
|
||||||
|
|
||||||
| Feature | Status |
|
|
||||||
|---------|--------|
|
|
||||||
| Word-level transcription (WhisperX) | Done |
|
|
||||||
| Text-based video editing | Done |
|
|
||||||
| Undo/redo | Done |
|
|
||||||
| Waveform timeline | Done |
|
|
||||||
| FFmpeg stream-copy export | Done |
|
|
||||||
| FFmpeg re-encode (up to 4K) | Done |
|
|
||||||
| AI filler word removal | Done |
|
|
||||||
| AI clip creation (Shorts) | Done |
|
|
||||||
| Ollama + OpenAI + Claude | Done |
|
|
||||||
| Word-level captions (SRT/VTT/ASS) | Done |
|
|
||||||
| Caption burn-in on export | Done |
|
|
||||||
| Studio Sound (DeepFilterNet) | Done |
|
|
||||||
| Keyboard shortcuts (J/K/L) | Done |
|
|
||||||
| Speaker diarization | Done |
|
|
||||||
| Virtualized transcript (react-virtuoso) | Done |
|
|
||||||
| Encrypted API key storage | Done |
|
|
||||||
| Project save/load (.cutscript) | Done |
|
|
||||||
| AI background removal | Planned |
|
|
||||||
|
|
||||||
## Keyboard Shortcuts
|
## Keyboard Shortcuts
|
||||||
|
|
||||||
@ -102,28 +158,19 @@ talkedit/
|
|||||||
|-----|--------|
|
|-----|--------|
|
||||||
| Space | Play / Pause |
|
| Space | Play / Pause |
|
||||||
| J / K / L | Reverse / Pause / Forward |
|
| J / K / L | Reverse / Pause / Forward |
|
||||||
|
| I / O | Mark In / Mark Out |
|
||||||
| ← / → | Seek ±5 seconds |
|
| ← / → | Seek ±5 seconds |
|
||||||
| Delete | Delete selected words |
|
| Delete | Delete selected words or zones |
|
||||||
| Ctrl+Z | Undo |
|
| Ctrl+Z | Undo |
|
||||||
| Ctrl+Shift+Z | Redo |
|
| Ctrl+Shift+Z | Redo |
|
||||||
| Ctrl+S | Save project |
|
| Ctrl+S | Save project |
|
||||||
| Ctrl+E | Export |
|
| Ctrl+E | Export |
|
||||||
|
| Ctrl+F | Search transcript |
|
||||||
|
| Ctrl+Scroll | Zoom waveform |
|
||||||
| ? | Shortcut cheatsheet |
|
| ? | Shortcut cheatsheet |
|
||||||
|
|
||||||
## API Endpoints
|
---
|
||||||
|
|
||||||
| Method | Endpoint | Description |
|
|
||||||
|--------|----------|-------------|
|
|
||||||
| GET | /health | Health check |
|
|
||||||
| POST | /transcribe | Transcribe video with WhisperX |
|
|
||||||
| POST | /export | Export edited video (stream copy or re-encode) |
|
|
||||||
| POST | /ai/filler-removal | Detect filler words via LLM |
|
|
||||||
| POST | /ai/create-clip | AI-suggested clips for shorts |
|
|
||||||
| GET | /ai/ollama-models | List local Ollama models |
|
|
||||||
| POST | /captions | Generate SRT/VTT/ASS captions |
|
|
||||||
| POST | /audio/clean | Noise reduction (DeepFilterNet) |
|
|
||||||
| GET | /audio/capabilities | Check audio processing availability |
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
MIT License — see [LICENSE](LICENSE) for details.
|
Source code is MIT — see [LICENSE](LICENSE) for details. The distributed binary includes a 7-day free trial requiring a one-time license key purchase for continued use.
|
||||||
|
|||||||
@ -1,50 +1,62 @@
|
|||||||
# TalkEdit — Tech Stack, Tools, and Planned Features
|
# TalkEdit — Tech Stack, Tools, and Features
|
||||||
|
|
||||||
This document summarizes the chosen technology, tooling, the full planned feature set for the MVP, recommended additions, removals, and items to put on the back burner.
|
This document summarizes the chosen technology, tooling, the full feature set, recommended additions, and items on the back burner.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
- Goal: Offline, local text-based audio/video editor (Descript-style) focused on spoken-word creators (podcasters, YouTubers). Fast, privacy-first, single-file installer.
|
- Goal: Offline, local text-based audio/video editor (Descript-style) focused on spoken-word creators (podcasters, YouTubers). Fast, privacy-first, single-file installer.
|
||||||
|
|
||||||
## Tech Stack
|
## Tech Stack
|
||||||
- Frontend: React + Vite + Tailwind CSS + shadcn/ui
|
- Frontend: React 19 + Vite + TypeScript + Tailwind CSS + Zustand (with zundo undo/redo) + Virtuoso (virtualized transcript)
|
||||||
- Backend: Tauri 2.0 (Rust) for file I/O, invoking native binaries, and exposing commands to the UI
|
- Backend: Tauri 2.0 (Rust) for file I/O, licensing, licensing crypto (Ed25519), model management, error logging
|
||||||
- Transcription: Whisper.cpp (Rust bindings like `whisper-rs` / `whisper-cpp-sys`) — word-level timestamps
|
- Transcription: Python faster-whisper with WhisperX for word-level alignment. Models downloaded on demand.
|
||||||
- Audio/Video Processing: FFmpeg invoked from Rust (or `ffmpeg-next` Rust crate)
|
- Audio/Video Processing: FFmpeg invoked from Rust via Python scripts (video_editor.py, audio_cleaner.py, caption_generator.py)
|
||||||
- State: Zustand (in-frontend store)
|
- AI: Ollama, OpenAI, Claude through Python ai_provider.py. Bundled Qwen3 LLM planned.
|
||||||
|
- State: Zustand (in-frontend store) + zundo middleware for undo/redo history
|
||||||
- Packaging: Tauri `tauri build` for cross-platform installers
|
- Packaging: Tauri `tauri build` for cross-platform installers
|
||||||
- Optional local tools: Ollama (optional local LLMs) for advanced on-device heuristics
|
|
||||||
|
|
||||||
## Developer Tools
|
## Developer Tools
|
||||||
- Rust toolchain (cargo, rustc)
|
- Rust toolchain (cargo, rustc)
|
||||||
- Node.js + npm/yarn for frontend
|
- Node.js + npm for frontend
|
||||||
|
- Python 3.11+ (faster-whisper, WhisperX, AI providers)
|
||||||
- FFmpeg binaries (platform-specific; bundled or downloaded at install)
|
- FFmpeg binaries (platform-specific; bundled or downloaded at install)
|
||||||
- Build/test: Tauri CLI, Vite dev server
|
- Build/test: Tauri CLI, Vite dev server
|
||||||
|
- Testing: Vitest (frontend), cargo test (Rust), pytest (Python)
|
||||||
|
- CI: GitHub Actions (Rust clippy/test, Frontend tsc/vitest, Python pytest)
|
||||||
|
|
||||||
## MVP Feature List (Planned)
|
## Implemented Features
|
||||||
1. Drag-and-drop import (audio/video auto audio-extract)
|
|
||||||
2. One-click local transcription (model selector: tiny/base → larger models)
|
- [x] 1. Media import via file dialog (audio/video auto audio-extract)
|
||||||
3. Scrollable, Google-Doc-style transcript editor
|
- [x] 2. One-click local transcription with model selector (tiny/base → larger models) and model-size chooser
|
||||||
|
- [x] 3. Scrollable, Google-Doc-style transcript editor (Virtuoso virtualized)
|
||||||
- Click word → seek video/audio
|
- Click word → seek video/audio
|
||||||
- Highlight + Delete → remove corresponding media segment (smart 150–250ms fades)
|
- Select words → cut corresponding media segment (smart 150–250ms fades)
|
||||||
4. One-click "Clean it" button
|
- [x] 4. Smart Cleanup
|
||||||
- Remove fillers (configurable list)
|
- Filler word removal (configurable list per-project)
|
||||||
- Remove long pauses (>0.8s) by default
|
- Silence trimming
|
||||||
5. One-click audio polish chain (FFmpeg): normalize, light compression, basic noise reduction
|
- [x] 5. Audio Polish chain (FFmpeg): normalize, compression, noise reduction
|
||||||
6. Preview with synced playback, undo/redo, project save/load
|
- [x] 6. Preview with synced playback, undo/redo (zundo), project save/load
|
||||||
7. Export MP4/audio with optional SRT/VTT captions and burned-in captions
|
- [x] 7. Export MP4/audio with SRT/VTT/ASS captions (speaker-labeled)
|
||||||
|
- [x] 8. Speaker diarization
|
||||||
|
- [x] 9. Custom filler lists per-project
|
||||||
|
- [x] 10. Background music with auto-ducking
|
||||||
|
- [x] 11. Append clips (concatenation)
|
||||||
|
- [x] 12. Settings: AI provider config (Ollama, OpenAI, Claude)
|
||||||
|
- [x] 13. Keyboard shortcuts with custom remapping
|
||||||
|
- [x] 14. Help panel + cheatsheet
|
||||||
|
- [x] 15. 7-day licensing with Ed25519-signed license keys
|
||||||
|
|
||||||
## Recommended Additions (near-term, high ROI)
|
## Recommended Additions (near-term, high ROI)
|
||||||
- Model-size chooser + progressive fallback (start fast, upgrade model later)
|
|
||||||
- Local GPU/CPU detection & recommended model/settings UI
|
- [ ] Local GPU/CPU detection & recommended model/settings UI
|
||||||
- Per-project incremental transcription: re-run only edited segments
|
- [ ] Per-project incremental transcription: re-run only edited segments
|
||||||
- "Preview cleaning" dry-run that highlights candidate removals before applying
|
- [ ] "Preview cleaning" dry-run that highlights candidate removals before applying
|
||||||
- Export size/time estimator and suggested export presets
|
- [ ] Export size/time estimator and suggested export presets
|
||||||
- Custom filler lists per-project and import/export of filler lists
|
- [ ] Accessibility export presets (podcast vs YouTube presets)
|
||||||
- High-quality offline captions export (SRT + VTT + speaker labels)
|
- [ ] Bundled Qwen3 LLM for offline AI features
|
||||||
- Accessibility export presets (podcast vs YouTube presets)
|
|
||||||
|
|
||||||
## Remove / Defer (Back Burner)
|
## Remove / Defer (Back Burner)
|
||||||
These broaden scope or add legal/privacy surface — defer for now.
|
These broaden scope or add legal/privacy surface — defer for now.
|
||||||
|
|
||||||
- Voice cloning / TTS: DEFER
|
- Voice cloning / TTS: DEFER
|
||||||
- Multi-track, full timeline NLE features: DEFER
|
- Multi-track, full timeline NLE features: DEFER
|
||||||
- Real-time collaboration / cloud sync: DEFER
|
- Real-time collaboration / cloud sync: DEFER
|
||||||
@ -52,18 +64,20 @@ These broaden scope or add legal/privacy surface — defer for now.
|
|||||||
|
|
||||||
## Risks & Mitigations
|
## Risks & Mitigations
|
||||||
- Large model sizes: don't bundle large models; download on-demand and document storage location.
|
- Large model sizes: don't bundle large models; download on-demand and document storage location.
|
||||||
- Timestamp accuracy: provide manual word-adjust UI and per-segment re-run.
|
- Timestamp accuracy: WhisperX word-level alignment + manual per-segment re-run available.
|
||||||
- FFmpeg packaging/licensing: ship platform-specific binaries or use Tauri bundling guidance; document license compliance.
|
- FFmpeg packaging/licensing: ship platform-specific binaries or use Tauri bundling guidance; document license compliance.
|
||||||
|
|
||||||
## Prioritized Quick Wins
|
## Prioritized Quick Wins
|
||||||
1. Model chooser UI + auto-fallback settings
|
1. Per-project incremental transcription
|
||||||
2. "Preview cleaning" dry-run UI
|
2. "Preview cleaning" dry-run UI
|
||||||
3. Per-project incremental transcription saving
|
3. Export presets (podcast vs YouTube)
|
||||||
|
|
||||||
## Next Steps for Implementation
|
## Next Steps for Implementation
|
||||||
- Add model chooser UI and capability detection early in the frontend iteration.
|
- Bundle Qwen3 LLM for offline AI processing.
|
||||||
- Implement Rust transcription command and a compact API for incremental transcription.
|
- Implement incremental transcription to speed up re-editing workflows.
|
||||||
- Implement FFmpeg polish templates and a minimal preview pipeline.
|
- Add export presets and size estimation.
|
||||||
|
- Improve GPU/CPU detection and model recommendations.
|
||||||
|
|
||||||
---
|
---
|
||||||
Generated as requested to capture tech, tools, planned features, and the recommended add/remove/defer list.
|
|
||||||
|
Generated to capture tech, tools, implemented features, and the recommended add/remove/defer list.
|
||||||
|
|||||||
261
plan.md
261
plan.md
@ -4,193 +4,124 @@
|
|||||||
|
|
||||||
TalkEdit's defensible position: **works on hour+ files without degrading**, fully offline, one-time payment. No competitor owns this — Descript chokes on long content, CapCut limits mobile uploads, and both require accounts.
|
TalkEdit's defensible position: **works on hour+ files without degrading**, fully offline, one-time payment. No competitor owns this — Descript chokes on long content, CapCut limits mobile uploads, and both require accounts.
|
||||||
|
|
||||||
---
|
**Current status (May 2026):** All core editing features are built and stable. Polish pass completed. 107 automated tests (95 frontend + 12 Rust). Ready for beta testing.
|
||||||
|
|
||||||
## Phase 1: Polish (pre-launch, do this first)
|
|
||||||
|
|
||||||
### Reliability & error handling
|
|
||||||
- [ ] Handle backend crashes gracefully — if Python backend dies, show a reconnect banner, don't leave UI in broken state
|
|
||||||
- [ ] Transcription failure recovery — show which model/download step failed, suggest alternatives
|
|
||||||
- [ ] Export failure reporting — surface FFmpeg stderr to user in a readable way
|
|
||||||
- [ ] File locking / concurrent access — prevent exporting while transcription is running and vice versa
|
|
||||||
|
|
||||||
### UX roughness
|
|
||||||
- [ ] Drag-and-drop file import onto the welcome screen
|
|
||||||
- [ ] Loading spinners for every async action with descriptive messages ("Downloading model...", "Analyzing silence...")
|
|
||||||
- [ ] Undo/redo visual feedback — toast notification "Undo: removed cut"
|
|
||||||
- [ ] `?` keyboard shortcut opens a proper cheat sheet modal
|
|
||||||
- [ ] Save indicator — dot or "unsaved" badge next to project name
|
|
||||||
- [ ] Disabled state for all buttons during export/transcription to prevent double-clicks
|
|
||||||
- [ ] Empty states for every panel ("Add your first marker", "No silence detected", etc.)
|
|
||||||
|
|
||||||
### Trial & licensing UX
|
|
||||||
- [ ] Wire up the "Activate" link to a real payment page (not placeholder `talked.it`)
|
|
||||||
- [ ] Show days remaining in the welcome screen bar (done)
|
|
||||||
- [ ] After trial expires, clearly explain what still works (export, loading) vs what's locked (editing, AI)
|
|
||||||
- [ ] License key field should handle paste + validate format client-side before sending to Rust
|
|
||||||
|
|
||||||
### Performance
|
|
||||||
- [ ] Lazy-load the waveform for very long files (>2hr) — don't fetch entire WAV at once
|
|
||||||
- [ ] Virtualize the waveform canvas rendering (only draw visible portion), not just transcript
|
|
||||||
- [ ] Debounce project auto-save
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 2: Standout features (own the niche)
|
## Phase 1: Polish ✅ COMPLETED
|
||||||
|
|
||||||
### Long-form content (win here)
|
### Reliability & error handling ✅
|
||||||
- [ ] **Chapter-based navigation** — markers auto-sorted, click to jump, usable on 3hr files (partially done)
|
- [x] Backend health check — polls `/health` every 30s, shows reconnecting banner
|
||||||
- [ ] **Per-segment re-transcription** without losing surrounding context (done)
|
- [x] Export failure reporting — surfaces FFmpeg stderr with copy-to-clipboard
|
||||||
- [ ] **Append multiple clips** into one timeline (done)
|
- [x] React ErrorBoundary catches render crashes, shows fallback with reload
|
||||||
- [ ] **Project stitching** — load multiple `.aive` projects, combine into one export
|
- [x] Global JS error logging — `window.onerror` + `onunhandledrejection` logged to Rust backend
|
||||||
- [ ] **Session memory** — re-open last project on launch, auto-restore cursor position and scroll
|
|
||||||
- [ ] **Smart chunking** for transcription — for files >2hr, transcribe in overlapping chunks and stitch seamlessly
|
|
||||||
|
|
||||||
### Export differentiation
|
### UX polish ✅
|
||||||
- [ ] **YouTube chapters** — auto-generate from markers, copy as timestamps (done)
|
- [x] Tooltips on every button/control across all panels
|
||||||
- [ ] **Export chapter markers** — embed in MP4/MKV metadata for chapter skip in players
|
- [x] Loading spinners for waveform, waveform retry button
|
||||||
- [ ] **Batch export** — export multiple projects or multiple cuts from one project in sequence
|
- [x] Export progress bar (visual, not just text)
|
||||||
- [ ] **Export transcript format presets** — SRT, VTT, TXT, TXT with timestamps, markdown (done)
|
- [x] Help panel with full feature documentation
|
||||||
|
- [x] Keyboard cheatsheet overlay with close button and preset indicator
|
||||||
|
- [x] First-run welcome overlay with 3-step guide
|
||||||
|
- [x] `?` keyboard shortcut opens cheatsheet (accessible from Help panel)
|
||||||
|
- [x] Empty states: MarkersPanel, AIPanel, WaveformTimeline
|
||||||
|
- [x] Error states: AIPanel with retry, WaveformTimeline with retry
|
||||||
|
- [x] Auto-save crash recovery every 60s, restore prompt on next launch
|
||||||
|
- [x] Confirmation dialogs for zone/marker deletion
|
||||||
|
- [x] Disabled state for all buttons during export/transcription
|
||||||
|
- [x] Export button disabled when no video loaded
|
||||||
|
|
||||||
### AI features (local-first moat)
|
### Consistency ✅
|
||||||
|
- [x] Mute zone color unified (blue everywhere)
|
||||||
|
- [x] Disabled opacity unified (40% everywhere)
|
||||||
|
- [x] Zone list items border radius unified (`rounded-lg`)
|
||||||
|
- [x] Toolbar button groups separated with visual dividers
|
||||||
|
- [x] Labels simplified: "Sound Gain", "Speed Adjust", "Trim Silence", "Chapter Marks", "Edit Zones", "Add Clips", "Bkg. Music", "AI Tools"
|
||||||
|
- [x] Model selector moved to AIPanel reprocess tab
|
||||||
|
- [x] Orphaned VolumePanel.tsx removed
|
||||||
|
|
||||||
All powered by the bundled Qwen3 LLM. No API keys, no cloud calls. Features are grouped by how much they contribute to the core workflow.
|
### Trial & licensing ✅
|
||||||
|
- [x] Trial duration: 7 days
|
||||||
|
- [x] Trial bar on welcome screen with days remaining
|
||||||
|
- [x] Sentinel file prevents deleting trial.json to reset trial
|
||||||
|
- [x] XOR integrity check prevents editing trial.json timestamp
|
||||||
|
- [x] `canEdit` defaults to `false` (locked until status check confirms)
|
||||||
|
- [x] Email confirmation step before license activation (deters key sharing)
|
||||||
|
- [x] `verify_license` command (verify without caching)
|
||||||
|
- [x] Expired banner explains what still works (export, loading)
|
||||||
|
|
||||||
#### Content creation (biggest value)
|
### Robustness ✅
|
||||||
- [ ] **Smart Shorts finder** — scan the full transcript for self-contained, engaging segments 10–90s long. Ranks by: narrative completeness (has a beginning/end), energy cues from transcript sentiment, and topic boundaries. Results shown as a list of suggested cut ranges with preview. One-click export as a separate short video or copy timecodes.
|
- [x] React ErrorBoundary
|
||||||
- [ ] **Sound bite / quotable moment finder** — find punchy, standalone sentences that work as clips for social media. Ranks by quotability. Different from Shorts finder: these are <15s, single-sentence, high-impact lines.
|
- [x] Store-level input validation (reject NaN, clamp bounds, enforce min zone duration)
|
||||||
- [ ] **Hook analyzer** — score the first 30 seconds of the video for engagement. Suggests cuts or rephrases to make the intro stronger. Shows a "hook score" 1–10.
|
- [x] Runtime assertions in critical paths (TranscriptEditor, WaveformTimeline, ExportDialog)
|
||||||
- [ ] **Title + description generator** — suggest 5 titles and a YouTube description from the transcript + markers. One-click copy.
|
- [x] Auto-save crash recovery
|
||||||
|
- [x] CI pipeline (GitHub Actions: Rust + Frontend + Python)
|
||||||
#### Editing acceleration
|
- [x] Bad project state recovery (auto-prunes invalid zones on load, Dev Panel reset button)
|
||||||
- [ ] **AI auto-chapters** — detect topic shifts from transcript → create timeline markers. Uses topic segmentation from LLM, not just silence gaps.
|
- [x] 95 frontend tests (editorStore, licenseStore, aiStore, assert)
|
||||||
- [ ] **AI Smart Clean** — one-pass: filler removal + silence trim + normalize (done)
|
- [x] 12 Rust tests (licensing, models)
|
||||||
- [ ] **AI sentence rephrase** — right-click word → rephrase with AI → replace in transcript (also done, uses backend)
|
- [x] Canvas zone handles enlarged (r=6), hit area increased
|
||||||
- [ ] **AI pacing analysis** — flag segments where the speaker talks too fast or too slow for sustained periods. Suggests speed range adjustments or cuts.
|
- [x] Search match contrast improved
|
||||||
- [ ] **AI dead-air finder** — finds moments where nothing interesting is said for >5s (rambling, off-topic, false starts). Different from silence trimmer — this is content-based, not audio-level-based.
|
- [x] Split panes keyboard-accessible (arrow keys, tabIndex, ARIA)
|
||||||
- [ ] **AI readabilty scan** — flag sentences that are too long, complex, or jargon-heavy for spoken word. Suggests simpler alternatives.
|
|
||||||
|
|
||||||
#### Metadata & distribution
|
|
||||||
- [ ] **AI show notes** — generate title, description, key moments, and timestamps from transcript + markers
|
|
||||||
- [ ] **AI keyword/tag extraction** — pull out 5–10 topic tags from the transcript. Useful for YouTube SEO or categorization
|
|
||||||
- [ ] **AI question finder** — detect all questions asked in the video (speaker or guest). Useful for Q&A videos, AMAs, interviews — jump to each question instantly
|
|
||||||
- [ ] **AI thumbnail text suggestion** — suggest short overlay text for video thumbnails based on the most compelling line in the video
|
|
||||||
- [ ] **AI call-to-action finder** — detect where the speaker asks for likes/subscribes/comments. Lets you trim or reposition CTAs
|
|
||||||
|
|
||||||
#### Accessibility & compliance
|
|
||||||
- [ ] **AI content flagging** — flag profanity, sensitive topics, or copyrighted references in the transcript. Color-coded by category. Useful before publishing to restricted platforms
|
|
||||||
- [ ] **AI language leveling** — rewrite transcript segments at a target reading grade level (e.g., "simplify to 8th-grade level"). Useful for educational content or broad audiences
|
|
||||||
|
|
||||||
### Bundled local LLM (killer friction-killer)
|
|
||||||
The biggest UX gap: users must set up Ollama or paste an API key to use AI features. Bundle two small models — download on first AI use, just like Whisper models.
|
|
||||||
|
|
||||||
**Three-tier AI provider choice** (set once, persisted):
|
|
||||||
|
|
||||||
| Option | Hardware needed | Download | Setup |
|
|
||||||
|--------|----------------|----------|-------|
|
|
||||||
| Qwen3 4B (recommended) | 8GB+ free RAM | 2.5 GB | None — auto-download |
|
|
||||||
| Qwen3 1.7B (lightweight) | 4GB+ free RAM | 1.0 GB | None — auto-download |
|
|
||||||
| Ollama (bring your own) | Any | None | User starts Ollama themselves |
|
|
||||||
|
|
||||||
Default to Qwen3 4B. If the machine can't meet the RAM threshold (checked via Tauri at runtime), fall back to 1.7B or prompt to set up Ollama.
|
|
||||||
|
|
||||||
- [ ] **Integrate llama.cpp Rust bindings** (`llama-cpp-rs` or `candle`) — replace Python `ai_provider.py` calls with native inference for bundled models
|
|
||||||
- [ ] **Auto-download Qwen3 4B or 1.7B** on first AI action based on hardware check (GGUF Q4_K_M format, ~2.5GB / ~1GB). Same UX as Whisper download: progress bar, resume on interrupt
|
|
||||||
- [ ] **Model selector** in Settings: "Qwen3 4B (fast, no setup)" vs "Qwen3 1.7B (lightweight)" vs Ollama vs OpenAI vs Claude. Default to Qwen3 4B
|
|
||||||
- [ ] **Hardware detection on first AI use** — check total system RAM, recommend 4B if ≥8GB free, 1.7B if less. Skip download entirely if machine can't run either (fall back to Ollama/API prompt)
|
|
||||||
- [ ] **GPU acceleration** — llama.cpp supports CUDA/Metal/Vulkan. Detect at runtime and enable if available
|
|
||||||
- [ ] **For lightweight AI tasks** (filler detection, chapter titles, summarization) the bundled model handles them directly. Only task requiring heavier reasoning (rephrase, smart speed) get the full model
|
|
||||||
- [ ] **Remove the Python backend dependency for AI** — once bundled LLM + Whisper.cpp handle everything, no need to ship Python for AI features. One less runtime dependency
|
|
||||||
- [ ] GGUF model files are cached in app data dir, same as Whisper models
|
|
||||||
|
|
||||||
**Why this wins:**
|
|
||||||
- New users open the app, click "Detect filler words" and it just works. No API signup. No Docker. No "install Ollama" README steps.
|
|
||||||
- Descript charges $24/mo and still requires internet. CapCut's AI features are cloud-only. TalkEdit gives you local AI with zero setup.
|
|
||||||
- The same download-on-demand pattern already works for Whisper models — users understand it.
|
|
||||||
- Two size options means it works on everything from a 16GB gaming laptop to an 8GB office machine.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 3: Marketing & launch
|
## Phase 2: Standout features (post-beta)
|
||||||
|
|
||||||
### Target audiences (three distinct segments)
|
### Long-form content
|
||||||
|
- [x] Chapter-based navigation — markers auto-sorted, click to jump (partially done)
|
||||||
|
- [x] Per-segment re-transcription (done)
|
||||||
|
- [x] Append multiple clips into one timeline (done)
|
||||||
|
- [ ] Project stitching — load multiple `.aive` projects, combine into one export
|
||||||
|
- [ ] Smart chunking for transcription — for files >2hr
|
||||||
|
|
||||||
| Segment | Pain point | TalkEdit's message |
|
### Export
|
||||||
|---------|-----------|-------------------|
|
- [x] YouTube chapters from markers (done)
|
||||||
| **Podcasters** | Editing a 1hr episode takes 4hrs in traditional editors. Descript is expensive and cloud-only. | *"Transcribe, clean, and export your podcast in minutes. One payment, forever."* |
|
- [x] Export transcript formats: SRT, VTT, TXT (done)
|
||||||
| **YouTube creators (long-form)** | CapCut chokes on 30min+ files. Need AI tools but don't want subscriptions. | *"Edit hour-long videos like a doc. AI chapters, filler removal, Smart Shorts — all local."* |
|
- [ ] Batch export — multiple projects/cuts in sequence
|
||||||
| **Privacy-conscious / enterprise** | Can't upload content to cloud editors (legal, compliance, NDAs). | *"100% offline. Your video never leaves your machine. No account required."* |
|
|
||||||
|
### AI features
|
||||||
|
- [x] AI Smart Clean — filler removal + silence trim + normalize (done)
|
||||||
|
- [x] AI sentence rephrase (done)
|
||||||
|
- [x] AI clip suggestions for social media (done)
|
||||||
|
- [ ] Smart Shorts finder — scan transcript for 10–90s segments
|
||||||
|
- [ ] AI auto-chapters — topic detection from transcript
|
||||||
|
- [ ] AI show notes — title, description, key moments
|
||||||
|
- [ ] AI dead-air finder — content-based silence detection
|
||||||
|
|
||||||
|
### Bundled local LLM
|
||||||
|
- [ ] Integrate llama.cpp Rust bindings
|
||||||
|
- [ ] Auto-download Qwen3 on first AI use (4B: 2.5GB / 1.7B: 1GB)
|
||||||
|
- [ ] Hardware detection at runtime, model selection in Settings
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3: Marketing & launch (post-beta)
|
||||||
|
|
||||||
### Messaging pillars
|
### Messaging pillars
|
||||||
|
1. "The offline video editor that doesn't slow down on long files"
|
||||||
|
2. "No subscription. One price, owned forever."
|
||||||
|
3. "Zero-setup AI" — bundled Qwen3, no API keys
|
||||||
|
4. "Your podcast → 10 TikToks in one click" — Smart Shorts finder
|
||||||
|
|
||||||
1. **"The offline video editor that doesn't slow down on long files"** — core positioning
|
### Channels
|
||||||
2. **"No subscription. One price, owned forever."** — pricing differentiator
|
- [ ] r/podcasting, r/VideoEditing, r/selfhosted
|
||||||
3. **"Zero-setup AI"** — bundled Qwen3, no API keys, no Docker, no Ollama
|
- [ ] Product Hunt, Hacker News "Show HN"
|
||||||
4. **"Your podcast → 10 TikToks in one click"** — Smart Shorts finder hook
|
- [ ] YouTube demo (3-5 min walkthrough)
|
||||||
|
- [ ] Free licenses to 20 podcasters for testimonials
|
||||||
### Launch channels
|
|
||||||
|
|
||||||
#### Creator communities (highest ROI)
|
|
||||||
- [ ] **r/podcasting** — post a demo video: "I edited a 1hr podcast in 4 minutes with this free tool." Free trial link. Emphasize offline + no sub.
|
|
||||||
- [ ] **r/VideoEditing** — comparison post: "TalkEdit vs Descript for long-form." Let the features speak. Include benchmarks (2hr file load time, export speed).
|
|
||||||
- [ ] **r/selfhosted** — this audience cares deeply about offline/local. Post: "Fully offline Descript alternative I've been building. Built-in local AI, no cloud." Free license giveaways to top commenters.
|
|
||||||
- [ ] **r/SaaS** — post the journey, get feedback. Good for building awareness among builders who might recommend it.
|
|
||||||
- [ ] **Hacker News** — "Show HN: I built an offline video editor with bundled local AI." The technical audience will appreciate the Rust + llama.cpp + Whisper stack. Be ready for technical questions.
|
|
||||||
|
|
||||||
#### Video demos (the product is visual — show it)
|
|
||||||
- [ ] **Product Hunt launch** — video + GIF-heavy listing. Tagline: *"Descript for long-form content, 100% offline."* Give away 50 free Pro licenses on launch day.
|
|
||||||
- [ ] **YouTube demo** — 3-5 min screencap: open 1hr file → auto-transcribe → Smart Clean → Smart Shorts finds 10 clips → export all. No cuts, real-time.
|
|
||||||
- [ ] **TikTok/Shorts** — 30s clips of the Smart Shorts finder in action. "This tool turned my 1hr podcast into 10 TikToks automatically." Each short is itself a demo of the feature.
|
|
||||||
|
|
||||||
#### Earned / low-cost
|
|
||||||
- [ ] **Offer free Pro licenses** to 20 podcasters with >10K followers in exchange for a public review or mention. Target: Joe Rogan–style solo podcasters who edit their own content.
|
|
||||||
- [ ] **GitHub release** — tag v1.0.0 with detailed release notes, screenshots, and binary downloads for all three platforms. Encourage issues/feature requests.
|
|
||||||
- [ ] **Write a "why I built this" post** — submit to Indie Hackers, Medium, Dev.to. Focus on: frustration with Descript pricing, desire for offline tools, the technical challenge of bundling an LLM.
|
|
||||||
- [ ] **Comparison landing page** — `talked.it/vs/descript` with a feature table, pricing comparison, and "privacy" as a highlighted column. SEO target: "Descript alternative"
|
|
||||||
|
|
||||||
#### Paid (only after product-market fit is validated)
|
|
||||||
- [ ] YouTube ads targeting "how to edit a podcast" and "Descript tutorial" search terms
|
|
||||||
- [ ] Podcast sponsorship on indie podcasting shows (target audience overlap is perfect)
|
|
||||||
|
|
||||||
### Free-to-paid funnel
|
|
||||||
- [ ] 30-day full-feature trial — no credit card required, no account signup
|
|
||||||
- [ ] After trial: locked editing + AI, but export still works (people can still get value from completed projects)
|
|
||||||
- [ ] Pro license: $39 one-time. Business license: $79 one-time (priority support, volume licensing)
|
|
||||||
- [ ] No subscriptions. Emphasize: *"Buy once, don't think about it again."*
|
|
||||||
|
|
||||||
### Launch checklist
|
|
||||||
- [ ] Landing page at talked.it with: feature list, screenshots, pricing, download buttons
|
|
||||||
- [ ] Demo video (3-5 min walkthrough)
|
|
||||||
- [ ] 30s Smart Shorts demo clip for TikTok/Shorts
|
|
||||||
- [ ] Product Hunt listing ready (logo, description, GIFs, launch day plan)
|
|
||||||
- [ ] Reddit drafts for r/podcasting, r/VideoEditing, r/selfhosted
|
|
||||||
- [ ] HN Show HN draft
|
|
||||||
- [ ] 20 free Pro licenses queued for influencers
|
|
||||||
- [ ] GitHub v1.0.0 release with binaries
|
- [ ] GitHub v1.0.0 release with binaries
|
||||||
- [ ] Compare page: TalkEdit vs Descript
|
|
||||||
|
### Pricing
|
||||||
|
- 7-day free trial (no CC, no account)
|
||||||
|
- Pro: $39 one-time
|
||||||
|
- Business: $79 one-time (priority support, volume licensing)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 4: Post-launch
|
## Non-goals (explicitly deferred)
|
||||||
|
|
||||||
### Retention
|
|
||||||
- [ ] In-app changelog on update
|
|
||||||
- [ ] Email list for major releases (optional, no account required)
|
|
||||||
- [ ] Community templates/sharing for export presets and filler lists
|
|
||||||
|
|
||||||
### Growth features
|
|
||||||
- [ ] **Sample video download** — "Try without your own media" button downloads a test file + pre-made transcript
|
|
||||||
- [ ] **Built-in free music library** — 5-10 CC0 loops shipped with the app
|
|
||||||
- [ ] **Export presets** — community-contributed, loaded from a JSON file
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Non-goals (explicitly defer)
|
|
||||||
|
|
||||||
- Cloud sync / collaboration
|
- Cloud sync / collaboration
|
||||||
- Voice cloning / TTS
|
- Voice cloning / TTS
|
||||||
- Full multi-track NLE timeline (transitions, picture-in-picture, etc.)
|
- Full multi-track NLE timeline
|
||||||
- Mobile app
|
- Mobile app
|
||||||
- Subscription model
|
- Subscription model
|
||||||
- Training/fine-tuning models in-app
|
- Image/video generation models
|
||||||
- Image/video generation models (Stable Diffusion, etc.) — text-only LLM is sufficient for transcript tasks
|
|
||||||
|
|||||||
Reference in New Issue
Block a user