2026-03-03 06:31:04 -05:00
# CutScript
2025-01-28 17:00:03 -05:00
2026-03-06 21:49:22 +05:30
An open-source, local-first, Descript-like text-based audio and video editor powered by AI. Edit audio/video by editing text — delete a word from the transcript and it's cut from the audio/video.
2025-08-05 11:18:36 -04:00
2026-03-03 14:45:36 -05:00
< img width = "1034" height = "661" alt = "image" src = "https://github.com/user-attachments/assets/b1ed9505-792e-42ca-bb73-85458d0f02a5" / >
2026-03-03 06:31:04 -05:00
## Architecture
2025-01-28 17:00:03 -05:00
2026-04-15 17:40:27 -06:00
- **Tauri + React** desktop app with Tailwind CSS
2026-03-03 06:31:04 -05:00
- **FastAPI** Python backend (spawned as child process)
- **WhisperX** for word-level transcription with alignment
- **FFmpeg** for video processing (stream-copy and re-encode)
- **Ollama / OpenAI / Claude** for AI features (filler removal, clip creation)
2025-03-03 12:28:32 -05:00
2026-03-03 06:31:04 -05:00
## Quick Start
2025-03-03 12:28:32 -05:00
2026-03-03 06:31:04 -05:00
### Prerequisites
2025-03-03 12:28:32 -05:00
2026-03-03 06:31:04 -05:00
- Node.js 18+
- Python 3.10+
- FFmpeg (in PATH)
- (Optional) Ollama for local AI features
2025-03-03 12:07:30 -05:00
2026-03-03 06:31:04 -05:00
### Install
2025-03-03 12:07:30 -05:00
2025-07-17 00:05:23 -04:00
```bash
2026-04-15 17:40:27 -06:00
# Root and frontend dependencies
2026-03-03 06:31:04 -05:00
npm install
cd frontend & & npm install & & cd ..
2025-07-17 00:05:23 -04:00
2026-03-03 06:31:04 -05:00
# Backend dependencies
cd backend & & pip install -r requirements.txt & & cd ..
2025-07-17 00:05:23 -04:00
```
2026-03-03 06:31:04 -05:00
### Run (Development)
2025-07-23 14:45:19 -04:00
```bash
2026-04-15 17:40:27 -06:00
# Start Tauri dev environment (includes backend + frontend)
npm run dev:tauri
2025-07-23 14:45:19 -04:00
```
2026-03-03 06:31:04 -05:00
Or run them separately:
2025-07-17 00:05:23 -04:00
2026-03-03 06:31:04 -05:00
```bash
# Terminal 1: Backend
cd backend & & python -m uvicorn main:app --reload --port 8642
2025-03-03 12:07:30 -05:00
2026-04-15 17:40:27 -06:00
# Terminal 2: Frontend + Tauri
cd frontend & & cargo tauri dev
2025-03-03 12:07:30 -05:00
```
2026-03-03 06:31:04 -05:00
## Project Structure
2025-03-03 12:07:30 -05:00
```
2026-04-15 17:40:27 -06:00
talkedit/
├── src-tauri/ # Tauri Rust runtime
│ ├── Cargo.toml
│ ├── src/
│ │ ├── main.rs # App entry & backend spawner
│ │ └── commands/ # Tauri IPC handlers
├── frontend/ # React + Vite + Tailwind
2026-03-03 06:31:04 -05:00
│ └── src/
2026-04-15 17:40:27 -06:00
│ ├── components/ # VideoPlayer, TranscriptEditor, etc.
│ ├── store/ # Zustand state (editorStore, aiStore)
│ ├── lib/tauri-bridge.ts # Tauri API polyfill
2026-03-03 06:31:04 -05:00
│ └── types/ # TypeScript interfaces
├── backend/ # FastAPI Python backend
│ ├── main.py
│ ├── routers/ # API endpoints
│ ├── services/ # Core logic (transcription, editing, AI)
│ └── utils/ # GPU, cache, audio helpers
└── shared/ # Project schema
2025-03-03 12:07:30 -05:00
```
2026-03-03 06:31:04 -05:00
## Features
| Feature | Status |
|---------|--------|
| Word-level transcription (WhisperX) | Done |
| Text-based video editing | Done |
| Undo/redo | Done |
| Waveform timeline | Done |
| FFmpeg stream-copy export | Done |
| FFmpeg re-encode (up to 4K) | Done |
| AI filler word removal | Done |
| AI clip creation (Shorts) | Done |
| Ollama + OpenAI + Claude | Done |
| Word-level captions (SRT/VTT/ASS) | Done |
| Caption burn-in on export | Done |
| Studio Sound (DeepFilterNet) | Done |
| Keyboard shortcuts (J/K/L) | Done |
| Speaker diarization | Done |
| Virtualized transcript (react-virtuoso) | Done |
| Encrypted API key storage | Done |
| Project save/load (.cutscript) | Done |
| AI background removal | Planned |
## Keyboard Shortcuts
| Key | Action |
|-----|--------|
| Space | Play / Pause |
| J / K / L | Reverse / Pause / Forward |
| ← / → | Seek ±5 seconds |
| Delete | Delete selected words |
| Ctrl+Z | Undo |
| Ctrl+Shift+Z | Redo |
| Ctrl+S | Save project |
| Ctrl+E | Export |
| ? | Shortcut cheatsheet |
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /health | Health check |
| POST | /transcribe | Transcribe video with WhisperX |
| POST | /export | Export edited video (stream copy or re-encode) |
| POST | /ai/filler-removal | Detect filler words via LLM |
| POST | /ai/create-clip | AI-suggested clips for shorts |
| GET | /ai/ollama-models | List local Ollama models |
| POST | /captions | Generate SRT/VTT/ASS captions |
| POST | /audio/clean | Noise reduction (DeepFilterNet) |
| GET | /audio/capabilities | Check audio processing availability |
## License
MIT License — see [LICENSE ](LICENSE ) for details.